AMD launches Instinct MI350P PCIe card for enterprise AI inference in standard servers

AMD on Wednesday launched the Instinct MI350P, a PCIe accelerator card built so enterprises can run AI inference inside the air-cooled servers they already own, instead of rebuilding a data centre around liquid-cooled racks. The card carries 144GB of HBM3E memory at 4TB/s, draws 600W in a dual-slot form factor, and slots into existing PowerEdge and ProLiant chassis from Dell and HPE.

The MI350P is the standard-server cousin of AMD’s MI350X, which ships in the OAM module format used by hyperscalers. Same CDNA 4 architecture, smaller chip, half the compute units (128 versus 256), half the memory (144GB versus 288GB), and roughly 60 per cent of the power budget. Density and interconnect are where the cut shows up: the OAM card uses Infinity Fabric for chip-to-chip traffic, while the MI350P stops at PCIe Gen5 x16 at 128GB/s.

For most on-prem deployments, that is fine. AMD said up to eight MI350P cards can sit in a single air-cooled chassis, giving enterprises 1,152GB of HBM3E in a box they can plug into a regular rack. Peak compute hits 4,600 teraflops at MXFP4 precision and 2,299 teraflops at standard precision per card, the company said.

Tom’s Hardware put the MI350P at roughly 40 per cent ahead of Nvidia’s H200 NVL on theoretical FP16 and FP8 compute, the closest like-for-like PCIe competitor. AMD did not disclose pricing. The headline AMD-versus-Nvidia comparison uses theoretical peak figures, not benchmarked workload throughput.

Dell said the card will land in two PowerEdge models, the XE7745 and the R7725, with availability “due in the summer”, which in Northern Hemisphere terms means roughly the September quarter. HPE confirmed support for the ProLiant DL380a Gen12 in the same window, according to StorageReview, the storage trade publication that broke the OEM specs.

The Australian angle runs through the OEM channel. ABS Business Characteristics Survey 2024 figures put AI use among large businesses at 60 per cent, but most of those workloads still sit inside corporate data centres, not in a hyperscaler tenancy. Liquid cooling is rare in those rooms, and full-rack overhauls do not get capex sign-off without a multi-year plan.

Eight MI350P cards in one air-cooled chassis cover trillion-parameter model hosting, AMD said, without the 200kW power feed or floor reinforcement an OAM rack needs. That puts on-prem inference inside reach for banks, insurers and Commonwealth departments that already buy PowerEdge and ProLiant kit and have data-residency clauses that rule out a public-cloud GPU.

What the MI350P does not solve is software. Nvidia’s CUDA stack is still the default in enterprise AI work, and AMD’s ROCm has lagged. Dell and HPE will sell the MI350P alongside Nvidia kit, not instead of it. AU channel volume will depend on Dell and HPE pricing, which neither vendor has disclosed, and on whether AMD’s ROCm 7 release later this year covers the inference frameworks Australian enterprises run. The card lands in a 2026 enterprise AI hardware cycle that includes Nvidia’s H200 NVL, Intel’s Gaudi 3 and inference-focused launches from Cerebras, Groq and Tenstorrent.

AMD launches Instinct MI350P PCIe card for enterprise AI inference in standard servers

Related

IREN signs $US3.4b Nvidia AI cloud contract, founder slams Australian regulation

Nvidia faces $US78.8b earnings test as BofA lifts target on AI chip demand

Anthropic locks in SpaceX Colossus 1 for 300MW of new Claude compute