Who is Danny - stock.adobe.com

AMD goes after Nvidia with AI accelerator and software library

AMD has signed up major hardware companies to support its latest initiative to build out datacentre AI optimisation

AMD has finally come up with its answer to Nvidia’s dominance in the artificial intelligence (AI)-optimised server space. The chipmaker has unveiled a datacentre processor and open source software stack for artificial intelligence, and Microsoft, Meta, Oracle, Dell Technologies, HPE, Lenovo, Supermicro, Arista, Broadcom and Cisco are among the companies that have announced their support for the new AMD hardware and software.

“AI is the future of computing, and AMD is uniquely positioned to power the end-to-end infrastructure that will define this AI era, from massive cloud installations to enterprise clusters and AI-enabled intelligent embedded devices and PCs,” said AMD CEO Lisa Su.

The company’s AI strategy encompasses a new datacentre processor, the AMD Instinct MI300 Series datacentre AI accelerators and ROCm 6, an open software stack, which AMD claims offers “significant optimisations and new features supporting large language models [LLMs]”. It has also launched the Ryzen 8040 Series family of processors with Ryzen AI for laptop, desktop and workstation PCs.

Microsoft said it will be offering the Azure ND MI300x v5 Virtual Machine (VM) series optimised for AI workloads that run on AMD Instinct MI300X accelerators.

Oracle has unveiled plans to offer OCI bare metal compute featuring AMD Instinct MI300X accelerators, while Meta plans to add AMD Instinct MI300X accelerators to its datacentres in combination with ROCm 6 to power AI inferencing workloads.

The Dell PowerEdge XE9680 server will use the new AMD hardware, as will Lenovo’s ThinkSystem platform, and HPE has announced plans to bring AMD Instinct MI300 accelerators to its enterprise and HPC offering.

Trish Damkroger, senior vice-president and chief product officer of HPC, AI and Labs at Hewlett Packard Enterprise, said the AMD accelerator will be used in the HPE Cray EX supercomputer to provide what she described as “powerful accelerated compute, with improved energy efficiency, to help our customers tackle compute-intensive modelling, simulation, AI and machine learning workloads, and realise a new level of insights”.

Read more about AI-optimised server hardware

  • As the hype surrounding artificial intelligence enters a new phase with the rising enterprise and hyperscale interest in generative AI, operators are not sure how to proceed with new datacentre builds, it is claimed.
  • Maia is for AI inferencing, while the Arm-based Cobalt is for general-purpose computing. The chips could compete with AMD, Nvidia and Intel offerings eventually.

On the software side, OpenAI said it has added support for AMD Instinct accelerators to Triton 3.0, providing out-of-the-box support for AMD accelerators that, according to AMD, allow developers to work at a higher level of abstraction on AMD hardware.

Bronis R de Supinski, chief technology officer for Livermore Computing at Lawrence Livermore National Laboratory, said the AMD accelerator will improve the Lab’s El Capitan exascale supercomputing system being built by HPE.

“El Capitan will offer incredible programmability, performance and energy efficiency through its use of the AMD Instinct MI300A APUs, virtually eliminating the challenges of repeatedly moving data between CPUs and GPUs, thus helping us achieve our mission,” he said.

Greg Diamos, co-founder and chief technology officer at Lamini, discussed how AMD’s software offered a level of compatibility with Nvidia’s Cuda libraries that have become the de facto standard for AI and developers.

“Lamini and our customers have been running LLMs on a high-performance cluster of AMD Instinct accelerators for over a year,” he said. “We’ve found AMD ROCm has similar software compatibility with Nvidia’s Cuda while supporting large language models.”

Read more on Chips and processor hardware