Azure achieves fastest AI training milestone with Nvidia partnership

Microsoft’s Azure cloud platform just posted the fastest AI training results at the largest reported scale, powered by a deepened collaboration with Nvidia. The achievement, announced on March 18, 2025, centers on record-setting performance in the MLPerf Training v4.1 benchmarks, the widely recognized independent standard for measuring machine learning hardware performance.

The configuration behind the results: 512 Nvidia H200 GPUs working in concert, delivering a 28% performance improvement over previous setups built on H100 GPUs.

What the benchmarks actually show

In previous 2023 benchmarks, Azure showed it could train a GPT-3 model with 175 billion parameters on 10,752 H100 GPUs in approximately 4 minutes. The new H200-based configuration builds on that foundation with meaningfully better per-GPU performance, reducing the total hardware needed to hit comparable training speeds.

The full stack behind these results goes beyond just swapping in newer GPUs. Microsoft cited integrated innovations across hardware, networking, and software. The setup leverages Nvidia Quantum InfiniBand networking, which handles the massive data transfer demands between GPUs during distributed training. It also incorporates Nvidia’s microservices alongside Azure’s own AI services, including its AI Foundry platform.

The configuration behind the results: 512 Nvidia H200 GPUs working in concert, delivering a 28% performance improvement over previous setups built on H100 GPUs.

What the benchmarks actually show

Azure achieves fastest AI training milestone with Nvidia partnership

Azure achieves fastest AI training milestone with Nvidia partnership

Other newsrooms on this story

Related reading

Nvidia's GB300 NVL72 achieves 61.4K concurrent agents per megawatt, a 20x leap…

Microsoft at NVIDIA GTC: New solutions for Microsoft Foundry, Azure AI…

Lambda, Microsoft agree to multibillion-dollar AI infrastructure deal with…

Fastest, Largest, Strongest: NVIDIA Blackwell Sweeps MLPerf Training 6.0

NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From…

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and…

Other newsrooms on this story

Related reading

Nvidia's GB300 NVL72 achieves 61.4K concurrent agents per megawatt, a 20x leap…

Microsoft at NVIDIA GTC: New solutions for Microsoft Foundry, Azure AI…

Lambda, Microsoft agree to multibillion-dollar AI infrastructure deal with…

Fastest, Largest, Strongest: NVIDIA Blackwell Sweeps MLPerf Training 6.0

NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From…

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and…