Nvidia just dropped a number that should make every data center operator do a double take. The company’s new GB300 NVL72 system can handle 61,400 concurrent AI agents per megawatt of power consumed, compared to just 2,600 on the prior-generation H200.

That’s a 20x improvement in agent density per unit of energy. For an industry where electricity costs are rapidly becoming the binding constraint on growth, this isn’t a spec sheet flex. It’s a structural shift in the economics of inference.

What’s inside the rack

The GB300 NVL72 is built on Nvidia’s Blackwell Ultra architecture, packing 72 Blackwell Ultra GPUs and 36 Grace CPUs into a single liquid-cooled rack. The system integrates roughly 20 to 21 TB of HBM3e memory and offers 130 TB/s of NVLink bandwidth, which is the internal data highway that keeps all those GPUs talking to each other without bottlenecking.

Nvidia says the platform delivers up to 50 times the AI factory output of its older Hopper-generation systems. It also claims 10 times the tokens per second per user and five times the throughput per watt.