NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark

AgentPerf from Artificial Analysis, the industry’s first agentic AI benchmark, gives developers, enterprises and infrastructure providers a clear way to compare systems for agentic AI. In the first round of published results, the NVIDIA Blackwell Ultra NVL72 platform delivers leading performance across the agentic AI workloads tested, running 20x more agents per megawatt than NVIDIA Hopper.

Agentic AI is a fundamentally different workload than conversational AI. A single chat completion is a sprint: one large language model (LLM) call, one response. An agent functions more like a relay: It breaks a goal into many steps and keeps going until the task is done.

Agents chain together multiple LLM calls and tool calls to gather context, observe, reason and act.

That results in dozens to hundreds of LLM calls chained together, each passing growing context to the next, with tool calls like code compile and execution, database search and web browsing at every handoff. The complexity isn’t additive; it’s multiplicative.

The distinction matters enormously for performance measurement. Existing AI inference benchmarks measure one LLM call: how fast an LLM responds to a single request and how many simultaneous requests a system can handle. They weren’t designed for agentic workloads, where chained LLM calls, tool call delays and growing context stress accelerated computing systems in fundamentally different ways than a single LLM call ever could.

NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark

Other newsrooms on this story

Related reading

NVIDIA Blackwell Leads AgentPerf, the First Agentic-AI Infra Benchmark:…

Nvidia Blackwell achieves 20x more agents per megawatt than Hopper

AA-AgentPerf releases initial results for DeepSeek V4 Pro benchmark, showing…

NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI…

New SemiAnalysis InferenceX Data Shows NVIDIA Blackwell Ultra Delivers up to…

Nvidia's GB300 NVL72 achieves 61.4K concurrent agents per megawatt, a 20x leap…