FOMO Driving GPU Overbuying, 95% of Capacity Idle

Companies are pumping billions into AI infrastructure that’s largely unused, according to a report released Tuesday by Cast AI, a global automation platform for cloud-native and AI workloads.

Based on data from 23,000 Kubernetes clusters, the report found that average GPU utilization across enterprise servers is just 5%. In other words, 95% of provisioned GPU capacity is not being used.

The report noted that a CPU core sitting idle costs cents per hour, while a GPU sitting idle costs dollars. For the first time since EC2 launched in 2006, GPU prices are rising, not falling. In January 2026, AWS raised H200 Capacity Block prices by 15%, citing supply and demand. The increase breaks a two-decade pricing trend.

At these prices, the hoarding instinct makes sense, the report acknowledged. Lead times are long, and releasing unrecoverable capacity feels riskier than overpaying. But at 5% utilization, the math doesn’t work, and the hoarding feeds the scarcity loop that drives prices higher.

“This was shocking to us, and shocking to our customers,” Cast AI President Laurent Gil told TechNewsWorld. “Almost nobody realized they were not using those machines very well.”

FOMO Driving GPU Overbuying, 95% of Capacity Idle

Related reading

Enterprise GPU utilization: why 95% of AI infrastructure spend is wasted

The question everyone in AI is asking: How long before a GPU depreciates?

Is the AI bubble about to burst? There's increasing evidence that it might be |…

OpenAI logged its first $1 billion month but computing power demand is…

Data centers in Nvidia’s hometown stand empty awaiting power | Fortune

An Indian company is set to build a $2 billion AI hub with Nvidia’s GPUs and go…