The AI infrastructure earnings calls of the past eight quarters have given the public a precise vocabulary for what the build-out costs in capital. Hyperscaler GPU procurement. Power purchase agreements. Real-estate footprints. The vocabulary they have not given the public is for what it costs to keep the clusters healthy on a recurring basis after the capital is spent. That line item, on close inspection, has become one of the largest hidden cost centers in the entire build-out. It is growing faster than the capital line above it.

The visible numbers in the AI infrastructure conversation describe the capital story. Hyperscaler GPU procurement is on track to cross multi-trillion-dollar cumulative spend over the current cycle. Power purchase agreements have moved into the range that historically described heavy industry. Real-estate commitments have followed. The capital narrative has been told in detail across two years of investor updates.

The operational story is less visible. It describes what it costs to keep the clusters healthy. The work is unglamorous and largely manual. GPU node failures have to be detected, triaged, and remediated. Pods have to be rescheduled around degraded hardware. Resource utilization across an accelerator fleet has to be monitored, balanced, and reported on. Each of these tasks is, in current production environments, performed by a class of engineer whose compensation is among the highest in the industry.