For the past several years, the default assumption in enterprise IT was that AI would follow the same path as many other workloads and settle into the public cloud. That assumption seemed reasonable on the surface. The hyperscalers had the infrastructure, GPU capacity, managed services, and developer ecosystems. If you wanted to move fast, public cloud AI looked like the obvious answer.

That logic is now being challenged by reality. As enterprises move from AI experiments to AI in production, they increasingly find that the public cloud is a convenient place to start but not the most practical place to stay. Enterprises are wondering if they can afford to base their long-term AI strategies on cost models they do not control, risks they cannot fully contain, and architectures that are optimized for provider scale rather than enterprise economics.

This is why private cloud AI is becoming more popular. Enterprises are not moving on-premises because it’s a fashionable choice. They are moving because, in many cases, it is the financially rational choice.

The expense of token-based AI

The market still treats token-based AI pricing as a stable, mature economic model. It is not. Much of what enterprises pay today reflects a highly competitive environment in which providers are still subsidizing adoption, offering aggressive discounts, and prioritizing market share over normalized margins. That may be good news in the short term, but it is dangerous to assume those conditions will persist.