The cost of running a capable AI model fell by roughly 280 times in two years. Over the same stretch, the average company's AI bill went up, not down. Both numbers are real, both come from credible research, and the space between them is the single most useful thing an operator can understand about AI economics in 2026. It explains why "the models keep getting cheaper" and "our AI spend is out of control" are being said in the same meeting, by the same people, about the same systems.

I watch this play out in client projects every month. Someone reads that token prices collapsed, assumes their costs are about to fall off a cliff, and then opens an invoice that did the opposite. The confusion is not a billing error. It is a structural feature of how AI is now built, and once you see the mechanism you can plan around it instead of being surprised by it.

The Number That Should Have Lowered Your Bill

Start with the collapse, because it is genuinely staggering. Stanford's 2026 AI Index pegs the price of GPT-3.5-level performance at about 280 times cheaper between November 2022 and October 2024, falling from roughly 20 dollars per million tokens to about 7 cents. That is not a typo and it is not a one-off. Epoch AI measures a median decline near 50 times per year for equal capability. The venture firm a16z frames the same trend more conservatively at around 10 times per year, which they point out is still faster than compute fell in the PC era or bandwidth fell during the dotcom build-out.