AI Shrinkflation: Your AI Model Was Quietly Dialed Back

The age of token subsidization is over. AI providers are adjusting pricing, throttling capacity, and dialing back model quality — and most users haven't noticed yet.

You open your AI coding assistant on a Tuesday morning. Something feels off. The reasoning feels shallower. The context-building you relied on is gone. It's not stopping to read the file before editing it. You aren't imagining it.

A senior director at AMD's AI group ran the numbers: 6,852 Claude Code session files, 17,871 thinking blocks, 234,760 tool calls. Her analysis found that reasoning depth had dropped roughly 67% following a February 2026 update. [1] The model had shifted from "read first, then edit" to "edit without reading context." Code quality suffered accordingly.

And when a developer discovered why, the story got more interesting: Anthropic had quietly injected a reasoning_effort parameter set to 25 out of 100 into consumer-facing Claude.ai sessions — visible only via extended thinking introspection. [2] The same model, a fraction of the effort. Same price. No announcement.

AI Shrinkflation: Your AI Model Was Quietly Dialed Back

Other newsrooms on this story

Related reading

The Tokenpocalypse: What AI Token Pricing Means for SL Builders

Other newsrooms on this story

Related reading

The Tokenpocalypse: What AI Token Pricing Means for SL Builders

OpenAI considers aggressive price reductions as Anthropic gains ground in AI…

Corporate AI spending spree hits a wall as companies rethink employee tool…

Unit Prices Are Falling, So Why Are the Bills Going Up? Tokenomics for AI…

The enduring paradox of the AI economy — models get better and more efficient,…

AI is becoming a bargain hunter's market, with a few luxury models on top