AI Shrinkflation: Your AI Model Was Quietly Dialed Back
The age of token subsidization is over. AI providers are adjusting pricing, throttling capacity, and dialing back model quality — and most users haven't noticed yet.
You open your AI coding assistant on a Tuesday morning. Something feels off. The reasoning feels shallower. The context-building you relied on is gone. It's not stopping to read the file before editing it. You aren't imagining it.
A senior director at AMD's AI group ran the numbers: 6,852 Claude Code session files, 17,871 thinking blocks, 234,760 tool calls. Her analysis found that reasoning depth had dropped roughly 67% following a February 2026 update. [1] The model had shifted from "read first, then edit" to "edit without reading context." Code quality suffered accordingly.
And when a developer discovered why, the story got more interesting: Anthropic had quietly injected a reasoning_effort parameter set to 25 out of 100 into consumer-facing Claude.ai sessions — visible only via extended thinking introspection. [2] The same model, a fraction of the effort. Same price. No announcement.













