In short: the context tax is what you pay when every agent step re-sends the whole session transcript as input again, so step N re-bills turns 1..N and total cost grows with n(n+1)/2. Cheaper tokens lower the unit, not the shape. context_tax.py meters the re-bill multiplier offline; one debugging session measured 42.8x.

AI disclosure: I drafted this with an AI writing assistant. The tool, the fixtures, and every number below come from a real local run of the script in this post on tiktoken o200k_base. I reviewed and edited it before publishing.

Token prices have been sliding all year. Your agent bill probably hasn't.

I kept running into the same confusion in my own FinOps notes: per-token rates drop, and the monthly number goes the other way. The usual answers ("you're using a bigger model," "you have more users") didn't explain a single session getting more expensive as it ran. So I wrote a 40-line meter to look at the one thing nobody charts: the session transcript itself. On a synthetic-but-realistic debugging session, the last step billed 42.8x the input of the first step. Same model. Same task. No new users.

That gap has a boring cause and an annoying consequence. Here's both, plus the script.