I connected an MCP server last month and watched my token bill jump 37% on the first call. The actual work? A single git status. The schema for that one server consumed 42,000 tokens before the model typed a single character.
That's not a typo. Forty-two thousand.
If you ship AI agents in 2026 and you're not measuring MCP overhead, you're leaving real money on the table. Here's what I found when I actually instrumented the tax — and four patterns that brought my bill back under control.
What the "MCP tax" actually is
MCP (Model Context Protocol) defines a JSON-RPC handshake where every connected server pushes its full tool schema into the model's context window. The model needs those definitions to know what tools exist and how to call them. The protocol is clean. The economics are not.






