Your LLM never reads your words — it reads tokens. And almost every surprise on your AI bill traces back to that one fact. Here's the breakdown 👇

Here's the thing almost nobody internalizes about large language models: Claude never reads your words. It reads tokens — numbers. Your prompt is chopped into pieces, each piece is mapped to an integer, and the model only ever sees those integers. Every limit you hit, every bill you pay, and half the weird behavior you've seen traces back to this one fact.

This article explains what a token actually is, why the model works in tokens instead of words, and how that single design choice explains your AI bill.

The one-sentence version: text is split into tokens (chunks roughly ¾ of a word on average), each token maps to a number, and you pay per token — in and out — so understanding tokens is understanding cost.

What a token actually is