Author(s): GSO1
Originally published on Towards AI.
Figure by the author with assistance from Claude (Anthropic)
Calling a large language model (LLM) like ChatGPT “autocomplete” is not exactly wrong, but it is deeply misleading. Most of us think of autocomplete as a text-completion tool: a phone keyboard guessing the next word, a search bar finishing a phrase. But that picture is not powerful enough to explain what LLMs actually produce — explanations, analogies, plans, summaries, arguments, code, stories, dialogue.
A transformer-based LLM does predict one token at a time — roughly a word, though in practice often a word-fragment — but that visible sequence is only the surface trace of a much richer hidden process. Before each word appears, the model has built a high-dimensional internal state that reflects the topic, the context, the tone, the intent, and the likely directions the answer could take. The next token is not read off the prompt. It is read off this internal state. That is why a system trained only to predict the next token produces explanations, arguments, analogies, plans, and dialogue that feel far more than anything we would call autocomplete.










