Most of us use LLMs every day now, but if you asked the average developer what's actually happening between hitting enter and getting a response, the answer is usually some mix of "it's a neural network" and a shrug. That's fine — you don't need to know how a database B-tree works to write a query. But understanding the mental model behind LLMs makes you dramatically better at using them: you stop being surprised when they hallucinate, you write better prompts, and you understand why things like context windows and RAG exist.
So here's the whole thing, explained from the ground up. No equations.
The one-sentence version
An LLM is a function that takes some text and predicts the next chunk of text. That's it. Everything else — answering questions, writing code, "reasoning" — is an emergent side effect of doing that one thing extremely well, billions of times over.
Let's unpack how that actually produces something that feels intelligent.







