Think of the smartest autocomplete tool you’ve ever used. Now ask it to predict what happens when you push a glass off a table. It can describe the shattering in poetic detail. It has absolutely no idea why the glass falls.

That’s essentially the argument Demis Hassabis, CEO of Google DeepMind, laid out in a January 2026 interview on CNBC’s “The Tech Download.” Large language models, for all their impressive capabilities, fundamentally lack an understanding of physics, causality, spatial dynamics, and long-term planning. Language, Hassabis contends, describes the world but does not fully contain it.

The fix, according to Hassabis, is something called “world models”: AI systems designed to simulate and predict real-world dynamics rather than just process and generate text. It’s a distinction that sounds academic until you realize it’s reshaping DeepMind’s entire approach to building artificial general intelligence.

Language is a map, not the territory

Here’s the thing about LLMs like Google’s Gemini. They can process text, images, audio, and video. They can pass bar exams and write functional code. But ask them to reason about what happens when two objects collide, or to plan a sequence of physical actions over a long time horizon, and the cracks start showing.