← Back to all posts
Research Journey

Why LLMs are stuck — the case for world models

June 5, 2026

Source: The Blueprint of Reality — MM6451 study material

Hans Moravec observed in the 1980s that tasks easy for humans (walking, catching a ball, recognising objects) were hard for AI, while tasks hard for humans (chess, arithmetic, medical diagnosis) were relatively easy for AI. This paradox turns out to be a deep structural feature of how current large language models work.

LLMs learn statistical correlations in text. They know the word “gravity” in every possible context. They can define it precisely. What they can’t do is understand what happens when you push a cup off a table, because understanding that requires a grounded model of physical causation, not a statistical model of text co-occurrence. Scaling text training data doesn’t fix this. You can’t learn physics from descriptions of physics. You need the physics.

World models — internal simulations that represent the physical properties, causal structure, and temporal dynamics of the world — are the proposed solution. The theoretical basis goes back to Kenneth Craik’s 1943 work: intelligent behaviour depends on maintaining a small-scale model of the world and using it to simulate possible futures before acting. Current LLMs explicitly lack this.

The research direction this points toward — combining the language understanding of LLMs with the physical grounding of world models — is one of the most active areas of AI research in 2026. The destination, if achievable, is a qualitatively different kind of AI: one that doesn’t just describe the world, but understands it.