The Return of Recursion: How 5M-Parameter Models Are Outperforming Frontier LLMs on Reasoning in 2026
TL;DR Summary
Tiny recursive models with 5-7 million parameters are achieving state-of-the-art on deterministic reasoning tasks that frontier LLMs score 0% on — including Sudoku-Extreme, ARC-AGI puzzles, and maze navigation
The key innovation: reasoning in latent space instead of generating "thinking tokens" like Chain-of-Thought — delivering 100x speedups and 75% token reduction
Probabilistic TRM (7M params) achieves 98.75% on Sudoku-Extreme using Gaussian noise to escape local optima, while DeepSeek-R1 scores 0.0%












