Here’s a fundamental problem with building AI that can improve itself: the thing grading the homework never gets any smarter. A static evaluator eventually becomes the bottleneck. A new research framework from the University of Cambridge and Nvidia aims to fix that by letting both the AI agent and its evaluator evolve in tandem.
The preprint paper, titled “The Red Queen Gödel Machine: Co-Evolving Agents and Their Evaluators,” was submitted on June 24 by a team of 13 authors spanning Cambridge, Nvidia, Flower Labs, MBZUAI, and Inria. The core idea is deceptively simple: if AI agents keep getting better but their evaluators stay frozen in place, progress stalls. So make them evolve together.
How RQGM actually works
The framework, abbreviated RQGM, introduces what the researchers call “epoch-based controlled utility evolution.” The system runs in discrete rounds where both the AI doing the work and the AI judging the work get upgraded simultaneously.
This is a direct evolution of Jürgen Schmidhuber’s 2003 Gödel Machine concept, which proposed AI systems that could rewrite their own code using formal mathematical proofs. That original idea was elegant on paper but largely impractical in the real world. The new RQGM model swaps out formal proofs for something more organic: Darwinian mutation and iterative co-evolution.







