Stanford researchers have built a framework that lets multiple AI agents work together without anyone calling the shots. The framework, called Decentralized Language Models (DeLM), was detailed in an arXiv paper published June 9 by researchers Yuzhen Mao and Azalia Mirhoseini. It achieved a 10.5 percentage point improvement on a widely used coding benchmark while cutting per-task costs by roughly 50%, to about $0.12 per task.

How DeLM actually works

Most multi-agent AI systems today rely on a central orchestrator. One main model receives a task, breaks it into pieces, assigns those pieces to worker agents, collects results, and synthesizes a final answer.

DeLM flips this architecture. Instead of a boss handing out assignments, agents asynchronously claim subtasks from a shared task queue. They also have access to a shared verified context, essentially a communal knowledge base that any agent can read from and contribute to.

The system introduces several technical components to make this work reliably. Compression mechanisms keep the shared context from ballooning out of control. Verification gates ensure that information added to the shared pool meets quality thresholds before other agents can use it. And the asynchronous task queues mean agents don’t sit idle waiting for a central controller to tell them what to do next.