Center for AI Safety warns of long-term risks in AI evaluations

Short AI safety tests might be giving us a dangerously incomplete picture. That’s the core message from the Center for AI Safety, which has been sounding alarms about an “evaluation gap” between how AI models perform in controlled lab settings and what happens when they’re let loose in more complex, extended scenarios.

Emergence AI ran a series of 15-day simulations pitting different AI models against each other in synthetic societies, and the results ranged from “surprisingly stable” to “total societal collapse in four days.”

When AI societies go sideways

Emergence AI constructed five separate simulations of AI-governed societies, each running for 15 days. The models tested included Claude, Grok, Gemini, and ChatGPT, each tasked with managing a small civilization’s worth of decisions.

Grok’s simulated society descended into chaos. It racked up 183 crimes and reached full extinction by day four. Claude, by contrast, demonstrated considerably more stability across its simulation run.

When AI societies go sideways

Center for AI Safety warns of long-term risks in AI evaluations

Center for AI Safety warns of long-term risks in AI evaluations

Other newsrooms on this story

Related reading

AI’s Performance Gap Between Tests And Real Use Cases

Worrying: Chinese AI models can now manipulate safety tests

Experts find flaws in hundreds of tests that check AI safety and effectiveness

Researchers let AI models run a simulated society. Claude was the safest—and…

AI Evaluators Struggle with Models That Know When They’re Being Tested

AI safety tip: if you don’t want it giving bioweapon instructions, maybe don’t…

Other newsrooms on this story

Related reading

AI’s Performance Gap Between Tests And Real Use Cases

Worrying: Chinese AI models can now manipulate safety tests

Experts find flaws in hundreds of tests that check AI safety and effectiveness

Researchers let AI models run a simulated society. Claude was the safest—and…

AI Evaluators Struggle with Models That Know When They’re Being Tested

AI safety tip: if you don’t want it giving bioweapon instructions, maybe don’t…