Part 4 of 6: One Rogue Agent. The Whole Swarm Followed.

TL;DR: One adversarial agent. 2% of the population. That was enough to flip the entire swarm's...

giovedì 4 giugno 2026 New tab

1,324 words~6 min read

TL;DR: One adversarial agent. 2% of the population. That was enough to flip the entire swarm's behaviour. This is prompt injection at population scale — and your individual security audits can't see it.

Catch up: Part 1 biased judge. Part 2 upgrading made it worse. Part 3 the population drifted on its own.

Everything until now was nobody's fault.

Accidental drift. Emergent conventions. Feedback loops compounding in silence. Nobody planned it. Nobody intended it. The pipeline just... shifted.

Part 4 has a villain.

Part 4 of 6: One Rogue Agent. The Whole Swarm Followed.

Part 4 of 6: One Rogue Agent. The Whole Swarm Followed.

Other newsrooms on this story

Related reading

Part 3 of 6: Every Agent Passed. The System Failed.

The Most Dangerous Agent Isn't Evil — It's Hungry

Why AI Agents Go Rogue: 4 Real Incidents and What They Share

I Spent the Last Few Days Testing AI Agents and Got Scared — So I Built…

Running a Team of AI Sub-Agents: What Breaks — and the Rules I Built Around It

The Agent That Couldn't Rewrite Its Own History (Once We Made That True)

Other newsrooms on this story

Related reading

Part 3 of 6: Every Agent Passed. The System Failed.

The Most Dangerous Agent Isn't Evil — It's Hungry

Why AI Agents Go Rogue: 4 Real Incidents and What They Share

I Spent the Last Few Days Testing AI Agents and Got Scared — So I Built…

Running a Team of AI Sub-Agents: What Breaks — and the Rules I Built Around It

The Agent That Couldn't Rewrite Its Own History (Once We Made That True)