Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

At a glance

Some risks appear only when agents interact, not when tested alone. Actions that seem harmless can cascade causing a chain reaction across an agent network.

In our tests, a single malicious message passed from agent to agent, extracting private data at each step and pulling uninvolved agents into the chain.

We saw early signs that some agent networks become more resistant to these attacks, but defenses are still an open challenge being worked on.

Agents belonging to different users and organizations are beginning to interact with each other. These networks of agents are emerging as advances in large language models (LLMs) and silicon lower barriers to building agents, while tools like Claude, Copilot, and ChatGPT, along with existing platforms such as email and GitHub, bring them into constant contact. As a result, agents are no longer working in isolation but becoming participants in a shared, interconnected environment.

Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

Other newsrooms on this story

Related reading

AI agents of chaos? New research shows how bots talking to bots can go sideways…

AI agents are fast, loose, and out of control, MIT study finds

How to stop AI agents going rogue

Agent risk management mission-critical for an AI workforce - SiliconANGLE

Agent responsibly

Five Eyes warn agentic AI is too dangerous for rapid rollout