Google Deepmind treats its own AI agents like rogue employees with office keys

Google Deepmind treats its own AI agents as potential insider threats. The company's new "AI Control Roadmap" ties security measures to measurable AI capabilities, and an analysis of one million coding tasks shows most problems stem from overzealous agents, not malicious intent. Deepmind warns the window for global security standards is closing fast.

giovedì 18 giugno 2026 New tab

Rather than assuming alignment always works, the company built a safety framework that plans for the worst case: AI agents that go off script.

Deepmind's AI Control Roadmap details how Google locks down its most advanced internal AI systems. The company thinks it could work as a blueprint for the rest of the industry, too.

The framework assumes that a highly capable AI agent might not share its operators' goals and plans accordingly. Deepmind compares it to a driving instructor with dual controls: The instructor trusts the student but keeps a hand near the wheel and a foot near the brakes. Same idea here. AI agents only get permissions based on verified behavior, and trust builds gradually through controlled access.

Deepmind models its AI agents as insider threats

Deepmind treats its internal AI agents like employees who already have office access but might work against the company's interests. The framework builds on the MITRE ATT&CK framework, a well-established cybersecurity tool that breaks potential attacks down into individual tactics and techniques. That lets Deepmind track risks systematically, spot suspicious behavior early, and test defenses in controlled exercises.

Rather than assuming alignment always works, the company built a safety framework that plans for the worst case: AI agents that go off script.

Deepmind's AI Control Roadmap details how Google locks down its most advanced internal AI systems. The company thinks it could work as a blueprint for the rest of the industry, too.

Deepmind models its AI agents as insider threats

Google Deepmind treats its own AI agents like rogue employees with office keys

Google Deepmind treats its own AI agents like rogue employees with office keys

Other newsrooms on this story

Related reading

Google DeepMind has a plan to protect itself from its own rogue AI agents |…

DeepMind plans for rogue AI agents

AI Watchdog Warns of 'Rogue Deployment' Risk at Top Labs, With Capabilities…

METR report warns of rogue AI deployments at major tech firms

The race to deploy an AI workforce faces one important trust gap: What happens…

Why the next AI safety problem is the conversation between models

Other newsrooms on this story

Related reading

Google DeepMind has a plan to protect itself from its own rogue AI agents |…

DeepMind plans for rogue AI agents

AI Watchdog Warns of 'Rogue Deployment' Risk at Top Labs, With Capabilities…

METR report warns of rogue AI deployments at major tech firms

The race to deploy an AI workforce faces one important trust gap: What happens…

Why the next AI safety problem is the conversation between models