How to Run Untrusted AI Agent Code Without Docker

Docker shares the host kernel. That was always the trade. It was fine when a human read the script before it ran. It stopped being fine the second an LLM started writing code at runtime off a prompt nobody pre-screened. So here's the practitioner version: what to actually run when your agent executes code you've never seen.

The thing that broke

The review step is gone. A model writes a script, the script lives for milliseconds, then it executes. Could be a clean chart. Could be a curl-pipe-shell because a prompt injection rewired intent four hops upstream. You don't get to read it first.

And the container under it shares one kernel with every other workload on the box. CVE-2024-1086, a netfilter use-after-free, owns every container on the host once it pops. CISA confirmed active ransomware exploitation in late 2025, years after the patch. November 2025 dropped three more under runC (CVE-2025-31133, CVE-2025-52565, CVE-2025-52881), all bypassing maskedPaths through symlink races to write procfs gadgets. Own core_pattern and the kernel runs your binary on the next coredump, as root.

In March 2026, Oxford and the UK AISI shipped SandboxEscapeBench. Frontier models reliably escaped privileged containers, writable host mounts, and exposed Docker daemons on their own. Cost per attempt: roughly a dollar. The model does the recon, picks the CVE, hands back the shell. So the fix isn't a better Docker config. It's a different boundary.

The thing that broke

How to Run Untrusted AI Agent Code Without Docker

How to Run Untrusted AI Agent Code Without Docker

Other newsrooms on this story

Related reading

Run AI Coding Agents Safely with Docker Sandboxes

Letting an AI agent run shell commands is RCE on your machine. I fixed it with…

The Untrusted Autonomous Workload and AI Sandboxes | Docker

Coding Agent Horror Stories: The rm -rf ~/ Incident | Docker

Docker AI Governance: Unlock Agent Autonomy, Safely | Docker

AI Coding Agent Horror Stories: The 13-Hour AWS Outage | Docker

Other newsrooms on this story

Related reading

Run AI Coding Agents Safely with Docker Sandboxes

Letting an AI agent run shell commands is RCE on your machine. I fixed it with…

The Untrusted Autonomous Workload and AI Sandboxes | Docker

Coding Agent Horror Stories: The rm -rf ~/ Incident | Docker

Docker AI Governance: Unlock Agent Autonomy, Safely | Docker

AI Coding Agent Horror Stories: The 13-Hour AWS Outage | Docker