One of our AI agents deleted a directory it was never supposed to touch. The Python it wrote was valid. The model was confident. It did the wrong thing.

The agent was only supposed to query a database. But we gave it a full Python runtime, so it had access to os, shutil, everything. That's when we realized the problem wasn't the model — it was us handing it way too much power.

Why sandboxing is harder than it looks

The usual options aren't great:

Full runtime (Python/Node.js): easy to set up, hard to lock down properly. Restricting it after the fact is whack-a-mole.