One of our AI agents deleted a directory it was never supposed to touch. The Python it wrote was valid. The model was confident. It did the wrong thing.
The agent was only supposed to query a database. But we gave it a full Python runtime, so it had access to os, shutil, everything. That's when we realized the problem wasn't the model — it was us handing it way too much power.
Why sandboxing is harder than it looks
The usual options aren't great:
Full runtime (Python/Node.js): easy to set up, hard to lock down properly. Restricting it after the fact is whack-a-mole.









