Your favorite AI assistant might be smart, but researchers now argue it should be treated with the same suspicion your computer treats a random downloaded program. A May 2026 paper published on arXiv makes the case that AI agents, especially those handling financial transactions, need to be architected as fundamentally untrusted components within larger systems.

The paper, titled “Agent Security is a Systems Problem” (arXiv:2605.18991), arrives at a moment when the crypto industry is betting heavily on autonomous AI agents to manage everything from DeFi trades to wallet operations. Circle CEO Jeremy Allaire has projected that billions of AI agents will independently conduct economic activities using stablecoins within the next three to five years.

The operating system analogy

Modern operating systems don’t trust individual processes. Every application runs in a sandbox with limited permissions, can only access files it’s been explicitly granted, and gets terminated if it tries to reach beyond its boundaries. The researchers want the same philosophy applied to AI agents.

The paper advocates for three specific measures. First, enforcing security invariants at the system level, meaning hard rules that can’t be overridden by the AI itself. Second, implementing least-privilege sandboxing, where agents only get access to the minimum resources needed for their specific task. Third, ensuring effective separation of instructions from data, which addresses one of the most dangerous attack vectors in AI systems today.