Your AI Agent Doesn't Need to Be Smarter. It Needs to Be Idempotent

Most of the failures I see in production AI agents aren't reasoning failures. The model picks the right tool, fills in the right arguments, and makes a perfectly sensible decision. Then the agent charges the customer twice.

The reason is mundane and has nothing to do with intelligence. A write-capable agent — one that can send an email, create a ticket, move money, or update a database — lives inside the same unreliable network as any other distributed system. Requests time out. Connections drop after the server already committed the write but before the response came back. An orchestration framework retries a step that looked like it failed but didn't. And because the agent is a loop that re-plans on every observation, a single ambiguous outcome can send it down the path of just trying the action again.

In a read-only agent, a retry is free. In a write-capable agent, a retry is a second irreversible action in the real world. That asymmetry is the whole game, and the fix is older than LLMs: idempotency.

The shape of the bug

Here's the sequence that bites teams over and over. The agent calls send_invoice. The downstream service receives it, creates the invoice, and starts sending the response. Somewhere on the way back, the connection dies. From the agent's point of view, the call failed — it got a timeout, not a 200. So the agent, doing exactly what a resilient system is supposed to do, retries. Now there are two invoices.

The shape of the bug

Your AI Agent Doesn't Need to Be Smarter. It Needs to Be Idempotent

Your AI Agent Doesn't Need to Be Smarter. It Needs to Be Idempotent

Related reading

Enterprise AI doesn't need a better model. It needs smarter agent logic.

AI doesn't fail because the model is bad. It fails because there's nothing…

The Day I Realized AI Agents Need Circuit Breakers

Why AI Agents Fail in Production (And How Engineering Teams Are Fixing It in…

Inside An AI Agent: Planning, Tool Use, Memory, Constraints, And Verification

Your AI coding agent doesn't need a smarter model. It needs your backlog.

Related reading

Enterprise AI doesn't need a better model. It needs smarter agent logic.

AI doesn't fail because the model is bad. It fails because there's nothing…

The Day I Realized AI Agents Need Circuit Breakers

Why AI Agents Fail in Production (And How Engineering Teams Are Fixing It in…

Inside An AI Agent: Planning, Tool Use, Memory, Constraints, And Verification

Your AI coding agent doesn't need a smarter model. It needs your backlog.