After running AI coding agents in production for a while, one thing became clear: the failures aren't in the code the model writes. They're at the seams — git, CI, auth, the network. The boundaries with the outside world.

The model itself is genuinely capable. It writes functions, writes tests, refactors. What breaks is everything around the work: pushing the result, waiting on CI, merging the PR, refreshing a token, calling another service. And the failures are often the kind a human would avoid without thinking.

Here are five incidents we hit and fixed in Codens' Purple (the orchestration core) over the last few weeks. All real, with production task IDs and dates. Every fix is merged. There's a shared design lesson at the end that ties them together.

Incident 1: a half-resolved merge nearly flooded a PR with 12,000 lines

This was the scary one.