I built a tool to catch AI coding agents misbehaving — and put zero AI in it

I lean on AI coding agents hard. Claude Code, Cursor, Codex — I drive them fast to ship fast. That's not a confession, it's the whole reason this project exists. If you push these tools to their limits every day, you stop seeing them as magic and start seeing exactly where they break.

And the thing I kept noticing is this: they never break in the chat.

In the conversation, the agent looks great. It explains its plan, it sounds reasonable, it agrees with all your constraints. The problem shows up later — in the diff, after the fact, when you're tired and the PR is green and you just want to merge.

What actually goes wrong

A short, real list of things I watched coding agents do, none of which looked wrong in the chat:

And the thing I kept noticing is this: they never break in the chat.

What actually goes wrong

A short, real list of things I watched coding agents do, none of which looked wrong in the chat:

I built a tool to catch AI coding agents misbehaving — and put zero AI in it

I built a tool to catch AI coding agents misbehaving — and put zero AI in it

Related reading

AI doesn't write bad code. It writes plausible code — so I tried to break my…

Designing Coding Agent Skills That Actually Work

Stop Fighting Your AI Coding Agent - Here's How to Actually Use It 🤖

Stop Flying Blind with Coding Agents: Inspect Claude Code and Codex Requests…

Your AI coding agent forgets everything outside the chat. I built OpenContext…

How I built mechanical enforcement for AI coding agents — and why prompts…

Related reading

AI doesn't write bad code. It writes plausible code — so I tried to break my…

Designing Coding Agent Skills That Actually Work

Stop Fighting Your AI Coding Agent - Here's How to Actually Use It 🤖

Stop Flying Blind with Coding Agents: Inspect Claude Code and Codex Requests…

Your AI coding agent forgets everything outside the chat. I built OpenContext…

How I built mechanical enforcement for AI coding agents — and why prompts…