The problem
Most code an AI agent writes looks right. It compiles, it reads well, it probably works. "Probably" is the problem. The agent stops when the work looks done, and "looks done" is rarely "proven done." You find the gap later, in production, in the case nobody tested.
proven-python
proven-python is a Claude Code skill that holds an AI coding agent to the way a disciplined engineer actually works on Python:
Write the failing test first, then the code that makes it pass.







