The problem

Most code an AI agent writes looks right. It compiles, it reads well, it probably works. "Probably" is the problem. The agent stops when the work looks done, and "looks done" is rarely "proven done." You find the gap later, in production, in the case nobody tested.

proven-python

proven-python is a Claude Code skill that holds an AI coding agent to the way a disciplined engineer actually works on Python:

Write the failing test first, then the code that makes it pass.