AI agents already write code, create files, refactor modules, and make commits. This is the daily workflow of anyone using Cursor, Claude Code, Copilot, Kiro, or any IDE with an integrated agent. The problem is that when you review the git history afterwards, you cannot distinguish what the agent did on its own from what you asked it to do — and you cannot determine why it made each change.

I work in QA and software quality engineering. Part of my work is ensuring traceability — knowing who did what, when, and why. When I started using AI agents intensively in my development workflow, I noticed that git log had become a black box. The commits were there, the diffs were there, but the intent had disappeared.

A feat(auth): add token refresh endpoint records what changed. It does not record whether the agent acted autonomously or was explicitly asked. It does not record what condition in the code made that change necessary. And if you need to audit the agent's work three months later — or if another agent needs to read that history to understand the project — there is no way to reconstruct the reasoning from the commit alone.

The specification described in this article is published at github.com/hubtheocoelho/qac-spec.