I want to tell you about the first time I watched a coding agent fix a bug correctly on the first try.

It wasn't the model. The model was the same one I'd been using for months. It was the loop around the model that had changed.

The agent didn't open three files, skim them, write a confident theory in its head, and edit something that "looked like" the cause. That was the old loop. The one where you run the code, see the same bug, and watch the agent shrug and try another file.

This time the agent did something different. It listed five hypotheses. Picked one. Added some logging. Then it stopped and asked me to reproduce the failure. I clicked the button. The bug fired. The agent read the logs, confirmed the hypothesis, wrote a two-line fix, and pulled its logging back out.

The whole thing took maybe four minutes.