The agents were doing exactly what I told them to. That was the problem.

I'd built a pipeline where AI agents could take a spec file, implement a feature, run the tests, review the result, and commit — without me writing a line of code. It mostly worked. Dozens of features shipped. But I kept reviewing the output and feeling like something was off. Not broken. Just subtly wrong in a way that was hard to name.

I spent a while blaming the models. Then the prompts. Then the validation steps. Eventually I had to sit with the obvious: the agents were implementing exactly what I'd written. My specs were underspecified. The bottleneck was always me, at the planning stage.

The thing most people throw away

There's something that feels right about vibe coding. You're operating at the level of intent — describing what you want and letting the model handle the mechanics. That part is genuinely useful.