Your AI Returns a 200 OK. That Doesn't Mean It's Right.
A problem I kept hitting while building developer tools, and what I learned trying to solve it.
A few months ago I started noticing something strange in a feature I'd built. The flow was simple: send some text to an LLM, ask for a structured JSON response, use that response in the app. It had been working fine for weeks. Then, with no code changes on my end, a field that used to always be a number started showing up as a string. Nothing crashed. No error in the logs. The API call returned 200 OK every single time. The response just quietly stopped looking the way my code expected it to.
It took me longer than I'd like to admit to figure out what was happening, mostly because I was looking in the wrong place. I kept checking my own code, assuming I'd introduced a bug. I hadn't. The model's output had simply drifted.
Why this is different from a normal API breaking







