I spent three days debugging why my GPT-4-powered app kept returning malformed JSON. It wasn't a prompt issue. I tried few-shot examples, system messages, even begged the model with 'PLEASE give me valid JSON'. And it still broke in production.

This is the story of how I finally got reliable structured output from LLMs — without playing whack-a-mole with edge cases.

The Problem: JSON Roulette

I was building a small internal tool that extracts meeting notes and turns them into structured data: action items, dates, assignees. The prompt was crystal clear:

Return a JSON array of objects with fields: action, due_date (ISO 8601), assignee.