LLMs are probabilistic text generators. In a notebook demo, that's fine. In production, it means your pipeline will occasionally receive a Python dict where you expected JSON, a 900-word paragraph where you asked for three bullet points, or a hallucinated field name that breaks your downstream schema. This post is not about theory — it's about five concrete patterns, each with working code, that handle these failures reliably.
The core problem
You're calling an LLM API expecting structured output. The model has been prompted carefully. But over thousands of calls, you'll see:
Malformed JSON (trailing commas, unquoted keys, markdown code fences wrapping the payload)
Responses that exceed or fall short of length constraints











