A 1.5B model running on your laptop will return JSON that almost parses. The closing brace is missing. A trailing comma sneaks in. The whole thing is wrapped in a markdown fence with a chirpy "Sure! Here's your JSON:" on top. Cloud models do this too, but small local models do it constantly, and that is exactly where most "just prompt it harder" advice falls apart.
I wrote about validating LLM responses with Zod before: schemas as contracts, safeParse, extracting JSON from chaos. That post is the foundation. This one is the local-model-specific layer on top: Ollama's native JSON modes, retry loops that actually converge, repairing truncated output, and a single generateStructured<T>(schema) helper that ties it all together so you never hand-roll this again.
Two JSON Modes Most People Skip
Ollama gives you two ways to force structure before you ever touch Zod. Use them. They cut your parse-failure rate dramatically.
The first is format: "json". It constrains decoding so the model can only emit syntactically valid JSON. No markdown fences, no preamble, no trailing prose.







