Getting structured data out of a language model reliably is harder than it looks. The model might return JSON that's almost valid, skip required fields, or wrap the object in a markdown block. Three Python libraries try to solve this differently: instructor, LangChain's structured output, and PydanticAI. This article is a direct comparison based on actual use — not documentation.
What "structured output" actually means
When you call a language model you get back a string. If you want a Python object — a typed dict, a Pydantic model, a dataclass — you need something to bridge the gap. There are two broad approaches:
JSON mode / function calling: You send the schema to the model and it commits to returning valid JSON matching that schema.
Parse-and-retry: You ask for JSON, validate with Pydantic, and if validation fails you send the error back and ask the model to fix it.










