Pydantic and JSON-schema guarantee the shape of a tool call. They say nothing about whether it was the right call for the user's intent.
TL;DR: We put strict Pydantic validation on every tool call our agent makes, expecting tool-call failures to drop. They barely did. When I categorized 40 logged failures, 31 of them passed schema validation cleanly. They were well-formed calls to the wrong tool, or the right tool with arguments that were valid types but wrong values. Schema validation catches structural errors. Our actual problem was semantic, and the validator is blind to it.
What schema validation actually guarantees
Pydantic checks types, required fields, enums, ranges. A call like cancel_order(order_id="A123") is structurally perfect even when the user asked to cancel a subscription, not an order. The validator passes it. The user is still angry. Shape is not intent.
The 40-failure breakdown






