Your AI agent calls the wrong tool — and your JSON schema is usually why

Here's the number that should worry you more than it does: an agent that calls the right tool with the right arguments 95% of the time completes an eight-step task correctly only about 66% of the time. Reliability doesn't fail in one dramatic crash. It leaks. Every step is a coin that lands heads 19 times out of 20, and you're flipping it eight times in a row.

The good news is that most of that leak isn't the model being dumb. It traces to two things you control completely: the JSON schema you hand the model, and whether you let it guess when it shouldn't. Fix those two and the per-call rate climbs — and because it compounds, small gains pay off hugely.

Your schema is a prompt, not documentation

This is the reframe that fixes everything downstream. When you define a tool, the description fields aren't docs for your teammates. They are the only instructions the model gets about when and how to use that tool. The model never sees your implementation. It sees the schema. That's it.

So a schema like this is not "good enough":

Your schema is a prompt, not documentation

So a schema like this is not "good enough":

Your AI agent calls the wrong tool — and your JSON schema is usually why

Your AI agent calls the wrong tool — and your JSON schema is usually why

Related reading

Tool-Call Accuracy Is Lying to You: A Four-Layer Eval Stack for Agents

Your schema validation passes and the agent still picks the wrong tool. The bug…

AI Agent Tool Design: What Works and What Doesn't

The Tool Call Succeeded. The Outcome Failed.

Your AI agent reports 80% task completion. It fabricated it.

Why your AI agent is flaky — and 7 rules that make it reliable

Related reading

Tool-Call Accuracy Is Lying to You: A Four-Layer Eval Stack for Agents

Your schema validation passes and the agent still picks the wrong tool. The bug…

AI Agent Tool Design: What Works and What Doesn't

The Tool Call Succeeded. The Outcome Failed.

Your AI agent reports 80% task completion. It fabricated it.

Why your AI agent is flaky — and 7 rules that make it reliable