The easiest way to make an AI gateway feel flaky is to pretend every upstream model works the same way.

On paper, a lot of tools look compatible.

They all take a prompt. They all return text. Some of them even share an OpenAI-shaped API.

In practice, the differences show up exactly where users stop forgiving you:

a tool-specific field gets dropped