TL;DR
An AI gateway is a middleware layer between your application code and your LLM providers - it centralises routing, auth, rate limiting, cost tracking, and guardrails in one place
You probably don't think you need one until something specific breaks: a runaway cost spike, a failed model causing silent errors, a security audit you can't pass
We went from scattered SDKs and shared API keys to a gateway-first setup over about three months - this post covers what changed and what we'd do differently
Six months ago we had what I'd describe as a functional mess. We were running three LLM providers - OpenAI for our customer-facing chat, Anthropic for internal document summarisation, and a self-hosted Llama model for batch classification jobs. Each had its own SDK. Each had its own API key, living in .env files on whoever's machine had last run that service. Each had its own rate limiting logic, copy-pasted between services with slight variations.






