Unify all your AI models - local and cloud - behind a single OpenAI-compatible API with LiteLLM and Ollama.
LiteLLM is a proxy server that exposes 100+ LLM providers through one endpoint. Connect it to Ollama for local inference, and you get load balancing, cost tracking, rate limits, and automatic fallback routing.
What You Need
Python 3.9+
Ollama installed and running







