How I built a 3-provider LLM fallback system in production (and what actually broke)

I'm a pre-final year student. I built Socra(https://socra-production.up.railway.app/) — a multi-agent LLM SaaS that interrogates your startup idea using 5 specialist AI personas before generating an architecture masterplan. It has paying users. It runs on Railway. And for the first two weeks of production, it was quietly broken in a way I didn't notice until real users hit it.

This is the story of how I built the 3-provider fallback chain (Anthropic → Google → Groq), what broke along the way, and the actual code that runs in production today.

Why you need a fallback chain at all

When I first deployed Socra, the LLM routing was simple: one provider, one model, one API key. It worked fine in development.

How I built a 3-provider LLM fallback system in production (and what actually broke)

This is the story of how I built the 3-provider fallback chain (Anthropic → Google → Groq), what broke along the way, and the actual code that runs in production today.

Why you need a fallback chain at all

When I first deployed Socra, the LLM routing was simple: one provider, one model, one API key. It worked fine in development.

How I built a 3-provider LLM fallback system in production (and what actually broke)

How I built a 3-provider LLM fallback system in production (and what actually broke)

Related reading

LLM Fallback in Production, Agentic eCommerce, and GitHub Copilot for Parallel…

Self-Hosting Your First LLM for Enterprise: What Nobody Tells You Before You…

SOC-in-a-Box: One LLM, Eight Hats, A Production-Bar AI SOC on a Single GPU

From 60% to 93%: How We Built a Continuous Evaluation Framework for LLM Systems

LLM Gateways: Routing, Fallbacks, And Semantic Caching

We Had LLMs Hallucinating Legal URLs in Production — Here's What We Tried

Related reading

LLM Fallback in Production, Agentic eCommerce, and GitHub Copilot for Parallel…

Self-Hosting Your First LLM for Enterprise: What Nobody Tells You Before You…

SOC-in-a-Box: One LLM, Eight Hats, A Production-Bar AI SOC on a Single GPU

From 60% to 93%: How We Built a Continuous Evaluation Framework for LLM Systems

LLM Gateways: Routing, Fallbacks, And Semantic Caching

We Had LLMs Hallucinating Legal URLs in Production — Here's What We Tried