When the LLM Refuses: A Fallback Chain That Salvages Most Refusals

Every production LLM app eats false-positive refusals. A user asks something perfectly fine, the safety filter trips, the model emits two sentences of "I can't help with that," and your UI shows a wall. Do that a few times and the user leaves.

We've measured this on HoneyChat — Telegram-native AI companion, ~300 DAU, 17 languages. Across a normal day, somewhere between 2% and 8% of model calls land in a refusal or finish_reason="content_filter" state. Most of those are not actually problematic content — they're the model being twitchy about edge phrasing, polysemous words, or roleplay framing. The pattern below recovers about 70% of them.

HoneyChat LLM routing at a glance (core/llm.py, plan-gated via OpenRouter):

Tier(s)

Pace

HoneyChat LLM routing at a glance (core/llm.py, plan-gated via OpenRouter):

Tier(s)

Pace

When the LLM Refuses: A Fallback Chain That Salvages Most Refusals

When the LLM Refuses: A Fallback Chain That Salvages Most Refusals

Related reading

Your LLM didn't fail. Your application trusted it too much.

The Safety Feature That Taught an LLM to Lie

Can You Build an Alternative to LLMs? 8 Months, ~200 Failed Experiments, One…

You Can’t Prompt Your Away Your LLM Problems | Towards AI

Airline and Transport Chatbot Compliance using LiteLLM + Microsoft ASSERT

A Tiny LLM Request Recorder I Use to Reproduce Production Failures

Related reading

Your LLM didn't fail. Your application trusted it too much.

The Safety Feature That Taught an LLM to Lie

Can You Build an Alternative to LLMs? 8 Months, ~200 Failed Experiments, One…

You Can’t Prompt Your Away Your LLM Problems | Towards AI

Airline and Transport Chatbot Compliance using LiteLLM + Microsoft ASSERT

A Tiny LLM Request Recorder I Use to Reproduce Production Failures