RAG in production: the failure modes nobody warns you about

Retrieval-augmented generation looks trivial in a tutorial: embed some documents, drop them in a vector database, stuff the top matches into a prompt, done. Then you point it at real company data and real users, and you discover that the demo was the easy 10%.

We build RAG systems over private knowledge for companies, and almost every painful bug traces back to the same handful of failure modes. Here they are, and what actually fixes them.

1. Retrieval returns the wrong chunks — and the model uses them anyway

The single biggest source of "wrong" RAG answers isn't the LLM. It's retrieval handing it irrelevant or partial context, which the model then summarizes with total confidence. Naive fixed-size chunking splits a table from its header, or a clause from the sentence that negates it.

The fix is unglamorous data engineering: chunk on semantic boundaries, not character counts; add a reranking step so the top-k you actually pass is the top-k by relevance, not by raw vector distance; and store enough metadata to filter before you search. Retrieval quality sets the ceiling on everything downstream.

We build RAG systems over private knowledge for companies, and almost every painful bug traces back to the same handful of failure modes. Here they are, and what actually fixes them.

1. Retrieval returns the wrong chunks — and the model uses them anyway

RAG in production: the failure modes nobody warns you about

Other newsrooms on this story

RAG in production: the failure modes nobody warns you about

Other newsrooms on this story

Related reading

Most RAG Problems Are R(etrieval) Problems

Why my first RAG system hallucinated (and how I fixed it)

Your RAG Agent Is Retrieving the Wrong Chunk: 5 Failure Modes We Fix in…

The Access Control Gap That Makes Most Enterprise RAG Systems Dangerous

Production-Grade RAG: Why Vector Search Isn't Enough (and How Hybrid Search…

What is RAG? A Beginner's Guide to Retrieval-Augmented Generation (For…

Related reading

Most RAG Problems Are R(etrieval) Problems

Why my first RAG system hallucinated (and how I fixed it)

Your RAG Agent Is Retrieving the Wrong Chunk: 5 Failure Modes We Fix in…

The Access Control Gap That Makes Most Enterprise RAG Systems Dangerous

Production-Grade RAG: Why Vector Search Isn't Enough (and How Hybrid Search…

What is RAG? A Beginner's Guide to Retrieval-Augmented Generation (For…