Most RAG pipelines embed the user's raw query and skip the cheapest recall win there is: rewriting it before search.

Pre-filter by metadata to shrink the search space before the vector index runs. The recall lift, the cardinality trap, and the code.

Most RAG pipelines embed the user's raw query and skip the cheapest recall win there is: rewriting it before search.

Extractive vs abstractive compression of retrieved chunks. Sentence-level filtering. How to cut tokens without losing the answer.

If your RAG app sometimes answers from the wrong document even though the right one was in your...