If your RAG app sometimes answers from the wrong document even though the right one was in your database, the fix usually isn't a better embedding model — it's adding a reranker. It's the single highest-leverage upgrade to a basic retrieval pipeline, and it's easy to bolt on.

This is Day 8 of PromptFromZero, where I break down one technique a day.

Why plain retrieval gets the order wrong

Standard RAG embeds your question and grabs the nearest document vectors. It's lightning fast — but that embedding squashes a whole passage into a single vector, so it often ranks a doc that merely shares words with your question above the one that actually answers it. Good enough for a first pass; not good enough to feed straight to the LLM.

const candidates = await vectorSearch(query, { k: 25 }); // fast, coarse