Every RAG pipeline I've reviewed this year hits the same decision point: which vector store do you actually ship? The wrong choice compounds — it shapes your architecture, your operational overhead, and how painful a future migration will be. I've run all four of these in production or near-production contexts. Here's what actually matters for the decision.

What you actually need from a vector database

Before benchmarking anything, answer these:

Scale: how many vectors today, and in 12 months?

Filtering: do you need metadata filters applied before the ANN search, not after?