Vector search has become load-bearing infrastructure in modern AI systems remarkably fast. A year or two ago, it was primarily a research curiosity and a niche tool for semantic search. Today it sits at the center of RAG pipelines, recommendation engines, multimodal retrieval systems, and a growing class of applications that reason over unstructured data.
The operational patterns haven't kept pace with the adoption.
Most teams that deploy vector search in production treat it the way they treated relational databases before they understood indexing: as infrastructure that works until it doesn't, with failure modes that aren't well understood until they've been encountered firsthand. The problems that emerge at scale — degraded recall, unpredictable latency, ghost results from deleted records — are preventable. But preventing them requires understanding how vector indices actually work, and what happens to them under continuous change.
This post is about that.
What Vector Search Is Actually Doing







