Enterprise RAG — A practitioner's build log | Post 3 of 6

A retrieval pipeline has more design surface than it appears. The technology choices — vector search, LLM provider, storage engine — get most of the attention. The structural choices — where filtering happens, how evaluation is wired, what the dashboard connects to — determine whether the system actually works correctly in a production environment.

This post documents three structural decisions I made in Enterprise RAG, the constraint that drove each one, and the cost I accepted.

Decision 1: Lexical retrieval before semantic — sequencing, not a permanent choice

The default retrieval implementation uses token cosine similarity against a local SQLite chunk store (RAG_RETRIEVAL_PROVIDER=local). Not vector embeddings. Not a managed search index. Lexical scoring.