A few months ago I was drowning in documentation. My team had written hundreds of pages about our internal microservices, configuration guides, and deployment procedures. Great, right? Except that nobody read them. The same questions popped up in Slack every week. "How do I reset the staging DB?" "What's the syntax for that webhook?"

I tried throwing a basic search index on top of the wiki. It was terrible. People would type "reset staging database" and get back a page about resetting production credentials. Context? Gone. Synonyms? Useless.

So I did what any developer would do: I spent two weekends building a RAG (Retrieval-Augmented Generation) system from scratch. Here’s what I learned, including the dead ends that wasted my time.

The naïve approach: dump PDFs into a vector database

I started with the classic recipe: PDFs → text splitter → OpenAI embeddings → Pinecone. Simple. It worked... for one question. For everything else it returned irrelevant junk.