Most RAG tutorials I found were either "pip install langchain and you're done" or 50-page academic papers. I wanted something in between — a pipeline I could actually explain in an interview, where I understood every line.

So I built one from scratch. No LangChain, no LlamaIndex, no frameworks. Just FastAPI, FAISS, sentence-transformers, and an LLM API.

Here's what I built, what worked, and what broke.

The architecture

PDF --> extract text (pypdf) --> chunk (500 char, 50 overlap) --> embed (MiniLM-L6-v2)