In this article, we will understand how vector search works in Azure AI Search and how to use it as the retrieval layer in a Retrieval-Augmented Generation (RAG) system. The article is meant for software engineers. We will not stop at theory. We will build a small, working example that you can run on your own machine and follow along step by step.
By the end, you will have a small document search service that takes a user question, finds the most relevant text using vector similarity, and prepares the context that you can pass to a language model.
Please note that Azure AI Search was earlier called Azure Cognitive Search. The service was renamed, but many older articles and code samples still use the old name. The concepts are the same.
Let us begin.
What is a RAG system, in short








