Over the last few months, Retrieval Augmented Generation (RAG) has emerged as a popular technique for getting the most out of Large Language Models (LLMs) like Llama-2-70b-chat.

In this post, we’ll explore the creation of an example RAG “app” which helps you generate click-worthy titles for Hacker News submissions. All you need to do is provide a working title, idea, or phrase, and even the most boring of words will be transformed into a title destined for the front page of Hacker News.

Admittedly, this is a basic toy idea. It’s not revolutionary, and it may not land your post on the front page of Hacker News. That’s okay, because that’s not the point: the point is to provide you with a practical hands-on feel for how RAG works, and give you the understanding you need to use this technique in your own projects and systems.

Okay, so what is retrieval augmented generation?

Retrieval augmented generation is a technique of enriching your language model outputs by retrieving contextual information from an external data source, and including that information as part of your language model prompts. The idea is that when you augment a language model prompt with meaningful external data, the language model is able to respond with deeper understanding and relevance.