Hello, I'm Shrijith Venkatramana. I'm building git-lrc, an AI code reviewer that runs on every commit. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

Every developer eventually discovers the same frustrating pattern.

Your application sends a 20,000-token prompt to an LLM. The first request takes 2 seconds. The next request contains the exact same 20,000 tokens plus a tiny user message at the end.

And somehow the model processes the entire thing again.

At least, that's what many developers assume.