The Case Against LLMs as Rerankers

Authors: Apoorva Joshi, Zhenmei Shi, Akshay Goindani, Hong LiuResearch Leads: Zhenmei Shi, Akshay Goindani, Hong Liu

Large language models are increasingly being used for a broad range of tasks, including reranking, but they may not be the optimal choice when considering practical constraints like cost, latency, and accuracy in production applications.

In this blog post, we put our latest reranker model, rerank-2.5, to the test against some of the best-performing LLMs on the market to see whether LLMs are actually good rerankers. Our studies show the following:

Purpose-built rerankers, such as rerank-2.5, are up to 60x cheaper, 48x faster, and achieve up to 15% better reranking accuracy (NDCG@10) than state-of-the-art LLMs.

First-stage retrieval matters—pairing strong first-stage retrieval methods with specialized rerankers yields the best reranking quality.

Authors: Apoorva Joshi, Zhenmei Shi, Akshay Goindani, Hong LiuResearch Leads: Zhenmei Shi, Akshay Goindani, Hong Liu

Purpose-built rerankers, such as rerank-2.5, are up to 60x cheaper, 48x faster, and achieve up to 15% better reranking accuracy (NDCG@10) than state-of-the-art LLMs.

First-stage retrieval matters—pairing strong first-stage retrieval methods with specialized rerankers yields the best reranking quality.

The Case Against LLMs as Rerankers

The Case Against LLMs as Rerankers

Related reading

The State Of LLMs 2025: Progress, Progress, and Predictions

How to evaluate and benchmark Large Language Models (LLMs)

A Vindication of the Rights of L.L.M.s

The Scaling Laws That Made LLMs Work

LLM Trends and Future Outlook

Small language models: Rethinking enterprise AI architecture

Related reading

The State Of LLMs 2025: Progress, Progress, and Predictions

How to evaluate and benchmark Large Language Models (LLMs)

A Vindication of the Rights of L.L.M.s

The Scaling Laws That Made LLMs Work

LLM Trends and Future Outlook

Small language models: Rethinking enterprise AI architecture