Reciprocal Rank Fusion (RRF)
How RRF merges ranked lists from BM25 and vector search without normalizing scores, with the fusion formula and practical notes.
Reciprocal Rank Fusion (RRF) is a simple and effective way to combine multiple ranked search results into one final ranking.
In the flow of this series, RRF comes after BM25 and embeddings because it answers the next practical question: once you have multiple first-stage retrievers, how should you merge their outputs into one candidate list?
It is often used in hybrid search, where you want to merge results from:
Instead of comparing raw scores from different systems, RRF only looks at rank positions.
This makes it practical because BM25 scores and embedding similarity scores usually live on different scales and cannot be compared directly.
Core idea
If a document appears high in multiple ranked lists, it should get a higher final score.
RRF gives each document a fusion score using this formula:
Where:
- = a document
- = the set of ranking systems
- = the rank position of document in ranking system
- = a constant, usually
60
A smaller rank number means a better position:
- rank 1 = top result
- rank 2 = second result
- rank 3 = third result
So documents near the top contribute more. In practice, RRF rewards agreement: a document that ranks high in multiple lists tends to rise after fusion.
Where RRF fits in the pipeline
RRF belongs to first-stage retrieval, not to the later re-ranking stage.
That distinction matters:
- BM25 and embeddings produce candidate lists
- RRF fuses those candidate lists into one stronger candidate list
- Re-ranking then applies a more expensive model to reorder that already-retrieved set
Example: combining BM25 and embeddings
Let k = 60.
Imagine two first-stage ranked lists:
| Document | BM25 Rank | Embedding Rank |
|---|---|---|
| Doc A | 1 | 2 |
| Doc B | 2 | 1 |
| Doc C | 3 | — |
| Doc D | — | 3 |
RRF scores:
| Document | BM25 contribution | Embedding contribution | Total |
|---|---|---|---|
| Doc A | 1 / 61 | 1 / 62 | 0.03252 |
| Doc B | 1 / 62 | 1 / 61 | 0.03252 |
| Doc C | 1 / 63 | 0 | 0.01587 |
| Doc D | 0 | 1 / 63 | 0.01587 |
Final fused ranking
After sorting by RRF score:
| Final Rank | Document | Score |
|---|---|---|
| 1 | Doc A / Doc B | 0.03252 |
| 2 | Doc C / Doc D | 0.01587 |
The point is not the exact tie-breaking. The point is that documents supported by both retrievers rise above documents supported by only one.
Why ranks, not raw scores?
You usually cannot add BM25 scores and embedding similarity scores directly. They use different scales and different semantics.
RRF avoids that problem by ignoring raw scores and asking only one question: how high did this document rank in each list?
That gives it three practical advantages:
- no score normalization step
- no hand-tuned blend weights like
αandβ - less sensitivity to score-scale changes across retrievers
Why RRF is a useful baseline
- It is simple and easy to implement—often a strong first choice for hybrid search.
- It avoids score normalization and cross-retriever score arithmetic.
- It can combine very different retrieval methods.
- It often improves recall and ranking quality.
Limitations
- it only uses rank positions, not score confidence
- it may ignore useful score differences inside a ranked list
- it does not learn from user feedback by itself
Short conclusion
RRF merges ranked lists by summing 1/(k + rank) per retriever and sorting—no shared score scale required. For hybrid BM25 plus embedding search, that makes it a common, low-friction baseline before you decide whether to add a later re-ranking stage.
Read in series order, RRF is the bridge between retrieval and precision ranking: it improves the first-stage candidate set, while the next article on re-ranking explains how to refine that set even further.
Last updated on