Reciprocal Rank Fusion
Dense (vector) search and keyword (FTS) search each return a ranked list. These lists use different scoring scales — cosine similarity vs ts_rank — so you cannot simply average the scores. Reciprocal Rank Fusion (RRF) is a position-based merging algorithm: it ignores the absolute scores and only cares about where each document appears in each list.
The RRF Formula
For each document d, across all ranked lists:
score(d) = Σ 1 / (rank(d, list) + k + 1)
Where k = 60 (a constant from the original RRF paper). The constant prevents top-ranked items from completely dominating — a document ranked 1st scores 1/62 ≈ 0.016 while one ranked 10th scores 1/71 ≈ 0.014. The difference is meaningful but not overwhelming.
Full rrfMerge() Implementation
HybridRetriever.java — rrfMerge()
View source ↗
private List<RetrievedChunk> rrfMerge(List<Document> dense,
List<DocumentChunk> keyword, int k) {
Map<String, Double> scores = new LinkedHashMap<>();
Map<String, RetrievedChunk> chunkMap = new LinkedHashMap<>();
// Score dense results by rank position
for (int i = 0; i < dense.size(); i++) {
String id = dense.get(i).getId();
scores.merge(id, 1.0 / (i + 1 + RRF_K), Double::sum);
chunkMap.computeIfAbsent(id, _ -> fromAiDocument(dense.get(i)));
}
// Score FTS results by rank position
for (int i = 0; i < keyword.size(); i++) {
String id = keyword.get(i).getQdrantId();
scores.merge(id, 1.0 / (i + 1 + RRF_K), Double::sum);
chunkMap.computeIfAbsent(id, _ -> fromDomainChunk(keyword.get(i)));
}
// Sort by combined RRF score, cap at topK
return scores.entrySet().stream()
.sorted(Map.Entry.<String, Double>comparingByValue().reversed())
.limit(k)
.map(e -> withScore(chunkMap.get(e.getKey()), e.getValue()))
.toList();
}
Implementation Notes
scores.merge(id, value, Double::sum)— adds the RRF contribution for each list. If a chunk appears in both lists, the scores accumulate.chunkMap.computeIfAbsent— stores the first representation encountered. A chunk appearing in both lists is only stored once.RRF_K = 60— the constant from the original paper (Cormack, Clarke, Buettcher 2009).- The final list is capped at
k = topK(typically 10).
Worked Example
| Chunk | Dense rank | FTS rank | Dense RRF contrib. | FTS RRF contrib. | Combined score |
|---|---|---|---|---|---|
| Chunk X | 3rd | 1st | 1/(3+61) = 0.01563 | 1/(1+61) = 0.01613 | 0.03176 |
| Chunk Y | 1st | not found | 1/(1+61) = 0.01613 | 0 | 0.01613 |
| Chunk Z | not found | 2nd | 0 | 1/(2+61) = 0.01587 | 0.01587 |
Chunk X, despite ranking only 3rd in dense search, wins because it is consistently relevant across both methods — a strong signal of genuine relevance. Chunk Y topped the dense list but was not in FTS at all, suggesting it might be a semantic near-miss.
RRF is robust and requires no hyperparameter tuning beyond the fixed k=60. It consistently outperforms simple score averaging in information retrieval benchmarks because rank position is more stable than raw scores across different scoring systems.