In our last discussion, we explored how Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by fetching external information to improve their responses.

Today, let's dive deeper into a key retrieval technique used inside RAG systems: Reciprocal Rank Fusion (RRF).


✨ What is Reciprocal Rank Fusion (RRF)?

Reciprocal Rank Fusion is a simple yet powerful algorithm used to combine search results from multiple queries.

Instead of depending on just a single query to retrieve documents, RRF:

  • Fans out multiple subqueries
  • Retrieves results separately for each
  • Merges them so that higher-ranked results are prioritized across the different searches

🔥 In short: RRF combines multiple search results intelligently so that the most relevant documents rise to the top.


🤔 Why Use RRF?

Sometimes a single query can't capture everything the user needs.

For example, if a user asks:

"What are the challenges of task decomposition?"

Depending on how it's phrased, you might miss out on some valuable documents!

✅ By generating multiple sub-questions,

Retrieving documents for each sub-question, and

Merging the results using RRF,

—you ensure broader coverage without losing precision.


🛠 How Does RRF Work?

Let's break it down step-by-step:

Stage What Happens
Subquery Generation Create multiple focused sub-questions based on the main query.
Parallel Retrieval Retrieve relevant documents separately for each sub-question.
Reciprocal Rank Fusion Merge documents intelligently based on their ranks (using the RRF formula).
Final Selection Select top-scoring documents for context or generation.

🧮 The RRF Formula

Each document's RRF score is computed as:

RRF_score = ∑ (1 / (k + rank))
  • rank = Position of the document in the retrieval list (starting from 0)
  • k = A small constant (commonly 60) that dampens the impact of lower-ranked documents.

⚡ Thus, documents that consistently rank higher across different queries will be scored higher and surfaced earlier!


🔥 Flowchart for Visual Understanding

Flow chart of Reciprocal Rank Fusion


🧩 Simple Code Snippet for Learning

Here’s a mini Python function showing how RRF merging happens:

def reciprocal_rank_fusion(subquestions, k=60):
    all_ranked_results = []

    # Retrieve chunks for each sub-question
    for subq in subquestions:
        chunks = retrieve_chunks(subq["question"])
        all_ranked_results.append(chunks)

    score_dict = {}

    # Apply RRF scoring
    for chunks in all_ranked_results:
        for rank, doc in enumerate(chunks):
            key = (doc.metadata.get("page"), doc.page_content.strip())
            if key not in score_dict:
                score_dict[key] = {"doc": doc, "score": 0}
            score_dict[key]["score"] += 1 / (k + rank)

    # Sort by score
    fused_docs = sorted(score_dict.values(), key=lambda x: x["score"], reverse=True)

    return [entry["doc"] for entry in fused_docs]

📌 Highlights:

  • Retrievals happen in parallel for each subquery.
  • Rank and reciprocal scores intelligently merge the results.
  • Final documents are sorted and fused based on their scores.

🎯 Conclusion

Reciprocal Rank Fusion (RRF) is a powerful technique that broadens context intelligently without introducing too much irrelevant information.

It ensures:

  • Broader and more accurate retrieval
  • Preference for high-quality, semantically rich documents
  • Better context for LLMs to generate superior outputs

🌟 What's Next?

Stay tuned!

In future articles, we'll cover more advanced retrieval techniques like:

🔹 Step-Back Prompting
🔹 Chain-of-Thought Retrieval
🔹 Hybrid Search
🔹 Hierarchical Decomposition (HyDE)


🚀 Stay Curious. Keep Building!