(Everything you need to know about RAG)

[Paper Link]

What is RAG

RAG, or Retrieval Augmented Generation, is a method to counter the hallucination drawback of an LLM. It addresses the issue of an LLM being questioned about unknown data by providing it with relevant information.

High-level working

  1. Query classification: Deciding the necessity of retrieval.
  2. Retrieval: Obtaining relevant docs for the query.
  3. Reranking: Reranking the docs based on their relevance to the query.
  4. Repacking: Organizing the retrieved docs into a structured one for better generation.
  5. Summarization: Extracting key info for response generation from the repacked doc.

RAG workflow diagram

x1.png

Scope of the research paper

Best methods for each step in RAG

# RAG Modules Best Performing Methods (RAG score, Latency) Best Efficiency (RAG score, Latency)
1 Retrieval Hybrid with HyDE (0.58, 11.71) Hybrid (0.498, 1.45)
2 Reranking monoT5 (0.58, 11.71) TILDEv2 (0.536,11.26)
3 Repacking Reverse (0.56, 11.70) Forward (0.542, 11.68)
4 Summarization Recomp (0.56, 11.70) Recmop (0.56, 11.70)

Multimodal extension of RAG

This paper introduced Retrieval as Generation by incorporating Text2Image and Image2Text retrieval capabilities into the RAG system, with a substantial collection of paired images and textual descriptions as a retrieval source. These techniques can be used to speed up image generation or image captioning.