(Everything you need to know about RAG)

What is RAG

RAG, or Retrieval Augmented Generation, is a method to counter the hallucination drawback of an LLM. It addresses the issue of an LLM being questioned about unknown data by providing it with relevant information.

High-level working

Query classification: Deciding the necessity of retrieval.
Retrieval: Obtaining relevant docs for the query.
Reranking: Reranking the docs based on their relevance to the query.
Repacking: Organizing the retrieved docs into a structured one for better generation.
Summarization: Extracting key info for response generation from the repacked doc.

RAG workflow diagram

Scope of the research paper

Figure out the best combination of methods in each step of the RAG workflow.
Introduce a framework for evaluating RAG systems based on a formulated dataset to assess the RAG model’s capabilities in genera and specialized for a domain.
To prove multimodal retrieval techniques can improve QA capabilities on visual inputs.

Best methods for each step in RAG

#	RAG Modules	Best Performing Methods (RAG score, Latency)	Best Efficiency (RAG score, Latency)
1	Retrieval	Hybrid with HyDE (0.58, 11.71)	Hybrid (0.498, 1.45)
2	Reranking	monoT5 (0.58, 11.71)	TILDEv2 (0.536,11.26)
3	Repacking	Reverse (0.56, 11.70)	Forward (0.542, 11.68)
4	Summarization	Recomp (0.56, 11.70)	Recmop (0.56, 11.70)

Multimodal extension of RAG

This paper introduced Retrieval as Generation by incorporating Text2Image and Image2Text retrieval capabilities into the RAG system, with a substantial collection of paired images and textual descriptions as a retrieval source. These techniques can be used to speed up image generation or image captioning.