The standard retrieval approach follows the following processes:
While embedding models excel at capturing semantic relationships, they can miss exact keyword matches. BM25 (Best Matching 25) is a ranking method to find precise word or phrase matches from the documents. BM25 works on TF-IDF(Term Frequency - Inverse Document Frequency) encoding that measures how important a word is to a document and eliminates common words.
Retrieval with BM25:

Due to limitations caused by chunk overlapping size, traditional RAG solutions remove context between individual chunks when encoding information, which often results in the system failing to retrieve the relevant information from the knowledge base.
For example, imagine you had a collection of financial information (say, U.S. SEC filings) embedded in your knowledge base, and you received the following question: "What was the revenue growth for ACME Corp in Q2 2023?"
A relevant chunk might contain the text: "The company's revenue grew by 3% over the previous quarter." However, this chunk on its own doesn't specify which company it's referring to or the relevant time period, making it difficult to retrieve the right information or use the information effectively.
Contextual Retrieval solves this problem by prepending chunk-specific context to each chunk before encoding (50-100 tokens of additional context).
