Retrieval-Augmented Generation (RAG) Project

If this project helps you, consider buying me a coffee ☕. Your support helps me keep contributing to the open-source community!

bRAGAI's official platform will launch soon. Join the waitlist to be one of the early adopters!

This repository contains a comprehensive exploration of Retrieval-Augmented Generation (RAG) for various applications. Each notebook provides a detailed, hands-on guide to setting up and experimenting with RAG from an introductory level to advanced implementations, including multi-querying and custom RAG builds.

Project Structure

If you want to jump straight into it, check out the file full_basic_rag.ipynb -> this file will give you a boilerplate starter code of a fully customizable RAG chatbot.

Make sure to run your files in a virtual environment (checkout section Get Started)

The following notebooks can be found under the directory tutorial_notebooks/.

[1]_rag_setup_overview.ipynb

This introductory notebook provides an overview of RAG architecture and its foundational setup. The notebook walks through:

Environment Setup: Configuring the environment, installing necessary libraries, and API setups.
Initial Data Loading: Basic document loaders and data preprocessing methods.
Embedding Generation: Generating embeddings using various models, including OpenAI's embeddings.
Vector Store: Setting up a vector store (ChromaDB/Pinecone) for efficient similarity search.
Basic RAG Pipeline: Creating a simple retrieval and generation pipeline to serve as a baseline.

[2]_rag_with_multi_query.ipynb

Building on the basics, this notebook introduces multi-querying techniques in the RAG pipeline, exploring:

Multi-Query Setup: Configuring multiple queries to diversify retrieval.
Advanced Embedding Techniques: Utilizing multiple embedding models to refine retrieval.
Pipeline with Multi-Querying: Implementing multi-query handling to improve relevance in response generation.
Comparison & Analysis: Comparing results with single-query pipelines and analyzing performance improvements.

[3]_rag_routing_and_query_construction.ipynb

This notebook delves deeper into customizing a RAG pipeline. It covers:

Logical Routing: Implements function-based routing for classifying user queries to appropriate data sources based on programming languages.
Semantic Routing: Uses embeddings and cosine similarity to direct questions to either a math or physics prompt, optimizing response accuracy.
Query Structuring for Metadata Filters: Defines structured search schema for YouTube tutorial metadata, enabling advanced filtering (e.g., by view count, publication date).
Structured Search Prompting: Leverages LLM prompts to generate database queries for retrieving relevant content based on user input.
Integration with Vector Stores: Links structured queries to vector stores for efficient data retrieval.

[4]_rag_indexing_and_advanced_retrieval.ipynb

Continuing from the previous customization, this notebook explores:

Preface on Document Chunking: Points to external resources for document chunking techniques.
Multi-representation Indexing: Sets up a multi-vector indexing structure for handling documents with different embeddings and representations.
In-Memory Storage for Summaries: Uses InMemoryByteStore for storing document summaries alongside parent documents, enabling efficient retrieval.
MultiVectorRetriever Setup: Integrates multiple vector representations to retrieve relevant documents based on user queries.
RAPTOR Implementation: Explores RAPTOR, an advanced indexing and retrieval model, linking to in-depth resources.
ColBERT Integration: Demonstrates ColBERT-based token-level vector indexing and retrieval, which captures contextual meaning at a fine-grained level.
Wikipedia Example with ColBERT: Retrieves information about Hayao Miyazaki using the ColBERT retrieval model for demonstration.

[5]_rag_retrieval_and_reranking.ipynb

This final notebook brings together the RAG system components, with a focus on scalability and optimization:

Document Loading and Splitting: Loads and chunks documents for indexing, preparing them for vector storage.
Multi-query Generation with RAG-Fusion: Uses a prompt-based approach to generate multiple search queries from a single input question.
Reciprocal Rank Fusion (RRF): Implements RRF for re-ranking multiple retrieval lists, merging results for improved relevance.
Retriever and RAG Chain Setup: Constructs a retrieval chain for answering queries, using fused rankings and RAG chains to pull contextually relevant information.
Cohere Re-Ranking: Demonstrates re-ranking with Cohere’s model for additional contextual compression and refinement.
CRAG and Self-RAG Retrieval: Explores advanced retrieval approaches like CRAG and Self-RAG, with links to examples.
Exploration of Long-Context Impact: Links to resources explaining the impact of long-context retrieval on RAG models.

Getting Started

Pre-requisites: Python 3.11.7 (preferred)

Clone the repository:

git clone https://github.com/bRAGAI/bRAG-langchain.git 

cd bRAG-langchain

Create a virtual environment

python -m venv venv

source venv/bin/activate

Install dependencies: Make sure to install the required packages listed in requirements.txt.

pip install -r requirements.txt
Run the Notebooks: Begin with [1]_rag_setup_overview.ipynb to get familiar with the setup process. Proceed sequentially through the other notebooks to build and experiment with more advanced RAG concepts.

Set Up Environment Variables:

Duplicate the .env.example file in the root directory and name it .env and include the following keys (replace with your actual keys):

#LLM Modal
OPENAI_API_KEY="your-api-key"

#LangSmith
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY="your-api-key"
LANGCHAIN_PROJECT="your-project-name"

#Pinecone Vector Database
PINECONE_INDEX_NAME="your-project-index"
PINECONE_API_HOST="your-host-url"
PINECONE_API_KEY="your-api-key"

Notebook Order: To follow the project in a structured manner:
- Start with [1]_rag_setup_overview.ipynb
- Proceed with [2]_rag_with_multi_query.ipynb
- Then go through [3]_rag_routing_and_query_construction.ipynb
- Continue with [4]_rag_indexing_and_advanced_retrieval.ipynb
- Finish with [5]_rag_retrieval_and_reranking.ipynb

Usage

After setting up the environment and running the notebooks in sequence, you can:

Experiment with Retrieval-Augmented Generation: Use the foundational setup in [1]_rag_setup_overview.ipynb to understand the basics of RAG.
Implement Multi-Querying: Learn how to improve response relevance by introducing multi-querying techniques in [2]_rag_with_multi_query.ipynb.

Incoming Notebooks (work in progress)

Context Precision with RAGAS + LangSmith
- Guide on using RAGAS and LangSmith to evaluate context precision, relevance, and response accuracy in RAG.
Deploying RAG application
- Guide on how to deploy your RAG application

The notebooks and visual diagrams were inspired by Lance Martin's LangChain Tutorial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Retrieval-Augmented Generation (RAG) Project

Project Structure

[1]_rag_setup_overview.ipynb

[2]_rag_with_multi_query.ipynb

[3]_rag_routing_and_query_construction.ipynb

[4]_rag_indexing_and_advanced_retrieval.ipynb

[5]_rag_retrieval_and_reranking.ipynb

Getting Started

Usage

Incoming Notebooks (work in progress)

Files

README.md

Latest commit

History

README.md

File metadata and controls

Retrieval-Augmented Generation (RAG) Project

Project Structure

[1]_rag_setup_overview.ipynb

[2]_rag_with_multi_query.ipynb

[3]_rag_routing_and_query_construction.ipynb

[4]_rag_indexing_and_advanced_retrieval.ipynb

[5]_rag_retrieval_and_reranking.ipynb

Getting Started

Usage

Incoming Notebooks (work in progress)