Code repo for EMNLP 2024 paper - EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
Efficient RAG is a new framework to train Labeler and Filter to learn to conduct multi-hop RAG without multiple LLM calls.
- 2024-09-12 open source the code
You need to install PyTorch >= 2.1.0 first, and then install dependent Python libraries by running the command
pip install -r requirements.txt
You can also create a conda environment with python>=3.9
conda create -n <ENV_NAME> python=3.9 pip
conda activate <ENV_NAME>
pip install -r requirements.txt
-
Download the dataset from HotpotQA, 2WikiMQA and MuSiQue. Separate them as train, dev and test set, and then put them under
data/dataset
. -
Download the retriever model Contriever and base model DeBERTa, put them under
model_cache
-
Prepare the corpus by extract documents and construct embedding.
python src/retrievers/multihop_data_extractor.py --dataset hotpotQA
python src/retrievers/passage_embedder.py \
--passages data/corpus/hotpotQA/corpus.jsonl \
--output_dir data/corpus/hotpotQA/contriever \
--model_type contriever
- Deploy LLaMA-3-70B-Instruct with vLLM framework, and configure it in
src/language_models/llama.py
We will use hotpotQA training set as an example. You could construct 2WikiMQA and MuSiQue in the same way.
python src/data_synthesize/query_decompose.py \
--dataset hotpotQA \
--split train \
--model llama3
python src/data_synthesize/token_labeling.py \
--dataset hotpotQA \
--split train \
--model llama3
python src/data_synthesize/token_extraction.py \
--data_path data/synthesized_token_labeling/hotpotQA/train.jsonl \
--save_path data/token_extracted/hotpotQA/train.jsonl \
--verbose
python src/data_synthesize/next_hop_query_construction.py \
--dataset hotpotQA \
--split train \
--model llama
python src/data_synthesize/next_hop_query_filtering.py \
--data_path data/synthesized_next_query/hotpotQA/train.jsonl \
--save_path data/next_query_extracted/hotpotQA/train.jsonl \
--verbose
python src/data_synthesize/negative_sampling.py \
--dataset hotpotQA \
--split train \
--retriever contriever
python src/data_synthesize/negative_sampling_labeled.py \
--dataset hotpotQA \
--split train \
--model llama
python src/data_synthesize/negative_token_extraction.py \
--dataset hotpotQA \
--split train \
--verbose
python src/data_synthesize/training_data_synthesize.py \
--dataset hotpotQA \
--split train
Training Filter model
python src/efficient_rag/filter_training.py \
--dataset hotpotQA \
--save_path saved_models/filter
Training Labeler model
python src/efficient_rag/labeler_training.py \
--dataset hotpotQA \
--tags 2
EfficientRAG retrieve procedure
python src/efficientrag_retrieve.py \
--dataset hotpotQA \
--retriever contriever \
--labels 2 \
--labeler_ckpt <<PATH_TO_LABELER_CKPT>> \
--filter_ckpt <<PATH_TO_FILTER_CKPT>> \
--topk 10 \
Use LLaMA-3-8B-Instruct as generator
python src/efficientrag_qa.py \
--fpath <<MODEL_INFERENCE_RESULT>> \
--model llama-8B \
--dataset hotpotQA
If you find this paper or code useful, please cite by:
@inproceedings{zhuang2024efficientrag,
title={EfficientRAG: Efficient Retriever for Multi-Hop Question Answering},
author={Zhuang, Ziyuan and Zhang, Zhiyang and Cheng, Sitao and Yang, Fangkai and Liu, Jia and Huang, Shujian and Lin, Qingwei and Rajmohan, Saravan and Zhang, Dongmei and Zhang, Qi},
booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing},
pages={3392--3411},
year={2024}
}