Official PyTorch implementation of "ReMoDetect: Reward Models Recognize Aligned LLM's Generations".
- Download the cleaned full-text HC3 dataset in English from YuchuanTian/AIGC_text_detector.git.
- Download the
unfilter_full/en_train_cleaned.csv
file from the repository linked above and extract it into the./data
directory.
-
Configure Environment Variables
Set your Groq API key in the
data_process.sh
script.export GROQ_API_KEY="your_groq_api_key_here"
-
Run Data Processing Script
Execute the
data_process.sh
script to process the dataset.bash data_process.sh
-
Train
Execute the
train.sh
script to process the dataset.bash train.sh
Or, you can get trained weight from huggingface
-
Generate Evaluation Data (Optional)
If you want to generate additional evaluation data, place your Azure, Anthropic, and Groq API keys in the
gen_eval_data.sh
script.export AZURE_API_KEY="your_azure_api_key_here" export ANTHROPIC_API_KEY="your_anthropic_api_key_here" export GROQ_API_KEY="your_groq_api_key_here"
Then, run the script:
bash gen_eval_data.sh
-
Evaluate
Finally, run the
eval.sh
script to evaluate the model.bash eval.sh
The benchmark based on the Fast-DetectGPT project and some codes include their licences.
@inproceedings{lee2024remodetect,
title={ReMoDetect: Reward Models Recognize Aligned LLM's Generations},
author={Lee, Hyunseok and Tack, Jihoon and Shin, Jinwoo},
booktitle={Advances in Neural Information Processing Systems},
year={2024}
}