MGTBench

MGTBench provides the reference implementations of different machine-generated text (MGT) detection methods. It is still under continuous development and we will include more detection methods as well as analysis tools in the future.

Supported Methods

Currently, we support the following methods (continuous updating):

Metric-based methods:
- Log-Likelihood [Ref];
- Rank [Ref];
- Log-Rank [Ref];
- Entropy [Ref];
- GLTR Test 2 Features (Rank Counting) [Ref];
- DetectGPT [Ref];
- LRR [Ref];
- NPR [Ref];
Model-based methods:
- OpenAI Detector [Ref];
- ChatGPT Detector [Ref];
- ConDA [Ref] [Model Weights];
- GPTZero [Ref];
- LM Detector [Ref];

Supported Datasets

Essay;
WP;
Reuters;

Note that our datasets are constructed based on Verma et al., you can download them from Google Drive.

Installation

git clone https://github.com/xinleihe/MGTBench.git;
cd MGTBench;
conda env create -f environment.yml;
conda activate MGTBench;

Usage

To run the benchmark on the Essay dataset:

# Distinguish Human vs. Claude:
python benchmark.py --dataset Essay --detectLLM Claude --method Log-Likelihood

# Text attribution:
python attribution_benchmark.py --dataset Essay

Note that you can also specify your own datasets on dataset_loader.py.

Authors

The tool is designed and developed by Xinlei He (CISPA), Xinyue Shen (CISPA), Zeyuan Chen (Individual Researcher), Michael Backes (CISPA), and Yang Zhang (CISPA).

Cite

If you use MGTBench for your research, please cite MGTBench: Benchmarking Machine-Generated Text Detection.

bibtex
@article{HSCBZ23,
author = {Xinlei He and Xinyue Shen and Zeyuan Chen and Michael Backes and Yang Zhang},
title = {{MGTBench: Benchmarking Machine-Generated Text Detection}},
journal = {{CoRR abs/2303.14822}},
year = {2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
methods		methods
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
attribution_benchmark.py		attribution_benchmark.py
benchmark.py		benchmark.py
dataset_loader.py		dataset_loader.py
dataset_loader_attribution.py		dataset_loader_attribution.py
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MGTBench

Supported Methods

Supported Datasets

Installation

Usage

Authors

Cite

About

Releases

Packages

Contributors 2

Languages

License

xinleihe/MGTBench

Folders and files

Latest commit

History

Repository files navigation

MGTBench

Supported Methods

Supported Datasets

Installation

Usage

Authors

Cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages