This is the codebase for the paper Enforcing Paraphrase Generation via Controllable Latent Diffusion.
You personal dataset should be placed in datasets
directory, and split into train, valid, test
subsets.
Each dataset should be in csv format with src, tgt
as headers.
When training, you should use main.py
--config
meaning the path to your yaml config file, which should be placed inconf
directory--mode
meaning thetrain
orresume
mode--ckpt
is required only inresume
mode
When inference, you should use seq2seq.py
--ckpt_dir
meaning the checkpoint directory--config
please use the same config file as training, you can find it in<SAVE_PATH>/conf.yaml
Use controlnet_train.py
--ckpt
refers to the original ldp checkpoint path
--ldp
refers to the original ldp checkpoint path--ckpt_dir
meaning the checkpoint directory
If you find the code helpful, please cite
@article{zou2024enforcing,
title={Enforcing Paraphrase Generation via Controllable Latent Diffusion},
author={Zou, Wei and Zhuang, Ziyuan and Huang, Shujian and Liu, Jia and Chen, Jiajun},
journal={arXiv preprint arXiv:2404.08938},
year={2024}
}