CoNT is a strong contrastive learning framework for neural text generation which outperforms the MLE based training method on five generation tasks, including machine translation, summarization, code comment generation, data-to-text generation, commensense generation.
- CoNT-[NeurIPS 2022]
- [Github] Official implementation of CoNT
The aforementioned repository has the following issues:
-
- It does not support Korean models and BART language model
-
- For Korean, it is not appropriate to use beam search decoding when sampling negative samples
Therefore, we release the CoNKT (Contrastive Neural Korean Text Generation) model, which solves the two issues mentioned above.
A warm-up stage where the model is only supervised by Negative Log Likelihood Loss
is recommended as it guarantees the quality of the examples from the model’s prediction.
# Warm-up Training & Inference (T5)
bash run_warmup_t5.sh
# Warm-up Training & Inference (BART)
bash run_warmup_bart.sh
We implement its InfoNCE version by treating ground truth as positive sample and self-generated samples (by top-p decoding strategy not beam search decoding) are also treated as negative samples.
# CoNKT Training & Inference (T5)
bash run_conkt_t5.sh
# CoNKT Training & Inference (BART)
bash run_conkt_bart.sh
After restoring the model's tokenized output to the original text, Rouge performance was evaluated by comparing it to the reference and hypothesis tokenized using mecab.
- Dacon, Korean Abstract Summarization AI Contest [Dataset]
- Training: 29,432
- Validation: 7,358
- Test: 9,182
#Param | rouge-1 | rouge-2 | rouge-l | |
---|---|---|---|---|
T5-small | 77M | 51.55 | 33.26 | 45.02 |
KoBART | 124M | 53.75 | 34.40 | 45.94 |
CoNKT-T5-small | 77M | 54.08 | 34.42 | 45.54 |
CoNKT-KoBART | 124M | 55.02 | 35.22 | 46.22 |
- AI-Hub, News Summarization [Dataset]
- Training: 245,626
- Validation: 27,685
- Test: 2,542
#Param | rouge-1 | rouge-2 | rouge-l | |
---|---|---|---|---|
T5-small | 77M | 53.44 | 34.03 | 45.36 |
KoBART | 124M | 56.68 | 36.91 | 47.33 |
CoNKT-T5-small | 77M | 56.40 | 36.35 | 46.90 |
CoNKT-KoBART | 124M | 58.75 | 39.54 | 49.56 |
@article{an2022cont,
title={CoNT: Contrastive Neural Text Generation},
author={An, Chenxin and Feng, Jiangtao and Lv, Kai and Kong, Lingpeng and Qiu, Xipeng and Huang, Xuanjing},
journal={arXiv preprint arXiv:2205.14690},
year={2022}
}