Skip to content

Yale-LILY/CONFIT-1

Repository files navigation

CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning

We propose a training strategy that improves the factual consistency and overall quality of summaries via a novel contrastive fine-tuning, called ConFiT. Based on our linguistically-informed typology of errors, we design different modular objectives that each target a specific type.

The code is based on huggaface's transformers. Thanks to them!

Pretrained models

Download the T5, Pegasus, and BART models from Google Drive.

Test

from transformers import PegasusTokenizer, PegasusForConditionalGeneration
model = PegasusForConditionalGeneration.from_pretrained('FROM_OUR_PRETRAINED_MODELS')
device = "cuda:0"
model = model.to(device)
tokenizer = PegasusTokenizer.from_pretrained('google/pegasus-xsum')
ARTICLE_TO_SUMMARIZE = ('"#ENTER")
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='pt')
print (inputs['input_ids'].shape)

Annotation

Google Drive is the unblinded annotation files that I used to compute the distributions in Tables 3 and 4.

And see analysis.ipynb: the script to calculate the score.

Generate Summary

import torch
input_ids = torch.load('./input_ids.pt')
decoder_input_ids = torch.load('./decoder_input_ids.pt')
attention_mask = torch.load('./attention_mask.pt')
print (input_ids.shape)
print (decoder_input_ids.shape)
print (attention_mask.shape)
output = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids)

Citation

@article{tang2021confit,
  title={CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning},
  author={Tang, Xiangru and Nair, Arjun and Wang, Borui and Wang, Bingyao and Desai, Jai and Wade, Aaron and Li, Haoran and Celikyilmaz, Asli and Mehdad, Yashar and Radev, Dragomir},
  journal={arXiv preprint arXiv:2112.08713},
  year={2021}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published