Skip to content

Latest commit

 

History

History
61 lines (43 loc) · 2.88 KB

README.md

File metadata and controls

61 lines (43 loc) · 2.88 KB

SCIroShot ⚛️

SCIroShot is an entailment-based zero-shot text classifier that has been trained on a weakly supervised dataset of scientific data originally gathered from Microsoft Academic Graph.

For more details, refer to the paper "A weakly supervised textual entailment approach to zero-shot text classification", published in the EACL 2023 conference.

Figure 1

📖 How to use

from transformers import pipeline

zstc = pipeline("zero-shot-classification", model="BSC-LT/sciroshot")

sentence = "Leo Messi is the best player ever."
candidate_labels = ["politics", "science", "sports", "environment"]
template = "This example is {}"

output = zstc(sentence, candidate_labels, hypothesis_template=template, multi_label=False)

print(output)
print(f'Predicted class: {output["labels"][0]}')

📝 Results

Scientific domain

Model arXiv SciDocs-MesH SciDocs-MAG Konstanz Elsevier PubMed
fb/bart-large-mnli 33.28 66.18🔥 51.77 54.62 28.41 31.59🔥
SCIroShot 42.22🔥 59.34 69.86🔥 66.07🔥 54.42🔥 27.93

General domain

Model Topic Emotion Situation
RTE (Yin et al., 2019) 43.8 12.6 37.2🔥
FEVER (Yin et al., 2019) 40.1 24.7 21.0
MNLI (Yin et al., 2019) 37.9 22.3 15.4
NSP (Ma et al., 2021) 50.6 16.5 25.8
NSP-Reverse (Ma et al., 2021) 53.1 16.1 19.9
SCIroShot 59.08🔥 24.94🔥 27.42

📣 Citation

@inproceedings{pamies2023weakly,
  title={A weakly supervised textual entailment approach to zero-shot text classification},
  author={Pàmies, Marc and Llop, Joan and Multari, Francesco and Duran-Silva, Nicolau and Parra-Rojas, César and González-Agirre, Aitor and Massucci, Francesco Alessandro and Villegas, Marta},
  booktitle={Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics},
  pages={286--296},
  year={2023}
}

⚖️ License

This work is distributed under a Apache License, Version 2.0.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101004870. H2020-SC6-GOVERNANCE-2018-2019-2020 / H2020-SC6-GOVERNANCE-2020