Skip to content

Latest commit

 

History

History
56 lines (56 loc) · 2.26 KB

2023-04-24-mohta23a.md

File metadata and controls

56 lines (56 loc) · 2.26 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Are large language models good annotators?
Numerous Natural Language Processing (NLP) tasks require precisely labeled data to ensure effective model training and achieve optimal performance. However, data annotation is marked by substantial costs and time requirements, especially when requiring specialized domain expertise or annotating a large number of samples. In this study, we investigate the feasibility of employing large language models (LLMs) as replacements for human annotators. We assess the zero-shot performance of various LLMs of different sizes to determine their viability as substitutes. Furthermore, recognizing that human annotators have access to diverse modalities, we introduce an image-based modality using the BLIP-2 architecture to evaluate LLM annotation performance. Among the tested LLMs, Vicuna-13b demonstrates competitive performance across diverse tasks. To assess the potential for LLMs to replace human annotators, we train a supervised model using labels generated by LLMs and compare its performance with models trained using human-generated labels. However, our findings reveal that models trained with human labels consistently outperform those trained with LLM-generated labels. We also highlights the challenges faced by LLMs in multilingual settings, where their performance significantly diminishes for tasks in languages other than English.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
mohta23a
0
Are large language models good annotators?
38
48
38-48
38
false
Mohta, Jay and Ak, Kenan and Xu, Yan and Shen, Mingwei
given family
Jay
Mohta
given family
Kenan
Ak
given family
Yan
Xu
given family
Mingwei
Shen
2023-04-24
Proceedings on "I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops
239
inproceedings
date-parts
2023
4
24