title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Are large language models good annotators?

Numerous Natural Language Processing (NLP) tasks require precisely labeled data to ensure effective model training and achieve optimal performance. However, data annotation is marked by substantial costs and time requirements, especially when requiring specialized domain expertise or annotating a large number of samples. In this study, we investigate the feasibility of employing large language models (LLMs) as replacements for human annotators. We assess the zero-shot performance of various LLMs of different sizes to determine their viability as substitutes. Furthermore, recognizing that human annotators have access to diverse modalities, we introduce an image-based modality using the BLIP-2 architecture to evaluate LLM annotation performance. Among the tested LLMs, Vicuna-13b demonstrates competitive performance across diverse tasks. To assess the potential for LLMs to replace human annotators, we train a supervised model using labels generated by LLMs and compare its performance with models trained using human-generated labels. However, our findings reveal that models trained with human labels consistently outperform those trained with LLM-generated labels. We also highlights the challenges faced by LLMs in multilingual settings, where their performance significantly diminishes for tasks in languages other than English.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

mohta23a

0

Are large language models good annotators?

38

48

38-48

38

false

Mohta, Jay and Ak, Kenan and Xu, Yan and Shen, Mingwei

given	family
Jay	Mohta

given	family
Kenan	Ak

given	family
Yan	Xu

given	family
Mingwei	Shen

2023-04-24

Proceedings on "I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops

239

inproceedings

date-parts

2023

4

24

https://proceedings.mlr.press/v239/mohta23a/mohta23a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2023-04-24-mohta23a.md

2023-04-24-mohta23a.md

Files

2023-04-24-mohta23a.md

Latest commit

History

2023-04-24-mohta23a.md

File metadata and controls