Plan for including LLMs' Zero-shot performance? #132

Leezekun · 2024-06-30T22:33:30Z

Hi,

Thank you for the great work!

Given the current prevalence of Large Language Models (LLMs), are there any plans to include more LLM-based approaches in performance evaluations, especially focusing on zero-shot performance?

Here are a few relevant papers and approaches:

Hu, Yushi, et al. "In-context learning for few-shot dialogue state tracking." arXiv preprint arXiv:2203.08568 (2022).
Hudeček, Vojtěch, and Ondřej Dušek. "Are LLMs all you need for task-oriented dialogue?" arXiv preprint arXiv:2304.06556 (2023).
Heck, Michael, et al. "ChatGPT for zero-shot dialogue state tracking: A solution or an opportunity?" arXiv preprint arXiv:2306.01386 (2023).
Chung, Willy, et al. "Instructtods: Large language models for end-to-end task-oriented dialogue systems." arXiv preprint arXiv:2310.08885 (2023).
Li, Zekun, et al. "Large Language Models as Zero-shot Dialogue State Trackers through Function Calling." arXiv preprint arXiv:2402.10466 (2024).

Are there any plans to benchmark the performance of LLMs in zero-shot settings? I would be happy to assist with this if needed.

budzianowski · 2024-07-01T15:28:56Z

Hi @Leezekun - thanks for posting this. A simple answer is - absolutely! There are numbers of efforts to work in a zero-shot manner. If you are happy to update the benchmarks that would be very helpful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan for including LLMs' Zero-shot performance? #132

Plan for including LLMs' Zero-shot performance? #132

Leezekun commented Jun 30, 2024

budzianowski commented Jul 1, 2024

Plan for including LLMs' Zero-shot performance? #132

Plan for including LLMs' Zero-shot performance? #132

Comments

Leezekun commented Jun 30, 2024

budzianowski commented Jul 1, 2024