You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given the current prevalence of Large Language Models (LLMs), are there any plans to include more LLM-based approaches in performance evaluations, especially focusing on zero-shot performance?
Here are a few relevant papers and approaches:
Hu, Yushi, et al. "In-context learning for few-shot dialogue state tracking." arXiv preprint arXiv:2203.08568 (2022).
Hudeček, Vojtěch, and Ondřej Dušek. "Are LLMs all you need for task-oriented dialogue?" arXiv preprint arXiv:2304.06556 (2023).
Heck, Michael, et al. "ChatGPT for zero-shot dialogue state tracking: A solution or an opportunity?" arXiv preprint arXiv:2306.01386 (2023).
Chung, Willy, et al. "Instructtods: Large language models for end-to-end task-oriented dialogue systems." arXiv preprint arXiv:2310.08885 (2023).
Li, Zekun, et al. "Large Language Models as Zero-shot Dialogue State Trackers through Function Calling." arXiv preprint arXiv:2402.10466 (2024).
Are there any plans to benchmark the performance of LLMs in zero-shot settings? I would be happy to assist with this if needed.
The text was updated successfully, but these errors were encountered:
Hi @Leezekun - thanks for posting this. A simple answer is - absolutely! There are numbers of efforts to work in a zero-shot manner. If you are happy to update the benchmarks that would be very helpful!
Hi,
Thank you for the great work!
Given the current prevalence of Large Language Models (LLMs), are there any plans to include more LLM-based approaches in performance evaluations, especially focusing on zero-shot performance?
Here are a few relevant papers and approaches:
Are there any plans to benchmark the performance of LLMs in zero-shot settings? I would be happy to assist with this if needed.
The text was updated successfully, but these errors were encountered: