You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey Rishabh!! Been a long time since we chatted about NAMs/GAMs :)
Wow, this is a really cool paper -- thanks for sharing! As corroborating evidence, in medprompt we did ablations up to k=20 few shots and found continued performance improvements (e.g. 90.2 -> 90.6 on medQA when going from 5 shots to 20 shots), but wanted to keep the inference budget reasonable for the "standard" algorithm configuration. We didn't ablate beyond that, so it's really cool to see it studied so rigorously.
Happy to add a link to your paper in the readme when I'm back at my desk, and excited to read it more thoroughly too. Would be fun to catch up sometime!
Seems like many-shot prompting seems to help on several of the existings tasks here (Big-bench hard, MATH, GSM8K, GPQA).
Not sure what's the process but seems like worth a mention / including it here.
https://arxiv.org/abs/2404.11018
Also, works for Claude-3 (many-shot jailbreaking paper) and gpt-4o in multimodal tasks (many-shot ICL in multimodal tasks).
The text was updated successfully, but these errors were encountered: