-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of WizardCoder on HumanEvalFixDocs #18
Comments
We didn't run that. I think it would be somewhere between OctoCoder & GPT-4.
|
Thanks! |
I observe an accuracy of 51.2 with WizardCoder. (Thus lower than OctoCoder, and also lower than WizardCoder's performance on HumanEval) PS: I'm reporting with greedy decoding and pass@1 score. As a sanity check, would it be possible for you to confirm if you are observing the same performance? |
Using For StarCoder which prompt are you using? Surprised it is that high. Would be curious to know what you get for |
I am using
Will try to run with 20 samples and temperature 0.2. CC: @Muennighoff |
With 20 samples and T=0.2, I observe the following result (pass@1 of 58.9 compared to 43.5 reported in the paper..)
Would it be possible for you to re-compute these numbers just to be sure? |
Discussion moved to #21 |
Hi @Muennighoff,
Thank you for releasing many useful resources.
QQ: Do you know what is the accuracy of WizardCoder-15.5B on HumanEvalFixDocs? (i.e. where does WizardCoder stand in Table12 of your paper?)
The text was updated successfully, but these errors were encountered: