-
Notifications
You must be signed in to change notification settings - Fork 185
Issues: bigcode-project/bigcode-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Possibly system specific] Wild (12% vs 20%) run-to-run swings in
multiple-cpp
reported scores
#258
opened Jul 18, 2024 by
alat-rights
Using the humanevalpack to test the ChatGLM3 model results in an abnormal score.
#251
opened Jul 5, 2024 by
burger-pb
API-based evaluation support (humanevalpack_openai.py is too old)
#234
opened May 10, 2024 by
s-natsubori
If I want to add my own designed prompts before each question, how should I modify the code
#230
opened Apr 27, 2024 by
ALLISWELL8
Multiple-E Go test file name suffix does not contain _test.go
#224
opened Apr 20, 2024 by
hitesh-1997
Please add flag to log score for each sample (akin to Eleuther's LM Evaluation Harness)
#215
opened Apr 8, 2024 by
RylanSchaeffer
Previous Next
ProTip!
Updated in the last three days: updated:>2024-07-16.