How we compute the leaderboard #73

gasse · 2021-09-06T20:13:21Z

gasse
Sep 6, 2021
Maintainer

Dear participants,

We've realized that it was not very clear which instances we were using to evaluate your submissions and update the leaderboard. Here are some clarifications.

Train, validation and test instances

Each benchmark dataset is currently split into two sets of instances:

the training set (train)
the validation set (valid)
the test set (test)

Both train and valid have been released publicly and can be used in whichever way you want for training, while test is kept hidden until the end of the competition, and is used for evaluation (for more details, see our data description).

Intermediate evaluations

During the competition, in order to get a sense of where you're at, we perform intermediate evaluation rounds where we evaluate your submissions using only a small subset (20%) of the final test set of each problem benchmark, that is:

20 instances for item_placement
20 instances for load_balancing
4 instances for anonymous, repeated 5 times with different random seeds

Those are the numbers that you currently see on the online leaderboard.

Final evaluation

At the end of the competition, all submissions will be evaluated in a final evaluation round, on the whole test set of each problem benchmark, that is:

100 instances for item_placement
100 instances for load_balancing
20 instances for anonymous, repeated 5 times with different random seeds

After this final evaluation we will update the online leaderboard with the final numbers, and those will tell the winners of each challenge (primal, dual and config).

We hope that this clarifies things on your side. Please let us know if you have any further question or comment.

Again, we wish you good luck and fun with your submissions !

Best,
The organizing team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How we compute the leaderboard #73

{{title}}

Replies: 0 comments

Select a reply

How we compute the leaderboard #73

gasse Sep 6, 2021 Maintainer

Train, validation and test instances

Intermediate evaluations

Final evaluation

Replies: 0 comments

gasse
Sep 6, 2021
Maintainer