get_performance_dictionary doesn't provide the desired metric #48

andres-fr · 2022-06-23T13:32:31Z

The interface of the function is

def get_performance_dictionary(
    optimizer_path, mode="most", metric="valid_accuracies", conv_perf_file=None
):

But, despite providing e.g. "valid_accuracies", the function returns sometimes the "test_accuracies".
The explanation can be found in the following line of code (permalink to dev branch):

DeepOBS/deepobs/analyzer/analyze.py

Line 532 in 9782c0b

    
           metric = "test_accuracies" if "test_accuracies" in sett.aggregate else "test_losses"

This line overrides the metric provided by the user in all cases, making it redundant.
A proposed fix is to delete this line, or to remove the metric parameter from the function. I personally think the former is more meaningful, since it provides more flexibility to the end user.

The text was updated successfully, but these errors were encountered:

fsschneider · 2022-07-01T09:16:53Z

I think, the issue is a little bit more complicated.

As the docstring says, the metric that is passed to the get_performance_dictionary determines "how to decide the best setting". Currently, this can be influenced by the user, e.g. ranking hyperparameters by valid_accuracies or train_losses.

The line that you quoted, determines which metric is used to determine the "performance". This currently is indeed hard-coded to be either test_accuracies or test_losses. However, this does not affect the ranking, only which metric is used to report performance.

In general, with DeepOBS we very much encourage users to report test_accuracies as performance measures, so hard-coding it, doesn't sound too bad.
If we want to change it, we should have two parameters controlling the behavior of the analysis part. Something like ranking_metric and performance_metric. This would require more thorough changes than just removing the line you quoted.

andres-fr added 🆕 Status: New New Issue 🐛 Type: Bug Something isn't working labels Jun 23, 2022

andres-fr assigned fsschneider Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_performance_dictionary doesn't provide the desired metric #48

get_performance_dictionary doesn't provide the desired metric #48

andres-fr commented Jun 23, 2022

fsschneider commented Jul 1, 2022

get_performance_dictionary doesn't provide the desired metric #48

get_performance_dictionary doesn't provide the desired metric #48

Comments

andres-fr commented Jun 23, 2022

fsschneider commented Jul 1, 2022