Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negate the result and prefix the metric name for error/loss metrics #278

Merged
merged 6 commits into from
Apr 6, 2021

Conversation

sebhrusen
Copy link
Collaborator

@sebhrusen sebhrusen commented Apr 1, 2021

#268

Only changing the result and metric columns when the metric represents an error:

  • metric name is prefixed with neg_.
  • result is negated.

Example:

Summing up scores for current run:
                  id         task          framework constraint fold    result       metric   mode version params               app_version                  utc  duration  training_duration  predict_duration models_count       seed       acc  auc    balacc   logloss      mae        r2     rmse
0  openml.org/t/3913          kc2  constantpredictor       test    0   0.50000          auc  local  0.23.2         dev [issue/268, b990e93]  2021-04-01T15:52:54       0.2                0.0               0.0            1  933769621  0.792453  0.5  0.500000  0.510714      NaN       NaN      NaN
1  openml.org/t/3913          kc2  constantpredictor       test    1   0.50000          auc  local  0.23.2         dev [issue/268, b990e93]  2021-04-01T15:52:54       0.1                0.0               0.0            1  933769622  0.792453  0.5  0.500000  0.510714      NaN       NaN      NaN
2    openml.org/t/59         iris  constantpredictor       test    0  -1.09861  neg_logloss  local  0.23.2         dev [issue/268, b990e93]  2021-04-01T15:52:54       0.0                0.0               0.0            1  933769621  0.333333  NaN  0.333333  1.098610      NaN       NaN      NaN
3    openml.org/t/59         iris  constantpredictor       test    1  -1.09861  neg_logloss  local  0.23.2         dev [issue/268, b990e93]  2021-04-01T15:52:54       0.0                0.0               0.0            1  933769622  0.333333  NaN  0.333333  1.098610      NaN       NaN      NaN
4  openml.org/t/2295  cholesterol  constantpredictor       test    0 -45.68970     neg_rmse  local  0.23.2         dev [issue/268, b990e93]  2021-04-01T15:52:54       0.0                0.0               0.0            1  933769621       NaN  NaN       NaN       NaN  35.6774 -0.077562  45.6897
5  openml.org/t/2295  cholesterol  constantpredictor       test    1 -55.00410     neg_rmse  local  0.23.2         dev [issue/268, b990e93]  2021-04-01T15:52:54       0.0                0.0               0.0            1  933769622       NaN  NaN       NaN       NaN  45.3871 -0.049133  55.0041

@sebhrusen
Copy link
Collaborator Author

@Innixma does this look reasonable?

@sebhrusen sebhrusen requested a review from PGijsbers April 1, 2021 15:59
Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefixing neg_ is probably more sensible (since scikit-learn does it) so I agree with that choice even though it takes more space. Other than that I tested it (with constant predictor) and it looks good.

@sebhrusen sebhrusen merged commit f7b68eb into master Apr 6, 2021
@sebhrusen sebhrusen deleted the issue/268 branch April 6, 2021 15:43
Copy link
Collaborator

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for late response, I was on PTO.

Looks good to me, had a comment for long-term improving code quality and extensibility.

def auc(self):
"""Array Under (ROC) Curve, computed on probabilities, not on predictions"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: area instead of array

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups! will fix

return float(r2_score(self.truth, self.predictions))


def higher_is_better(metric):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit hacky. Better to have either a dictionary mapping or metrics as classes (example in AutoGluon).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't disagree with you: it IS a bit hacky.
Ideally, there should be a class for each metric. It's probably something I'll do at some point to support custom metrics or other customizations in a more satisfying way that what was done in #141.
If there's a demand for it, I'll do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants