Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Just notice something #232

Open
dalide opened this issue Dec 1, 2017 · 2 comments
Open

Just notice something #232

dalide opened this issue Dec 1, 2017 · 2 comments

Comments

@dalide
Copy link
Contributor

dalide commented Dec 1, 2017

HI @kwinkunks, I know it has been one year already, I just happen to take a look at this repo again and found in the utils.py that the score used is "accuracy", not the actual "F1 score", is that right?

@kwinkunks
Copy link
Member

Hi... Indeed, I used the accuracy function in utils.py. This is the same as sklearn.metrics.f1_score with average='micro'.

This was discussed in another issue. It seems that using average='weighted' may have been more sensible for a dataset with such imbalanced labels, but 'micro' is what we went with.

It would be interesting to test them against each other, because one question I have is whether weighting the small populations would make the results rather unstable, as single instances of small populations could substantially change the score. Since I was using 100 realizations of the scores for the final ranking, maybe this would not have been a big concern, but it's worth investigating I think.

@dalide
Copy link
Contributor Author

dalide commented Dec 1, 2017

I see, I do remember that I saw this discussion somewhere, but couldn't find it last night. I agree that average='weighted would be a better choice for such imbalanced data set, since 'micro' basically gives the accuracy in normal sense.

If some population is too small , I guess their precision and recall would not be weighted too much towards the total.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants