-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MetaCAT fixes and upgrades #495
Conversation
Pushing for 3 updates: 1) Removed the check and update for labels with zero data, as this was causing issues during evaluation 2) Resolved an issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3) 3) Updated the attention mask creation to dynamically use the actual pad_idx value instead of assuming it to be 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do a slightly better approach where I commented.
But I'll leave a more general comment here as well.
This PR attempts to fix 3 things at once. But it's not immediately clear which part of the changes corresponds to which fix. And the commit messages don't help either
If in the future there's issues with some of this code, we'll come back and see when this was changed. And we'll come to this PR. We'll try to understand why the changes were made. But the commit messages won't be able to help us. And we'll be left having to look into the change logs to deduce which of the 3 fixes are relevant to the code at hand initially.
I would recommend in the future to use more descriptive commits and do one thing at a time
- Instead of "Pushing change" with all the changes you're making
- Make one change, and commit it with a message (e.g) "Allow dynamic use of correct pad_idx value instead of defaulting to 0"
- Then make the next change and commit with a message specific to that change
By doing this you would make the (potential) future trouble shooting easier since we would be able to track a change to a specific commit and its descriptive commit message. And it also has the added bonus of lowering the chance of committing changes that you weren't meaning to include since you're only dealing with one thing at a time.
PS:
In principle, it's also best to try to limit 1 PR to 1 change. But given how small these 3 are, I'm not too worried about it here.
For the issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3), pushing an optimized version w/o the try except block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
* MetaCAT fixes and upgrades Pushing for 3 updates: 1) Removed the check and update for labels with zero data, as this was causing issues during evaluation 2) Resolved an issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3) 3) Updated the attention mask creation to dynamically use the actual pad_idx value instead of assuming it to be 0 * Pushing type fix * Pushing for type fix * Fixing type issues * Pushing change * Pushing update w/o try except block For the issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3), pushing an optimized version w/o the try except block
* MetaCAT fixes and upgrades Pushing for 3 updates: 1) Removed the check and update for labels with zero data, as this was causing issues during evaluation 2) Resolved an issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3) 3) Updated the attention mask creation to dynamically use the actual pad_idx value instead of assuming it to be 0 * Pushing type fix * Pushing for type fix * Fixing type issues * Pushing change * Pushing update w/o try except block For the issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3), pushing an optimized version w/o the try except block
Pushing for 3 updates: