Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement shannon entropy #73

Open
apmt opened this issue Feb 14, 2023 · 0 comments
Open

Implement shannon entropy #73

apmt opened this issue Feb 14, 2023 · 0 comments
Assignees
Milestone

Comments

@apmt
Copy link
Contributor

apmt commented Feb 14, 2023

Epic: #74

This issue consists of implementing the Shannon entropy feature in Lib.

This implementation was made by the Mindsu team and is on GitLab (https://gitlab.com/dell-ml-unb/esw/-/blob/main/research/isolation_forest/features.py)

@apmt apmt self-assigned this Feb 14, 2023
@apmt apmt added this to the Sprint 5 milestone Feb 14, 2023
apmt pushed a commit that referenced this issue Feb 15, 2023
apmt pushed a commit that referenced this issue Feb 15, 2023
apmt added a commit that referenced this issue Feb 17, 2023
* #76 remove 'y' from consonant sequences feature

* #77 add all Mexico states abbreviations and its source in the docstring

* #73 implement shannon entropy method and adapt the threshold calculation to also match values below

* #73 new model with shannon entropy and notebook sets

* #73 fix baja california abbreviation

* #73 fix keysmash sequence test to also consider special characters

* #74 fix private methods naming convention to double underscores

* #75 and #78 add KeySmash features: repeated bigrams and unique chars ratios

* #74 fix tests and parser private methods

* #74 add model test

* #74 update initial sets models

---------

Co-authored-by: atarchetti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant