-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to work with Python 3.12 and Scikit-Learn >= 1.2 #77
Conversation
self.stop = list( | ||
set(english_stopwords).union([sf.lower() for sf | ||
in self.shortforms]) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TfidfVectorizer
now only allows lists for stop_words
and no longer accepts generic containers.
f1_scorer = make_scorer(f1_score, labels=self.pos_labels, | ||
pos_label=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Input validation for f1_score
and friends now requires explicitly setting pos_label=None
when no pos_labels
are desired. Before, if average
was not binary
, pos_label
was completely ignored.
feature_names = tfidf.get_feature_names() | ||
feature_names = tfidf.get_feature_names_out() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method changed names in scikit-learn 1.0, the old one was deprecated and then removed in 1.2.
Thanks @steppi, this looks great - I tested with Gilda and it works well on Python 3.12. |
This PR makes the following changes:
_score.c
, instead require Cython as a build dependency.adeft/modeling/classify.py
so that it works with the latest Scikit_learn.This supersedes #76. Hopefully everything in CI passes.