You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, if you want to repeatedly transform text samples with hypertools.tools.format_data() using the same parameters, the function re-fits both the vectorizer and text model on each call. This ends up being fairly inefficient, and for expensive/numerous operations, makes working directly with the underlying sklearn classes the better option.
We could add an argument to return the fit models for reuse, but a really nice feature would be something like a scikit-learn Pipeline object that you could create, fit, save, and reuse to perform various processing steps with a single call. This would also be a very attractive feature for hypertools, since it could also additionally implement methods like .plot() and .describe().
The text was updated successfully, but these errors were encountered:
Currently, if you want to repeatedly transform text samples with
hypertools.tools.format_data()
using the same parameters, the function re-fits both the vectorizer and text model on each call. This ends up being fairly inefficient, and for expensive/numerous operations, makes working directly with the underlyingsklearn
classes the better option.We could add an argument to return the fit models for reuse, but a really nice feature would be something like a scikit-learn Pipeline object that you could create, fit, save, and reuse to perform various processing steps with a single call. This would also be a very attractive feature for hypertools, since it could also additionally implement methods like
.plot()
and.describe()
.The text was updated successfully, but these errors were encountered: