You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use hyperimpute on my custom data. I am using the following setup:
method = "hyperimpute"
plugin = Imputers().get(method,
optimizer = "hyperband",
classifier_seed=["logistic_regression", "catboost", "xgboost", "random_forest"],
regression_seed=[
"linear_regression",
"catboost_regressor",
"xgboost_regressor",
"random_forest_regressor",
],
# class_threshold: int. how many max unique items must be in the column to be is associated with categorical
class_threshold=5,
# imputation_order: int. 0 - ascending, 1 - descending, 2 - random
imputation_order=2,
# n_inner_iter: int. number of imputation iterations
n_inner_iter=10,
# select_model_by_column: bool. If true, select a different model for each column. Else, it reuses the model chosen for the first column.
select_model_by_column=True,
# select_model_by_iteration: bool. If true, selects new models for each iteration. Else, it reuses the models chosen in the first iteration.
select_model_by_iteration=True,
# select_lazy: bool. If false, starts the optimizer on every column unless other restrictions apply. Else, if for the current iteration there is a trend(at least to columns of the same type got the same model from the optimizer), it reuses the same model class for all the columns without starting the optimizer.
select_lazy=True,
# select_patience: int. How many iterations without objective function improvement to wait.
select_patience=5,
)
# fit it on the data
plugin.fit(traindataSelected.copy())
# predict the missing values
predictedval = plugin.transform(traindataSelected.copy())
My train data has 1000 rows and 372 columns. When I run, I get the following error:
---> [78] predictedval = plugin.transform(traindataSelected.copy())
ValueError: Length mismatch: Expected axis has 368 elements, new values have 372 elements
Can you please let me know if I am missing something or the reason for the error? Is there a way to manually specify which columns should be considered continuous and which ones should be treated as discrete?
Even when I use mean imputer, my predicted data is 368 columns while my original data has 372 columns.
method = "mean"
plugin = Imputers().get(method)
# fit it on the data
plugin.fit(X.copy())
# predict the missing values
predictedval = plugin.transform(X.copy())
Thanks!
The text was updated successfully, but these errors were encountered:
Question
Length mismatch error
Further Information
I am trying to use hyperimpute on my custom data. I am using the following setup:
My train data has 1000 rows and 372 columns. When I run, I get the following error:
Can you please let me know if I am missing something or the reason for the error? Is there a way to manually specify which columns should be considered continuous and which ones should be treated as discrete?
Even when I use mean imputer, my predicted data is 368 columns while my original data has 372 columns.
Thanks!
The text was updated successfully, but these errors were encountered: