-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'Expecting data to be a DMatrix object, got: ', <class 'pandas.core.frame.DataFrame'> #498
Comments
@yzheng27 Thanks for reporting the issue. Could you try with the latest interpret-community release 0.24.2 and see if you continue to see this issue? In case you still see the issue, could you provide a sample notebook so that we can reproduce this issue locally. A stack trace of the error will also help us greatly in triaging this issue. Regards, |
@gaugup I think the issue is happening because they are using the XGBoost API that uses DMatrix, instead of the scikit-learn XGBoost API that is pandas compatible, so I'm guessing that upgrading to latest version won't fix it. @yzheng27 I will take a look to see if we can support DMatrix from XGBoost somehow, but an easy quick fix would be to use the scikit-learn API for XGBoost, |
thank you. i was able to generate the global_explanation by loading the model with scikit-learn interface. But now my notebook is running code below for several hours. is it expected? the shape of x_test is around 24000*325.
|
"the shape of x_test is around 24000*325" |
@yzheng27 one other thing, are you importing the dashboard from raiwidgets package, on this repository:
https://github.com/microsoft/responsible-ai-toolbox Make sure you don't import it from interpret-community package, as it has been moved to the other repository. Also, can you run:
to check that you have the latest version of raiwidgets package with ExplanationDashboard? |
@imatiach-msft i'm using the library from raiwidgets and the version is 0.15.1. I was able to get the dashboard with the data dimensions I mentioned, though it took several hours. Will try with the smaller data. |
@yzheng27 if it took several hours but eventually worked then it must be that the UI just loaded too much data, and downsampling should speed it up significantly. All of the datapoints are loaded into the UI and I've noticed that usually after >5k datapoints the UI becomes very slow. Perhaps there is some way to change the UI to stream select data from python backend or to aggregate statistics across multiple points in the future for users who want to run it on a lot of data, I'm not sure. The ErrorAnalysisDashboard is actually able to work on millions of points if you pass in a sample_dataset for the Dataset Explorer, so perhaps something like that could be done for the ExplanationDashboard as well: |
Was following example https://github.com/interpretml/interpret-community/blob/master/notebooks/explain-regression-local.ipynb on my own data and xgboost object, but get error
('Expecting data to be a DMatrix object, got: ', <class 'pandas.core.frame.DataFrame'>)
at explainer.explain_global(x_test). Changed x_test to DMatrix generates error'DMatrix' object has no attribute 'shape'
. Please advise. Thank you.Version:
interpret-community==0.23.0
interpret-core==0.2.7
xgboost==1.4.1
The text was updated successfully, but these errors were encountered: