You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems to be an edge case which is not considered in our implementation.
For ligthgbm only the path-dependent method is supported when categorical features exists (i.e. in a pd.DataFrame we have columns with type category ). See link here.
The interventional method is not supported because shap doesn't know how to deal with categorical features. One has to OHE them to make it work.
For the path dependent approach, there are a few parameters that are not set, one of them being num_outputs which breaks the code in the first place. Those params are only set when self.tress is not None. See link here.
If we fix that, another problem arises in the _build_explanation method. _build_explanation calls the predict function to compute the raw predictions returned in the explanation (see link here). The predict function uses a TreeEnsamble wrapper define in shap which doesn't work because it uses a cext which also doesn't know how to handle categorical features. See link here.
The text was updated successfully, but these errors were encountered:
Seems like two different issues, i.e. since interventional method is not supported, fitting with a dataset should not work (can we raise an error if fit is called with arguments?) Additionally, in the path-dependent case we might also want to raise an error if the explain step is called with e.g. a pd.Dataframe containing categorical values?
The other issue is to do with when fit is called correctly without arguments so that the path-dependent method is used. Is it correct to say that the reason things don't quite work here is because we perform an additional predict call? It's not quite clear to me what the required fix is and how it interferes with this predict call?
Might be a related issue with explaining catboost models.
There, the categorical features are transformed internally inside the model (docs). The input to explain() should then not be encoded and consequently _build_explanation fails in this case as well.
The shap library is able to output explanations by first converting the input data to catboost.Pool that handles the transformations (see here).
Would appreciate if something similar could be added to your wrapper as well
There seems to be an edge case which is not considered in our implementation.
For
ligthgbm
only the path-dependent method is supported when categorical features exists (i.e. in apd.DataFrame
we have columns with typecategory
). See link here.The interventional method is not supported because
shap
doesn't know how to deal with categorical features. One has to OHE them to make it work.For the path dependent approach, there are a few parameters that are not set, one of them being
num_outputs
which breaks the code in the first place. Those params are only set whenself.tress
is notNone
. See link here.If we fix that, another problem arises in the
_build_explanation
method._build_explanation
calls thepredict
function to compute the raw predictions returned in the explanation (see link here). Thepredict
function uses aTreeEnsamble
wrapper define inshap
which doesn't work because it uses acext
which also doesn't know how to handle categorical features. See link here.The text was updated successfully, but these errors were encountered: