-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added shapley values to rest scorer endpoint #241
Conversation
one more point: pls make sure that user can:
|
@mmalohlava are not the columns seen in the sample response (first comment) #241 (comment) not a decent representation of what we should expect?... At least, of shapley on transformed features? |
@mmalohlava - |
@Rajimut I think this is different... these are shap contributions on the original features. cc @mmalohlava |
Ah I see.. thanks, so that leaves us with what we have column names listed out along with their contributions for transformed features |
Here is example: a simple pipeline for Call of scoring pipeline: columns = [
pd.Series(['4.599999904632568', '4.900000095367432', '4.800000190734863', '4.699999809265137', '5.099999904632568', '5.199999809265137', '4.400000095367432', '5.0', '5.199999809265137', '4.900000095367432', '5.099999904632568', '4.300000190734863', '4.800000190734863', '4.300000190734863', '5.199999809265137'], name='Sepal_Length', dtype='float32'),
pd.Series(['1.600000023841858', '1.2000000476837158', '1.0', '1.5', '1.2000000476837158', '1.600000023841858', '1.2999999523162842', '1.7000000476837158', '1.0', '3.0', '1.0', '1.2999999523162842', '1.2000000476837158', '1.2999999523162842', '1.100000023841858'], name='Petal_Length', dtype='float32'),
]
df = pd.concat(columns, axis=1)
preds = (scorer.score_batch(df, apply_data_recipes=False, pred_contribs=True, pred_contribs_original=True))
print(preds)
preds.to_csv("preds.csv") result:
See full file: 💡 Observation:
|
This is another example, using the same dataset, but forcing DAI to generate some transformed features. Then if i score data:
columns = [
pd.Series(['4.599999904632568' ], name='Sepal_Length', dtype='float32'),
pd.Series(['2.700000047683716'], name='Sepal_Width', dtype='float32'),
pd.Series(['1.2999999523162842'], name='Petal_Length', dtype='float32'),
pd.Series(['1.2999999523162842'], name='Petal_Width', dtype='float32'),
]
df = pd.concat(columns, axis=1)
preds = (scorer.score_batch(df, apply_data_recipes=False, pred_contribs=True, pred_contribs_original=True))
print(preds)
preds.to_csv("preds_only_original.csv") then result is:
(see full results here: preds_only_original.csv)
print('---------- Score Frame ----------')
columns = [
pd.Series(['4.599999904632568' ], name='Sepal_Length', dtype='float32'),
pd.Series(['2.700000047683716'], name='Sepal_Width', dtype='float32'),
pd.Series(['1.2999999523162842'], name='Petal_Length', dtype='float32'),
pd.Series(['1.2999999523162842'], name='Petal_Width', dtype='float32'),
]
df = pd.concat(columns, axis=1)
preds = (scorer.score_batch(df, apply_data_recipes=False, pred_contribs=True, pred_contribs_original=False))
print(preds)
preds.to_csv("preds_only_transformed.csv") result:
the full results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the example output correct? I cannot match it to the https://github.com/h2oai/mojo2/blob/7a1ab76b09f056334842a5b442ff89859aabf518/doc/shap.md
What if we have "richer" data structure in the output? Something that would not to be parsed to get the combination. Is it useful? Is the notation in the example and the description something standard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New to Shap values, apologies if the following may not be as relevant as I think it is.
Seems like the separatoring character between InputName
and OutputIndex
(i.e. _
, according to MOJO2 doc) might generate non-regular field names. E.g., since the InputName
s (e.g., fields) that Shap is generating also have at least one underscores themselves, it seems like it might be difficult to deterministically parse into InputName
and OutputIndex
.
I assume the reason why the sample output in the original description doesn't include the OutputIndex
(i.e. output class) is because the sample problem is binomial. Had the problem been multinomial, and the number of shap values returned been multipied, I imagine it would be difficult to grok what each Shap field
was referring to (because of the way the contrib labels are generated).
The sample output shows the result as a mapping from shap label to shap value. I wonder what a nested structure would look like, where the top-level is a mapping from output class to the same structure in the sample response.
E.g.:
shap_val = [
{
output_class = "iris-setosa"
data = {
fields = [ "contrib_0_AGE", "contrib_12_Pay" ...]
contributions = [ "0.123", "0.456", ...]
}
},
{
output_class = "iris-virginica"
data = {
...
}
}
}
That type of structure may be more intuitive to traverse, considering it seems we may have dozens++ of shap fields that are similarly labeled? At least, the user doesn't have to attempt to parse the Shap-generated field names. Unless there's a simple grammar that exists that just isn't documented.
Q: Looking at @mmalohlava 's example: Does the MOJO2 lib have a way for clients to define that they want Shap values for just original features? The MOJO2 doc shows setShapPredictContrib(boolean)
, but no parameter for specifying if values should be generated for original or also transformed columns. If it does (doesn't look like it?), would that be a good param for this API to expose as well?
Else, at the moment it looks like MOJO2 gives us Shap values for all original + transformed columns by default. In which case, do users expect Shap values to be lumped together, or separated into original-only and transformed-only?
So, I discussed little bit with @mmalohlava earlier... so will try my best to summarize all the comments in to somethings somewhat cohesive:
so for input:
you could get (assuming you joined the predictions and shapley results
Related to 2/3, it is important I think, to acknowledge that output of shapley values can be quite large/complex, but in the context of the output of response from the api, this is less important. My suggestion for the api is as follows:
^^ or something of the sort. My reasoning is as follows:
|
yes, it is copy-paste from the actual output of Python scoring pipeline, i can provide reproducible example.
not sure, if i understand the question
It is Python scoring pipeline output, in MOJO we trying to get close to it as much as possible - what is important is: (1) names of original features, (2) names of output category in case of multinomial problem, (3) clear separation of bias term (that it cannot be confused with any of feature shap value) |
reading your comment and you have good points there: i think if we are producing shapley values for original features (the inputs of pipeline) we do not need to list names (like contrib_0_AGE, ...) - we just need to output them in the order of input features. However, if we produce Shap values for transformed features, we have to output names since they reflect names of internally engineered features.
not yet, right now only transformed, but we will have to separate the API calls (CC: @pkozelka ) - original vs transformed.
At the moment MOJO provides only Shap values of transformed features (note: the list can still contain original features if they are input for any of internal models). |
@mmalohlava I would argue we should always output the name of the column. Noting that in the api the user can request for a certain column to be included in the response. |
Reg the api request
I definitely like this idea. I think if users are not providing any enum parameter then we could default it to None, thus providing backward compatibility for the existing Reg api Response In that case it might make sense to define the following, output class can be parsed for the 2D structure below as per @orendain 's example
|
@orendain - as discussed the support for shap values for h2o-3 mojos is not available currently in the mojo2 library |
@Rajimut sorry for delayed response back... I think there is way in the pipeline to know if is h2o3 model or not. I feel like simple logic would be able to return some warning/error no?
|
I think currently it is not possible for us to know whether the uploaded mojo is from h2o3, since it is not exposed by the mojo pipeline. We can do a work-around by unzipping the file and reading the extension but it might not be a correct way as the pipeline is doing this already |
As per discussion on slack, my above comment is related to: Since not specific to shapley scores. Deferring to later changes. Discussed with @Rajimut that we can add additional field in response: |
...corer/src/main/java/ai/h2o/mojos/deploy/sagemaker/hosted/controller/ModelsApiController.java
Outdated
Show resolved
Hide resolved
...corer/src/main/java/ai/h2o/mojos/deploy/sagemaker/hosted/controller/ModelsApiController.java
Outdated
Show resolved
Hide resolved
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Outdated
Show resolved
Hide resolved
try { | ||
ShapleyType requestedShapleyType = shapleyType(request.getShapleyValuesRequested()); | ||
switch (requestedShapleyType) { | ||
case TRANSFORMED: | ||
response.setFeatureShapleyContributions(computeContribution(request)); | ||
break; | ||
case ORIGINAL: | ||
log.info(UNIMPLEMENTED_MESSAGE); | ||
break; | ||
default: | ||
break; | ||
} | ||
} catch (Exception e) { | ||
log.info("Failed shapley values: {}, due to: {}", request, e.getMessage()); | ||
log.debug(" - failure cause: ", e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should always when we cannot fulfill the request. "Failing silently" causes a lot of "why does it not return what I have requested" situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was reg a discussion about
As per discussion on slack, my above comment is related to:
#165
#62Since not specific to shapley scores. Deferring to later changes. Discussed with @Rajimut that we can add additional field in response:
message which is a map of {message_level: <enum of: LOG, WARN, ERROR>, message: } or something.
The idea is not to fail the scoring request when shapley values cannot be computed for a model - And in mojos that are not obtained from DAI like h2o-3 mojos the shapley requests will fail causing the scoring response to fail as well. In order to handle this, as per discussion above, we have decided to provide 400 response on shapley exclusive endpoint /model/contribution
. The existing endpoint /model/score
will remain unaffected if the shapley values are not available.
[For the future] We also want to include a message field with the score and shapley response to provide some additional information which will describe the reason for the failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts:
- If a user explicitly requests Shap scores for a scorer that has loaded an H2O-3 MOJO (perhaps the client even knowing that the feature isn't supported), would we consider that to be client error? Should it be the server's responsibility to "fix/adjust" a client's request if the client hasn't requested adjustment?
- The try-catch block is catching all shap errors. If the error is due to something other than compatibility, silently failing would hide all of it.
- I envision the two points above on the same level as how we fail if one of the regular client-provided scoring row is malformed. We don't currently ignore one row while allowing the rest to continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the error is due to something other than compatibility, silently failing would hide all of it.
Yes, that is true.. we have scenarios where the error could be caused because the models are not supporting it. My thoughts: This could be best conveyed to the user by providing a message instead of throwing the exception. But we could handle the user related error like passing in a wrong Shapley Value by throwing an error.
Errors due to h2o-3 mojo - shouldn't we fail this silently?
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Outdated
Show resolved
Hide resolved
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Outdated
Show resolved
Hide resolved
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Outdated
Show resolved
Hide resolved
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Outdated
Show resolved
Hide resolved
...corer/src/main/java/ai/h2o/mojos/deploy/sagemaker/hosted/controller/ModelsApiController.java
Outdated
Show resolved
Hide resolved
Side note: Could we bump the (minor) version of the API? https://github.com/h2oai/dai-deployment-templates/blob/master/common/swagger/swagger.yaml#L2-L7 May need to coordinate with @mmalohlava to double check it does not interfere with any existing plans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still two questions.
LGTM. I don't want to approve myself as there are people more competent in java and this repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, LGTM.
Thanks for untangling all of the data science + implementation + future API concerns and tackling it in this PR. Not only a huge enhancement to the API and REST scorer, but a good lesson on Shapley and how it all now connects with the H2O.ai tech stack!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🖖
I can simply say WOW! What an enormous effort and what a great result! I'm not super proficient with Java and the scorers' code so I'm giving my LGTM as a general ACK.
Let's have a final confirmation from @mmalohlava and lets merge it!
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Show resolved
Hide resolved
response.setFeatureShapleyContributions(transformedFeatureContribution(request)); | ||
break; | ||
case ORIGINAL: | ||
log.info(UNIMPLEMENTED_MESSAGE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the latest runtime MOJO2 2.7.0 should support original shap as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created an issue for this here #247
common/transform/src/main/java/ai/h2o/mojos/deploy/common/transform/MojoScorer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rajimut thank you! nice result!
Tiny request: can you create issue for parts which are missing/not implemented:
- original shapley
- any changes in MOJO API you would suggest.
Created some issues for other things needed to be done after this PR
In local rest scorer
|
Aim: to provide Shapley values along with the /model/score endpoint optionally if requested by the client
The boolean is called shapley_results
(still on discussion here https://docs.google.com/document/d/1HWArQP7RTqJ7JHt2C_Tw8MAZkBCIfwCnLT9cR2kLdT8/edit#)
The boolean can take true or false.
When true - it provides Shapley values for transformed features
When false - it provides just the predictions
When the boolean is not specified the response contains just the predictions (for backward compatibility)
Sample request
Sample response: