Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interpret support for model predictions with response levels #732

Merged
merged 9 commits into from
Oct 11, 2023

Conversation

GStechschulte
Copy link
Collaborator

@GStechschulte GStechschulte commented Sep 28, 2023

This PR resolves #723 and will allow the sub-package interpret to work with models, such as ordinal and categorical regression, whose predictions are a vector of some quantity, e.g., probabilities.

If a model's prediction is a vector of some quantity, then a series of joins are performed on the mean predictions y_hat_mean, uncertainty interval bounds, and data used to compute predictions (e.g., cap_data). Ultimately, the left join is used to ensure the data used to compute predictions is duplicated correctly with the predictions and uncertainty intervals.

Here is a link to a Gist demonstrating the bug fix on bmb.interpret.plot_predictions for categorical and ordinal regression.

To do:

  • support for plot_predictions
  • support for predictive differences (plot_comparisons and plot_slopes)
  • add tests
  • run Pylint
  • run black

@GStechschulte
Copy link
Collaborator Author

The gist has been updated with demos for comparisons, slopes, and predictions.

@GStechschulte GStechschulte marked this pull request as ready for review October 9, 2023 19:23
@tomicapretto
Copy link
Collaborator

I'm trying to replicate the plot here

image

The first thing I do is the following:

plot_predictions(model, idata, ["length"]);

which results in:

image

The problem is when I try to map "sex" to the panel variable.

plot_predictions(model, idata, {"horizontal": "length", "panel": "sex"});

which results in the following error:

TypeError: covariates must be a string or a list of strings.

And if I do the following:

plot_predictions(model, idata, ["length", "sex"]);
# or
plot_predictions(model, idata, ["length", "sex", "sex"]);

it returns a weird result because it's mapping "sex" to color as well.

image
image

And finally I tried

plot_predictions(model, idata, ["length", "choice", "sex"]);

but the result doesn't look good either

image

So, my question is: why did we remove the ability to pass a dictionary to covariates? And on top of that, do you think it should be possible to map another attribute to the color apart from the response level? My initial approach would be not to allow it, but if you think it's not hard, go ahead.

@GStechschulte
Copy link
Collaborator Author

GStechschulte commented Oct 10, 2023

We removed the ability in the plot comparisons PR #684, and I described why we should remove it in this comment.

do you think it should be possible to map another attribute to the color apart from the response level

It is possible without needing to code anything. It is subtle, but since the plot functions use the columns from the summary dataframe, and if the user knows what those columns are a priori, then they can use whatever column they feel necessary for main, group, and panel.

To achieve your plot

bmb.interpret.plot_predictions(
    model,
    idata,
    ["length", "sex"],
    subplot_kwargs={"main": "length", "group": "estimate_dim", "panel": "sex"},
    fig_kwargs={"figsize": (10, 3)},
    legend=True
);

image

There is a section in plot_predictions discussing the subplot_kwargs. However, maybe I should add a section explaining the plot functions use the summary dataframe, therefore you can use any of the existing columns in the summary dataframe as main, group, and panel variables. Additionally, I could discuss the estimate_dim column in cases of multiple response dimensions?

@tomicapretto
Copy link
Collaborator

@GStechschulte ha! excellent!

Do you want to modify the example to use the new functionality? After that, I think it can be merged.

Thanks a lot!

@GStechschulte
Copy link
Collaborator Author

For sure, I will add it. There's also a couple inline comments I want to add, and then it can be merged. Thanks! 👍🏼

@GStechschulte GStechschulte force-pushed the interpret-vector-preds branch from 2815c0f to c525787 Compare October 10, 2023 14:45
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@GStechschulte GStechschulte merged commit 77a8fa1 into bambinos:main Oct 11, 2023
1 of 4 checks passed
@GStechschulte GStechschulte deleted the interpret-vector-preds branch January 21, 2024 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

interpret value errors for ordinal and categorical models
2 participants