-
-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix for plotting order in plot_comparison (fixes mismatch between labels and plot) #731
Conversation
Perhaps the rationale behind centering could be added to help user decide whether to turn it on or not. But I'm not entirely sure about this.
… character, or so. Also deleted superflous space unrelated to the black's complaint
... for more informative output.
Cannot make changes locally at the moment. Hence so many commits ...
It's a bit annoying, I can only work from the web interface currently, but hopefully all typos are resolved now
@jt-lab thanks for the contribution! Can I ask for an example of how this looked before and how it looks after the change? Thanks! Bonus: If it comes with data so I can test on my end, even better! |
Here you go:
Plot before the fix: After the fix: Data set: |
@jt-lab thanks for finding the subtle bug and opening a PR (good attention to detail)👍🏼. The difference between # user provided values only (no default values are computed for "factor1" and "factor3")
bmb.interpret.plot_comparisons(
model=model,
idata=trace,
contrast={"factor2": ["X", "Y"]},
conditional={"factor1": ["C", "A", "B"], "factor3": [1, 2]},
) and # computes default values for "factor1" and "factor3"
bmb.interpret.plot_comparisons(
model=model,
idata=trace,
contrast={"factor2": ["X", "Y"]},
conditional=["factor1", "factor3"],
) is that in the latter code snippet, since no values are passed for Thanks! Also, you are right. This is because in |
@GStechschulte, thanks for the explantions! I'm not sure I understand the purpose of the dictionary conditional argument. When I list all levels of the factors, I get the same as with the list version defaults. When I list a subset of the levels, I get errors. Unrelated: When reading through the plot_comparisons doc, I noticed this sentence "Bambi uses the ordering (keys if dict and elements if list)...". Im not sure I understand correctly, but dictionaries don't gurantee any order of the items or keys. So this might also be a source of unexpected behavior. So the user might need to pass an OrderedDict to be safe. But maybe i'm missing something! Many thanks again! |
This is due to the data types of the data being used to fit the model. When you pass the list if v == "main":
if v == numeric:
return np.linspace(v.min(), v.max(), 50)
elif v == categorical:
return np.unique(v)
elif v == "group":
if v == numeric:
return np.quantile(v, np.linspace(0, 1, 5))
elif v == categorical:
return np.unique(v)
elif v == "panel":
if v == numeric:
return np.quantile(v, np.linspace(0, 1, 5))
elif v == categorical:
return np.unique(v) In your case, "main" and "group" are both categorical data types. Thus, the default values are the unique values. When you pass a dict to the conditional arg., the keys of the dict are taken as the "main", "group", and "panel" covariates. Thus, Then, since you ended up passing all the unique values for df["factor1"].unique(), df["factor3"].unique()
as the dict values, the result of passing a list to conditional and the passing of the dict (stated above) results in the same data being generated. If you were to pass Lastly, regarding
This is due to a bug in the function def get_unique_levels(x: np.ndarray) -> np.ndarray:
return np.unique(x) there are no more shape errors. As an aside, if I hope this helps 👍🏼 |
Should I make a separate PR incorporating @jt-lab commits and my local changes to fix the bugs? Or should @jt-lab keep this PR and then commit my changes to fix the bugs? @tomicapretto |
@GStechschulte if your fixes are related to what @jt-lab is doing here, I think @jt-lab could allow this PR to be edited by maintainers and then you could contribute here so everything is in place and we respect the original authorship. But if they are separate things, you could open a different PR. But if you think that's too much, feel free to do something else. |
@tomicapretto, how can I make it editable? |
I think it's already enabled |
I have pushed 3 commits to your branch. Now, the plots work with no errors. I also decided that your original commit to sort the dict values in ascending order is the most appropriate. In particular, the x-axis of graphs should be in ascending order Thanks a lot for raising the issue and for opening a PR 😄 It is much appreciated! Below are a few of your examples: f, ax = plt.subplots(2)
bmb.interpret.plot_comparisons(
model=model,
idata=trace,
contrast={"factor2": ["X", "Y"]},
conditional={"factor1": ["A", "B", "C"], "factor3": [1, 2]},
ax=ax[0]
)
bmb.interpret.plot_comparisons(
model=model,
idata=trace,
contrast={"factor2": ["X", "Y"]},
conditional={"factor1": ["C", "B", "A"], "factor3": [1, 2]},
ax=ax[1]
)
plt.tight_layout(); bmb.interpret.plot_comparisons(
model=model,
idata=trace,
contrast={"factor2": ["X", "Y"]},
conditional={"factor1": ["A", "C"], "factor3": [1, 2]},
fig_kwargs=dict(figsize=(7, 3))
); bmb.interpret.plot_comparisons(
model=model,
idata=trace,
contrast={"factor2": ["X", "Y"]},
conditional=["factor1", "factor3"],
fig_kwargs=dict(figsize=(7, 3))
); |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@GStechschulte @jt-lab ok I struggled a little bit with git and the fact that we're working on @jt-lab's main branch (I used a different name locally). Nevertheless, it seems everything is working. I just made a small modification to the if-else structure, I think it's clearer now, and I also wrote a fix for the HSGP and the failing tests. We can merge once CI passes. |
@tomicapretto Ahh great! And thanks for the HSGP fix 👍🏼 |
Mmmm. The |
The problem happens when we test this plot_slopes(model, idata, wrt={"Days": 2}, conditional={"Subject": 308} and it's because it's calling |
Codecov Report
@@ Coverage Diff @@
## main #731 +/- ##
==========================================
- Coverage 89.56% 89.55% -0.01%
==========================================
Files 44 44
Lines 3525 3524 -1
==========================================
- Hits 3157 3156 -1
Misses 368 368
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
… labels and plot) (bambinos#731) Co-authored-by: GStechschulte <[email protected]> Co-authored-by: Tomas Capretto <[email protected]>
A call like ...
currently leads to a mismatch of the order with which the levels of factor2 are plotted on the x axis and the x axis labels!
I have traced this down to .values.dtype.categories returning the names in sorted order which is eventually used for the labels. The actual plot is based on the user provided order.
This commit is only a quick fix which sorts the user-provided levels to match the order to avoid any false results. It would be better to give the user control about the order by using the order of the user provided list for the labels but a change like that seems to interact with further code which I don't fully understand.
Moreover, I'm not entirely sure what the benefit of the call above is compared to:
I thought it was to give the user control over which levels are plotted (and in which order). But removing levels from the list does lead to errors (and the order is not applied). I might miss something, but if there is no additional functionality comapred to the list version, perhaps the dict version should be removed (for now).
I'm sorry for the many incremental commits just for fixing spaces and typos. I could not work with a local version but make change in githubs web interface (and test the code on a server).