Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimate_means(backend = "marginaleffects") not calculating marginal means #275

Open
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

strengejacke
Copy link
Member

Fixes #273

@DominiqueMakowski
Copy link
Member

Should we add tests against emmeans, marginaleffects and ggeffects just so that we keep track of the similarities and discrepancies?

@strengejacke
Copy link
Member Author

Yes this is still work in progress. I'll add tests and also contacts with marginaleffects engine.

@strengejacke
Copy link
Member Author

One important thing is that we probably have to include a type dictionary:
https://github.com/vincentarelbundock/marginaleffects/blob/main/R/type_dictionary.R

Else, we have other default types, leading to different results compared to emmeans or marginaleffects. See examples here:
#273 (comment)

@strengejacke
Copy link
Member Author

strengejacke commented Dec 15, 2024

Ok, I added a lot of tests (still more to come, for now, all tests pass) and I think we now have the marginal means equivalent to emmeans with the marginaleffects-backend.

And we even have "marginal predictions" for random effects, which is what we should have, IMHO.

@strengejacke
Copy link
Member Author

I think we could simplify estimate_contrats(), see new get_marginalcontrasts(). And I think we don't need the contrast argument, since by does this task. And we could then use contrast instead of method, because we can now also have flexible contrasts like (b1-b3)=(b2-b4) - in this case, contrast is a more appropriate name than method. WDYT?

@DominiqueMakowski

This comment was marked as outdated.

@strengejacke

This comment was marked as outdated.

@strengejacke

This comment was marked as outdated.

@strengejacke
Copy link
Member Author

What is the purpose of fixed? Can't we just use by directly?

dat <- mtcars
dat[c("gear", "vs", "am")] <- lapply(dat[c("gear", "vs", "am")], as.factor)
dat$am
#>  [1] 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1
#> Levels: 0 1
model <- lm(mpg ~ gear * vs * am, data = dat)
modelbased::estimate_means(model, fixed = "am")
#> We selected `by = c("gear", "vs", "am")`.
#> Estimated Marginal Means
#> 
#> am | gear | vs |  Mean |   SE |         95% CI
#> ----------------------------------------------
#> 0  |    3 |  0 | 15.05 | 1.04 | [12.90, 17.20]
#> 0  |    4 |  0 |       |      |               
#> 0  |    5 |  0 |       |      |               
#> 0  |    3 |  1 | 20.33 | 2.09 | [16.03, 24.63]
#> 0  |    4 |  1 | 21.05 | 1.81 | [17.33, 24.77]
#> 0  |    5 |  1 |       |      |               
#> 
#> Marginal means estimated at gear
modelbased::estimate_means(model, by = c("gear", "vs", "am = '0'"))
#> Estimated Marginal Means
#> 
#> gear | vs | am |  Mean |   SE |         95% CI
#> ----------------------------------------------
#> 3    |  0 |  0 | 15.05 | 1.04 | [12.90, 17.20]
#> 4    |  0 |  0 |       |      |               
#> 5    |  0 |  0 |       |      |               
#> 3    |  1 |  0 | 20.33 | 2.09 | [16.03, 24.63]
#> 4    |  1 |  0 | 21.05 | 1.81 | [17.33, 24.77]
#> 5    |  1 |  0 |       |      |               
#> 
#> Marginal means estimated at gear

Created on 2024-12-15 with reprex v2.1.1

@DominiqueMakowski
Copy link
Member

fixed was just to "fix" variables at specific values, i.e. explicitly fix them without marginalizing over them

@strengejacke
Copy link
Member Author

Yes, but see my example, where you get the same result with by

@DominiqueMakowski
Copy link
Member

ah yes fair then, yes I agree it makes sense to streamline the API here. For contrast I'm less sure, I think it makes sense for users to specify explicitly the variables we want to contrast and dissociate that from the rest

@strengejacke
Copy link
Member Author

ah yes fair then, yes I agree it makes sense to streamline the API here. For contrast I'm less sure, I think it makes sense for users to specify explicitly the variables we want to contrast and dissociate that from the rest

ok. what about changing method to test? That's a more generic name for it, allowing to use the flexible options from marginaleffects for hypothesis.

@strengejacke
Copy link
Member Author

ok, when these tests pass, I'd like to merge this PR and open a new one for contrasts, and then a new one for slopes.

Tests are validated against our emmeans-engine, so we can - for estimate_means() - safely change the default backend to "marginaleffects", if you like.

Copy link

codecov bot commented Dec 15, 2024

Codecov Report

Attention: Patch coverage is 78.87324% with 15 lines in your changes missing coverage. Please review.

Project coverage is 37.22%. Comparing base (38a93b7) to head (7c4f4d4).
Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
R/get_marginalcontrasts.R 0.00% 9 Missing ⚠️
R/get_marginaleffects_type.R 68.75% 5 Missing ⚠️
R/get_marginalmeans.R 97.29% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #275      +/-   ##
==========================================
+ Coverage   36.21%   37.22%   +1.00%     
==========================================
  Files          25       27       +2     
  Lines        1226     1209      -17     
==========================================
+ Hits          444      450       +6     
+ Misses        782      759      -23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

estimate_means(backend = "marginaleffects") not calculating marginal means
2 participants