-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify multilevel
argument
#253
Comments
yeah that sounds good to me. Though we should do a soft deprecation first with a warning and leave it for some time (as this is probably quite a popular feature of the package). Interestingly, I had the same confusion about multilevel factor/SEM analysis. For me, and in my field, "multilevel" is used as a synonym for mixed models (random factors models). And some day I wanted to have RE in my SEM and FA, so I looked for it and was thrilled when I saw multilevel FAs... followed by a disappointment when I understood it was "just" a stratified analysis. So I can understand how users coming from the opposite direction would have the same confusion... So yeah, making things more explicit is good. We should probably think on overhauling the whole factor treatment, we could have multiple arguments like |
I'm thinking of shifting to an explicit declaration of which variables should be partialed or semipartialed, which would make a lot of the arguments easier to manage together |
Is this still relevant? My script basically centres variables within- and between clusters (similar to |
I could easily see this as a feature. I think it would be nice as either a different correlation method or a separate function, because if I understand it has a different output: it returns two correlation indices (between and within) is that correct?
I agree with that, moving forward we probably would need to rethink how to make it API more explicit and flexible and less confusing |
Yes that's exactly right. For example, it could return one correlation matrix (and table with other statistics) for the within- correlations and a second separate correlation matrix for the between- correlations. I also think a nice implementation of the |
I find the
multilevel
argument name confusing, and there have been several issues from users lately that have expressed similar confusion.Based on the name, I would expect a decomposition of the correlation matrix into between-groups and within-groups components, similar to
psych::statsBy()
. The between-groups component is correlations among group means, the within-groups component is the pooled within-group correlation matrix (computed as the correlations among group-mean-centered variables). This is what is typically meant in my experience (at least in psychology circles) by phrases like "multilevel factor analysis", "multilevel SEM", or "multilevel correlations".The
multilevel
argument computes what is effectively the within-groups component described above, but estimated using random effects (random intercepts for group) rather than fixed effects (group-mean-centering or including groups as dummy-coded variables). Both fixed and random specifications of this adjustment are "multilevel" in the sense that they are estimating average within-group correlations, but we currently do not report the between component of the multilevel correlations in either specification.I think it would be clearer for the argument to be named something like
random_factors
. This would make it clearer to me that what this argument is switching is how factors are partialed out.Estimating correct point estimates/df/p/CIs for both within-group and between-group correlations is easy for fixed factor controls (known analytic solutions).
For random factor controls, we can get reasonable point estimates/df/p/CIs for within-correlation using our current estimation approach and some choice of profile likelihood or DoF approximation, or we can be close enough I'd argue by just using the fixed effects df. For between-correlations, we can either (1) pivot to a long format and fit a model with
0 + name + (0 + name | id)
and get the correlation from there, then use profile likelihood for the CI, or (2) use our current estimation approach, estimate random effects for persons, and then compute the correlations among those post-hoc, using the fixed effects df. The second option there is probably close enough.The text was updated successfully, but these errors were encountered: