-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Difference between "significant' and "not significant" is not itself statistically significant #67
Comments
I've spent some time thinking about this basic interpretational issue in the past although I confess I've forgotten some of it. I think I've seen some discussion about this very thing somewhere. It's sort of a weird issue: The interaction term is itself a test of differences in slopes, so it is directly addressing the problem at the heart of that paper I so often cite in my peer review reports. On the other hand, it's not the case that the statistical test of the interaction proves some important distinction between the points along the x-axis immediately before and after the transition from blue to pink (or however it may be signified). But then, at what threshold are we allowed to say there is a significant difference between the slopes at different values of X? |
Yeah, I agree that this stuff can get super confusing. Here’s how I have come to think about this. There are basically 3 scientific questions. Each requires a different test. None of them requires us to compare the different color regions in a Johnson-Neyman plot.
Consider this model: library(ggplot2)
library(interactions)
library(marginaleffects)
mod <- lm(mpg ~ hp * wt, data = mtcars) Question 1: Does the slope of Y with respect to X depend on the value of Z?In an linear model like this, the answer to this question can be read off immediately from the interaction coefficient: summary(mod)
#
# Call:
# lm(formula = mpg ~ hp * wt, data = mtcars)
#
# Residuals:
# Min 1Q Median 3Q Max
# -3.0632 -1.6491 -0.7362 1.4211 4.5513
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 49.80842 3.60516 13.816 5.01e-14 ***
# hp -0.12010 0.02470 -4.863 4.04e-05 ***
# wt -8.21662 1.26971 -6.471 5.20e-07 ***
# hp:wt 0.02785 0.00742 3.753 0.000811 ***
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Residual standard error: 2.153 on 28 degrees of freedom
# Multiple R-squared: 0.8848, Adjusted R-squared: 0.8724
# F-statistic: 71.66 on 3 and 28 DF, p-value: 2.981e-13 The Question 2: Is the slope of Y with respect to X different from 0 when Z=1?The answer to question 2 can be read off of the interation plot: plot_slopes(mod, variables = "hp", condition = "wt") +
geom_hline(yintercept = 0, color = "orange") For any point on the x-axis, we know if the slopes of This is equivalent to this command, but I intentially drew the plot without colors to emphasize that all we need is the interval to answer Question 2. johnson_neyman(mod, "hp", "wt") Question 3: Is the slope of Y with respect to X different when Z=1 or Z=2?This is just a restatement of Question 1. We already know the answer: Yes, slopes(mod, variables = "hp", newdata = datagrid(wt = c(3.5, 4)))
#
# Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % hp wt
# hp -0.02263 0.00788 -2.872 0.00408 -0.0381 -0.00719 147 3.5
# hp -0.00871 0.00969 -0.899 0.36886 -0.0277 0.01029 147 4.0
#
# Columns: rowid, term, estimate, std.error, statistic, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo, mpg, hp, wt The above gives use slopes at two points. We can compare them with the slopes(mod,
variables = "hp",
newdata = datagrid(wt = c(3.5, 4)),
hypothesis = "b1 = b2")
#
# Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
# b1=b2 -0.0139 0.00371 -3.75 <0.001 -0.0212 -0.00665
#
# Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high Again, we confirm that the two slopes are different from each other. And this is trivially true for any small change on the x-axis, just because the interaction coefficient is significant. slopes(mod,
variables = "hp",
newdata = datagrid(wt = c(3, 3.0001)),
hypothesis = "b1 = b2")
#
# Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
# b1=b2 -2.78e-06 7.94e-07 -3.51 <0.001 -4.34e-06 -1.23e-06
#
# Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high My conclusionIt seems to me like the standard approach already gives us all the tools to answer the basic questions of interest precisely and correctly. And I’m not sure what additional question the Johnson-Neyman plot answers. |
I just learned about Johnson-Neyman plot this morning when I saw a Stack Overflow question about the
interactions
package.I wonder if it would be a good idea for the documentation to refer to the classic Gelman & Stern paper The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant
This is obviously not a technical issue with your package (which is great!), but highlighting significant and non-significant regions with sharply different colors might encourage bad statistical practice.
The text was updated successfully, but these errors were encountered: