Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong DFs in model_parameters.fixest #892

Closed
vincentarelbundock opened this issue Aug 15, 2023 · 20 comments · Fixed by #899
Closed

Wrong DFs in model_parameters.fixest #892

vincentarelbundock opened this issue Aug 15, 2023 · 20 comments · Fixed by #899
Assignees
Labels
3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working

Comments

@vincentarelbundock
Copy link
Contributor

parameters::model_parameters appears to use the wrong degrees of freedom to compute p values in this model, which results in an overly conservative test:

library(parameters)
library(fixest)
library(carData)
d <- Greene
d$dv <- ifelse(Greene$decision == 'yes', 1, 0)

mod <- feglm(dv ~ language | judge,
             data = d,
             cluster = c('judge'), family = 'logit')

# many degrees of freedom
insight::get_df(mod)
# [1] 382

# normal and t give pretty similar results
pnorm(-2.48139) * 2
# [1] 0.01308711

pt(-2.48139, df = insight::get_df(mod)) * 2
# [1] 0.0135165

# but parameters::model_parameters uses 9 DF
pt(-2.48139, df = 9) * 2
# [1] 0.03491173

model_parameters(mod)
# # Fixed Effects
# 
# Parameter         | Log-Odds |   SE |         95% CI |     z | df |     p
# -------------------------------------------------------------------------
# language [French] |    -0.68 | 0.28 | [-1.31, -0.06] | -2.48 |  9 | 0.035
# 
@bwiernik
Copy link
Contributor

I haven't dug into this, but I would assume the correct df here is 9 given that it's a fixed effects estimate, so the get_dof() is what I would expect to be wrong

@vincentarelbundock
Copy link
Contributor Author

I don’t think so. This feglm command is equivalent to estimating a glm() model with +factor(judge). That introduces 10 additional dummies, so we are estimating at most 11 parameters. Going from 384 observations to 9 degrees of freedom would be a steep penalty…

@lrberge, what is the best way for us to extract the relevant degrees of freedom to compute the p value with pt()?

library(parameters)
library(fixest)
library(carData)

d <- Greene
d$dv <- ifelse(Greene$decision == 'yes', 1, 0)

# These two models are equivalent
coef(feglm(dv ~ language | judge,
             data = d,
             family = 'logit'))["languageFrench"]
# languageFrench 
#     -0.6837183
coef(glm(dv ~ language + factor(judge), d, family = binomial))["languageFrench"]
# languageFrench 
#     -0.6837183

@bwiernik
Copy link
Contributor

That's fair--like I said I hadn't dug into the example--what's the p value reported by the package itself?

@vincentarelbundock
Copy link
Contributor Author

what's the p value reported by the package itself?

Not 100% sure, but the package reports a much smaller p, so it's either using pnorm() or a large df.

@bwiernik
Copy link
Contributor

I can investigate this weekend

@lrberge
Copy link

lrberge commented Aug 15, 2023

Degrees of freedom are tricky because they depend on the type of standard error computed. In particular, when SEs are clustered, the variables nested in the clusters are discarded from DoF calculation. So that they differ from the regular DoF we have in mind.

I think that's what happens in your example. To get the DoF as used in fixest, there is the function degrees_freedom. To get the one we have in mind (N - all vars), there is degrees_freedom_iid.

I hope it helps!

@strengejacke
Copy link
Member

strengejacke commented Aug 15, 2023

There are three types of df in fixest::degrees_freedom(), in parameters we use type "t" for wald, and "resid" for residual" df. This works for ci() and p_value(), but does not seem to be passed by model_parameters(). The default used in fixest seems residual-df, i.e. 382.

library(parameters)
library(fixest)
#> 
#> Attaching package: 'fixest'
#> The following objects are masked from 'package:parameters':
#> 
#>     demean, dof
library(carData)
d <- Greene
d$dv <- ifelse(Greene$decision == 'yes', 1, 0)

mod <- feglm(dv ~ language | judge,
             data = d,
             cluster = c('judge'), family = 'logit')

summary(mod)
#> GLM estimation, family = binomial(link = "logit"), Dep. Var.: dv
#> Observations: 384 
#> Fixed-effects: judge: 10
#> Standard-errors: Clustered (judge) 
#>                 Estimate Std. Error  t value Pr(>|t|)    
#> languageFrench -0.683718   0.275539 -2.48139 0.013087 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -211.1   Adj. Pseudo R2: 0.053406
#>            BIC:  487.6     Squared Cor.: 0.114888

ci(mod, method = "wald")
#>        Parameter   CI   CI_low     CI_high
#> 1 languageFrench 0.95 -1.30703 -0.06040618
ci(mod, method = "residual")
#>        Parameter   CI    CI_low    CI_high
#> 1 languageFrench 0.95 -1.225481 -0.1419557

p_value(mod, method = "wald")
#>        Parameter          p
#> 1 languageFrench 0.03491193
p_value(mod, method = "residual")
#>        Parameter          p
#> 1 languageFrench 0.01351663

model_parameters(mod, ci_type = "wald")
#> # Fixed Effects
#> 
#> Parameter         | Log-Odds |   SE |         95% CI |     z | df |     p
#> -------------------------------------------------------------------------
#> language [French] |    -0.68 | 0.28 | [-1.31, -0.06] | -2.48 |  9 | 0.035
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald z-distribution approximation.
#> 
#> The model has a log- or logit-link. Consider using `exponentiate =
#>   TRUE` to interpret coefficients as ratios.
model_parameters(mod, ci_type = "residual")
#> # Fixed Effects
#> 
#> Parameter         | Log-Odds |   SE |         95% CI |     z | df |     p
#> -------------------------------------------------------------------------
#> language [French] |    -0.68 | 0.28 | [-1.31, -0.06] | -2.48 |  9 | 0.035
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald z-distribution approximation.

Created on 2023-08-15 with reprex v2.0.2

@strengejacke
Copy link
Member

What would be good to know is a) whether fixest() always return a t-value (and not z) as test statistic, and if not, b), what is the best way to determine, which test statistic was used? insight::find_statistic() currently returns "z-statistic" for the above model, although the output states t value.

I guess, fixing insight::find_statistic() and probably insight::get_dof() first, and then revising the methods in parameters (defaulting to ci_method = "residuals") should be the next steps to resolve this issue.

@strengejacke
Copy link
Member

(cf.

degrees_of_freedom.fixest <- function(model, method = "wald", ...) {
)

@lrberge
Copy link

lrberge commented Aug 15, 2023

I'm on holidays w/t computer so can't be very thorough and sorry my first reply was off, but:

  • the "conservative DF" of the OP is the one fixest uses for the t test. This is governed by how the vcov is computed, see ssc argument t.df. If the user passes ssc(t.df="conv") when computing the vcov, we end up with stg close to the normal with large N and clustered SEs (the one the OP had in mind). The ref on this is a paper by the usual suspects Cameron et al on clustered SEs.
  • on z vs t: feols is t, for feglm it depends on the model. It actually mimics glm so it should be t for Poisson and logit (methinks)
  • on the best way to find out... I don't know on the top of my head. It's in the column names of the $coeftable item for sure but there may be better ways

@strengejacke
Copy link
Member

There's a mismatch between summary() and $coeftable:

library(fixest)
library(carData)

d <- Greene
d$dv <- ifelse(Greene$decision == 'yes', 1, 0)

# These two models are equivalent
m <- feglm(dv ~ language | judge,
             data = d,
             family = 'logit')

# z-value?
m$coeftable
#>                  Estimate Std. Error   z value   Pr(>|z|)
#> languageFrench -0.6837183  0.3241979 -2.108953 0.03494862

# t-value?
summary(m)
#> GLM estimation, family = binomial(link = "logit"), Dep. Var.: dv
#> Observations: 384 
#> Fixed-effects: judge: 10
#> Standard-errors: Clustered (judge) 
#>                 Estimate Std. Error  t value Pr(>|t|)    
#> languageFrench -0.683718   0.275539 -2.48139 0.013087 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -211.1   Adj. Pseudo R2: 0.053406
#>            BIC:  487.6     Squared Cor.: 0.114888

Created on 2023-08-23 with reprex v2.0.2

@lrberge
Copy link

lrberge commented Aug 28, 2023

Indeed, it's normal. Since computing VCOVs is costly, it is something that is delayed in fixest until the user asks to to something with the object (and it does impact coeftable). See here: .

Once you pass summary(), it's the "right" standard-errors.

I advise in the doc ()I think but maybe not! to use user-level methods to access the objects, like the function coeftable, but it's not super clear, and, importantly it is not the standard in R (where we're used to access stuff directly in the object).

Sorry for the implementation that creates confusion.

@strengejacke
Copy link
Member

library(parameters)
library(fixest)
#> 
#> Attaching package: 'fixest'
#> The following objects are masked from 'package:parameters':
#> 
#>     demean, dof
library(carData)
d <- Greene
d$dv <- ifelse(Greene$decision == 'yes', 1, 0)

mod <- feglm(dv ~ language | judge,
             data = d,
             cluster = c('judge'), family = 'logit')

summary(mod)
#> GLM estimation, family = binomial(link = "logit"), Dep. Var.: dv
#> Observations: 384 
#> Fixed-effects: judge: 10
#> Standard-errors: Clustered (judge) 
#>                 Estimate Std. Error  t value Pr(>|t|)    
#> languageFrench -0.683718   0.275539 -2.48139 0.013087 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -211.1   Adj. Pseudo R2: 0.053406
#>            BIC:  487.6     Squared Cor.: 0.114888

model_parameters(mod)
#> # Fixed Effects
#> 
#> Parameter         | Log-Odds |   SE |         95% CI | t(382) |     p
#> ---------------------------------------------------------------------
#> language [French] |    -0.68 | 0.28 | [-1.23, -0.14] |  -2.48 | 0.014
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.
#> 
#> The model has a log- or logit-link. Consider using `exponentiate =
#>   TRUE` to interpret coefficients as ratios.

Created on 2023-09-09 with reprex v2.0.2

Do we have any examples for z-statistic models?

@strengejacke
Copy link
Member

(requires easystats/insight#802 to be installed to work correctly)

@strengejacke
Copy link
Member

strengejacke commented Sep 9, 2023

Ok, this is confusing. When do I know that when I have a t-statistic, whether we need residual- or t-df to compute p-values?

library(parameters)
library(fixest)
#> 
#> Attaching package: 'fixest'
#> The following objects are masked from 'package:parameters':
#> 
#>     demean, dof
library(carData)

data("qol_cancer")
d <- Greene
d$dv <- ifelse(Greene$decision == "yes", 1, 0)
qol_cancer <- cbind(
  qol_cancer,
  datawizard::demean(qol_cancer, select = c("phq4", "QoL"), group = "ID")
)

mod1 <- feglm(dv ~ language | judge,
  data = d,
  cluster = c("judge"), family = "logit"
)
mod2 <- fixest::feols(
  QoL ~ time + phq4 | ID,
  data = qol_cancer
)

summary(mod1)
#> GLM estimation, family = binomial(link = "logit"), Dep. Var.: dv
#> Observations: 384 
#> Fixed-effects: judge: 10
#> Standard-errors: Clustered (judge) 
#>                 Estimate Std. Error  t value Pr(>|t|)    
#> languageFrench -0.683718   0.275539 -2.48139 0.013087 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Log-Likelihood: -211.1   Adj. Pseudo R2: 0.053406
#>            BIC:  487.6     Squared Cor.: 0.114888
statistic <- insight::get_statistic(mod1)$Statistic
statistic
#> [1] -2.481386

# no "t" df here, mismatch between p and p from summary
degrees_freedom(mod1, type = "t")
#> [1] 9
dof <- 9
2 * stats::pt(abs(statistic), df = dof, lower.tail = FALSE)
#> [1] 0.03491193

# resid-df are close and *almost* match "p" from summary
degrees_freedom(mod1, type = "resid")
#> [1] 382
dof <- 382
2 * stats::pt(abs(statistic), df = dof, lower.tail = FALSE)
#> [1] 0.01351663

# seems like Inf seems correct here
dof <- Inf
2 * stats::pt(abs(statistic), df = dof, lower.tail = FALSE)
#> [1] 0.01308724

summary(mod2)
#> OLS estimation, Dep. Var.: QoL
#> Observations: 564 
#> Fixed-effects: ID: 188
#> Standard-errors: Clustered (ID) 
#>      Estimate Std. Error  t value   Pr(>|t|)    
#> time  1.08785   0.669197  1.62561 1.0572e-01    
#> phq4 -3.66052   0.671116 -5.45438 1.5425e-07 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 10.1     Adj. R2: 0.613325
#>              Within R2: 0.179856
statistic <- insight::get_statistic(mod2)$Statistic
statistic
#> [1]  1.625609 -5.454377

# now t-df are correct and match "p" from summary
degrees_freedom(mod2, type = "t")
#> [1] 187
dof <- 187
2 * stats::pt(abs(statistic), df = dof, lower.tail = FALSE)
#> [1] 1.057170e-01 1.542519e-07

# resid-df are *not* correct and do not match "p" from summary
degrees_freedom(mod2, type = "resid")
#> [1] 561
dof <- 561
2 * stats::pt(abs(statistic), df = dof, lower.tail = FALSE)
#> [1] 1.045945e-01 7.379301e-08

# Inf not correct
dof <- Inf
2 * stats::pt(abs(statistic), df = dof, lower.tail = FALSE)
#> [1] 1.040329e-01 4.914483e-08

Created on 2023-09-09 with reprex v2.0.2

@strengejacke
Copy link
Member

To summarize:

  • feglm, df = Inf, test-statistic = t
  • feols, df = resid, test-static = t

@strengejacke
Copy link
Member

So we need to check

> mod1$method
[1] "feglm"
> mod1$method_type
[1] "feglm"
> mod2$method
[1] "feols"
> mod2$method_type
[1] "feols"

?

strengejacke added a commit that referenced this issue Sep 9, 2023
strengejacke added a commit that referenced this issue Sep 9, 2023
* Wrong DFs in `model_parameters.fixest`
Fixes #892

* desc, version

* Wrong DFs in `model_parameters.fixest`
Fixes #892

* add test

* fix test

* fix
@lrberge
Copy link

lrberge commented Sep 9, 2023

The z value is used for binomial and Poisson models (quasi families use t). This is identical to what stats::glm does.

To be clear:

library(fixest)

# tiny data set
base = setNames(head(iris, 10), c("y", "x1", "x2", "x3", "species"))
set.seed(1)
base$y01 = sample(0:1, nrow(base), TRUE)

gather_pvalues = function(family){
    est_fixest = feglm(y01 ~ x1 + x2, base, family = family, vcov = iid ~ ssc(adj = FALSE))
    est_glm    =   glm(y01 ~ x1 + x2, base, family = family)
    
    tstat = tstat(est_fixest, keep = "x1")
    
    res = c(fixest = pvalue(est_fixest, keep = "x1"),
            glm = pvalue(est_glm, keep = "x1"),
            normal = pnorm(-abs(tstat)) * 2,
            "t" = pt(-abs(tstat), degrees_freedom(est_fixest, "resid")) * 2)
    
    round(res, 4)
}

# N = normal; t = Student
cbind("logit (N)" = gather_pvalues(binomial()),
      "probit (N)" = gather_pvalues(binomial("probit")),
      "poisson (N)" = gather_pvalues(poisson()),
      "quasipoisson (t)" = gather_pvalues(quasipoisson()),
      "gaussian (t)" = gather_pvalues(gaussian()))
#>           logit (N) probit (N) poisson (N) quasipoisson (t) gaussian (t)
#> fixest.x1    0.2139     0.2099      0.3168           0.3275       0.2707
#> glm.x1       0.2139     0.2099      0.3168           0.3275       0.2707
#> normal.x1    0.2139     0.2099      0.3168           0.2925       0.2318
#> t.x1         0.2539     0.2502      0.3501           0.3275       0.2707

BTW there was a century old display bug in fixest output.... (z values displaying as t values.) So using the names of the coeftable certainly didn't work. Thanks for making me fix it!

@strengejacke
Copy link
Member

thanks, that helped me understand how to fix this in parameters!

@vincentarelbundock
Copy link
Contributor Author

thanks all for your work and insights into this. I really appreciate your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants