Feature request: option to pass vcov to `add_global_p(anova_fun = tidy_wald_test)` for robust p-values #2076

aghaynes · 2024-11-21T11:02:00Z

Is your feature request related to a problem? Please describe.
When adjusting variances via e.g. clustered variances, it's possible to adjust the confidence interval, but not the p-value.

tbl_regression(m, exponentiate = TRUE, 
               tidy_fun = \(x, ...) 
               tidy_robust(
                 x,
                 exponentiate = TRUE,
                 conf.level = 0.95,
                 conf.int = TRUE,
                 vcov = cov
               )
               ) %>% 
  # get the results of the wald test
  add_global_p(anova_fun = tidy_wald_test)

Describe the solution you'd like
It would be great to be able to pass a vcov to (e.g. cov in the above code) to tidy_wald_test so that aod::wald.test can also use it.

e.g.

... |>
add_global_p(anova_fun = tidy_wald_test, vcov = cov)

Describe alternatives you've considered
@mroumet and I managed to backwards engineer a solution that takes a cov object from the global environment. Maybe it's possible to pass it as an option instead though. I indicate the modification we made with an arrow.

tidy_wald_test2 <- 
  function (x, tidy_fun = NULL,...) {
    # assert_package("aod", "tidy_wald_test()")
    tidy_fun <- tidy_fun %||% broom.helpers::tidy_with_broom_or_parameters
    broom.helpers::tidy_and_attach(model = x, tidy_fun = tidy_fun) %>% 
      broom.helpers::tidy_identify_variables() %>% dplyr::select(term = "variable", 
                                                                 model_terms = "term") %>% dplyr::mutate(term_id = dplyr::row_number()) %>% 
      tidyr::nest(data = -"term") %>% dplyr::rowwise() %>% 
      dplyr::mutate(model_terms = unlist(.data$data[["model_terms"]]) %>% 
                      list(), model_terms_id = rlang::set_names(.data$data[["term_id"]]) %>% 
                      list(), wald_test = aod::wald.test(Sigma = cov,       ########## <----------------------------
                                                         b = stats::coef(x), Terms = .data$model_terms_id) %>% 
                      list(), df = .data$wald_test$result$chi2 %>% purrr::pluck("df"), 
                    statistic = .data$wald_test$result$chi2 %>% purrr::pluck("chi2"), 
                    p.value = .data$wald_test$result$chi2 %>% purrr::pluck("P"), 
      ) %>% dplyr::ungroup() %>% dplyr::select("term", 
                                               "df", "statistic", "p.value")
  }

... |>
add_global_p(anova_fun = tidy_wald_test2)

I've just found that the proposed edits (passing it via an option) are very easy and seem to work... I'll make a PR shortly.

Additional context

MRE using a example data from the sandwich package

library(sandwich)
library(gtsummary)
library(rlang)

data("InstInnovation", package = "sandwich")

## replication of one-way clustered standard errors for model 3, Table I
## and model 1, Table II in Berger et al. (2017), see ?InstInnovation

## count regression formula
f1 <- cites ~ institutions + log(capital/employment) + log(sales) + year

## model 3, Table I: Poisson model
## one-way clustered standard errors
m <- glm(f1, data = InstInnovation, family = poisson)

# define var-covar
cov <- vcovCL(m, cluster = ~ company, vcov_type = "HC0")

# Define a function to perform Wald test using robust variance covariance matrix----
# The function is actually a slight modification of tidy_wald_test . In this version we define Sigma = cov (where cov is the robust variance-covariance matrix) insted of Sigma = stats::vcov(x).

tidy_wald_test2 <- 
  function (x, tidy_fun = NULL,...) {
    # assert_package("aod", "tidy_wald_test()")
    tidy_fun <- tidy_fun %||% broom.helpers::tidy_with_broom_or_parameters
    broom.helpers::tidy_and_attach(model = x, tidy_fun = tidy_fun) %>% 
      broom.helpers::tidy_identify_variables() %>% dplyr::select(term = "variable", 
                                                                 model_terms = "term") %>% dplyr::mutate(term_id = dplyr::row_number()) %>% 
      tidyr::nest(data = -"term") %>% dplyr::rowwise() %>% 
      dplyr::mutate(model_terms = unlist(.data$data[["model_terms"]]) %>% 
                      list(), model_terms_id = rlang::set_names(.data$data[["term_id"]]) %>% 
                      list(), wald_test = aod::wald.test(Sigma = cov,      ########## <----------------------------
                                                         b = stats::coef(x), Terms = .data$model_terms_id) %>% 
                      list(), df = .data$wald_test$result$chi2 %>% purrr::pluck("df"), 
                    statistic = .data$wald_test$result$chi2 %>% purrr::pluck("chi2"), 
                    p.value = .data$wald_test$result$chi2 %>% purrr::pluck("P"), 
      ) %>% dplyr::ungroup() %>% dplyr::select("term", 
                                               "df", "statistic", "p.value")
  }

# Get the Robust variance ----
tab <- tbl_regression(m, exponentiate = TRUE, 
               tidy_fun = \(x, ...) 
               tidy_robust(
                 x,
                 exponentiate = TRUE,
                 conf.level = 0.95,
                 conf.int = TRUE,
                 vcov = cov
               )
               ) 
  # get the results of the wald test
tab %>% add_global_p(anova_fun = tidy_wald_test)
tab %>%  add_global_p(anova_fun = tidy_wald_test2)

The text was updated successfully, but these errors were encountered:

ddsjoberg · 2024-11-25T15:42:10Z

Added in #2077

ddsjoberg · 2024-12-01T00:23:31Z

Hey @aghaynes ! Thanks again for the contribution to make to make it easier for robust standard errors!

Would you be open to another contribution? I think an example in our FAQ article illustrating how to use robust standard errors using gtsummary::tidy_robust() for the regular p-value from the model and gtsummary::tidy_wald(vcov) for the variable p-values? If possible, do you think we could make this happen using only packages already listed in the DESCRIPTION file?

What do you think?
https://www.danieldsjoberg.com/gtsummary/articles/gallery.html

aghaynes · 2024-12-02T08:19:40Z

Hey @ddsjoberg! Great idea. I'll have a look. Maybe there's a suitable dataset in aod. For it to be useful, it would probably be necessary to actually see a change in the p-value (and CI?), which isn't always the case with these adjustments (they're often only in the decimal places somewhere).

ddsjoberg · 2024-12-02T18:40:37Z

Awesome, thank you! If {aod} doesn't have something, I sometimes use this as an example for correlated data

gtsummary::trial |> 
  dplyr::mutate(subject_id = dplyr::row_number(), .by = trt)
#> # A tibble: 200 × 9
#>    trt      age marker stage grade response death ttdeath subject_id
#>    <chr>  <dbl>  <dbl> <fct> <fct>    <int> <int>   <dbl>      <int>
#>  1 Drug A    23  0.16  T1    II           0     0    24            1
#>  2 Drug B     9  1.11  T2    I            1     0    24            1
#>  3 Drug A    31  0.277 T1    II           0     0    24            2
#>  4 Drug A    NA  2.07  T3    III          1     1    17.6          3
#>  5 Drug A    51  2.77  T4    III          1     1    16.4          4
#>  6 Drug B    39  0.613 T4    I            0     1    15.6          2
#>  7 Drug A    37  0.354 T1    II           0     0    24            5
#>  8 Drug A    32  1.74  T1    I            0     1    18.4          6
#>  9 Drug A    31  0.144 T1    II           0     0    24            7
#> 10 Drug B    34  0.205 T3    I            0     1    10.5          3
#> # ℹ 190 more rows

^{Created on 2024-12-02 with reprex v2.1.1}

which gives us a data set where each subject received both Drug A and Drug B (there trt column).

aghaynes mentioned this issue Nov 21, 2024

add vcov argument to tidy_wald_test #2077

Merged

11 tasks

ddsjoberg closed this as completed Nov 25, 2024

ddsjoberg reopened this Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: option to pass vcov to `add_global_p(anova_fun = tidy_wald_test)` for robust p-values #2076

Feature request: option to pass vcov to `add_global_p(anova_fun = tidy_wald_test)` for robust p-values #2076

aghaynes commented Nov 21, 2024

ddsjoberg commented Nov 25, 2024

ddsjoberg commented Dec 1, 2024

aghaynes commented Dec 2, 2024

ddsjoberg commented Dec 2, 2024 •

edited

Loading

Feature request: option to pass vcov to add_global_p(anova_fun = tidy_wald_test) for robust p-values #2076

Feature request: option to pass vcov to add_global_p(anova_fun = tidy_wald_test) for robust p-values #2076

Comments

aghaynes commented Nov 21, 2024

ddsjoberg commented Nov 25, 2024

ddsjoberg commented Dec 1, 2024

aghaynes commented Dec 2, 2024

ddsjoberg commented Dec 2, 2024 • edited Loading

Feature request: option to pass vcov to `add_global_p(anova_fun = tidy_wald_test)` for robust p-values #2076

Feature request: option to pass vcov to `add_global_p(anova_fun = tidy_wald_test)` for robust p-values #2076

ddsjoberg commented Dec 2, 2024 •

edited

Loading