Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need response variable in new data for add_pi #45

Open
nlichti opened this issue Dec 5, 2019 · 4 comments
Open

need response variable in new data for add_pi #45

nlichti opened this issue Dec 5, 2019 · 4 comments

Comments

@nlichti
Copy link

nlichti commented Dec 5, 2019

Very useful package. I have a minor suggestion: in add_pi (and possibly other functions - I haven't checked), an error is thrown if tb does not include a column for the response variable. The actual values in the column are ignored, but the column has to be present. I'd guess this is due to some internal code similar to:
X <- formula(fit) %>% model.matrix(data = tb)
used to get the design matrix for simulation-based prediction intervals. Using a few more steps should eliminate the need to include the response. Something like:
chr_formula <- formula(fit) %>% deparse() %>% strsplit(' ') %>% getElement(1)
X <- as.formula(chr_formula[-1]) %>% model.matrix(data = tb)
I noticed this specifically with a Poisson GLM. add_ci did not require the response to be in tb.

@jthaman
Copy link
Owner

jthaman commented Aug 19, 2020

Thanks for this note. I will take a look.

@FlukeAndFeather
Copy link

I think I might be running into a bug related to this. Here's a reprex:

library(tidyverse)
mod <- glm(mpg ~ disp, family = Gamma(), data = mtcars)
tibble(disp = seq(min(mtcars$disp), max(mtcars$disp), length.out = 10)) %>% 
    ciTools::add_pi(mod)
#> Error in model.frame.default(formula = mpg ~ disp, data = structure(list(: invalid type (list) for variable 'mpg'

Created on 2021-09-05 by the reprex package (v2.0.0)

Weirdly, no bug if I leave out the family argument:

library(tidyverse)
mod <- glm(mpg ~ disp, data = mtcars)
tibble(disp = seq(min(mtcars$disp), max(mtcars$disp), length.out = 10)) %>% 
    ciTools::add_pi(mod)
#>        disp     pred  LPB0.025 UPB0.975
#> 1   71.1000 26.66946 19.753419 33.58550
#> 2  115.6444 24.83356 17.999921 31.66719
#> 3  160.1889 22.99765 16.220266 29.77504
#> 4  204.7333 21.16175 14.413797 27.90969
#> 5  249.2778 19.32584 12.580164 26.07152
#> 6  293.8222 17.48994 10.719340 24.26053
#> 7  338.3667 15.65403  8.831623 22.47644
#> 8  382.9111 13.81813  6.917619 20.71864
#> 9  427.4556 11.98222  4.978206 18.98624
#> 10 472.0000 10.14632  3.014492 17.27814

Created on 2021-09-05 by the reprex package (v2.0.0)

@akarlinsky
Copy link

Ran into this bug as well.

@akarlinsky
Copy link

Anyone find a way around it? I can't estimate a PI due to this bug. I tried estimating the glm with y=TRUE to keep the dependent variable in the model object.
I also tried creating a dependent variable column.
No luck :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants