Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve plot() for check_predictions() #290

Closed
strengejacke opened this issue May 23, 2023 · 11 comments · Fixed by #291
Closed

Improve plot() for check_predictions() #290

strengejacke opened this issue May 23, 2023 · 11 comments · Fixed by #291
Assignees
Labels
Enhancement 💥 New feature or request

Comments

@strengejacke
Copy link
Member

strengejacke commented May 23, 2023

For models with integer/categorical outcomes, we could use bars instead of density lines, like in:

https://solomonkurz.netlify.app/blog/2023-05-21-causal-inference-with-ordinal-regression/

image

@strengejacke strengejacke added the Enhancement 💥 New feature or request label May 23, 2023
@strengejacke strengejacke self-assigned this May 24, 2023
strengejacke added a commit that referenced this issue May 24, 2023
strengejacke added a commit that referenced this issue May 24, 2023
@strengejacke
Copy link
Member Author

library(performance)
library(see)
mtcars$vs2 <- factor(letters[mtcars$vs + 1])
model <- glm(vs2 ~ wt + mpg, data = mtcars, family = "binomial")

plot(check_predictions(model))

plot(check_predictions(model), type = "dots")

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12)
outcome <- gl(3, 1, 9)
treatment <- gl(3, 3)
d <- data.frame(treatment, outcome, counts) # showing data
model <- glm(counts ~ outcome + treatment, family = poisson(), data = d)

plot(check_predictions(model))

plot(check_predictions(model), type = "dots")

Created on 2023-05-24 with reprex v2.0.2

@strengejacke
Copy link
Member Author

library(performance)
library(see)
library(glmmTMB)

data(Salamanders)
m1 <- glmmTMB(count ~ mined + (1 | site),
  zi = ~mined,
  family = poisson, data = Salamanders
)

plot(check_predictions(m1))

plot(check_predictions(m1), type = "dots")

Created on 2023-05-24 with reprex v2.0.2

@strengejacke
Copy link
Member Author

Maybe this example is quite good to demonstrate why dots can be more useful than lines in some situations:

library(performance)
library(see)

set.seed(99)
d <- iris
d$skewed <- rpois(150, 1)
m2 <- glm(skewed ~ Species + Petal.Length + Petal.Width, data = d, family = poisson())
out <- check_predictions(m2)
plot(out)

plot(out, type = "dots")

Created on 2023-05-24 with reprex v2.0.2

@bwiernik
Copy link
Contributor

Thanks for taking the initiative on this! Really happy to see this added.

I like the use of the green dot and thin line rather than the bars for the observed data. It feels more consistent with the density.

Comparing the design of the bayesplot::ppc_bars() (what's shown in the first post), I can see the benefits of both the "replications as median+interval" and "replications as dots" options. I think the median+interval option is easier for a quick check (is the observed value in the interval), but the

Could we support both? Perhaps have options type = c("density", "discrete_dots", "discrete_interval", "discrete_both"), where "density" is the original continuous density method, "discrete_dots" is what you have implemented, "discrete interval" replaces the dots with a cross-bar with the median replication value and a ci percent interval bar, and "discrete_both" has both dots and intervals?

For all of these options, I think the green line for observed data should be added last to the plot so that it is on top and not obscured by the replication lines/dots.

@DominiqueMakowski
Copy link
Member

The dot plot lacks a legend to clarify which points are "model predictions" and which is the observed one.

@strengejacke
Copy link
Member Author

True, why didn't I realize that? 🙄

@strengejacke
Copy link
Member Author

library(performance)
library(see)

set.seed(99)
d <- iris
d$skewed <- rpois(150, 1)
m2 <- glm(skewed ~ Species + Petal.Length + Petal.Width, data = d, family = poisson())
out <- check_predictions(m2)

plot(out)
#> The model has an integer or a categorical response variable.
#>   It is recommended to switch to a dot-plot style, e.g.
#>   `plot(check_model(model), type = "discrete_dots"`.

plot(out, type = "discrete_dots")

plot(out, type = "discrete_interval")

plot(out, type = "discrete_both")

Created on 2023-05-25 with reprex v2.0.2

@strengejacke
Copy link
Member Author

What would you suggest as default for models with integer/ordinal/binary/... outcome?

@strengejacke
Copy link
Member Author

See the updated subtitle for intervals:

library(performance)
library(see)

set.seed(99)
d <- iris
d$skewed <- rpois(150, 1)
m2 <- glm(skewed ~ Species + Petal.Length + Petal.Width, data = d, family = poisson())
out <- check_predictions(m2)
plot(out, type = "discrete_interval")

Created on 2023-05-25 with reprex v2.0.2

@strengejacke
Copy link
Member Author

Maybe simple dots are enough for the observed data points?

library(performance)
library(see)

set.seed(99)
d <- iris
d$skewed <- rpois(150, 1)
m2 <- glm(skewed ~ Species + Petal.Length + Petal.Width, data = d, family = poisson())
out <- check_predictions(m2)
plot(out)
#> The model has an integer or a categorical response variable.
#>   It is recommended to switch to a dot-plot style, e.g.
#>   `plot(check_model(model), type = "discrete_dots"`.

plot(out, type = "discrete_dots")

plot(out, type = "discrete_interval")

plot(out, type = "discrete_both")

Created on 2023-05-25 with reprex v2.0.2

@strengejacke
Copy link
Member Author

strengejacke commented May 25, 2023

library(performance)
library(see)

set.seed(99)
d <- iris
d$skewed <- rpois(150, 1)
m2 <- glm(skewed ~ Species + Petal.Length + Petal.Width, data = d, family = poisson())
out <- check_predictions(m2)
plot(out) + theme_blackboard()
#> The model has an integer or a categorical response variable.
#>   It is recommended to switch to a dot-plot style, e.g.
#>   `plot(check_model(model), type = "discrete_dots"`.

plot(out, type = "discrete_dots") + theme_blackboard()

plot(out, type = "discrete_interval") + theme_blackboard()

plot(out, type = "discrete_both") + theme_blackboard()

Created on 2023-05-25 with reprex v2.0.2

strengejacke added a commit that referenced this issue May 25, 2023
* Improve plot() for check_predictions()
Fixes #290

* Improve plot() for check_predictions()
Fixes #290

* allow type arg

* minor

* fix y axis label

* add example

* example

* fix some issues

* add discrete_options

* subtitle

* set subtitle

* no lollipop

* return value

* add namespace

* fix issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement 💥 New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants