Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some model components #5

Merged
merged 107 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
6491e64
remove extra files
kaitejohnson Jun 26, 2024
342701a
add a .gitignore with R and data stuff
kaitejohnson Jun 26, 2024
4685a13
add git ignore from cfa-repo-template
kaitejohnson Jun 26, 2024
f66494a
make .github folder mirror cfa-repo-template
kaitejohnson Jun 26, 2024
d19fe05
remove attributes
kaitejohnson Jun 26, 2024
eeb3fe6
add bare bones componets of an R package
kaitejohnson Jun 26, 2024
1338587
update description
kaitejohnson Jun 26, 2024
112ae91
add description of package
kaitejohnson Jun 26, 2024
76d8574
Update README.md
kaitejohnson Jun 26, 2024
fcd6b15
fix pre-commit
kaitejohnson Jun 26, 2024
00fac5b
Merge branch 'format-as-package' of https://github.com/CDCgov/ww-infe…
kaitejohnson Jun 26, 2024
620d604
pre-commit on readme
kaitejohnson Jun 26, 2024
d8167b6
attempt to set up pkgdown
kaitejohnson Jun 26, 2024
178c1cf
correct path to deps
kaitejohnson Jun 26, 2024
9d444e8
remove call to package
kaitejohnson Jun 26, 2024
4f8ff3a
Create CODEOWNERS
kaitejohnson Jun 26, 2024
46f79c4
Create CONTRIBUTING.md
kaitejohnson Jun 26, 2024
45d949f
Create NEWS.md
kaitejohnson Jun 26, 2024
47364bd
Create SUPPORT.md
kaitejohnson Jun 26, 2024
9884e8c
add stan model and start of vignette
kaitejohnson Jun 27, 2024
70c69a5
start of generating simulated data
kaitejohnson Jun 27, 2024
d453696
add parameters, package, more documentation to gen simulated data
kaitejohnson Jun 28, 2024
513587b
add in functions for getting delay distributions
kaitejohnson Jun 28, 2024
fcad57a
add functions needed for generating simulated data
kaitejohnson Jun 28, 2024
e858a03
add simulated data as package data
kaitejohnson Jun 29, 2024
5a91ec8
add some initial pre-processing functions from other packages
kaitejohnson Jun 29, 2024
9afd74d
add functions to preprocess wastewater data
kaitejohnson Jun 30, 2024
f09fee2
add minimal hosp data preprocessing
kaitejohnson Jun 30, 2024
9f8ad5d
add internal package data for covid GI and delay
kaitejohnson Jun 30, 2024
c732701
add functions to create stan data, modify model to be more generally …
kaitejohnson Jun 30, 2024
900a5c7
simplify pre-processing in vignette
kaitejohnson Jul 1, 2024
6ead306
add draft of wrapper function to fit model
kaitejohnson Jul 1, 2024
d04ace0
add initialization functions
kaitejohnson Jul 1, 2024
e222e52
add a wrapper function around the model fitting
kaitejohnson Jul 2, 2024
0736f02
fix bug
kaitejohnson Jul 2, 2024
4f2a2d1
update gitignore to exclude stan binary and .Rproj
kaitejohnson Jul 2, 2024
3961d08
make it s.t. ww data needs a column called exclude, add preprocessing…
kaitejohnson Jul 3, 2024
d9bd1d7
add exclude column in preprocessing, make outlier flag optional step
kaitejohnson Jul 3, 2024
2a94332
add functions that set up model specification and mcmc options
kaitejohnson Jul 3, 2024
2894ae1
add wwinference function
kaitejohnson Jul 3, 2024
e6ef714
add post processing functions, realizing that I have an indexing erro…
kaitejohnson Jul 3, 2024
e8a3dfa
fix the handling of passing in col names, error was breaking LOD calling
kaitejohnson Jul 4, 2024
f5b314f
update documentation
kaitejohnson Jul 4, 2024
f558bd9
fix typo
kaitejohnson Jul 4, 2024
00c0543
adjust postprocess function, add in diagnostics
kaitejohnson Jul 5, 2024
f675dc4
add functions for quickly plotting outputs
kaitejohnson Jul 5, 2024
d272090
revert params back to current modle version
kaitejohnson Jul 5, 2024
728d83d
add plotting and diagnostics to vignette
kaitejohnson Jul 5, 2024
1028c72
fix merge conflicts with main from updating package infreastructure"
kaitejohnson Jul 5, 2024
f663ab6
fix merge conflicts with main from updating package infreastructure"
kaitejohnson Jul 5, 2024
06c8147
try running pre-commit locally
kaitejohnson Jul 5, 2024
64ab128
add dependences to DESCRIPTION
kaitejohnson Jul 5, 2024
fe88a39
add more deps
kaitejohnson Jul 5, 2024
b9c4eb8
try removing print statement
kaitejohnson Jul 5, 2024
56037c4
revert
kaitejohnson Jul 5, 2024
102da37
update vignette data
kaitejohnson Jul 5, 2024
b2d064e
fix typos
kaitejohnson Jul 5, 2024
9b3777f
remove example
kaitejohnson Jul 5, 2024
e770235
try adding bookdown to imports
kaitejohnson Jul 5, 2024
81ef6d8
fix to specify ISO8601 format for dates
kaitejohnson Jul 8, 2024
e7871a0
Update R/wwinference.R
kaitejohnson Jul 9, 2024
448af6f
Update inst/stan/functions/utils.stan
kaitejohnson Jul 9, 2024
4ee29b6
update checkers documentation, remove usesethis from imports
kaitejohnson Jul 9, 2024
10a030d
Merge branch 'add-some-model-components' of https://github.com/CDCgov…
kaitejohnson Jul 9, 2024
33390d4
remove additional stan model
kaitejohnson Jul 9, 2024
b751780
Exclude data-raw in dependency check
kaitejohnson Jul 9, 2024
933029b
remove print statement
kaitejohnson Jul 9, 2024
60eac88
move bookdown to suggests
kaitejohnson Jul 9, 2024
e59f421
Update R/delay_distribs.R
kaitejohnson Jul 10, 2024
00ce331
remove sgtf call
kaitejohnson Jul 10, 2024
8556a29
use to_simplex on discretized growth adjusted weibull
kaitejohnson Jul 10, 2024
965bc1f
fix description incubation period
kaitejohnson Jul 10, 2024
d28c419
use to simplex again
kaitejohnson Jul 10, 2024
307c966
revise figure description to be more general
kaitejohnson Jul 10, 2024
1265741
Update R/figures.R
kaitejohnson Jul 10, 2024
93c5355
Update R/figures.R
kaitejohnson Jul 10, 2024
64c7ad1
Update R/figures.R
kaitejohnson Jul 10, 2024
5b5d146
fix documentation and set rhat tolerance as fxn arg
kaitejohnson Jul 10, 2024
57399fc
change from p_high rhat to frac high rhat
kaitejohnson Jul 10, 2024
d56203b
rename postprocess to get_draws_df
kaitejohnson Jul 10, 2024
89b6178
Update R/preprocessing.R description
kaitejohnson Jul 10, 2024
6c45004
Update R/preprocessing.R description for hosp data
kaitejohnson Jul 10, 2024
40fa4c9
Update R/utils.R fix typo
kaitejohnson Jul 10, 2024
5919d8c
Update R/utils.R
kaitejohnson Jul 10, 2024
dd881a9
Update R/utils.R
kaitejohnson Jul 10, 2024
9144ec3
Update R/utils.R
kaitejohnson Jul 10, 2024
d2ebb91
Update R/utils.R
kaitejohnson Jul 10, 2024
5323019
Update R/utils.R
kaitejohnson Jul 10, 2024
6bc80d4
Update R/utils.R
kaitejohnson Jul 10, 2024
c408929
make col name for eval data general
kaitejohnson Jul 10, 2024
ba2cec8
Merge branch 'add-some-model-components' of https://github.com/CDCgov…
kaitejohnson Jul 10, 2024
8668baf
run pre-commit on all files
kaitejohnson Jul 10, 2024
cad3aa4
add col name argument to vignette
kaitejohnson Jul 10, 2024
9f8abb2
adjust R(t) figure title and y-axis label
kaitejohnson Jul 10, 2024
164fe96
fix typos in wwinference.Rmd
kaitejohnson Jul 10, 2024
8942981
Update vignettes/wwinference.Rmd
kaitejohnson Jul 10, 2024
0a61076
rename postprocess to get draws_df
kaitejohnson Jul 10, 2024
99ecb80
remove the functions needed to generate delay distribs from package f…
kaitejohnson Jul 10, 2024
2a4c7ec
Merge branch 'add-some-model-components' of https://github.com/CDCgov…
kaitejohnson Jul 10, 2024
f9849ba
fix documentation of eval data in figures
kaitejohnson Jul 10, 2024
bf26b3d
add documentation to functions in covid_pmf generation, remove drop f…
kaitejohnson Jul 10, 2024
a2f97a7
remove documentation of drop first and renormalize
kaitejohnson Jul 10, 2024
28f6d2d
Update data-raw/covid_pmfs.R
dylanhmorris Jul 10, 2024
d61400f
fix missed namespace call
kaitejohnson Jul 10, 2024
d0e35ad
add to change log with first PR into main
kaitejohnson Jul 10, 2024
c5b6348
Merge branch 'add-some-model-components' of https://github.com/CDCgov…
kaitejohnson Jul 10, 2024
1f892d4
run pre-commit
kaitejohnson Jul 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ export(generate_simulated_data)
export(get_count_data_sizes)
export(get_count_indices)
export(get_count_values)
export(get_draws_df)
export(get_ind_m)
export(get_mcmc_options)
export(get_model_diagnostic_flags)
Expand All @@ -30,7 +31,6 @@ export(indicate_ww_exclusions)
export(make_hospital_onset_delay_pmf)
export(make_incubation_period_pmf)
export(make_reporting_delay_pmf)
export(postprocess)
export(preprocess_hosp_data)
export(preprocess_ww_data)
export(simulate_double_censored_pmf)
Expand Down
14 changes: 7 additions & 7 deletions R/delay_distribs.R
dylanhmorris marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' Drop the first element of a simplex
#' Drop the first element of a simplex and re-normalize the result to sum to 1.
#'
#' When this vector corresponds to the generation interval distribution, we
#' want to drop this first bin. The renewal equation assumes that same-day
Expand Down Expand Up @@ -77,9 +77,9 @@ simulate_double_censored_pmf <- function(
}

#' @title Make incubation period pmf
#' @description This makes a pmf corresponding to
#' the incubation period for COVID after Omicron used in Park et al 2023
#' These estimates are from early Omicron.
#' @description When the default arguments are used, this returns a pmf
#' corresponding to the incubation period for COVID after Omicron used in
#' Park et al 2023. These estimates are from early Omicron.
#' @param backward_scale numeric indicating the scale parameter for the Weibull
#' used in producing the incubation period distribution. default is `3.60` for
#' COVID
Expand Down Expand Up @@ -113,15 +113,15 @@ make_incubation_period_pmf <- function(backward_scale = 3.60,
# Relies on fundamental assumption about epidemic growth rate.


corrected_sgtf <- tibble::tibble(
discr_gr_adj_weibull <- tibble::tibble(
time = seq(0, 23, by = 1), # 23 seems to get most of the distribution mass
density0 = dweibull(time,
shape = backward_shape,
scale = backward_scale
) * exp(r * time)
)

inc_period_pmf <- corrected_sgtf$density0 / sum(corrected_sgtf$density0)
inc_period_pmf <- to_simplex(discr_gr_adj_weibull$density0)
return(inc_period_pmf)
}

Expand Down Expand Up @@ -180,6 +180,6 @@ make_reporting_delay_pmf <- function(incubation_period_pmf,
)

infection_to_hosp_delay_pmf <- add_pmfs(pmfs) |>
(\(x) x / sum(x))()
to_simplex()
return(infection_to_hosp_delay_pmf)
}
21 changes: 12 additions & 9 deletions R/figures.R
Original file line number Diff line number Diff line change
@@ -1,22 +1,25 @@
#' Get plot of fit and forecasted counts
#'
#' @param draws A dataframe containing the posterior draws with the data joined
#' to it. This is `draws_df` output of the call to `wwinference()`
#' to it. This is the `draws_df` output of a call to [wwinference()]
#' @param count_data_eval A dataframe containing the count data we will
#' evaluate the forecasts against.
dylanhmorris marked this conversation as resolved.
Show resolved Hide resolved
#' @param count_data_eval_col_name string indicating the name of the count
#' data to evaluate against the forecasted count data
#' @param forecast_date A string indicating the date we made the forecast, for
#' plotting, in ISO8601 format YYYY-MM-DD
#' @param count_type A string indicating what data the counts refer to,
#' default is `hospital admissions`
#'
#' @return A ggplot object containing the posterior draw of the estimated,
#' nowcasted, and forecasted hospital admissions alongside the data used to
#' calibrate the hospital admissions model and the later observed hospital
#' admissions used to evaluate the forecast performance
#' nowcasted, and forecasted counts alongside the data used to
#' calibrate the model and subsequently observed counts (if any) against which
#' to evaluate the forecast performance.
#' @export
#'
get_plot_forecasted_counts <- function(draws,
count_data_eval,
count_data_eval_col_name,
forecast_date,
count_type = "hospital admissions") {
p <- ggplot(draws |> dplyr::filter(
Expand All @@ -28,7 +31,7 @@ get_plot_forecasted_counts <- function(draws,
) +
geom_point(
data = count_data_eval,
aes(x = date, y = daily_hosp_admits_for_eval),
aes(x = date, y = .data[[count_data_eval_col_name]]),
shape = 21, color = "black", fill = "white"
) +
geom_point(aes(x = date, y = observed_value)) +
Expand Down Expand Up @@ -61,7 +64,7 @@ get_plot_forecasted_counts <- function(draws,
#' Get plot of fit and forecasted wastewater concentrations
#'
#' @param draws A dataframe containing the posterior draws with the data joined
#' to it. This is `draws_df` output of the call to `wwinference()`
#' to it. This is the `draws_df` output of a call to [wwinference()]
#' @param forecast_date A string indicating the date we made the forecast, for
#' plotting, in ISO8601 format YYYY-MM-DD
#'
Expand Down Expand Up @@ -120,7 +123,7 @@ get_plot_ww_conc <- function(draws,
#' Get plot of fit, nowcasted, and forecasted "global" R(t)
#'
#' @param draws A dataframe containing the posterior draws with the data joined
#' to it. This is `draws_df` output of the call to `wwinference()`
#' to it. This is the `draws_df` output of a call to [wwinference()]
#' @param forecast_date A string indicating the date we made the forecast, for
#' plotting, in ISO8601 format YYYY-MM-DD
#'
Expand All @@ -143,8 +146,8 @@ get_plot_global_rt <- function(draws,
) +
geom_hline(aes(yintercept = 1), linetype = "dashed") +
xlab("") +
ylab("R(t) of hypothetical state") +
ggtitle("R(t) estimate") +
ylab("Global R(t)") +
ggtitle("Global R(t) estimate") +
scale_x_date(
date_breaks = "2 weeks",
labels = scales::date_format("%Y-%m-%d")
Expand Down
17 changes: 11 additions & 6 deletions R/model_diagnostics.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@
#' (estimated bayesian fraction of missing information), default is `0.2`
#' @param divergences_tolerance float indicating the tolerance for the
#' proportion of sampling iterations that are divergent, default is `0.01`
#' @param p_high_rhat_tolerance float indicating the tolerance for the
#' proportion of parameters rhats>1.05, default is `0.05`
#' @param frac_high_rhat_tolerance float indicating the tolerance for the
#' proportion of parameters rhats>rhat_tolderance, default is `0.05`
#' @param rhat_tolerance float indicating the tolerance for the rhat for
#' individual parameters, default is `1.05`
#' @param max_tree_depth_tol float indicating the tolerance for the proportion
#' of iterations that exceed the maximum tree depth, default is `0.01`
#'
Expand All @@ -25,7 +27,8 @@
get_model_diagnostic_flags <- function(stan_fit_object,
ebmfi_tolerance = 0.2,
divergences_tolerance = 0.01,
p_high_rhat_tolerance = 0.05,
frac_high_rhat_tolerance = 0.05,
rhat_tolerance = 1.05,
max_tree_depth_tol = 0.01) {
n_chains <- stan_fit_object$num_chains()
iter_sampling <- stan_fit_object$metadata()$iter_sampling
Expand All @@ -39,9 +42,11 @@ get_model_diagnostic_flags <- function(stan_fit_object,
flag_too_many_divergences <- any(
diagnostic_summary$num_divergent >= max_n_divergences
)
p_high_rhat <- as.numeric(mean(summary[, "rhat"]$rhat > 1.05, na.rm = TRUE))
flag_high_rhat <- p_high_rhat >=
p_high_rhat_tolerance
frac_high_rhat <- as.numeric(mean(summary[, "rhat"]$rhat > rhat_tolerance,
na.rm = TRUE
))
flag_high_rhat <- frac_high_rhat >=
frac_high_rhat_tolerance
max_n_max_treedepth <- n_chains * iter_sampling * max_tree_depth_tol
flag_high_max_treedepth <- any(
diagnostic_summary$num_max_tree_depth >= max_n_max_treedepth
Expand Down
12 changes: 6 additions & 6 deletions R/postprocess.R
kaitejohnson marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,12 @@
#' are observations, the data will be joined to each draw of the predicted
#' observation to facilitate plotting.
#' @export
postprocess <- function(ww_data,
count_data,
fit_obj,
date_time_spine,
lab_site_spine,
subpop_spine) {
get_draws_df <- function(ww_data,
count_data,
fit_obj,
date_time_spine,
lab_site_spine,
subpop_spine) {
draws <- fit_obj$result$draws()

count_draws <- draws |>
Expand Down
6 changes: 4 additions & 2 deletions R/preprocessing.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#' Get input wastewater data
#' Pre-process wastewater input data, adding needed indices and flagging
#' potential outliers
#' @param ww_data dataframe containing the following columns: site, lab,
#' date, a column for concentration, and lod
#' @param conc_col_name string indicating the name of the column containing
Expand Down Expand Up @@ -69,7 +70,8 @@ preprocess_ww_data <- function(ww_data,
return(ww_preprocessed)
}

#' Get input hospital admissions data
#' Pre-process hospital admissions data, converting column names to those
#' that [get_stan_data()] expects.
#' @param hosp_data dataframe containing the following columns: date,
#' a count column, and a population size column
#' @param count_col_name name of the column containing the epidemiological
Expand Down
29 changes: 17 additions & 12 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -53,12 +53,12 @@ add_pmfs <- function(pmfs) {
}

#' @title Get index matrix
#' @description Get the matrix needed to convert a vetor from weekly to daily
#' @description Get a matrix to broadcast a vector from weekly to daily
#' @param n_days number of days we will expand to
#' @param n_weeks number of weeks those days correspond to
#'
#' @return a n_day x n_week matrix for multiplying by weekly estimated
#' value to conver it to daily
#' value to broadcast it to daily
#' @export
#'
#' @examples
Expand All @@ -81,7 +81,7 @@ get_ind_m <- function(n_days, n_weeks) {
#' @title Create a new directory if one doesn't exist
#' @description
#' Function to create a directory for the specified output file path if needed.
#' dir_create won't throw a warning if its already made though!
#' Does nothing if the target directory already exists.
#'
#'
#' @param output_file_path file path that may or may not need to be created
Expand All @@ -94,30 +94,35 @@ create_dir <- function(output_file_path) {
}
}

#' @title Convert to logmean in lognorm distribution
#' @title Get the mean of a Normal distribution for a random variable Y
#' needed to ensure that the distribution of X = exp(Y) (which is Log-Normal)
#' has a specified mean and sd.
#' @description
#' see arithmetic moments here
#' https://en.wikipedia.org/wiki/Log-normal_distribution
#'
#' @param mean mean of the normal distribution
#' @param sd sd of the normal distribution
#' @param mean target mean for the Log-Normal distribution of X
#' @param sd target sd for the Log-Normal distribution X
#'
#' @return corresponding mean of the lognormal distribution
#' @return corresponding mean for the underlying Normal
#' distribution of Y = log(X).
#' @export
convert_to_logmean <- function(mean, sd) {
logmean <- log(mean^2 / sqrt(sd^2 + mean^2))
return(logmean)
}


#' @title Convert to logsd in lognormal distribution
#' @description@description see arithmetic moments here
#' @title Get the sd of a Normal distribution for a random variable Y
#' needed to ensure that the distribution of X = exp(Y) (which is Log-Normal)
#' has a specified mean and sd.
#' @description see arithmetic moments here
#' https://en.wikipedia.org/wiki/Log-normal_distribution
#'
#' @param mean mean of the normal distribution
#' @param sd sd of the normal distribution
#' @param mean target mean for the Log-Normal distribution of X
#' @param sd target sd for the Log-Normal distribution of X
#'
#' @return corresponding stdev of the lognormal distribution
#' @return corresponding sd for the underlying Normal distribution of Y = log(X)
#' @export
convert_to_logsd <- function(mean, sd) {
logsd <- sqrt(log(1 + (sd^2 / mean^2)))
Expand Down
2 changes: 1 addition & 1 deletion R/wwinference.R
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ wwinference <- function(ww_data,
]
))

draws <- postprocess(
draws <- get_draws_df(
ww_data = ww_data,
count_data = count_data,
fit_obj = fit,
Expand Down
11 changes: 7 additions & 4 deletions man/convert_to_logmean.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 7 additions & 5 deletions man/convert_to_logsd.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/create_dir.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/drop_first_and_renormalize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/postprocess.Rd → man/get_draws_df.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/get_ind_m.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions man/get_model_diagnostic_flags.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading