Skip to content

Commit

Permalink
Update priors spatial main (#250)
Browse files Browse the repository at this point in the history
* Adding class and methods for wwinference model fit (#58)

* Starting off refactoring (expected to fail) [skip ci]

* Adding new method

* Fixing bug in fit_model (was exploiting scoping)

* Updating docs (fixing S3 methods)

* 49 output class creation (#59)

* add a space

* add first test of first check

* add tests for all of the check/assert functions

* run precommit

* check bug in passing output of checkmate to cliabort

* initial tests of preprocess_ww_data

* add custum utils function for autoescaping brackets to pass to glue

* add a bunch of tests for preprocessing wastewater data

* add one more test of site lab indexing

* fix bugs caught in CI

* fix lab site spacing

* fix spacing in name again

* add test to hospital admissions preprocessing

* add additional test to ensure character to indexing of sites and labs

* remove bug in expected number of unique lab site indices

* add tests to make sure data is daily and test to checkers

* add a bunch of validation checks to the joint datasets and the user specifications

* replace with new way of getting stan data

* fix examples, add test, add warning

* fix examples, add test, add warning

* change from hosp -> count everywhere except stan and  vignette/examples

* add tests for pmfs

* fix bugs in documentation

* add padding value as a function arg

* change pmf size check to a warning not an error

* fix bug

* make initialization function more generic

* update changelog

* modify to test

* fix typo from merge

* fix parsing of cmdstan object

* change parsing of fit obj

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* some tweaks to checkers

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* fix documentation

* fix typo

* fix typo

* change outputs from wwinference() function

* fix typos, add documentation

* fix bug missing stan args

* exclude t columns in data join

* fix vignette bug

* add the ww_output documentation

* document ...

* fix missing comma

* move documentation of params around

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* change syntax and filenames

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/wwinference.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/wwinference.R

Co-authored-by: Dylan H. Morris <[email protected]>

* change naming and internal checking

* change syntax

* move around documentation

* fix check

* fix tests, fix documentation

* rename assert function to specify within a certain frame

* add element to text

* fix bug in function name

* tweak to inference function

* fix two bugs

* adjust tests based on updated get stan data function which breaks up generation of input data

* Update get_stan_data.R example

* update documentation after fixing example

* add example to wwinference wrapper function

* attempt to move around documentation for wwinference methods

* play around with the documentation of the default and the S3 method functions

* export S3 method function

* add back in exporting functions to get input data formatted for stan

* make first argument of function have same name as class object

* fix bug in how max generation time is found

* update vignette to explain wwinference_fit class object vs explicit function calling, add diagnostics and show both ways

* fix naming blocks adding comma when needed

* dont export autoescape brackets function

* fix same bug

* update test and preprocessing to count at LOD values at below LOD

* fix internal call to diagnostic flags function

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* implement DMs suggestions

* run pre-commit

* export default functions

* Add test-coverage.yaml from epinowcast

* remove test coverage

* remove example, function not exported

* export default function

* export both diagnostics functions

* add documentation of additional arguments

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_ww_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* manually input some suggestions

* Update tests/testthat/test_checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_checkers.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* add more checknames

* run pre-commit locally

* fix typo

* add some very minimal tests

* fix wwinference function

* fix bug

* fix bug

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update tests/testthat/test_preprocess_count_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* run pre-commit locally

* fix bugs in tests

* fix error in tests

* move forecast date, calib time, horizon time to args to wrapper function

* fix hosp only example in vignette

* fix error in example

* add dont run to examples

* check -> expect in checkmate, confirm tests pass locally

---------

Co-authored-by: Dylan H. Morris <[email protected]>
Co-authored-by: George G. Vega Yon <[email protected]>

* Making pre-commit happy

* Reworking cross-references and print method

* Removing copy of fit_model

* Fixing function call

* Addressing PR comments

* Forgot to save some changes

* Change output names (#86)

* change names of outputs of wwinference wrapper function

* fix a few other missed replacements

* fix pre-commit

* Fixing R CMD check

* Pre-commit

* Removing diagnostics_summary

---------

Co-authored-by: George G. Vega Yon <[email protected]>

* Update vignettes/wwinference.Rmd

Co-authored-by: Kaitlyn Johnson <[email protected]>

* Update vignettes/wwinference.Rmd

Co-authored-by: Kaitlyn Johnson <[email protected]>

* Adding example of summary and print in the vignette. Addressing some minor comments

* fix test for expected names after changing function args

* set seed in tests

---------

Co-authored-by: Kaitlyn Johnson <[email protected]>
Co-authored-by: Dylan H. Morris <[email protected]>
Co-authored-by: kaitejohnson <[email protected]>

* Addressing R CMD check notes due to tidyeval syntax (#108)

* Starting to use .data and others

* Removing more warnings

* Think almost all issues are now solved

* License warning and passing params as expected

* Removing  prefix

* Fixing note on license and news file

* Using str2lang in spread_draws

* Update R/get_draws_df.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Fixing R CMD check

* fixed intercept in figures

* Update R/generate_simulated_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Apply suggestions from code review by @dylanhmorris

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/get_stan_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* remove call to utils::globalVariables()

* Update R/preprocessing.R

* Update R/generate_simulated_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/preprocessing.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/get_stan_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

---------

Co-authored-by: Dylan H. Morris <[email protected]>
Co-authored-by: Kaitlyn Johnson <[email protected]>
Co-authored-by: kaitejohnson <[email protected]>

* update hierarchical estimate of sigma_site in `model_definition` (#120)

* add a space

* update hierarchical estimate of sigma_site

* update prior table

* run pre-commit

* update comment when transforming to site level standard deviations

* add to change log

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

* update notation for mode and sd of stdevs

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

* tweaks to formatting

* Update model_definition.md

Co-authored-by: Dylan H. Morris <[email protected]>

---------

Co-authored-by: Dylan H. Morris <[email protected]>

* Vignette tweaks (#141)

* fix typo in indicate ww exclusions documentation

* fix typos/language in vignette

* Update R/preprocessing.R

Co-authored-by: Chirag Kumar <[email protected]>

* update docs

---------

Co-authored-by: Chirag Kumar <[email protected]>

* actually set seed

* Set seeds in test_get_stan_data (#146)

Co-authored-by: Kaitlyn Johnson <[email protected]>

* Modify package to expect log scale concentration values and LODs (#122)

* Tweaks to model definition (#134)

* Fix check for required wastewater columns (#127)

* Switch to placing prior on and inferring `i/n` at the first observed timepoint (#85)

* update vignette to reflect default NULL seed in mcmcoptions (#125)

* Fix NEWS.md (#126)

* hot fix to readme

* Update NEWS.md

* run pre-commit

* Update NEWS.md (#144)

* run pre-commit locally

* Update NEWS.md

---------

Co-authored-by: George G. Vega Yon <[email protected]>

* Update DESCRIPTION (#156)

* Adding new class and method for get_draws (#153)

* Adding new class and method (expected to fail)

* Addressing issues with names (expected to fail)

* Adding the what parameter to the docs

* Addressing final bits. Now need the test

* Adding plot method as a wrapper

* Adding some tests

* Fixing test and setting default y=NULL in plot

* Adding some lines in the vignette to explain the plot method works on wwinference_fit_draws

* Addressing review comments

* Typo in length function

* Reverting R/sysdata.rda and ensuring tests run properly

* Reverting sysdata (again)

* Better print and fixing test

* Fixing tests

* Add contributors (#160)

* 163 expand R version  (#164)

* Add hex logo to repo (#148)

* update readme with logo

* swap to svg

* use use package

* adjust size and remove extra text

* try adding new logo

* fix title

* fix title again

* delete old logos

* Various bug fixes (#128)

* fix rendering to katex, add mathcal Rt to vignette (#169)

* Tweaks to main vignette (#170)

* Adding the post-page-artifact job (#181)

* Build link comment in PRs: update comment instead of re-creating on rebuilds (#182)

* Only run post-page-artifact job on PRs (#183)

* Fix formatting so functions link (#179)

* 174 cmdstanr sample args (#175)

* Hot fix validate pmf (#191)

* Restructure hierarchical estimation based on reference subpopulation (#158)

* update validate to warn if sum(site_pop)>total pop

* modify to center around the reference pop

* temporary change to stan file path for troubleshooting

* model compiles

* reorder pops by size, reindex subpops to sites, add switch for include_ww = 0

* wip rmd

* reindex labsites
 + other changes

* ensure the sum(sites)<total_pop case works

* workaround to handle include_ww = 0 in stan data

* add documentation, fix vignette

* fix preprocessing to order by site pop, add a test for this

* add tests for the hosp only and no aux site cases

* add a test of null data being passed in

* tweaks to print methods and get draws function

* tweak diagnostics, make sure hosp logic works as expected

* switch diagnostics, fix inits, add error message if req ww for hosp only model

* update vignette package data

* update test data

* add log shift from reference pop to central dynamic

* add m prior to params and stan data, fix inits bug

* update test data

* fix preprocessing test to order in terms of site pops

* fix model diagnostics functions

* m should be centered around 1!

* fix inits

* m is log scale, it should be centered around 0

* fix inits

* update test data

* edit subpop definition in model defn

* run pre-commit

* fix arrange

* edit model definition to explain reference subpop

* run pre-commit locally

* fix example

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* Update inst/stan/wwinference.stan

Co-authored-by: Dylan H. Morris <[email protected]>

* add ofsets to intercept and growth rate of unobserved infection process

* update test data running on WSL2

* Change how offsets are handled (#168)

* Update model file to handle offsets slightly differently, clarify parameter name comments

* Fix missing close paren

* Fix variable name

* Fix more variable names

* Remove separate handling of reference pop, fix a few more bugs

* Update docs

* Fix check for warning in get_stan_data test

* Better fix for test_get_stan_data

* Fail more informatively if test_ww_model fails to fit entirely

* Further customize the fitting failure message for informativeness

* Update get stan data with new variable names

* Add new variable names to example_params.toml

* Fix indexing and initialization

* Update test data

* add test of no ww model

* add conditional for inits, add test for no ww

* tweak prreprocessing to handle no wastewater case, add tests for all cases

* update testing data

* Update R/get_stan_data.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/get_draws.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* Update R/validate.R

Co-authored-by: Dylan H. Morris <[email protected]>

* fix initialization

* update language around the sum(sites)>pop

* run pre-commit locally

* whoops, fix init

* aux site -> aux subpop

* add site_to_subpop map to get_subpop_data function

* create vectors to pass to stan using the subpopulation mappings

* revert to original initialization, use index explicitly in df column name

* remove old comments

* add functions for making spines in wwinference

* move spine functions to get stan data file

* update docs

* fix fxn input

* Fix typo

* refactor handling of sites, subpops, ww data indices interally, commented code, expect to fail

* include lod vals in plots

* fix get stan data to be all based on mappings

* fix tests to take in all inputs to get stan data

* fix lab_site_subpop_spine fxn

* first pass fix postprocessing

* minor tweaks

* update expected column names from get_draws

* update test data

* fix labsite to subpop spine handling, add docs for get ww indices and vals

---------

Co-authored-by: Dylan H. Morris <[email protected]>
Co-authored-by: Dylan H. Morris <[email protected]>

* init had wrong name... (#199)

* add multiple os to matrix strategy (#190)

* Update NEWS.md (#205)

* Update README.md (#207)

* Update DESCRIPTION (#203)

* Fix error messaging when data extends beyond forecast date (#208)

* Positive constrain mode_sigma_ww_site (#210)

* Setup pkgdown so it hosts release and dev sites (#212)

* Adding the developer mode (see if it works)

* Updating action to build twice with caching

* Fixing concurrency and workflow graph

* Fixing action

* Wrong option passed to gh release list

* Adding missing token

* Debugging

* Debugging gh release list

* Debugging gh release list v2

* Trying a different strategu

* Trying a different strategy v2

* Using jq to extract the tag info

* Another try

* Printing releases

* Trying a different strategy

* Was using the wrong pipe

* Properly using the caching

* Switching the version

* Adding person in construction icon

* Adding minor tweaks: auto dev mode and rename cache key

* Adding more links to the site and enforcing buit on new _pkgdown config

* Fixing hashing step

* Was pointing to the wrong yml

* Ensuring hashing and usage of _pkgdown.yml

* Leveraging sparse checkout

* Ensuring where the pkg is thrown

* Correcting sed

* Fixing my bash

* Devel is main and adding toggle button

* Issue 200: Modify plot methods (#218)

* fix link (#231)

* Issue 197: rename `validate_both_datasets` function (#219)

* Replacing artifact and setting retention days to 7 (#230)

* Check unique values of site_pop per site (#232)

* Adding validation of records per site

* Updated news

* Addressing co-pilot hallucination

* Explicit call to dplyr::n()

* testing data had multiple site pops per site!

* Better error and adding a test to catch the error.

---------

Co-authored-by: Kaitlyn Johnson <[email protected]>

* [Hot fix] test when pop size is not constant was failing (#235)

* Hot fix!

* There was probably a single site!

* Correcting site name

* fix cbind, dont want duplicat column names, use seq_len but differently

* Update NEWS.md

Co-authored-by: Dylan H. Morris <[email protected]>

---------

Co-authored-by: Kaitlyn Johnson <[email protected]>
Co-authored-by: Kaitlyn Johnson <[email protected]>
Co-authored-by: Dylan H. Morris <[email protected]>

* Issue 184: Add outputs to `generate_simulated_data()` fxn and package data (#220)

* Issue 238: Fix plot bug (#239)

* swap order of plotting so calib data shows up

* fix plot function

* remove free y scale on subpop rt plot (#247)

* Issue 248: Add package workflow diagram to readme (#249)

* add language to readme

* Add files via upload

* Add files via upload

* remove fig

* Add files via upload

* add svg

* delete png

* add package workflow diagram

* run pre-commit locally

* Add files via upload

* Update README.md

* Update README.md

Co-authored-by: George G. Vega Yon <[email protected]>

---------

Co-authored-by: George G. Vega Yon <[email protected]>

* Modify priors on `eta_sd` and `inf_feedback` (#236)

* fix merge conflcits

* update package data

* update documentation

* remove extra package data

* fix merge conflicts in change log

* fix merge conflict in model defn

---------

Co-authored-by: George G. Vega Yon <[email protected]>
Co-authored-by: Dylan H. Morris <[email protected]>
Co-authored-by: Chirag Kumar <[email protected]>
Co-authored-by: Dylan H. Morris <[email protected]>
  • Loading branch information
5 people authored Nov 14, 2024
1 parent fe94da5 commit 4577d88
Show file tree
Hide file tree
Showing 25 changed files with 166 additions and 21 deletions.
8 changes: 4 additions & 4 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

# wwinference 0.1.0.99 (dev)

## User-visible changes
Expand All @@ -7,7 +6,10 @@ hospital admissions to output of function and package data. ([#184](https://gith
- `wwinference` now checks whether `site_pop` is fixed per site (see issue [#223](https://github.com/CDCgov/ww-inference-model/issues/226) reported by [@akeyel](https://github.com/akeyel)).

## Internal changes
- Updated the workflow for posting the pages artifact to PRs (issue [#229](https://github.com/CDCgov/ww-inference-model/issues/229)).
- Modified the priors on the infection feedback term and the step size of the weekly random walk in the effective reproductive number (issue [#227](https://github.com/CDCgov/ww-inference-model/issues/227)), based on benchmarking results from the evaluation pipeline described in the [PR](https://github.com/CDCgov/ww-inference-model/pull/236) corresponding to this change.
- Add package workflow diagram to readme ([#248](https://github.com/CDCgov/ww-inference-model/issues/248))
- `get_plot_subpop_rt()` now uses a shared y-axis to facilitate comparison of R(t) estimates) ([#245](https://github.com/CDCgov/ww-inference-model/issues/245))
- Updated the workflow for posting the pages artifact to PRs (issue [#229](https://github.com/CDCgov/ww-inference-model/issues/229)(https://github.com/CDCgov/ww-inference-model/issues/229)).
- Modify `plot_forecasted_counts()` so that it does not require an evaluation dataset ([#218](https://github.com/CDCgov/ww-inference-model/pull/218))

# wwinference 0.1.0
Expand All @@ -22,7 +24,5 @@ As it's written, the package is intended to allow users to do the following:
- Validate input data validation with informative error messaging ([#37](https://github.com/CDCgov/ww-inference-model/issues/37), [#54](https://github.com/CDCgov/ww-inference-model/issues/54))
- Provide a wrapper function to generate forward simulated data with user-specified variables. It calls a number of functions to perform specific model components ([#27](https://github.com/CDCgov/ww-inference-model/issues/27))
- Contains S3 class methods applied to the output of the main model wrapper function, the `wwinference_fit` class object ([#58](https://github.com/CDCgov/ww-inference-model/issues/58)).
<<<<<<< HEAD
- Wastewater concentration data is expected to be in log scale ([#122](https://onetakeda.box.com/s/pju273g5khx3y3cwoae2zwv3e7vu03x3)).
=======
- Wastewater concentration data is expected to be in log scale ([#122](https://github.com/CDCgov/ww-inference-model/pull/122)).
47 changes: 43 additions & 4 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,47 @@
#' @source vignette_data.R
"ww_data"

#' Example evaluation wastewater dataset.
#'
#' A dataset containing the simulated retrospective wastewater concentrations
#' (labeled here as `log_genome_copies_per_ml_eval`) by sample collection date
#' (`date`), the site where the sample was collected (`site`) and the lab
#' where the samples were processed (`lab`). Additional columns that are
#' required attributes needed for the model are the limit of detection for
#' that lab on each day (labeled here as `log_lod`) and the population size of
#' the wastewater catchment area represented by the wastewater concentrations
#' in each `site`.
#'
#' This data is generated via the default values in the
#' `generate_simulated_data()` function. They represent the bare minumum
#' required fields needed to pass to the model, and we recommend that users
#' try to format their own data to match this format.
#'
#' The variables are as follows:
#'
#' @format ## ww_data_eval
#' A tibble with 126 rows and 6 columns
#' \describe{
#' \item{date}{Sample collection date, formatted in ISO8601 standards as
#' YYYY-MM-DD}
#' \item{site}{The wastewater treatment plant where the sample was collected}
#' \item{lab}{The lab where the sample was processed}
#' \item{log_genome_copies_per_ml_eval}{The natural log of the wastewater
#' concentration measured on the date specified, collected in the site
#' specified, and processed in the lab specified. The package expects
#' this quantity in units of log estimated genome copies per mL.}
#' \item{log_lod}{The log of the limit of detection in the site and lab on a
#' particular day of the quantification device (e.g. PCR). This should be in
#' units of log estimated genome copies per mL.}
#' \item{site_pop}{The population size of the wastewater catchment area
#' represented by the site variable}
#' \item{location}{ A string indicating the location that all of the
#' data is coming from. This is not a necessary column, but instead is
#' included to more realistically mirror a typical workflow}
#' }
#' @source vignette_data.R
"ww_data_eval"



#' Example wastewater dataset with independent site correlations.
Expand Down Expand Up @@ -100,9 +141,9 @@
#' to match this format.
#'
#' This data is generated via the default values in the
#' `generate_simulated_data()` function. They represent the bare minumum
#' `generate_simulated_data()` function. They represent the bare minimum
#' required fields needed to pass to the model, and we recommend that users
#' try to format their own data to match this formate.
#' try to format their own data to match this format.
#'
#' The variables are as follows:
#' \describe{
Expand Down Expand Up @@ -304,8 +345,6 @@
"rt_site_data"




#' COVID-19 post-Omicron generation interval probability mass function
#'
#' \describe{
Expand Down
2 changes: 1 addition & 1 deletion R/figures.R
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ get_plot_subpop_rt <- function(draws,
linetype = "dashed",
show.legend = FALSE
) +
facet_wrap(~subpop_name, scales = "free") +
facet_wrap(~subpop_name) +
geom_hline(aes(yintercept = 1), linetype = "dashed") +
xlab("") +
ylab("Subpopulation R(t)") +
Expand Down
36 changes: 35 additions & 1 deletion R/generate_simulated_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -575,7 +575,6 @@ generate_simulated_data <- function(r_in_weeks = # nolint




# Global adjusted R(t) --------------------------------------------------
# I(t)/convolve(I(t), g(t)) #nolint
# This is not used directly, but we want to have it for comparing to the
Expand All @@ -599,6 +598,41 @@ generate_simulated_data <- function(r_in_weeks = # nolint
lod_lab_site = lod_lab_site
)

ww_data_eval <- format_ww_data(
log_obs_conc_lab_site = log_obs_conc_lab_site_eval,
ot = ot + ht,
ht = 0,
date_df = date_df,
site_lab_map = site_lab_map,
lod_lab_site = lod_lab_site
) |>
dplyr::rename(
"log_genome_copies_per_ml_eval" = "log_genome_copies_per_ml"
)

# Artificially add values below the LOD----------------------------------
# Replace it with an NA, will be used as an example of how to format data
# properly.
min_ww_val <- min(ww_data$log_genome_copies_per_ml)
ww_data <- ww_data |>
dplyr::mutate(
"log_genome_copies_per_ml" =
dplyr::case_when(
.data$log_genome_copies_per_ml ==
!!min_ww_val ~ 0.5 * .data$log_lod,
TRUE ~ .data$log_genome_copies_per_ml
)
)
ww_data_eval <- ww_data_eval |>
dplyr::mutate(
"log_genome_copies_per_ml_eval" =
dplyr::case_when(
.data$log_genome_copies_per_ml_eval ==
!!min_ww_val ~ 0.5 * .data$log_lod,
TRUE ~ .data$log_genome_copies_per_ml_eval
)
)


ww_data_eval <- format_ww_data(
log_obs_conc_lab_site = log_obs_conc_lab_site_eval,
Expand Down
5 changes: 5 additions & 0 deletions R/get_stan_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ get_input_ww_data_for_stan <- function(preprocessed_ww_data,
) |>
dplyr::arrange(.data$date, .data$lab_site_index)
}

return(ww_data)
}

Expand Down Expand Up @@ -418,6 +419,7 @@ get_stan_data <- function(input_count_data,
# Get the last date that there were observations of the epidemiological
# indicator (aka cases or hospital admissions counts)
last_count_data_date <- max(input_count_data$date, na.rm = TRUE)

# Validate input pmfs----------------------------------------------------
validate_pmf(generation_interval,
calibration_time,
Expand Down Expand Up @@ -651,6 +653,7 @@ get_stan_data <- function(input_count_data,
sd_log_sigma_ww_site_prior_sd =
params$sd_log_sigma_ww_site_prior_sd,
eta_sd_sd = params$eta_sd_sd,
eta_sd_mean = params$eta_sd_mean,
sigma_i_first_obs_prior_mode = params$sigma_i_first_obs_prior_mode,
sigma_i_first_obs_prior_sd = params$sigma_i_first_obs_prior_sd,
p_hosp_prior_mean = params$p_hosp_mean,
Expand Down Expand Up @@ -682,6 +685,7 @@ get_stan_data <- function(input_count_data,
offset_ref_initial_exp_growth_rate_prior_sd =
params$offset_ref_initial_exp_growth_rate_prior_sd
)

return(stan_data_list)
}

Expand Down Expand Up @@ -797,6 +801,7 @@ get_ww_indices_and_values <- function(input_ww_data,
"Length of censored vectors incorrect" =
length(ww_censored) + length(ww_uncensored) == owt
)

ww_sampled_times <- ww_data_joined |> dplyr::pull("t")
ww_sampled_subpops <- ww_data_joined |> dplyr::pull("subpop_index")
lab_site_to_subpop_spine <- lab_site_site_spine |>
Expand Down
2 changes: 2 additions & 0 deletions R/initialization.R
Original file line number Diff line number Diff line change
Expand Up @@ -111,10 +111,12 @@ get_inits_for_one_chain <- function(stan_data, stdev = 0.01) {
# unstructured correlation matrix
L_Omega = as.matrix(diag(2))
)

if (stan_data$corr_structure_switch == 2) {
init_list$L_Omega <- diag((n_subpops - 1))
}


if (stan_data$n_subpops > 1) {
init_list$error_rt_subpop <- matrix(
stats::rnorm((n_subpops - 1) * n_weeks,
Expand Down
1 change: 1 addition & 0 deletions R/wwinference.R
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,7 @@ wwinference <- function(ww_data,
)
}


# Get site to subpop spine
site_subpop_spine <- get_site_subpop_spine(
input_ww_data = input_ww_data,
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ This will help make clear the data requirements and how to structure this data t
- Sam Abbott (seabbs)
- Damon Bayer (damonbayer)

# Package workflow
The following depicts the suggested workflow for fitting the wastewater-informed forecasting model. See the ["Getting Started" vignette](https://cdcgov.github.io/ww-inference-model/articles/wwinference.html) for a full example.
![](./man/figures/wwinference_workflow.png)

# Installing and running code

## Install R
Expand Down
2 changes: 1 addition & 1 deletion data-raw/vignette_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ hosp_data_eval_ind <- simulated_data_ind$hosp_data_eval
rt_site_data_ind <- simulated_data_ind$rt_site_data
rt_global_data_ind <- simulated_data_ind$rt_global_data


usethis::use_data(hosp_data, overwrite = TRUE)
usethis::use_data(hosp_data_eval, overwrite = TRUE)
usethis::use_data(ww_data, overwrite = TRUE)

usethis::use_data(rt_site_data, overwrite = TRUE)
usethis::use_data(rt_global_data, overwrite = TRUE)
usethis::use_data(hosp_data_ind, overwrite = TRUE)
Expand Down
Binary file modified data/hosp_data.rda
Binary file not shown.
Binary file modified data/hosp_data_eval.rda
Binary file not shown.
Binary file modified data/hosp_data_eval_ind.rda
Binary file not shown.
Binary file modified data/hosp_data_ind.rda
Binary file not shown.
Binary file modified data/ww_data.rda
Binary file not shown.
Binary file added data/ww_data_eval.rda
Binary file not shown.
Binary file modified data/ww_data_ind.rda
Binary file not shown.
6 changes: 4 additions & 2 deletions inst/extdata/example_params.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,11 @@ offset_ref_initial_exp_growth_rate_prior_sd = 0.025

autoreg_p_hosp_a = 1 # shape1 parameter of autoreg term on IHR(t) trend
autoreg_p_hosp_b = 100 # shape2 parameter of autoreg term on IHR(t) trend
eta_sd_mean = 0.0278 # from posterior of fit to long time series
eta_sd_sd = 0.01
infection_feedback_prior_logmean = 6.37408 # log(mode) + q^2 mode = 500, q = 0.4
infection_feedback_prior_logsd = 0.4
infection_feedback_prior_logmean = 4.498 # log(~90) from posterior of fit to long
# time series
infection_feedback_prior_logsd = 0.636 # log(~1.9)


[hospital_admission_observation_process]
Expand Down
5 changes: 3 additions & 2 deletions inst/stan/wwinference.stan
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ data {
real sd_log_sigma_ww_site_prior_mode;
real<lower=0> sd_log_sigma_ww_site_prior_sd;
real<lower=0> eta_sd_sd;
real<lower=0> eta_sd_mean;
real p_hosp_prior_mean;
real<lower=0> p_hosp_sd_logit;
real<lower=0> p_hosp_w_sd_sd;
Expand Down Expand Up @@ -303,10 +304,10 @@ transformed parameters {
log_r_subpop_t_in_weeks = log_r_t_in_weeks +
(n_subpops > 1 ? offset_ref_log_r_t[1] : 0);
} else {

log_r_subpop_t_in_weeks = to_vector(log_r_subpop_t_in_weeks_matrix[i-1, :]);
}


//convert from weekly to daily
unadj_r_subpop_t = exp(to_row_vector(ind_m*(log_r_subpop_t_in_weeks)));

Expand Down Expand Up @@ -402,7 +403,7 @@ model {
offset_ref_logit_i_first_obs_prior_sd);
offset_ref_initial_exp_growth_rate ~ normal(offset_ref_initial_exp_growth_rate_prior_mean,
offset_ref_initial_exp_growth_rate_prior_sd);
eta_sd ~ normal(0, eta_sd_sd);
eta_sd ~ normal(eta_sd_mean, eta_sd_sd);
autoreg_rt_subpop ~ beta(autoreg_rt_subpop_a, autoreg_rt_subpop_b);

autoreg_rt ~ beta(autoreg_rt_a, autoreg_rt_b);
Expand Down
1 change: 1 addition & 0 deletions man/figures/wwinference_package_workflow.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/wwinference_workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions man/hosp_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

55 changes: 55 additions & 0 deletions man/ww_data_eval.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions model_definition.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,10 @@ In the case where the sum of the wastewater site catchment populations meets or

This amounts to modeling the wastewater catchments populations as approximately non-overlapping; every infected individual either does not contribute to measured wastewater or contributes principally to one wastewater catchment.
This approximation is reasonable if we restrict our analyses to primary wastewaster treatment plants, which avoids the possibility that an individual might be sampled once in a sample taken upstream and then sampled again in a more aggregated sample taken further downstream.

If the sum of the wastewater site catchment populations meets or exceeds the reported jurisdiction population ($\sum\nolimits_{k=1}^{K_\mathrm{sites}} n_k \ge n$) the model does not use a final subpopulation without sampled wastewater. In that case, the total number of subpopulations $K_\mathrm{total} = K_\mathrm{sites}$.


When converting from predicted per capita incident hospital admissions $H(t)$ to predicted hospitalization counts, we use the jurisdiction population size $n$, even in the case where $\sum n_k > n$.

This amounts to making two key additional modeling assumptions:
Expand Down
3 changes: 1 addition & 2 deletions tests/testthat/test_preprocess_ww_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ test_that("Function returns site indices in order of largest site pop", {
spine <- processed |> dplyr::distinct(site_pop, site_index)



expect_true(spine$site_pop[spine$site_index == 1] == max(spine$site_pop))
})

Expand Down Expand Up @@ -383,8 +382,8 @@ test_that("Function handles LOD values equal to concentration values", {

test_that("Constant population per site", {
wrong_pop <- ww_data
wrong_pop$site_pop[1] <- ww_data$site_pop[1] + 1000

wrong_pop$site_pop <- 1e6 + seq_len(nrow(ww_data))

expect_error(
preprocess_ww_data(
Expand Down
4 changes: 2 additions & 2 deletions vignettes/wwinference.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,8 @@ by the hospital admissions data, in this case, the size of the theoretical state
Additionally, we provide the `hosp_data_eval` dataset which contains the
simulated hospital admissions 28 days ahead of the forecast date, which can be
used to evaluate the model.
For the wastewater data, the expcted format is a table of observations with the

For the wastewater data, the expected format is a table of observations with the
following columns. The wastewater data should not contain `NA` values for days with
missing observations, instead these should be excluded:
- a date (column `date`): the date the sample was collected
Expand Down Expand Up @@ -155,7 +156,6 @@ ww_data_preprocessed <- preprocess_ww_data(
```
Note that this function assumes that there are no missing values in the
concentration column. The package expects observations below the LOD will
be replaced with a numeric value below the LOD. If there are NAs in your dataset when observations are below the LOD, we suggest replacing them with a value
be replaced with a numeric value below the LOD. If there are NAs in your dataset
when observations are below the LOD, we suggest replacing them with a value
below the LOD in upstream pre-processing.
Expand Down

0 comments on commit 4577d88

Please sign in to comment.