From bbdd39d1b66c841b5b7fd7f61c505b8c8e814594 Mon Sep 17 00:00:00 2001
From: Zachary Susswein <utb2@cdc.gov>
Date: Tue, 13 Feb 2024 05:06:45 +0000
Subject: [PATCH 1/8] Flesh out scenario details

This is a first crack at some of the details we need to get scnearios
started. I made the following big-er juddgement calls:
* Renewal is good enough (vs SEIR)
* One longer timeseries with crazy dynamics works because of the
rolling windows
* Put numbers on the GIs

Closes #22
---
 manuscript/index.qmd | 53 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 13 deletions(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index f6bb5c2d3..24a9e57dd 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -51,24 +51,51 @@ We assume:
 
 ### Simulation model
 
-We use the generic model structure described above with a renewal process. To simulate noise in the infection process we assume additional Brownian noise for the effective reproduction number of XX.
+We use the generic model structure described above with a renewal process for reasons of convenience. The renewal equation is quite flexibie and makes minimal assumptions about the underlying infectious disease process. Although using the renewal equation to generate the data could be construed as leading to more favorable inference, all threee latent processes are able to generate the simulated data.
+
+We simulate from the renewal process through the following procedure:
+1. Take a fixed timeseries of Rt for 160 days (7 days x 8 weeks + 7 days x 15 weeks less one day). See the next subsection for more description of these scenarios.
+2. Add noise to the fixed Rt estimates draws from a N(0, 0.1) with a fixed seed of `12345`.
+3. Simulate daily incidence starting from $I_0 = 10$ cases and a fixed generation interval. See the next subsection for more detail.
+4. Convolve the true incidence timeseries through a double-censored Exp(1/5) PMF with a maximum of 15 which has ~95% of the probability density. This PMF allows for some-day observation. Note that this choice of delay is arbitrary and does not correspond to any particular delay.
+5. Simulate additional negative binomial observation noise on the delayed cases drawn with mean of the true cases and overdispersion of 10.
+
+We do not add a day-of-week effect.
+
+### Generation intervals
+
+We use two generation intervals, corresponding to pathogens with long and short GIs. We use descretized, double-censored versions of the GI PMFs.
+1. *Short:* We use a Gamma(shape = 2, scale = 1), corresponding to a pathogen with a quite short generation interval. Vaguely corresponds to flu A in Wallinga & Lipsitch, 2006
+3. *Medium:* We use a Gamma(shape = 2, scale = 5, corresponding to lots of
+2. *Long:* We use a Gamma(shape = 2, scale = 10), corresponding to a pathogen with a moderately long generation interval (Smallpox? I don't know if we need to ground this in anything real and if we do we could drop this down to 15 days and use varicella?)
+
+We don't test a longer GI because it would be impractical in the testing framework and we do not believe we would see substantially different behavior.
+
+We produce the simulations described in the next section for both of these GIs.
 
 ### Simulations
 
-We test the following general scenarios:
-- Piecewise constant Rt in an epidemic setting
-      - Generation time:
-- An endemic setting with smoothly varying Rt
-- An outbreak setting with changes in Rt comparable to that observed due to susceptible depletion
-- A mixed outbreak setting with both smooth changes and piecewise changes in Rt
+We test the following scenario:
+- Piecewise constant Rt
+    - 1.1 for two weeks
+    - 2 for two weeks
+    - 0.5 for two weeks
+    - 1.5 for two weeks
+    - 0.75 for two weeks
+    - 1.1 for six weeks
+    - sine curve centered at 1 with amplitude of 0.3 afterwards
+
+We simulate out of this scenario for the GIs described in the previous section.
+
+This scenario provides both sharp changes at the start of the timeseries and more gradual transitions towards the end. The rolling windows allow for exploration of both of these situations in a single case study. The longer fit to the entire timeseries tests the ability to flexibily handle both of these paradigms in a single fit.
+
+### Fitting to simulated data
 
-We assume a delay distribution of ** motivated by **.
+We fit the Rt estimation models with 8 week rolling windows as well as one global fit over the entire timeseries. We evaluate metrics both including the first week of the fit and dropping the first week. We also seperately evaluate the forecast over the two week horizon.
 
-We explore the following misspecification scenarios for the generation interval:
+We fit to the simulated data using both the correct GIs as well as the misspecified GIs. For the misspecified scenarios, we evaluate both the quality of the fit as well as the quality of the sampling.
 
-- Correct
-- Too short
-- Too long
+I'm picturing figure 1A here as something that echoes the Sherratt 2023 Figure 2. One of the classic fit/forecast quality but for Rt.
 
 ### Case studies
 
@@ -149,7 +176,7 @@ Say if it looked okay and reference SI
 - We do not explore more complex prior models such as splines and gaussian processes
 - We focus our efforts on situational awareness and hence real-time performance. This means we do not focus on retrospective performance which may have different characteristics.
 - We did not perform full simulation-based calibration.
-- Our simulations are produced by a model that is similar to the renewal process inference method and so represents a "best" case for this method. Potential future work could explore other versions of the infection generation process backing the simulations but we feel this choice makes sense given that the renewal process best reflects our mechanistic understanding of how transmission works of the models we explore here.
+- Our simulations are produced by a model that is similar to the renewal process inference method and so represents a "best" case for this method. It would be potentially be stronger to simulate from an SEIR process directly.
 
 ## References {.unnumbered}
 

From 60f242374b1f6fcd865efe9d0569df752bb707a9 Mon Sep 17 00:00:00 2001
From: Zachary Susswein <46581799+zsusswein@users.noreply.github.com>
Date: Tue, 13 Feb 2024 22:08:18 -0500
Subject: [PATCH 2/8] Apply suggestions from code review

Co-authored-by: Sam Abbott <azw1@cdc.gov>
---
 manuscript/index.qmd | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index 24a9e57dd..f7bf04912 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -65,11 +65,10 @@ We do not add a day-of-week effect.
 ### Generation intervals
 
 We use two generation intervals, corresponding to pathogens with long and short GIs. We use descretized, double-censored versions of the GI PMFs.
-1. *Short:* We use a Gamma(shape = 2, scale = 1), corresponding to a pathogen with a quite short generation interval. Vaguely corresponds to flu A in Wallinga & Lipsitch, 2006
+1. *Short:* We use a Gamma(shape = 2, scale = 1), corresponding to a pathogen with a relatively short generation interval. Vaguely corresponds to flu A in Wallinga & Lipsitch, 2006
 3. *Medium:* We use a Gamma(shape = 2, scale = 5, corresponding to lots of
 2. *Long:* We use a Gamma(shape = 2, scale = 10), corresponding to a pathogen with a moderately long generation interval (Smallpox? I don't know if we need to ground this in anything real and if we do we could drop this down to 15 days and use varicella?)
 
-We don't test a longer GI because it would be impractical in the testing framework and we do not believe we would see substantially different behavior.
 
 We produce the simulations described in the next section for both of these GIs.
 
@@ -91,9 +90,9 @@ This scenario provides both sharp changes at the start of the timeseries and mor
 
 ### Fitting to simulated data
 
-We fit the Rt estimation models with 8 week rolling windows as well as one global fit over the entire timeseries. We evaluate metrics both including the first week of the fit and dropping the first week. We also seperately evaluate the forecast over the two week horizon.
+We fit the Rt estimation models with 8 week rolling windows as well as one global fit over the entire timeseries.
 
-We fit to the simulated data using both the correct GIs as well as the misspecified GIs. For the misspecified scenarios, we evaluate both the quality of the fit as well as the quality of the sampling.
+We fit to the simulated data using both the correct GIs as well as the misspecified GIs. 
 
 I'm picturing figure 1A here as something that echoes the Sherratt 2023 Figure 2. One of the classic fit/forecast quality but for Rt.
 

From 221a99e6c24af379a4eb84c32c0f8dc2b0dfd99f Mon Sep 17 00:00:00 2001
From: Sam Abbott <azw1@cdc.gov>
Date: Mon, 19 Feb 2024 17:54:28 +0000
Subject: [PATCH 3/8] Update manuscript/index.qmd

---
 manuscript/index.qmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index f7bf04912..719dd313d 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -51,7 +51,7 @@ We assume:
 
 ### Simulation model
 
-We use the generic model structure described above with a renewal process for reasons of convenience. The renewal equation is quite flexibie and makes minimal assumptions about the underlying infectious disease process. Although using the renewal equation to generate the data could be construed as leading to more favorable inference, all threee latent processes are able to generate the simulated data.
+We use the generic model structure described above with a renewal process as it represents (in its equivalence to an SEIR compartmental model) domain understanding of a model that can capture known infectious dynamics.
 
 We simulate from the renewal process through the following procedure:
 1. Take a fixed timeseries of Rt for 160 days (7 days x 8 weeks + 7 days x 15 weeks less one day). See the next subsection for more description of these scenarios.

From feffb0d133405148917a527115e0fd2440defeee Mon Sep 17 00:00:00 2001
From: Sam Abbott <azw1@cdc.gov>
Date: Mon, 19 Feb 2024 17:55:42 +0000
Subject: [PATCH 4/8] Update manuscript/index.qmd

---
 manuscript/index.qmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index 719dd313d..2e9af0a99 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -175,7 +175,7 @@ Say if it looked okay and reference SI
 - We do not explore more complex prior models such as splines and gaussian processes
 - We focus our efforts on situational awareness and hence real-time performance. This means we do not focus on retrospective performance which may have different characteristics.
 - We did not perform full simulation-based calibration.
-- Our simulations are produced by a model that is similar to the renewal process inference method and so represents a "best" case for this method. It would be potentially be stronger to simulate from an SEIR process directly.
+- Our simulations are produced by a model that is similar to the renewal process inference method and so represents a "best" case for this method.
 
 ## References {.unnumbered}
 

From 146cf362b39204ccc5bd9bfe6faa6e80fb8e4a1d Mon Sep 17 00:00:00 2001
From: Sam Abbott <azw1@cdc.gov>
Date: Mon, 19 Feb 2024 17:55:53 +0000
Subject: [PATCH 5/8] Update manuscript/index.qmd

---
 manuscript/index.qmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index 2e9af0a99..db58a8bce 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -54,7 +54,7 @@ We assume:
 We use the generic model structure described above with a renewal process as it represents (in its equivalence to an SEIR compartmental model) domain understanding of a model that can capture known infectious dynamics.
 
 We simulate from the renewal process through the following procedure:
-1. Take a fixed timeseries of Rt for 160 days (7 days x 8 weeks + 7 days x 15 weeks less one day). See the next subsection for more description of these scenarios.
+1. Take a fixed timeseries of Rt for 160 days. See the next subsection for more description of these scenarios.
 2. Add noise to the fixed Rt estimates draws from a N(0, 0.1) with a fixed seed of `12345`.
 3. Simulate daily incidence starting from $I_0 = 10$ cases and a fixed generation interval. See the next subsection for more detail.
 4. Convolve the true incidence timeseries through a double-censored Exp(1/5) PMF with a maximum of 15 which has ~95% of the probability density. This PMF allows for some-day observation. Note that this choice of delay is arbitrary and does not correspond to any particular delay.

From a778a162ac5dd4710f865a2feea63b6497ddb8f7 Mon Sep 17 00:00:00 2001
From: Sam Abbott <azw1@cdc.gov>
Date: Mon, 19 Feb 2024 18:11:38 +0000
Subject: [PATCH 6/8] Update manuscript/index.qmd

Co-authored-by: Samuel Brand <48288458+SamuelBrand1@users.noreply.github.com>
---
 manuscript/index.qmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index db58a8bce..219ef05fb 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -57,7 +57,7 @@ We simulate from the renewal process through the following procedure:
 1. Take a fixed timeseries of Rt for 160 days. See the next subsection for more description of these scenarios.
 2. Add noise to the fixed Rt estimates draws from a N(0, 0.1) with a fixed seed of `12345`.
 3. Simulate daily incidence starting from $I_0 = 10$ cases and a fixed generation interval. See the next subsection for more detail.
-4. Convolve the true incidence timeseries through a double-censored Exp(1/5) PMF with a maximum of 15 which has ~95% of the probability density. This PMF allows for some-day observation. Note that this choice of delay is arbitrary and does not correspond to any particular delay.
+4. The delay between infection and case ascertainment is represented as a convolution on the true incidence timeseries, as is standard in the literature **CITATIONS**. For any given infected person the delay between infection and ascertainment is distributed **SOME GAMMA/LOGNORMAL**; this is mapped to our discrete time forward simulations using double interval censoring of both the time of infection and the time of ascertainment **CITE SWP + OTHERS**.
 5. Simulate additional negative binomial observation noise on the delayed cases drawn with mean of the true cases and overdispersion of 10.
 
 We do not add a day-of-week effect.

From 38ab93cbc386fdc8da6db44a8c4ce2006a06145c Mon Sep 17 00:00:00 2001
From: Sam <s.e.abbott12@gmail.com>
Date: Mon, 19 Feb 2024 19:05:30 +0000
Subject: [PATCH 7/8] another go at sub-heading org and revert dicussion
 changes

---
 manuscript/index.qmd | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index 219ef05fb..0ea4da003 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -49,7 +49,9 @@ We assume:
    - AR(1) process
    - Differenced AR(1) process
 
-### Simulation model
+### Simulations
+
+#### Observation-generating process
 
 We use the generic model structure described above with a renewal process as it represents (in its equivalence to an SEIR compartmental model) domain understanding of a model that can capture known infectious dynamics.
 
@@ -62,7 +64,7 @@ We simulate from the renewal process through the following procedure:
 
 We do not add a day-of-week effect.
 
-### Generation intervals
+#### Generation intervals
 
 We use two generation intervals, corresponding to pathogens with long and short GIs. We use descretized, double-censored versions of the GI PMFs.
 1. *Short:* We use a Gamma(shape = 2, scale = 1), corresponding to a pathogen with a relatively short generation interval. Vaguely corresponds to flu A in Wallinga & Lipsitch, 2006
@@ -72,7 +74,9 @@ We use two generation intervals, corresponding to pathogens with long and short
 
 We produce the simulations described in the next section for both of these GIs.
 
-### Simulations
+#### Scenarios
+
+##### Reproduction number scenarios
 
 We test the following scenario:
 - Piecewise constant Rt
@@ -88,13 +92,14 @@ We simulate out of this scenario for the GIs described in the previous section.
 
 This scenario provides both sharp changes at the start of the timeseries and more gradual transitions towards the end. The rolling windows allow for exploration of both of these situations in a single case study. The longer fit to the entire timeseries tests the ability to flexibily handle both of these paradigms in a single fit.
 
-### Fitting to simulated data
-
-We fit the Rt estimation models with 8 week rolling windows as well as one global fit over the entire timeseries.
+#### Inference scenarios
+We explore the following misspecification scenarios for the generation interval:
 
-We fit to the simulated data using both the correct GIs as well as the misspecified GIs. 
+- Correct
+- Too short
+- Too long
 
-I'm picturing figure 1A here as something that echoes the Sherratt 2023 Figure 2. One of the classic fit/forecast quality but for Rt.
+For each simulated scenario we fit to 12 weeks of data or as much as possible if the scenario is shorter than 12 weeks.
 
 ### Case studies
 
@@ -175,7 +180,7 @@ Say if it looked okay and reference SI
 - We do not explore more complex prior models such as splines and gaussian processes
 - We focus our efforts on situational awareness and hence real-time performance. This means we do not focus on retrospective performance which may have different characteristics.
 - We did not perform full simulation-based calibration.
-- Our simulations are produced by a model that is similar to the renewal process inference method and so represents a "best" case for this method.
+- Our simulations are produced by a model that is similar to the renewal process inference method and so represents a "best" case for this method. Potential future work could explore other versions of the infection generation process backing the simulations but we feel this choice makes sense given that the renewal process best reflects our mechanistic understanding of how transmission works of the models we explore here.
 
 ## References {.unnumbered}
 

From 2657c10151c57124a8e2a5e88cc6ffc405725b8c Mon Sep 17 00:00:00 2001
From: Sam <s.e.abbott12@gmail.com>
Date: Mon, 19 Feb 2024 19:21:58 +0000
Subject: [PATCH 8/8] add another subheading

---
 manuscript/index.qmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/manuscript/index.qmd b/manuscript/index.qmd
index 0ea4da003..1fbc3fe04 100644
--- a/manuscript/index.qmd
+++ b/manuscript/index.qmd
@@ -92,7 +92,7 @@ We simulate out of this scenario for the GIs described in the previous section.
 
 This scenario provides both sharp changes at the start of the timeseries and more gradual transitions towards the end. The rolling windows allow for exploration of both of these situations in a single case study. The longer fit to the entire timeseries tests the ability to flexibily handle both of these paradigms in a single fit.
 
-#### Inference scenarios
+##### Inference scenarios
 We explore the following misspecification scenarios for the generation interval:
 
 - Correct