Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add differenced AR models #134

Merged
merged 58 commits into from
Mar 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
0e32d50
first pass at AR model
seabbs Mar 8, 2024
57d4fcf
move files around and add differencing infrastructure
seabbs Mar 8, 2024
47613aa
remove scaffold comment
seabbs Mar 8, 2024
105d8b4
change arg order
seabbs Mar 8, 2024
34635c7
clean up use of new
seabbs Mar 8, 2024
24b29b8
reorg based on main
seabbs Mar 11, 2024
cb53aea
improve testing
seabbs Mar 11, 2024
b54918e
catching out of date export
seabbs Mar 11, 2024
4cd8307
update to use parametric types and use @kwdef to simplify construction
seabbs Mar 12, 2024
3142f20
update to use parameters.jl and drop p calc in struct as don't work w…
seabbs Mar 12, 2024
668b7ee
work through tests
seabbs Mar 12, 2024
50c5673
check tests and add basic docs
seabbs Mar 12, 2024
030193f
tweak docs
seabbs Mar 12, 2024
e11a2e9
clean up merge issues
seabbs Mar 14, 2024
9a172f9
reimplement AR
seabbs Mar 14, 2024
e21e6fc
reimplement AR
seabbs Mar 14, 2024
e34e952
fix AR
seabbs Mar 15, 2024
8f5d203
fix current AR tests
seabbs Mar 15, 2024
4164ec4
current tests passing
seabbs Mar 15, 2024
54676d5
copy doc strings everywhere
seabbs Mar 15, 2024
83b2022
reorder tests for RandomWalk
seabbs Mar 15, 2024
42364fe
add AR(2)
seabbs Mar 15, 2024
03c7c5b
update ar default
seabbs Mar 15, 2024
455c36b
first pass at AR model
seabbs Mar 8, 2024
b8668f2
move files around and add differencing infrastructure
seabbs Mar 8, 2024
d2e0e72
remove scaffold comment
seabbs Mar 8, 2024
f51c17f
clean up use of new
seabbs Mar 8, 2024
3220e4f
reorg based on main
seabbs Mar 11, 2024
2724d96
remove differenced model for its own PR
seabbs Mar 11, 2024
04e7ceb
update to use parametric types and use @kwdef to simplify construction
seabbs Mar 12, 2024
1d76b45
update to use parameters.jl and drop p calc in struct as don't work w…
seabbs Mar 12, 2024
3d2fec1
work through tests
seabbs Mar 12, 2024
5936ae4
check tests and add basic docs
seabbs Mar 12, 2024
a4d2cc7
tweak docs
seabbs Mar 12, 2024
cebe727
first pass at AR model
seabbs Mar 8, 2024
b9c853c
move files around and add differencing infrastructure
seabbs Mar 8, 2024
49b7875
add differenced latent model
seabbs Mar 11, 2024
486c25c
add default diff latent priors
seabbs Mar 11, 2024
e50db06
clean up comments
seabbs Mar 11, 2024
6d23f52
fix simple example
seabbs Mar 11, 2024
3d389fb
improve draft tests
seabbs Mar 11, 2024
d2a967f
Add normal util
SamuelBrand1 Mar 13, 2024
259909c
Change DiffLatentModel constructor
SamuelBrand1 Mar 13, 2024
3262568
distribution tests for DiffLatentModel on randomwalk
SamuelBrand1 Mar 13, 2024
80405ae
doc strings for DiffLatentModel
SamuelBrand1 Mar 14, 2024
b21b920
remove unnecessary default prior creation
SamuelBrand1 Mar 14, 2024
b7c5aba
style change to use _ for unused loop var
SamuelBrand1 Mar 14, 2024
60a673d
doc string for generated_latent method
SamuelBrand1 Mar 14, 2024
8b1e20a
clean up further merging issues
seabbs Mar 14, 2024
bab886e
clena out utils duplicate
seabbs Mar 14, 2024
a343912
update to new module structure and utils
seabbs Mar 15, 2024
5db0d93
add back in scan.jl
seabbs Mar 15, 2024
cbc892b
clean up more leakage from main/git hygiene
seabbs Mar 15, 2024
9b379ed
add defaults to difflatentmodel
seabbs Mar 15, 2024
4581522
get first difflatent test passing
seabbs Mar 15, 2024
b0a7731
fix final DiffLatent test
seabbs Mar 15, 2024
ef49536
Merge branch 'main' into diff-latent-model
seabbs Mar 15, 2024
2f0efae
Update EpiAware.jl
seabbs Mar 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion EpiAware/src/EpiAware.jl
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ export spread_draws, scan, create_discrete_pmf
include("EpiLatentModels/EpiLatentModels.jl")
using .EpiLatentModels

export RandomWalk, AR
export RandomWalk, AR, DiffLatentModel

include("EpiInfModels/EpiInfModels.jl")
using .EpiInfModels
Expand Down
4 changes: 3 additions & 1 deletion EpiAware/src/EpiLatentModels/EpiLatentModels.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ using ..EpiAwareBase

using Turing, Distributions, DocStringExtensions

export RandomWalk, AR
export RandomWalk, AR, DiffLatentModel

include("docstrings.jl")
include("randomwalk.jl")
include("autoregressive.jl")
include("difflatentmodel.jl")
include("utils.jl")

end
191 changes: 191 additions & 0 deletions EpiAware/src/EpiLatentModels/difflatentmodel.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@

@doc raw"""
Model the latent process as a `d`-fold differenced version of another process.

## Mathematical specification

Let ``\Delta`` be the differencing operator. If ``\tilde{Z}_t`` is a realisation of
undifferenced latent model supplied to `DiffLatentModel`, then the differenced process is
given by,

```math
\Delta^{(d)} Z_t = \tilde{Z}_t, \quad t = d+1, \ldots.
```

We can recover ``Z_t`` by applying the inverse differencing operator ``\Delta^{-1}``, which
corresponds to the cumulative sum operator `cumsum` in Julia, `d`-times. The `d` initial
terms ``Z_1, \ldots, Z_d`` are inferred.

## Constructors

- `DiffLatentModel(latentmodel, init_prior_distribution::Distribution; d::Int)`
Constructs a `DiffLatentModel` for `d`-fold differencing with `latentmodel` as the
undifferenced latent process. All initial terms have common prior
`init_prior_distribution`.
- `DiffLatentModel(;undiffmodel, init_priors::Vector{D} where {D <: Distribution})`
Constructs a `DiffLatentModel` for `d`-fold differencing with `latentmodel` as the
undifferenced latent process. The `d` initial terms have priors given by the vector
`init_priors`, therefore `length(init_priors)` sets `d`.

## Example usage with `generate_latent`

`generate_latent` can be used to construct a `Turing` model for the differenced latent
process. In this example, the underlying undifferenced process is a `RandomWalk` model.


First, we construct a `RandomWalk` struct with an initial value prior and a step size
standard deviation prior.

```julia
using Distributions, EpiAware
rw = RandomWalk(Normal(0.0, 1.0), truncated(Normal(0.0, 0.05), 0.0, Inf))
```

Then, we can use `DiffLatentModel` to construct a `DiffLatentModel` for `d`-fold
differenced process with `rw` as the undifferenced latent process.

We have two constructor options for `DiffLatentModel`. The first option is to supply a
common prior distribution for the initial terms and specify `d` as follows:

```julia
diff_model = DiffLatentModel(rw, Normal(); d = 2)
```

Or we can supply a vector of priors for the initial terms and `d` is inferred as follows:
```julia
diff_model2 = DiffLatentModel(;undiffmodel = rw, init_priors = [Normal(), Normal()])
```

Then, we can use `generate_latent` to construct a Turing model for the differenced latent
process generating a length `n` process,

```julia
# Construct a Turing model
n = 100
difference_mdl = generate_latent(diff_model, n)
```

Now we can use the `Turing` PPL API to sample underlying parameters and generate the
unobserved latent process.

```julia
#Sample random parameters from prior
θ = rand(difference_mdl)
#Get a sampled latent process as a generated quantity from the model
(Z_t, _) = generated_quantities(difference_mdl, θ)
Z_t
```

"""
struct DiffLatentModel{M <: AbstractLatentModel, P} <: AbstractLatentModel
"Underlying latent model for the differenced process"
model::M
"The prior distribution for the initial latent variables."
init_prior::P
"Number of times differenced."
d::Int

function DiffLatentModel(
model::AbstractLatentModel, init_prior::Distribution; d::Int)
init_priors = fill(init_prior, d)
return DiffLatentModel(; model = model, init_priors = init_priors)
end

function DiffLatentModel(; model::AbstractLatentModel,
init_priors::Vector{D} where {D <: Distribution} = [Normal()])
d = length(init_priors)
init_prior = _expand_dist(init_priors)
return DiffLatentModel(model, init_prior, d)
end

function DiffLatentModel(model::AbstractLatentModel, init_prior::Distribution, d::Int)
@assert d>0 "d must be greater than 0"
@assert d==length(init_prior) "d must be equal to the length of init_prior"
new{typeof(model), typeof(init_prior)}(model, init_prior, d)
end
end

"""
Generate a `Turing` model for `n`-step latent process ``Z_t`` using a differenced
latent model defined by `latent_model`.

# Arguments
- `latent_model::DiffLatentModel`: The differential latent model.
- `n`: The length of the latent variables.

## Turing model specifications

### Sampled random variables
- `latent_init`: The initial latent process variables.
- Other random variables defined by `model<:AbstractLatentModel` field of the undifferenced
model.

### Generated quantities
- A tuple containing the generated latent process as its first argument
and a `NamedTuple` of sampled auxiliary variables as second argument.

# Example usage with `DiffLatentModel` model constructor

`generate_latent` can be used to construct a `Turing` model for the differenced latent
process. In this example, the underlying undifferenced process is a `RandomWalk` model.

First, we construct a `RandomWalk` struct with an initial value prior and a step size
standard deviation prior.

```julia
using Distributions, EpiAware
rw = RandomWalk(Normal(0.0, 1.0), truncated(Normal(0.0, 0.05), 0.0, Inf))
```

Then, we can use `DiffLatentModel` to construct a `DiffLatentModel` for `d`-fold
differenced process with `rw` as the undifferenced latent process.

We have two constructor options for `DiffLatentModel`. The first option is to supply a
common prior distribution for the initial terms and specify `d` as follows:

```julia
diff_model = DiffLatentModel(rw, Normal(); d = 2)
```

Or we can supply a vector of priors for the initial terms and `d` is inferred as follows:
```julia
diff_model2 = DiffLatentModel(;undiffmodel = rw, init_priors = [Normal(), Normal()])
```

Then, we can use `generate_latent` to construct a Turing model for the differenced latent
process generating a length `n` process,

```julia
# Construct a Turing model
n = 100
difference_mdl = generate_latent(diff_model, n)
```

Now we can use the `Turing` PPL API to sample underlying parameters and generate the
unobserved latent process.

```julia
#Sample random parameters from prior
θ = rand(difference_mdl)
#Get a sampled latent process as a generated quantity from the model
(Z_t, _) = generated_quantities(difference_mdl, θ)
Z_t
```
"""
@model function EpiAwareBase.generate_latent(latent_model::DiffLatentModel, n)
d = latent_model.d
@assert n>d "n must be longer than d"
latent_init ~ latent_model.init_prior

@submodel diff_latent, diff_latent_aux = generate_latent(latent_model.model, n - d)

return _combine_diff(latent_init, diff_latent, d), (; latent_init, diff_latent_aux...)
end

function _combine_diff(init, diff, d)
combined = vcat(init, diff)
for _ in 1:d
combined = cumsum(combined)
end
return combined
end
90 changes: 90 additions & 0 deletions EpiAware/test/test_difflatentmodel.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
@testitem "Testing DiffLatentModel constructor" begin
using Distributions, Turing

model = RandomWalk()
@testset "Testing DiffLatentModel with vector of priors" begin
init_priors = [Normal(0.0, 1.0), Normal(1.0, 2.0)]
diff_model = DiffLatentModel(; model = model, init_priors = init_priors)

@test diff_model.model == model
@test diff_model.init_prior == arraydist(init_priors)
@test diff_model.d == 2
end

@testset "Testing DiffLatentModel with single prior and d" begin
d = 3
init_prior = Normal()
diff_model = DiffLatentModel(model, init_prior; d = d)

@test diff_model.model == model
@test diff_model.init_prior == filldist(init_prior, d)
@test diff_model.d == d
end
end

@testitem "Testing DiffLatentModel process" begin
using DynamicPPL, Turing
using Distributions
using HypothesisTests: ExactOneSampleKSTest, pvalue

n = 100
d = 2
model = RandomWalk(Normal(0.0, 1.0), truncated(Normal(0.0, 0.05), 0.0, Inf))
init_priors = [Normal(0.0, 1.0), Normal(1.0, 2.0)]
diff_model = DiffLatentModel(model = model, init_priors = init_priors)

latent_model = generate_latent(diff_model, n)
fixed_model = fix(
latent_model, (σ_RW = 0, rw_init = 0.0))

n_samples = 2000
samples = sample(fixed_model, Prior(), n_samples) |>
chn -> mapreduce(hcat, generated_quantities(fixed_model, chn)) do gen
gen[1]
end

#Because of the recursive d-times cumsum to undifference the process,
#The distribution of the second day should be d lots of first day init distribution
"""
Add two normal distributions `x` and `y` together.

# Returns
- `Normal`: The sum of `x` and `y` as a normal distribution.
"""
function _add_normals(x::Normal, y::Normal)
return Normal(x.μ + y.μ, sqrt(x.σ^2 + y.σ^2))
end

#Plus day two distribution
day2_dist = foldl((x, y) -> _add_normals(x, init_priors[1]), 1:d, init = init_priors[2])

ks_test_pval_day1 = ExactOneSampleKSTest(samples[1, :], init_priors[1]) |> pvalue
ks_test_pval_day2 = ExactOneSampleKSTest(samples[2, :], day2_dist) |> pvalue

@test size(samples) == (n, n_samples)
@test ks_test_pval_day1 > 1e-6 #Very unlikely to fail if the model is correctly implemented
@test ks_test_pval_day2 > 1e-6 #Very unlikely to fail if the model is correctly implemented
end

@testitem "Testing DiffLatentModel runs with AR process" begin
using DynamicPPL, Turing
using Distributions

n = 100
d = 2
model = AR()
init_priors = [Normal(0.0, 1.0), Normal(1.0, 2.0)]
diff_model = DiffLatentModel(model = model, init_priors = init_priors)

latent_model = generate_latent(diff_model, n)
fixed_model = fix(latent_model,
(latent_init = [0.0, 1.0], σ_AR = 1.0, damp_AR = [0.8], ar_init = [0.0]))

n_samples = 100
samples = sample(fixed_model, Prior(), n_samples) |>
chn -> mapreduce(hcat, generated_quantities(fixed_model, chn)) do gen
gen[1]
end

@test size(samples) == (n, n_samples)
end
Loading