Skip to content

Commit

Permalink
Adding idea about how to sample more efficiently
Browse files Browse the repository at this point in the history
  • Loading branch information
gvegayon committed Apr 25, 2024
1 parent f6f8a48 commit e973ba2
Show file tree
Hide file tree
Showing 3 changed files with 100 additions and 1 deletion.
8 changes: 7 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -70,5 +70,11 @@
"cfenv": "cpp",
"cinttypes": "cpp"
},
"intel-corporation.oneapi-analysis-configurator.binary-path": "/home/george/Documents/development/epiworld/tests/01c.o"
"intel-corporation.oneapi-analysis-configurator.binary-path": "/home/george/Documents/development/epiworld/tests/01c.o",
"grammarly.selectors": [
{
"language": "quarto",
"scheme": "file"
}
]
}
54 changes: 54 additions & 0 deletions paper/mixing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Mixing probabilities in connected model
George G. Vega Yon, Ph.D.
2024-04-25

We will look into the probability of drawing infected individuals to
simplify the algorithm. There are $I$ infected individuals at any time
in the simulation; thus, instead of drawing from $Bern(c/N, N)$, we will
be drawing from $Bern(c/N, I)$. The next step is to check which infected
individuals should be drawn. Let’s compare the distributions using the
hypergeometric as an example:

``` r
set.seed(132)
nsims <- 1e5
N <- 400
rate <- 2
p <- rate/N
I <- 10

sim_complex <- parallel::mclapply(1:nsims, \(i) {
nsamples <- rbinom(N, N, p)
sum(rhyper(N, m = I, n = N, k = nsamples) > 0)
}, mc.cores = 4L) |> unlist()

sim_simple <- parallel::mclapply(1:nsims, \(i) {
sum(rbinom(N, I, p) > 0)
}, mc.cores = 4L) |> unlist()


op <- par(mfrow = c(1,2))
MASS::truehist(sim_complex)
MASS::truehist(sim_simple)
```

![](mixing_files/figure-commonmark/Simulation-1.png)

``` r
par(op)

quantile(sim_complex)
```

0% 25% 50% 75% 100%
3 16 19 22 40

``` r
quantile(sim_simple)
```

0% 25% 50% 75% 100%
3 17 19 22 40

These two approaches are equivalent, but the second one is more
efficient from the computational perspective.
39 changes: 39 additions & 0 deletions paper/mixing.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
format: gfm
title: Mixing probabilities in connected model
author: George G. Vega Yon, Ph.D.
date: 2024-04-25
---

We will look into the probability of drawing infected individuals to simplify the algorithm. There are $I$ infected individuals at any time in the simulation; thus, instead of drawing from $Bern(c/N, N)$, we will be drawing from $Bern(c/N, I)$. The next step is to check which infected individuals should be drawn. Let's compare the distributions using the hypergeometric as an example:


```{r}
#| label: Simulation
set.seed(132)
nsims <- 1e5
N <- 400
rate <- 2
p <- rate/N
I <- 10
sim_complex <- parallel::mclapply(1:nsims, \(i) {
nsamples <- rbinom(N, N, p)
sum(rhyper(N, m = I, n = N, k = nsamples) > 0)
}, mc.cores = 4L) |> unlist()
sim_simple <- parallel::mclapply(1:nsims, \(i) {
sum(rbinom(N, I, p) > 0)
}, mc.cores = 4L) |> unlist()
op <- par(mfrow = c(1,2))
MASS::truehist(sim_complex)
MASS::truehist(sim_simple)
par(op)
quantile(sim_complex)
quantile(sim_simple)
```

These two approaches are equivalent, but the second one is more efficient from the computational perspective.

0 comments on commit e973ba2

Please sign in to comment.