-
Notifications
You must be signed in to change notification settings - Fork 2
/
analysis-prr.Rmd
78 lines (61 loc) · 2.79 KB
/
analysis-prr.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
title: "Base Ranker: Proportional Reporting Ratio (PRR)"
author:
- name: Nan Xiao
url: https://nanx.me/
affiliation: Seven Bridges
affiliation_url: https://www.sevenbridges.com/
- name: Soner Koc
url: https://github.com/skoc
affiliation: Seven Bridges
affiliation_url: https://www.sevenbridges.com/
- name: Kaushik Ghose
url: https://kaushikghose.wordpress.com/
affiliation: Seven Bridges
affiliation_url: https://www.sevenbridges.com/
date: "`r Sys.Date()`"
output: distill::distill_article
bibliography: rankv.bib
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = TRUE, cache = TRUE)
```
PRR is a commonly used metric for safety signal detection. We can denote the vaccine-symptom pairs from the VAERS database as many $2 \times 2$ contingency tables:
| Target vaccine | Target symptom | All other symptoms | Total |
| :------------- | :------------- | :----------------------- | :-------- |
| Yes | $n_{ij}$ | $n_i - n_{ij}$ | $n_i$ |
| No | $n_j - n_{ij}$ | $n - n_i - n_j + n_{ij}$ | $n - n_i$ |
| Total | $n_j$ | $n - n_j$ | $n$ |
In the table, $n_i = \sum_j n_{ij}$, $n_j = \sum_i n_{ij}$. They are the marginal sums over all other symptoms or vaccines for each unique vaccine $i$ and symptom $j$. The proportional reporting ratio (PRR) [@evans2001] for each vaccine-symptom pair is defined as
$$
PRR_{ij} = \frac{n_{ij}}{E_{ij}}
$$
where
$$
E_{ij} = \frac{n_j (n_i - n_{ij})}{n - n_{ij}}.
$$
$PRR_{ij}$ measures the disproportionality in rates of the target symptom $j$ to all other symptoms for exposure to vaccine $i$ in the contingency table. It assumes that the reports of symptom $j$ are independent to all other symptoms on vaccine $i$, while this can turned out to be a quite strong assumption in reality. A relatively higher value of PRR indicates stronger association between the vaccine and the symptom.
Load the packages for PRR-based singal detection and ranking:
```{r}
suppressMessages(library("PhViD"))
library("kableExtra")
```
Load the preprocessed VAERS data and transform it into the analyzable format:
```{r}
df_p <- readRDS("data-processed/df_p.rds")
df_p <- df_p[, 1:3]
df_v <- as.PhViD(df_p, MARGIN.THRES = 10)
```
We calculate the Proportional Reporting Ratio (PRR) [@evans2001] and the ranking statistic --- lower bound of the 95% two-sided confidence interval of log(PRR):
```{r}
lst_prr <- PRR(df_v, MIN.n11 = 10, DECISION = 3, RANKSTAT = 2)
df_prr <- lst_prr$SIGNALS[order(lst_prr$SIGNALS$"LB95(log(PRR))", decreasing = TRUE), 1:8]
row.names(df_prr) <- NULL
```
View the top ranked vaccine-adverse event pairs:
```{r}
head(df_prr) %>% kable() %>% kable_styling()
```
```{r,echo=FALSE}
saveRDS(df_prr, file = "data-processed/df_prr.rds")
```