pgsCompaR is an R data package that contains performance metrics for polygenic risk score (PGS) development methods measured across five European biobanks.
This data package doesn't provide any helpful functions for comparing PGS. It only contains processed experimental data and documentation. The raw experimental data are also permissively licensed and publicly available, but are more difficult to work with.
The fastest way to install the development version of pgsCompaR using devtools:
devtools::install_github("intervene-EU-H2020/pgsCompaR")
This data package only depends on base R. You can download the built release and install it locally using install.packages()
also.
Development dependencies are required to run the scripts in data-raw/
that process the raw data and save the rda
files in data/
.
The simplest way to install the development dependencies is to use renv
and restore the development profile:
$ git clone https://github.com/intervene-EU-H2020/pgsCompaR.git
$ cd pgsCompaR
$ R
R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
...
> renv::activate(profile="dev")
> renv::restore()
You may need to install renv first.
There are four datasets exported by this package:
Dataset | About |
---|---|
metrics |
Polygenic risk score performance metrics table for single biobanks |
meta_res |
Equivalent to metrics , but meta-analysed |
dst |
Pairwise comparison of polygenic risk score development methods |
pv_mrg |
Equivalent to dst , but meta-analysed |
Dataset documentation can be viewed in R the normal way:
library(pgsCompaR)
data(metrics)
?metrics
The data are licensed with CC-BY-4.0.
If you reuse data from this package in published work please cite our publication:
Remo Monti, Lisa Eick, Georgi Hudjashov, Kristi Läll, Stavroula Kanoni, Brooke N. Wolford, Benjamin Wingfield, Oliver Pain, Sophie Wharrie, Bradley Jermy, Aoife McMahon, Tuomo Hartonen, Henrike Heyne, Nina Mars, Samuel Lambert, Kristian Hveem, Michael Inouye, David A. van Heel, Reedik Mägi, Pekka Marttinen, Samuli Ripatti, Andrea Ganna, Christoph Lippert. "Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning" The American Journal of Human Genetics 2024. doi: https://doi.org/10.1016/j.ajhg.2024.06.003