Finemapping pipeline

This is meant to collect a set of script needed to run finemapping on genetic association studies. We use snakemake workflow managment system based on python language. For additional details, refers to the homepage of Snakemake

In particular, we use the recently developed algorithm susie with its R implementation susieR, for further details on the algorithm, please refer to the original paper Wang et al. 2020

Overview

The pipeline starts from the summary statistic generated by the regenie algorithm, it applies a clumping step based on parameters defined in the configuration file. Afterwards, the clumps are enlarge to have a minimum size of 1Mb and if any overlaps between clumps is found, the two regions are merged together. For each clump, the susieR algorithm is applied. If a credible set is found, then it will be reported in the summary file.

Input:

Summary statistic files: pheno.regenie.gz.
Phenotype file (needed) containing the original phenotype used for the GWAS.
Genotype plink file sets [.bim, .fam, .bed] matching the GWAS analysis.

Output:

Installation

Requirements

- snakemake=8.4.8
- snakemake-executor-plugin-slurm
- git

Optional if already installed by the system administrator or already available in a conda environment.

See Install snakemake for further information and specific parameters.

conda create -n snakemake bioconda::snakemake bioconda::snakemake-executor-plugin-slurm

Pipeline installation

Now clone this repo into your working directory.

git clone https://github.com/EuracBiomedicalResearch/finemap_pipeline
cd finemap_pipeline

Write a configuration file

All the available parameters are defined through a configuration file written in YAML format language. Take the file config/config.yaml as an example and modify it according to your needs.

Running the pipeline

Activate the conda environment

conda activate snakemake

Dry-run to see the number of jobs to be submitted

sbatch snakemake --configfile config/config.yaml -n

Submit the command to slurm

NB See the snakemake documentation on how to create a slurm profile to submit jobs.

Snakemake documentation

Snakemake profiles

sbatch snakemake --configfile config/config.yaml --profile ~/snake_prof/slurm 
  --executor slurm
  --latency-wait 60
  --nolock

Output

The pipeline produce a summary tsv file with the leading variant for each credible set found in the analysis. The summary contain a subset of the original summary statistic.

References

{#ref-susier} Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. (2020). A simple new approach to variable selection in regression, with application to genetic fine mapping. Journal of the Royal Statistical Society, Series B 82, 1273–1300. https://doi.org/10.1111/rssb.12388

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Finemapping pipeline

Overview

Input:

Output:

Installation

Requirements

Pipeline installation

Running the pipeline

Output

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Finemapping pipeline

Overview

Input:

Output:

Installation

Requirements

Pipeline installation

Running the pipeline

Output

References