DFE analysis pipeline from Verta et al. (2021)

This repository contains the code and command lines for the re-sequencing calling pipeline and DFE analysis presented in Verta et al. (2021) (https://academic.oup.com/gbe/advance-article/doi/10.1093/gbe/evab059/6179807). The different analysis and preparation steps are contained in their own subdirectories which are outlined below in rough sequential order.

A note on the cluster

A lot of the pipeline scripts write and submit jobs on SLURM based CSC computer cluster Puhti (https://docs.csc.fi/computing/overview/). This behaviour can be changed from submitting the job to only writing the batch file by substituting the line from qsub import q_sub with from qsub import q_write as q_sub at the top of the offending python scripts.

Python requirements

The scripts also make use of a number of python modules:

anavar_utils: https://github.com/henryjuho/anavar_utils - some code to make working with anavar control files easier.

python_qsub_wrapper: https://github.com/henryjuho/python_qsub_wrapper - some code to write and submit batch jobs from within the python scripts.

sfs_utils: https://github.com/henryjuho/sfs_utils - code for extracting frequency data from VCF files

pysam: https://pysam.readthedocs.io/en/latest/api.html - module for working with VCF, BED, FASTA and other files in python.

Other directories

There are also the following directories that contain pipelines that did not end up in the paper:

homeoblock_alignments/: Pipeline for aligning homeoblocks in the salmon, not used for any analysis.

coverage_impact/: A sanity check to see if including low coverage individuals to maximise sample size and number of SNPs was skewing the estimated DFE.

Name		Name	Last commit message	Last commit date
Latest commit History 240 Commits
annotation		annotation
coverage_impact		coverage_impact
dfe		dfe
divergence		divergence
genome_alignment		genome_alignment
homeoblock_alignments		homeoblock_alignments
read_mapping		read_mapping
sfs		sfs
summary_stats		summary_stats
training_set		training_set
variant_calling		variant_calling
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DFE analysis pipeline from Verta et al. (2021)

Contents

A note on the cluster

Python requirements

Other directories

About

Languages

henryjuho/sal_enhancers

Folders and files

Latest commit

History

Repository files navigation

DFE analysis pipeline from Verta et al. (2021)

Contents

A note on the cluster

Python requirements

Other directories

About

Resources

Stars

Watchers

Forks

Languages