DFE analysis pipeline from Verta et al. (2021)

This repository contains the code and command lines for the re-sequencing calling pipeline and DFE analysis presented in Verta et al. (2021) (https://academic.oup.com/gbe/advance-article/doi/10.1093/gbe/evab059/6179807). The different analysis and preparation steps are contained in their own subdirectories which are outlined below in rough sequential order.

A note on the cluster

A lot of the pipeline scripts write and submit jobs on SLURM based CSC computer cluster Puhti (https://docs.csc.fi/computing/overview/). This behaviour can be changed from submitting the job to only writing the batch file by substituting the line from qsub import q_sub with from qsub import q_write as q_sub at the top of the offending python scripts.

Python requirements

The scripts also make use of a number of python modules:

anavar_utils: https://github.com/henryjuho/anavar_utils - some code to make working with anavar control files easier.

python_qsub_wrapper: https://github.com/henryjuho/python_qsub_wrapper - some code to write and submit batch jobs from within the python scripts.

sfs_utils: https://github.com/henryjuho/sfs_utils - code for extracting frequency data from VCF files

pysam: https://pysam.readthedocs.io/en/latest/api.html - module for working with VCF, BED, FASTA and other files in python.

Other directories

There are also the following directories that contain pipelines that did not end up in the paper:

homeoblock_alignments/: Pipeline for aligning homeoblocks in the salmon, not used for any analysis.

coverage_impact/: A sanity check to see if including low coverage individuals to maximise sample size and number of SNPs was skewing the estimated DFE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DFE analysis pipeline from Verta et al. (2021)

Contents

A note on the cluster

Python requirements

Other directories

Files

README.md

Latest commit

History

README.md

File metadata and controls

DFE analysis pipeline from Verta et al. (2021)

Contents

A note on the cluster

Python requirements

Other directories