Skip to content

Setup Support Scripts

Matt Ravenhall edited this page Jan 24, 2019 · 1 revision

Helpful Snippets

This is a collection of code snippets to help prep your files for SVPop.

Convert bcf to vcf.gz files

bcftools='/usr/bin/env bcftools'

samples=($(ls *bcf))

for sample in ${samples[@]}; do
	# Remove file extension
	ID=$(echo ${sample} | cut -d'.' -f1)

	# Convert to vcf file
	${bcftools} view ${ID}.bcf > ${ID}.vcf
done

samples=($(ls *vcf))

for sample in ${samples[@]}; do
	gzip $sample
done

Split DELLY output to separate models

samples=($(ls *vcf.gz))
models=(DEL DUP INS INV)

# Split by model
for sample in ${samples[@]}; do
	# Remove file extension
	ID=$(echo ${sample} | cut -d'.' -f1)
	echo ${ID}

	for model in ${models[@]}; do
		(zcat ${ID}.vcf.gz | grep '^##' ; zcat ${ID}.vcf.gz | grep 'SVTYPE='${model}) | gzip > ./${model}/${ID}_${model}.vcf.gz
	done
done

Creating your inFile (lists of raw vcfs) files

Assuming your variant files are formatted as <filePrefix>_<model>.vcf.gz, output a ForSVPop_<model>.txt file for each model.

for x in DEL DUP INS INV; do readlink -f *_${x}.vcf.gz > ForSVPop_${x}.txt; done