Skip to content

Commit

Permalink
Wastewater Lineage De-convolution (#99)
Browse files Browse the repository at this point in the history
* Add a titan wastewater variant calling workkflow

* Fix formatting

* Fix formatting (for real this time)

* adjust default minlength

* adjust trimmomatic params

* Set trimmmomatic minlen at workflow level

* Remove unused imports

* Add version capture task

* add meta block

* Update README.md

* Update README.md

* create titan_freyja workflow

* ensure executables are in path

* allow raw read inputs

* clean whitespace

* remove whitespace

* rename workflow and set header as samplename

* specify output file extension

* update versioning

* reset to jlevy docker for now

* small tweak

* update to staphb image

* allow user defined barcodes

* correct image tag

* revert to jlevy image

* add freyja plot workflow

* update to staphb freyja image

* add freyja plot workflow

* update freyja header for plot indexing

* adjust freyja commands

* fix interval flag

* fix plot options

* fix output variables

* adjust samplename array

* Update wf_freyja_plot.wdl

* add reference flag

* change default setting to plot summary

* Update wf_freyja_plot.wdl

* allow user-defined reference

* reference genome is optional

* close if statement

* fix task syntax

* allow user reference in ont workflows

* fix wdl syntax

* set user ref appropriately

* Output freyja demixed aggregate file

* fix collection array

* really fix collection array

* Update README.md

* Pull dockers from quay repo

* update summaries to include new outputs

* undo last commit -- wrong branch!

* allow for use of epi2me docker

* add medaka reference output

* fix typo

* modify bed dest path

* fix path

* Fix typo

* Update image diest

* comma

* capture reference from either artic or epi2me paths

* fix typo

* remove extra space

* add wildcard to ref path

* check for directory

* small tweak

* force zip

* small tweak 2

* add min/max length workflow vars

* Update test_clearlabs.yml

* Update test_ont.yml

* update to fastqscan

* Update test_ont.yml

* Update test_clearlabs.yml

* Update test_ont.yml

* check pass.vcf

* Default to freyja update_db

* Fix yml

* really fix yaml
  • Loading branch information
kevinlibuit authored Feb 4, 2022
1 parent 2feacbf commit 65f4df0
Show file tree
Hide file tree
Showing 9 changed files with 809 additions and 11 deletions.
15 changes: 15 additions & 0 deletions .dockstore.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,21 @@ workflows:
primaryDescriptorPath: /workflows/wf_titan_fasta.wdl
testParameterFiles:
- empty.json
- name: Titan_WWVC
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_wwvc.wdl
testParameterFiles:
- empty.json
- name: Freyja_FASTQ
subclass: WDL
primaryDescriptorPath: /workflows/wf_freyja_fastq.wdl
testParameterFiles:
- empty.json
- name: Freyja_Plot
subclass: WDL
primaryDescriptorPath: /workflows/wf_freyja_plot.wdl
testParameterFiles:
- empty.json
- name: TheiaCoV_Validate
subclass: WDL
primaryDescriptorPath: /workflows/wf_theiacov_validate.wdl
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ Bioinformatics workflows for genomic characterization, submission preparation, a
### Contributors & Influence
* Based on collaborative work with Andrew Lang, PhD & his [Genomic Analysis WDL workflows](https://github.com/AndrewLangvt/genomic_analyses)
* Workflows and task development influenced by The Broad's [Viral Pipes](https://github.com/broadinstitute/viral-pipelines)
* Titan Genomic Characterization workflows influenced by UPHL's [Cecret](https://github.com/UPHL-BioNGS/Cecret) & StaPH-B's [Monroe](https://staph-b.github.io/staphb_toolkit/workflow_docs/monroe/)
* Titan workflows for genomic characterization influenced by UPHL's [Cecret](https://github.com/UPHL-BioNGS/Cecret) & StaPH-B's [Monroe](https://staph-b.github.io/staphb_toolkit/workflow_docs/monroe/)
* The Titan workflow for waste water variant calling (Titan_WWVC) incorporates a modified version of the [CDPHE's WasteWaterVariantCalling WDL Worfklow](https://github.com/CDPHE/WasteWaterVariantCalling).
25 changes: 16 additions & 9 deletions tasks/task_alignment.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -6,26 +6,34 @@ task bwa {
File read1
File? read2
String samplename
String? reference_genome="/artic-ncov2019/primer_schemes/nCoV-2019/V3/nCoV-2019.reference.fasta"
File? reference_genome
Int? cpus=6
}

command {
command <<<
# date and version control
date | tee DATE
echo "BWA $(bwa 2>&1 | grep Version )" | tee BWA_VERSION
samtools --version | head -n1 | tee SAMTOOLS_VERSION

# set reference genome
if [[ ! -z "~{reference_genome}" ]]; then
echo "User reference identified; ~{reference_genome} will be utilized for alignement"
# move to primer_schemes dir; bwa fails if reference file not in this location
cp "~{reference_genome}" "/artic-ncov2019/primer_schemes/nCoV-2019/V3/nCoV-2019.reference.fasta"
fi

# Map with BWA MEM
echo "Running bwa mem -t ~{cpus} bwa_reference.bwa ~{read1} ~{read2} | samtools sort | samtools view -F 4 -o ~{samplename}.sorted.bam "
bwa mem \
-t ${cpus} \
${reference_genome} \
${read1} ${read2} |\
samtools sort | samtools view -F 4 -o ${samplename}.sorted.bam
-t ~{cpus} \
"/artic-ncov2019/primer_schemes/nCoV-2019/V3/nCoV-2019.reference.fasta" \
~{read1} ~{read2} |\
samtools sort | samtools view -F 4 -o ~{samplename}.sorted.bam

# index BAMs
samtools index ${samplename}.sorted.bam
}
samtools index ~{samplename}.sorted.bam
>>>

output {
String bwa_version = read_string("BWA_VERSION")
Expand All @@ -40,7 +48,6 @@ task bwa {
cpu: 2
disks: "local-disk 100 SSD"
preemptible: 0
maxRetries: 3
}
}

Expand Down
59 changes: 59 additions & 0 deletions tasks/task_taxonID.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -309,3 +309,62 @@ task nextclade_output_parser_one_sample {
String nextclade_aa_dels = read_string("NEXTCLADE_AADELS")
}
}
task freyja_one_sample {
input {
File primer_trimmed_bam
String samplename
File reference_genome
File? freyja_usher_barcodes
Boolean update_db = true
String docker = "staphb/freyja:1.2"
}
command <<<
# configure barcode settings and capture version
#if [[ ! -z "~{freyja_usher_barcodes}" ]]; then
# #capture database info
# azfreyja_usher_barcode_version=$(basename -- "~{freyja_usher_barcodes}")
# echo "here"
# #set environment with user-defined db
# mv ~{freyja_usher_barcodes} /opt/conda/envs/freyja-env/lib/python3.7/site-packages/freyja/data/usher_barcodes.csv
#else
# # update db if specified
# if ~{update_db}; then
# freyja update
# freyja_usher_barcode_version="freyja update: $(date +"%Y-%m-%d")"
# else
# freyja_usher_barcode_version="unmodified from freyja container: ~{docker}"
# fi
#fi
# always update freyja barcodes until v1.3.1 release (will allow user-defined ref files)
freyja update
freyja_usher_barcode_version="freyja update: $(date +"%Y-%m-%d")"
echo ${freyja_usher_barcode_version} | tee FREYJA_BARCODES
# Call variants and capture sequencing depth information
freyja variants ~{primer_trimmed_bam} --variants ~{samplename}_freyja_variants.tsv --depths ~{samplename}_freyja_depths.tsv --ref ~{reference_genome}
# Demix variants
freyja demix ~{samplename}_freyja_variants.tsv ~{samplename}_freyja_depths.tsv --output ~{samplename}_freyja_demixed.tmp
# Adjust output header
echo -e "\t/~{samplename}" > ~{samplename}_freyja_demixed.tsv
tail -n+2 ~{samplename}_freyja_demixed.tmp >> ~{samplename}_freyja_demixed.tsv
>>>
runtime {
memory: "4 GB"
cpu: 2
docker: "~{docker}"
disks: "local-disk 100 HDD"
}
output {
File freyja_variants = "~{samplename}_freyja_variants.tsv"
File freyja_depths = "~{samplename}_freyja_depths.tsv"
File freyja_demixed = "~{samplename}_freyja_demixed.tsv"
String freyja_barcode_version = read_string("FREYJA_BARCODES")
}
}
2 changes: 1 addition & 1 deletion tasks/task_versioning.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ task version_capture {
volatile: true
}
command <<<
PHVG_Version="PHVG v1.5.3"
PHVG_Version="PHVG v1.6.0-dev"
~{default='' 'export TZ=' + timezone}
date +"%Y-%m-%d" > TODAY
echo $PHVG_Version > PHVG_VERSION
Expand Down
Loading

0 comments on commit 65f4df0

Please sign in to comment.