This pipeline is designed for the analysis of rabies data using Pacbio MinION sequencing. It performs quality control, species identification, abundance estimation, SNP calling, and annotation.
- Quality control (
FastQC
) - Species identification (
Kraken2
) - Species abundance estimation (
Bracken
) - Read alignment (
Minimap2
) - SNP calling (choice of
Sniffles
orBCFtools
) - Variant annotation (
SnpEff
)
graph TD
A[Input: FASTQ/BAM] --> B[Quality Control]
B --> C[Species Identification]
C --> D[Species Abundance Estimation]
A --> E[Read Alignment]
E --> F{SNP Calling}
F -->|Option 1| G[Sniffles]
F -->|Option 2| H[BCFtools]
G --> I[Variant Annotation]
H --> I
I --> J[Final Output]
style A fill:#f9d79b,stroke:#f39c12,stroke-width:2px
style B fill:#aed6f1,stroke:#3498db,stroke-width:2px
style C fill:#aed6f1,stroke:#3498db,stroke-width:2px
style D fill:#aed6f1,stroke:#3498db,stroke-width:2px
style E fill:#aed6f1,stroke:#3498db,stroke-width:2px
style F fill:#f5b7b1,stroke:#e74c3c,stroke-width:2px
style G fill:#d5f5e3,stroke:#2ecc71,stroke-width:2px
style H fill:#d5f5e3,stroke:#2ecc71,stroke-width:2px
style I fill:#aed6f1,stroke:#3498db,stroke-width:2px
style J fill:#f9d79b,stroke:#f39c12,stroke-width:2px
-
Install
Nextflow
(>=23.04.0) -
Install any of
Docker
,Singularity
,Podman
,Shifter
orCharliecloud
for full pipeline reproducibility (you can useConda
both to install Nextflow itself and also to manage software within pipelines. Please only use it within pipelines as a last resort; see docs) -
Download the pipeline and test it on a minimal dataset with a single command:
nextflow run /path/to/pipeline -profile test,YOURPROFILE --outdir <OUTDIR>
nextflow run main.nf -profile conda \
--input 'path/to/your/files/*.{fastq,bam}' \
--input_format auto \
--snp_caller bcftools \
--kraken2_db /path/to/kraken2_db \
--bracken_db /path/to/bracken_db \
--read_length 150 \
--bracken_levels 'S,G,F' \
--outdir results