Skip to content

Latest commit

 

History

History
72 lines (57 loc) · 3.53 KB

README.md

File metadata and controls

72 lines (57 loc) · 3.53 KB

LYSSA: A Rabies Analysis Pipeline

Nextflow run with conda run with docker run with singularity

Introduction

This pipeline is designed for the analysis of rabies data using Pacbio MinION sequencing. It performs quality control, species identification, abundance estimation, SNP calling, and annotation.

Pipeline summary

  1. Quality control (FastQC)
  2. Species identification (Kraken2)
  3. Species abundance estimation (Bracken)
  4. Read alignment (Minimap2)
  5. SNP calling (choice of Sniffles or BCFtools)
  6. Variant annotation (SnpEff)

Pipeline Overview

graph TD
    A[Input: FASTQ/BAM] --> B[Quality Control]
    B --> C[Species Identification]
    C --> D[Species Abundance Estimation]
    A --> E[Read Alignment]
    E --> F{SNP Calling}
    F -->|Option 1| G[Sniffles]
    F -->|Option 2| H[BCFtools]
    G --> I[Variant Annotation]
    H --> I
    I --> J[Final Output]
    
    style A fill:#f9d79b,stroke:#f39c12,stroke-width:2px    
    style B fill:#aed6f1,stroke:#3498db,stroke-width:2px     
    style C fill:#aed6f1,stroke:#3498db,stroke-width:2px     
    style D fill:#aed6f1,stroke:#3498db,stroke-width:2px    
    style E fill:#aed6f1,stroke:#3498db,stroke-width:2px     
    style F fill:#f5b7b1,stroke:#e74c3c,stroke-width:2px     
    style G fill:#d5f5e3,stroke:#2ecc71,stroke-width:2px    
    style H fill:#d5f5e3,stroke:#2ecc71,stroke-width:2px    
    style I fill:#aed6f1,stroke:#3498db,stroke-width:2px    
    style J fill:#f9d79b,stroke:#f39c12,stroke-width:2px     
Loading

Quick Start

  1. Install Nextflow (>=23.04.0)

  2. Install any of Docker, Singularity, Podman, Shifter or Charliecloud for full pipeline reproducibility (you can use Conda both to install Nextflow itself and also to manage software within pipelines. Please only use it within pipelines as a last resort; see docs)

  3. Download the pipeline and test it on a minimal dataset with a single command:

    nextflow run /path/to/pipeline -profile test,YOURPROFILE --outdir <OUTDIR>

To run this pipeline with Bracken for multiple taxonomic levels, you would use:

nextflow run main.nf -profile conda \
  --input 'path/to/your/files/*.{fastq,bam}' \
  --input_format auto \
  --snp_caller bcftools \
  --kraken2_db /path/to/kraken2_db \
  --bracken_db /path/to/bracken_db \
  --read_length 150 \
  --bracken_levels 'S,G,F' \
  --outdir results