Skip to content

Commit

Permalink
Quick update of README to replace ncov2019 with mpxv (#2)
Browse files Browse the repository at this point in the history
  • Loading branch information
dfornika authored Aug 2, 2022
1 parent ebc34c2 commit 7d9715e
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 62 deletions.
66 changes: 12 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,32 @@
# ncov2019-artic-nf
A Nextflow pipeline for running the ARTIC network's fieldbioinformatics tools (https://github.com/artic-network/fieldbioinformatics), with a focus on ncov2019
# mpxv-artic-nf
A Nextflow pipeline for running the ARTIC network's fieldbioinformatics tools (https://github.com/artic-network/fieldbioinformatics), with a focus on monkeypox virus (mpxv).

![push master](https://github.com/BCCDC-PHL/ncov2019-artic-nf/actions/workflows/push_master.yml/badge.svg)
![push master](https://github.com/BCCDC-PHL/mpxv-artic-nf/actions/workflows/push_master.yml/badge.svg)

#### Introduction

------------

This Nextflow pipeline automates the ARTIC network [nCoV-2019 novel coronavirus bioinformatics protocol](https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html "nCoV-2019 novel coronavirus bioinformatics protocol"). The upstream repository ([connor-lab/ncov2019-artic-nf](https://github.com/connor-lab/ncov2019-artic-nf)) was created to aid the harmonisation of the analysis of sequencing data generated by the [COG-UK](https://github.com/COG-UK) project. This fork ([BCCDC-PHL/ncov2019-artic-nf](https://github.com/BCCDC-PHL/ncov2019-artic-nf)) has a few modifications designed to support the SARS-CoV-2 sequencing efforts at the BC Centre for Disease Control Public Health Laboratory, and to conform to standardization efforts in the context of the [CanGOGeN](https://www.genomecanada.ca/en/cancogen) project. It will turn SARS-COV2 sequencing data (Illumina or Nanopore) into consensus sequences and provide other helpful outputs to assist the project's sequencing centres with submitting data.
This pipeline is based on the [BCCDC-PHL/ncov2019-artic-nf](https://github.com/BCCDC-PHL/ncov2019-artic-nf) pipeline, which is a fork of the [connor-lab/ncov2019-artic-nf](https://github.com/connor-lab/ncov2019-artic-nf) pipeline. It has been modified to support analysis of monkeypox virus.


#### Quick-start
##### Illumina

```
nextflow run BCCDC-PHL/ncov2019-artic-nf [-profile conda,singularity,docker,slurm,lsf] \
nextflow run BCCDC-PHL/mpxv-artic-nf -profile conda \
--illumina --prefix "output_file_prefix" \
--bed /path/to/primers.bed \
--ref /path/to/ref.fa \
--primer_pairs_tsv /path/to/primer_pairs_tsv \
--composite_ref /path/to/human_and_sars-cov-2_composite_ref \
--directory /path/to/reads
--composite_ref /path/to/human_and_mpxv_composite_ref \
--directory /path/to/reads \
--outdir /path/to/outputs
```
You can also use cram file input by passing the --cram flag.
You can also specify cram file output by passing the --outCram flag.

For production use at large scale, where you will run the workflow many times, you can avoid cloning the scheme repository, creating an ivar bed file and indexing the reference every time by supplying both `--bed /path/to/ivar-compatible.bed` and `--ref /path/to/bwa-indexed/ref.fa`.

Alternatively you can avoid just the cloning of the scheme repository to remain on a fixed revision of it over time by passing --schemeRepoURL /path/to/own/clone/of/github.com/artic-network/artic-ncov2019. This removes any internet access from the workflow except for the optional upload steps.

##### Nanopore
###### Nanopolish

```
nextflow run BCCDC-PHL/ncov2019-artic-nf [-profile conda,singularity,docker,slurm,lsf] \
--nanopolish --prefix "output_file_prefix" \
--basecalled_fastq /path/to/directory \
--fast5_pass /path/to/directory \
--sequencing_summary /path/to/sequencing_summary.txt
```

###### Medaka

```
nextflow run connor-lab/ncov2019-artic-nf [-profile conda,singularity,docker,slurm,lsf] \
--medaka --prefix "output_file_prefix" \
--basecalled_fastq /path/to/directory \
--fast5_pass /path/to/directory \
--sequencing_summary /path/to/sequencing_summary.txt
```

#### Installation
An up-to-date version of Nextflow is required because the pipeline is written in DSL2. Following the instructions at https://www.nextflow.io/ to download and install Nextflow should get you a recent-enough version.

#### Containers
This repo contains both [Singularity]("https://sylabs.io/guides/3.0/user-guide/index.html") and Dockerfiles. You can build the Singularity containers locally by running `scripts/build_singularity_containers.sh` and use them with `-profile singularity` The containers will be available from Docker/Singularityhub shortly.

#### Conda
The repo contains a environment.yml files which automatically build the correct conda env if `-profile conda` is specifed in the command. Although you'll need `conda` installed, this is probably the easiest way to run this pipeline.
Expand All @@ -66,23 +40,7 @@ By default, the pipeline just runs on the local machine. You can specify `-profi
You can use multiple profiles at once, separating them with a comma. This is described in the Nextflow [documentation](https://www.nextflow.io/docs/latest/config.html#config-profiles)

#### Config
Common configuration options are set in `conf/base.config`. Workflow specific configuration options are set in `conf/nanopore.config` and `conf/illumina.config` They are described and set to sensible defaults (as suggested in the [nCoV-2019 novel coronavirus bioinformatics protocol](https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html "nCoV-2019 novel coronavirus bioinformatics protocol"))

##### Options
- `--outdir` sets the output directory.
- `--bwa` to swap to bwa for mapping (nanopore only).

##### Workflows

###### Nanopore
Use `--nanopolish` or `--medaka` to run these workflows. `--basecalled_fastq` should point to a directory created by `guppy_basecaller` (if you ran with no barcodes), or `guppy_barcoder` (if you ran with barcodes). It is imperative that the following `guppy_barcoder` command be used for demultiplexing:

```
guppy_barcoder --require_barcodes_both_ends -i run_name -s output_directory --arrangements_files "barcode_arrs_nb12.cfg barcode_arrs_nb24.cfg"
```

###### Illumina
The Illumina workflow leans heavily on the excellent [ivar](https://github.com/andersen-lab/ivar) for primer trimming and consensus making. This workflow will be updated to follow ivar, as its also in very active development! Use `--illumina` to run the Illumina workflow. Use `--directory` to point to an Illumina output directory usually coded something like: `<date>_<machine_id>_<run_no>_<some_zeros>_<flowcell>`. The workflow will recursively grab all fastq files under this directory, so be sure that what you want is in there, and what you don't, isn't!
Common configuration options are set in `conf/base.config`. Workflow specific configuration options are set in `conf/illumina.config` They are described and set to sensible defaults (as suggested in the [nCoV-2019 novel coronavirus bioinformatics protocol](https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html "nCoV-2019 novel coronavirus bioinformatics protocol"))

Important config options are:

Expand All @@ -94,7 +52,7 @@ Important config options are:
| `mpileupDepth` | `100000` | Mpileup depth for ivar |
| `varFreqThreshold` | `0.75` | ivar/freebayes frequency threshold for consensus variant |
| `varMinFreqThreshold | `0.25` | ivar/freebayes frequency threshold for ambiguous variant |
| `varMinDepth` | `10` | Minimum coverage depth to call variant |
| `varMinDepth` | `10` | Minimum coverage depth to call variant |
| `ivarMinVariantQuality` | `20` | ivear minimum mapping quality to call variant |
| `downsampleMappingQuality` | `20` | Exclude reads below this mapping quality while downsampling |
| `downsampleAmpliconSubdivisions` | `3` | Number of times amplicons are subdivided to determine locations of checkpoints to test for depth while downsampling |
Expand Down
6 changes: 3 additions & 3 deletions environments/illumina/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM continuumio/miniconda3:latest
LABEL authors="Matt Bull" \
description="Docker image containing all requirements for an Illumina ncov2019 pipeline"
description="Docker image containing all requirements for an Illumina mpxv pipeline"

COPY environments/extras.yml /extras.yml
COPY environments/illumina/environment.yml /environment.yml
RUN apt-get update && apt-get install -y curl g++ git make procps && apt-get clean -y
RUN /opt/conda/bin/conda env create -f /environment.yml
RUN /opt/conda/bin/conda env update -f /extras.yml -n artic-ncov2019-illumina && /opt/conda/bin/conda clean -a
ENV PATH=/opt/conda/envs/artic-ncov2019-illumina/bin:$PATH
RUN /opt/conda/bin/conda env update -f /extras.yml -n artic-mpxv-illumina && /opt/conda/bin/conda clean -a
ENV PATH=/opt/conda/envs/artic-mpxv-illumina/bin:$PATH
8 changes: 4 additions & 4 deletions environments/illumina/Singularity
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ environments/illumina/environment.yml /environment.yml
environments/extras.yml /extras.yml
%labels
authors="Matt Bull"
description="Docker image containing all requirements for the ARTIC project's ncov2019 pipeline"
description="Docker image containing all requirements for the ARTIC project's mpxv pipeline"
%post

apt-get update && apt-get install -y g++ git make procps rsync && apt-get clean -y
/opt/conda/bin/conda env create -f /environment.yml
/opt/conda/bin/conda env update -f /extras.yml -n artic-ncov2019-illumina
PATH=/opt/conda/envs/artic-ncov2019-illumina/bin:$PATH
/opt/conda/bin/conda env update -f /extras.yml -n artic-mpxv-illumina
PATH=/opt/conda/envs/artic-mpxv-illumina/bin:$PATH

%environment
export PATH=/opt/conda/envs/artic-ncov2019-illumina/bin:$PATH
export PATH=/opt/conda/envs/artic-mpxv-illumina/bin:$PATH
2 changes: 1 addition & 1 deletion environments/illumina/environment.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: artic-ncov2019-illumina
name: artic-mpxv-illumina
channels:
- conda-forge
- bioconda
Expand Down

0 comments on commit 7d9715e

Please sign in to comment.