Skip to content

Latest commit

 

History

History
75 lines (45 loc) · 2.04 KB

README.md

File metadata and controls

75 lines (45 loc) · 2.04 KB

cidgoh_qc

Nextflow workflow for checking quality of NGS data

Introduction

cidgoh_qc is a bioinformatics analysis workflow based on nextflow to perform QC analysis.

How to install

  1. Install Nextflow (>=21.04.0)

TIPS: You can load nextflow on the Cedar cluster like this:

$ module load nextflow/21.04.3
  1. Install any of Docker, Singularity or Conda as package manager.

TIPS: Docker and Conda are not allowed on the Cedar cluster. By default, singularity is in the default tools, so you don't need to install on the Cedar cluster.

  1. Download source code from github
$ git clone https://github.com/cidgoh/cidgoh_qc.git
$ cd cidgoh_qc

TIPS: We have set up a default version on the Cedar cluster at /project/rrg-whsiao-ab/shared_tools/cidgoh_qc

How to use

Use singularity, docker or conda to mangage dependencies

$ nextflow run ./main.nf -profile <conda/singularity> --input samplesheet.csv --adapter_trim_mode <cutadapt/trimgalore/fastp> --kraken2_db [$dbname]

Use slurm to submit jobs

$ nextflow run ./main.nf -profile slurm --input samplesheet.csv --adapter_trim_mode <cutadapt/trimgalore/fastp> --kraken2_db [$dbname]

TIPS: If you run job on the Cedar cluster, you don't need to add --workDir because we have set up a default work_folder at /project/rrg-whsiao-ab/misc/tmp_work_nextflow.

How to check running performance

The nextflow reports are under "Reports" of your result folder.

timeline

TIPS: According to the used resources, you can adjust default resources request under conf/slurm.config

For example:

params {
  account = "xxxx"
  runTime       = 2.h
  singleCPUMem  = 1.GB
}

 withName:fastqc {
    cpus = 4
    memory = {params.singleCPUMem * 4 * task.attempt}
    time = {params.runTime * task.attempt}
  }