Oxford Nanopore ReadMe

MMRSVD Germline Structural Variant (SV): Oxford Nanopore (ONT) Documentation

SV Analysis Pipeline: Oxford Nanopore Data

(--workflow germline_sv, --data_type ont)

For input sample:

•   NANOSTAT getting stats from Prefilter Read
•   PORECHOP adapter trimming   
•   NANOQC   
•   NANOFILT filtering and trimming  
•   NANOSTAT getting stats from Postfilter Read
•   Minimap2 Mapping to reference genome
•   PBSV SV calling   
•   SNIFFLES SV calling   
•   SURVIVOR Annotation of results based on intersection with previously identified mouse SVs, genic and exonic regions

Oxford Nanopore Flowchart

flowchart TD
    p00([ONT READS\nFASTQ])
    p00A[NANOSTAT_PREFILT]
    p00B[PORECHOP]
    p00C[NANOQC]
    p00D[NANOFILT]
    p00E[NANOSTAT_POSTFILT]
    p00F[NANOSV]
    p00G[PARSE_NANOSV_DEPTHS]
    p001([REFERENCE_GENOME\nGRCm39])
    p002[MINIMAP2_INDEX]
    p003[PRE-ALIGNED BAM]
    p01[MINIMAP2_MAP_ONT]
    p02[SAMTOOLS_SORT]
    p03[SAMTOOLS_FILTER]
    o1([Genomic BAM]):::output
    p04[SNIFFLES]
    p04A[PARSE_SNIFFLES_DEPTHS]
    o2([Merged Depths BED]):::output
    o3([Annotated SV Calls]):::output
    o4([Merged VCF]):::output
    o5([Intersect BEDS]):::output
    o6([SNIFFLES SV Calls]):::output
    o7([NANO SV Calls]):::output
    p05[SURVIVOR_MERGE]
    p05 --> o4
    p06[SURVIVOR_SUMMARY]
    p07[SURVIVOR_VCF_TO_TABLE]
    p08[SURVIVOR_TO_BED]
    p09[SURVIVOR_BED_INTERSECT]
    p10[SURVIVOR_ANNOTATION]
    p11[SURVIVOR_ANNOTATION_WITH_EXONS]
    p12[PYTHON_PARSE_SURVIVOR_IDS]
    p13[R_MERGE_DEPTHS]
    p14[VCFTOOLS_FILTER]
    p15[SURVIVOR_INEXON]
    p16[PYTHON_ANNOT_DEPTHS]
    p17[PYTHON_ANNOT_ON_TARGET]
    p00 --> p00A
    p00A --> p00B
    p00B --> p00C
    p00C --> p00D
    p00D --> p00E
    p001 -..-> |Generate Reference Index if Neccesary| p002
    p002 --> p01
    p00D --> p01
    p01 --> p02
    p02 --> p03
    p001 --> p03
    p03 --> o1
    o1 --> p04
    p003 -..-> |If Pre-Aligned Bam Provided| p03
    o1 --> p00F
    p00F --> o7
    o7 --> p00G
    p00G -->p13
    p04 --> o6
    o6 --> p04A
    p04A -->p13
    o6 --> p05
    o7 --> p05
    o4 --> p06
    o4 --> p07
    o4 --> p12
    p12 --> p13
    p06 --> p08
    p06 --> p10
    p07 --> p10
    p07 --> p08
    p08 --> p09
    p08 --> p10
    p09 -->o5
    o5 --> p10
    o4 --> p11
    o5 --> p11
    p10 --> p13
    o4 --> p14
    p14 --> p15
    o5 --> p15
    p15 --> p16
    p13 -->o2
    o2 --> p16
    p16 --> p17
    p17 --> o3
    classDef output fill:#90aaff,stroke:#6c8eff,stroke-width:2px,color:#000000

Parameters for MMRSVD Germline SV Pipeline (ont)

--sampleID
- Default: <STRING>
- Comment: The sample ID for the input data (required).
--pubdir
- Default: /<PATH>
- Comment: The directory that the saved outputs will be stored.
--organize_by
- Default: sample
- Comment: How to organize the output folder structure. Options: sample or analysis.
--cacheDir
- Default: '/projects/omics_share/meta/containers'
- Comment: This is directory that contains cached Singularity containers. JAX users should not change this parameter.
-w
- Default: /<PATH>
- Comment: The directory that all intermediary files and nextflow processes utilize. This directory can become quite large. This should be a location on /fastscratch or other directory with ample storage.
--data_type
- Default: null
- Comment: Options: illumina or pacbio, or ont.
--fastq1
- Default: null
- Comment: The path to a single FASTQ file, or one of a pair of FASTQs for paired-end data.
--bam
- Default: null
- Comment: The path to a BAM input data if alignment has already been performed outside this pipeline.
--fasta
- Default: /<PATH>
- Comment: The path to the reference genome in FASTA format.
--fasta_index
- Default: /<PATH>
- Comment: Optional paramter to specify index for reference genome. If not provided, pipeline will generate an index.
--quality
- Default: 10
- Comment: NanoFilt parameter for minimum quality.
--length
- Default: 400
- Comment: NanoFilt parameter for maximum read length.
--headcrop
- Default: 10
- Comment: NanoFilt parameter to trim N nucleotides from the start of the read.
--tailcrop
- Default: 20
- Comment: NanoFilt parameter to trim N nucleotides from the end of the read.
--targ_chr
- Default: null
- Specify targeted chromosome if data were generated using adaptive sequencing mode.
--targ_start
- Default: null
- Specify targeted start coordinate if data were generated using adaptive sequencing mode.
--targ_end
- Default: null
- Specify targeted end coordinate if data were generated using adaptive sequencing mode.
--genome_build
- Default: GRCm38
- Comment: Mouse specific. Options: GRCm38 or GRCm39. Parameter that controls reference data used for alignment and annotation.
--tandem_repeats
- Default: '/ref_data/ucsc_mm10_trf_chr_sorted.bed'
- Comment: BED file that lists the coordinates of centromeres and telomeres to exclude as alignment targets. Note: default path refers to a location within the containers qquay.io/jaxcompsci/pbsv-td_refs:2.8.0--refv0.2.0 and quay.io/jaxcompsci/sniffles-td_refs:2.0.7--refv0.2.0, which require this file.
--sv_ins_ref
- Default: '/ref_data/variants_freeze5_sv_INS_mm39_to_mm10_sorted.bed.gz'
- Comment: BED file that lists previously indentified insertion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
--sv_del_ref
- Default: '/ref_data/variants_freeze5_sv_DEL_mm39_to_mm10_sorted.bed.gz'
- Comment: BED file that lists previously indentified deletion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
--sv_inv_ref
- Default: '/ref_data/variants_freeze5_sv_INV_mm39_to_mm10_sorted.bed.gz'
- Comment: BED file that lists previously indentified inversion SVs. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
--reg_ref
- Default: '/ref_data/mus_musculus.GRCm38.Regulatory_Build.regulatory_features.20180516.gff.gz'
- Comment: BED file that lists regulatory features. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
--genes_bed
- Default: '/ref_data/Mus_musculus.GRCm38.102.gene_symbol.bed'
- Comment: BED file that lists gene symbol IDs and coordinates. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
--exons_bed
- Default: '/ref_data/Mus_musculus.GRCm38.102.exons.bed'
- Comment: BED file that lists exons and coordinates. Note: default path refers to a location within the container quay.io/jaxcompsci/bedtools-sv_refs:2.30.0--refv0.2.0, which requires this file.
--surv_dist
- Default: 1000
- Comment: Maximum distance between breakpoints for merging SVs.
--surv_supp
- Default: 1
- Comment: The number of callers (out of 4) required to support an SV.
--surv_type
- Default: 1
- Comment: Boolean (0/1) that requires SVs to be the same type for merging.
--surv_strand
- Default: 1
- Comment: Boolean (0/1) that requires SVs to be on the same strand for merging.
--surv_min
- Default: 30
- Comment: Minimum length (bp) to output SVs.

Pipeline Default Outputs

Naming Convention	Description
`germline_sv_report.html`	Nextflow autogenerated report
`trace/trace.txt`	Nextflow trace of processes
`${sampleID}/${sampleID}_ONT_NS_struct_var.vcf`	VCF output combining merged NanoSV and Sniffles calls annotated for overlap with exonic regions
`${sampleID}/${sampleID}_survivor_joined_results.csv`	Table of SVs annotated with overlaps of previously identified SVs (beck), genes, exons, regulatory regions
`${sampleID}/alignments/${sampleID}.q30.bam`	Analysis-ready alignment of reads
`${sampleID}/alignments/${sampleID}.q30.bam.bai`	Index for analysis-ready alignment of reads
`${sampleID}/stats/nanostat_*fastq_${sampleID}`	NanoStat pre-Porechop log
`${sampleID}/stats/nanostat_${sampleID}_porechop_NanoFilt_${sampleID}`	NanoStat post-Porechop log
`${sampleID}/unmerged_calls/${sampleID}.nanosv_sorted_prefix.vcf`	SV calls from NanoSV
`${sampleID}/unmerged_calls/${sampleID}.sniffles_sorted_prefix.vcf`	SV calls from Sniffles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Oxford Nanopore ReadMe

MMRSVD Germline Structural Variant (SV): Oxford Nanopore (ONT) Documentation

SV Analysis Pipeline: Oxford Nanopore Data

(--workflow germline_sv, --data_type ont)

Oxford Nanopore Flowchart

Parameters for MMRSVD Germline SV Pipeline (ont)

Pipeline Default Outputs

Home

Pipeline Documentation

Benchmarking Documentation

Pipeline development and Release Documentation

Clone this wiki locally