Releases: TheJacksonLaboratory/cs-nf-pipelines
v0.7.4
v0.7.3
Release 0.7.3
In this release we make a updates to the ATAC workflow, and correct issues related to the PTA workflow.
ATAC:
- Merging of replicate samples is now supported. Use the
--merge_replicates
option, along with a CSV input file. See the wiki page for details on CSV setup. - GRCm39 pseudo-references generated with G2Gtools are now supported. Previously, GRCm38 was supported via the
--chain
option. For GRCm39, VCI files are required input also specified with--chain
PTA:
- The mouse PTA workflow would crash when all somatic CNVs were filtered, we have corrected this.
- Numerous adjustments to adjustments to memory and wall clock limits were made to support high coverage WGS data.
Pipelines Added:
None
Modules Added:
- modules/g2gtools/g2gtools_vci_convert.nf
Pipeline Changes:
- workflows/atac.nf: Replicate merging added. GRCm39 pseudo-reference support added.
- subworkflows/aria_download_parse.nf: Support for replicate merging added.
- subworkflows/concatenate_local_files.nf: Support for replicate merging added.
Module Changes:
- modules/cosmic/cosmic_add_cancer_resistance_mutations_germline.nf: wallclock and memory request increase.
- modules/gridss/gridss_assemble.nf: memory request increase, and java heap adjustment.
- modules/gridss/gripss_somatic_filter.nf: memory request increase, and java heap adjustment.
- modules/illumina/manta.nf: memory and wallclock requests were made flat rather than scaled to input file size.
- modules/picard/picard_mergesamfiles.nf: correct
file
vs.path
nextflow issue. - modules/python/python_somatic_vcf_finalization.nf: wallclock requests increase.
- modules/python/python_somatic_vcf_finalization_mouse.nf: wallclock requests increase.
- modules/r/plot_delly_cnv.nf: add dynamic plot naming based on
sampleID
- modules/samtools/samtools_chain_sort_fixmate_bam.nf: alter module to re-sort final filtered BAM prior to possible replicate merge.
- modules/samtools/samtools_non_chain_reindex.nf: alter module to re-sort final filtered BAM prior to possible replicate merge.
- modules/samtools/samtools_stats_insertsize.nf: wallclock request increase.
- modules/svaba/svaba.nf: memory and wallclock requests increase.
Scripts Added:
None
Script Changes:
- bin/gbrs/generate_emission_prob_avecs.py: Modify for use with non-DO strain IDs and dynamic number of strains.
- bin/pta/annotate-bedpe-with-cnv.r: Capture edge case where all somatic CNV are filtered.
- bin/pta/annotate-cnv-delly.r: Capture edge case where all somatic CNV are filtered.
- bin/pta/delly_cnv_plot.r: Capture edge case where all somatic CNV are filtered.
NF-Test Modules Added:
None
v0.7.2
Release 0.7.2
In this minor release we correct a bug in --workflow atac
. In this workflow, the macs2
module was configured to use a user defined parameter tmpdir
for scratch space. However, if the specified tmpdir
did not exist, macs2
would fail silently, and allow the workflow to continue. This behavior has been fixed.
v0.7.1
Release 0.7.1
In this minor release we change the Xengsort container to include GNU sort
rather than BusyBox sort
. This change was required to process very large FASTQ files.
In our testing, BusyBox sort
requires files to be held in memory during sorting, and does not support the use of temporary files. The use of GNU sort
allows for temporary files to be generated and alleviates the need to hold entire files in memory. This change has no impact on output from Xengsort, or any associated workflow.
v0.7.0
Release 0.7.0
In this release we add a new workflow for calling copy number variation (CNV) from raw Illumina IDAT genotype array files. Currently the IlluminaCytoSNP v2.1 array is supported, but support for additional arrays is possible.
We make additional minor changes as described below.
Pipelines Added:
- CNV calling from Illumina genotype array data (--cnv_array)
Modules Added:
- modules/bcftools/bcftools_gtct2vcf.nf
- modules/bcftools/bcftools_query_ascat.nf
- modules/illumina/iaap_cli.nf
- modules/ascat/ascat_run.nf
- modules/ascat/ascat_annotation.nf
Pipeline Changes:
None
Module Changes:
- Replaced the incorrect
${task.mem}
with${task.memory}
in the Nextflow error catch statement in modules related to the SV calling workflows. - utility_modules/gzip.nf: Memory request increase
Scripts Added:
- cnv_array/ASCAT_run.R
- cnv_array/annotate_ensembl_genes.pl
- cnv_array/seg_plot.R
- cnv_array/segment_raw_extend.pl
Script Changes:
None
NF-Test Modules Added:
- tests/workflows/cnv_array.nf.test
PIVOT_v2
This release captures workflows used in the analysis of multi-omics patient derived xenograft (PDX) data from the Pediatric Preclinical In Vivo Testing (PIVOT) Consortium.
Workflows used by this project are:
amplicon
rnaseq
with--pdx
pdx_wes
cnv_array
v0.6.7
Release 0.6.7
In this release we make the following minor adjustments:
- Correct syntax errors in the Xengsort module when running single-end data.
- Minor adjustments to EMASE and GBRS help and log information to include the
gen_org
param. - Bump the version of MultiQC to v1.23.
- Increase the memory request for a
PTA
moudles:python_merge_prep.nf
andpython_reorder_vcf_columns.nf
. - Add
CHECK_STRANDEDNESS
to multiQC output for PDX RNAseq - Increased job memory request in example run scripts.
v0.6.6
Release 0.6.6
In this release, we add a FASTQ sorting function to the Xengsort module. Due to asynchronous multi-threading in the classification step, Xengsort produces FASTQ output with non-deterministic sort order. BWA produces subtly different mapping results when reads in otherwise identical FASTQ inputs are shuffled (see note from BWA developer here). The slight mapping differences are not enough to impact overall results, but do prevent fully reproducible results when Xengsort is used and reads are not sorted. The addition of the sorting function allows for fully reproducible results, with no additional user action required.
v0.6.5
Release 0.6.5
In this minor release, we fix a subscript out of bounds
bug in bin/wes/sequenza_seg_na_window.R
.
v0.6.4
Release 0.6.4
In this release, we adjust memory and wallclock requirements for a number of modules, update read_group_from_fastq.py
from python2 to python3, and incorporate PRs #4 and #5.
- PR #4 (contributed by @BrianSanderson) adds an optional gene and transcript count merge across samples in the RNA and PDX RNA workflows (merge accessed via including the
--merge_rna_counts
flag). - PR #5 (contributed by @alanhoyle) adds a catch for corrupt gzip files in the Bowtie module as used by EMASE/GRBS analyses.
Pipelines Added:
None
Modules Added:
- utility_modules/merge_rsem_counts.nf
Pipeline Changes:
- workflows/rnaseq.nf module added to merge gene and transcript expression when
--merge_rna_counts
is used. - workflows/pdx_rnaseq.nf module added to merge gene and transcript expression when
--merge_rna_counts
is used.
Module Changes:
- bowtie/bowtie.nf pipefail catch added for corrupt gzip files, per #5.
- fastp/fastp.nf save json report as well as html report.
- nygenome/lancet.nf wallclock request increase.
- picard/picard_markduplicates.nf memory adjustment, and accounting for MarkDuplicates not fully respecting -Xmx memory limits imposed by Java.
- picard/picard_reordersam.nf memory request increase.
- picard/picard_sortsam.nf memory request increase.
- utility_modules/read_groups.nf container changed to py3.
Script Changes:
- bin/shared/read_group_from_fastq.py update from py2 to py3.