Skip to content

Releases: CampagneLaboratory/goby3

Release 3.3.1

09 Jan 15:38
Compare
Choose a tag to compare
  • Fixed alignment concat where results could be truncated if several empty slices followed one another (e.g., if concat A,B,C and A and B are empty, goby ca could yield an empty alignment, completely omitting alignments in part C.)

Release 3.3.0

15 Nov 19:47
Compare
Choose a tag to compare
  • Substantially reduced memory utilization for discover-sequence-variant (all modes).
  • discover-sequence-variant could in some rare cases output the same base twice (when indels were extending prior to the beginning of the read after equivalent indel region calculation). This fix improved indel performance when training models with variationanalysis 1.4.0+.
  • Initial work to develop models for genomic segments (see .ssi format and concurrent work in variationanalysis). This is work in progress. Protobuf schema is in goby-io/protobuf/SegmentInformationRecords.proto Models are developed in parallel with Keras (in goby3/python/dl) and DL4J (in variationanalysis).
  • Updated genotyping model to state of the art (models/genotyping/1510204519948/, see evaluation results in the folder)

3.2.2

17 Feb 21:00
Compare
Choose a tag to compare

This release includes:

  • Fix frequency of bases when indels are also present. Now correctly removes bases that
    support the flanking sequence of the indel and do not double count.
  • Many changes to how we store varmaps introduced to support indels (vcf-to-varmap).
    The serialization format is incompatible with previous versions, so make sure you regenerate
    varmaps from VCF.
  • Adjust VCF output for compatibility with REF/ALT conventions. This makes it possible to measure
    performance with standard tools such as RTG vcfeval (http://realtimegenomics.com/products/rtg-tools/).
  • Keep counts of indels separately for forward and reverse strand.
  • vcf-to-varmap mode: improved semantic of --chromosome-prefix option allows removing (e.g., -chr)
    or adding (+chr) prefix to chromosome name.

Release 3.2.1

24 Jan 17:38
Compare
Choose a tag to compare

This release includes the following changes:

  • fast-co-compact: fix a big introduced on 10/6/2016 which created negative read entries.
  • catch a number of exception that can be thrown by HTSJDK when processing BAM files. Exceptions
    are caught so that an error on one alignment does not interrupt processing of an entire alignment.
    Errors are shown in log.
  • vcf-to-genotype-map mode now supports (b)gzipped vcf input.
  • vcf-to-genotype-map: fix bug that manifested itself when the vcf had a single genotype field.
  • vcf-to-genotype-map: add chromosome-prefix argument to help import VCF where the chr prefix is missing.

Release 3.2

31 Dec 00:38
Compare
Choose a tag to compare

This release includes the following changes:

  • Remove memory leak when reading SAM/BAM files. This was the likely cause for running out of memory error in compression benchmarks (had nothing to do with compression but with the conversion of SAM/BAM to goby representation).
  • Disabled tests that could not succeed anymore (because of choices we made in Goby 3, such as lack of auto-upgrade for alignments produced with Goby 1 and 2.)
  • BAM/CRAM support. Added an option to bypass the header check on SO:COORDINATE. Use
    -x HTSJDKReaderImpl:force-sorted=true to force Goby to consider an alignment sorted.
  • SBI format: add ability to add true labels while writing the file. Add support for downsampling sites without variants.
  • Genotype format: reorganization to support calling with deep learning models trained with variation analysis.

Release 3.1

12 Dec 22:02
Compare
Choose a tag to compare
  • Reorganize model prediction to facilitate installing new versions of the variationAnalysis jars.
    Goby 3.1 is now compatible with variationanalysis 1.1.1.
  • Replace models with versions trained with variationanalysis 1.1.1.
  • Add somatic mutation models trained with whole genome data (ICGC GoldSet).

3.0.0

30 Sep 15:43
Compare
Choose a tag to compare
  • Goby 3 estimates probabilities that a genomic site contains a somatic mutation using adaptive models trained with DeepLearning4Java. Models are provided for RNA-Seq, paired exome and trio experimental design (subject and parents, with subject with possible de-novo mutations). See the companion project variationanalysis which we developed to train these models. New models can be trained with the companion project and used directly with Goby. These models can be used with the somatic_variation format of the discovery-sequence-variants mode. A preprint about the method is being finalized.
  • We have enabled support for BAM and CRAM in the discovery-sequence-variant mode of Goby. This means that you can directly use BAM files to call somatic variations in them. CRAM support has not been well tested, but is supported by the HTSDK library that we also use to support BAM, so it will hopefully also work.
  • We have upgraded Goby from an older version of the Java Samtools library to the latest HTSJDK (version 2.2.4). This may cause trouble with alignments in BAM format sorted with samtools 0.x. We discovered that HTSJDK will not recognize that these BAM files are sorted unless you modify the header or sort the BAM again with a recent version of samtools (1.x+).
  • We ported the source code to Maven and consolidated dependencies. As a result you will find the goby-io jar in Maven Central, which makes it easier to use the Goby framework in any Maven project.
  • The source code of the project has been moved to a new repository on GitHub: https://github.com/CampagneLaboratory/goby3