Skip to content

Releases: nanoporetech/remora

v3.3.0

17 Sep 16:25
Compare
Choose a tag to compare

Dataset rework to add filters, more robust metadata and ability to train basecaller from Remora dataset. Also includes updates to modified base models.

v3.2.0

21 May 13:23
Compare
Choose a tag to compare

Feature additions:

  • Addition of modified base models for v5.0 basecalling models
    • All models (except for inosine) provided with SUP and HAC support
    • DNA
      • 5mC+5hmC, 5mC+4mC, 6mA
    • RNA (all models now all-contexts)
      • m6A, Pseudouridine, Inosine
  • Add remora model inspect command
  • Add remora dataset copy command
  • Allow multiple models to be used with remora infer from_pod5_and_bam (support C-mod and A-mod calling simultaneously)
  • Update learning rate scheduler to use cosine decay
  • Support IUPAC revcomp
  • Picoamp scaling enabled (model maintain use of scaling via k-mer levels)
  • Allow signal padding for chunks at the end of a read
  • Add parameter to limit the size of a dataset merge
  • Allow training and validation from core or config dataset
  • Reference-anchored inference documentation

Bug fixes:

  • Extend access to missing_metrics_ok to allow more robust use of API for read loading
  • Pin plotnine version
  • Fix bug for chunk extraction over entire read
  • Fix bug in m6A model specification

v3.1.0

05 Dec 22:40
Compare
Choose a tag to compare

Add v4.3 modified base models

  • 5mC+5hmC CG-context SUP
  • 5mC+5hmC CG-context HAC
  • 5mC+5hmC all-context HAC and SUP
  • 6mA all-context HAC and SUP
  • 4mC+5mC all-context SUP (research model)

Bug fixes:

  • Fix overflow on re-squiggle (#138)
  • Fix issue with split reads in API (#134)

v3.0.0

15 Nov 05:15
Compare
Choose a tag to compare

This version adds several new features as well as general bug fixes and optimizations.

Key Improvements:

  • A major Remora datasets update
    • Easier dataset composition and manipulation
      • Flexible dataset mixing, allowing use of randomers, native, enzymatic, PCR, spike-in, and other dataset types
      • Datasets defined by configuration file, which can be generated automatically
    • Larger datasets enabled
      • Model training has now been demonstrated on over one billion training chunks
    • Easier hyper-parameter tuning at training time
  • Enhanced signal and metrics plotting and exploration interface
  • Improved model inference speed
  • Full RNA support, including an m6A model - also available for production modified base calling through Dorado
  • ChEBI code support
    • Allow any modified base with full pipeline support
  • Split reads support
  • Use latest POD5 update
    • Allow single POD5 or directory of POD5 files as input
  • Various bug fixes

v2.1.3

24 Jul 14:05
Compare
Choose a tag to compare

Fix for cython 3.0 update

v2.1.2

05 Jun 23:00
Compare
Choose a tag to compare

Minor bug fixes and run settings checks

  • Fixes #82 and #83
  • Added check to avoid original issue for #82

v2.1.1

18 May 20:17
Compare
Choose a tag to compare
  • Add all-context models to repository
  • Add finetune option to model training

v2.1.0

03 May 19:10
Compare
Choose a tag to compare

This version adds several new features as well as general bug fixes and optimizations.

Major features:

  • Raw signal plotting allows users to visualize raw Nanopore signal aligned to a reference
    • In addition, the Remora API supports efficient and easy-to-use access to pre-read per-site signal based statistics (such as notebooks in repository).
    • Allows users to explore modified bases in signal space to gain intuition for modified base model training.
    • This replaces the Tombo signal visualization and metrics extraction features.
  • Infer performance optimization and bug fix on certain systems
    • Note that Dorado is still preferred as the production modified basecalling platform, but these improvements allow users to perform modified basecalling after canonical basecalling more efficiently.
  • Initial RNA Support
    • Support 3' to 5' signal in Remora datasets and models
  • Various training improvements
    • Batch balancing
    • Dynamic training data filtering

v2.0.0

06 Dec 03:19
Compare
Choose a tag to compare

Remora v2.0.0 release

Feature additions:

  • Updated kit14 5mC+5hmC models
  • Simplified POD5+BAM input pipeline
  • Remove ONNX model format (pytorch only unified with Dorado)
  • Automatic model downloads
  • Inference and validation from modBAM format
  • Duplex modified base calling
  • Remore Taiyaki/Megalodon dependency
  • Basecall-anchored training

v1.1.1

16 Jun 19:14
Compare
Choose a tag to compare

Remora v1.1.1 release

Feature additions:

  • Guppy-compatible model export including version 1 Remora models

Bug Fixes

  • onnxruntime protobuf dependency version issue
  • remora validate from_modbams using strand from --regions-bed
  • Fix big in unused chunk extraction code