Skip to content

Releases: pemistahl/lingua-rs

Lingua 1.2.1

08 May 16:26
Compare
Choose a tag to compare

Improvements

  • Language detection for sentences with more than 120 characters now performs more quickly by iterating through trigrams only which is enough to achieve high detection accuracy.
  • Textual input that includes logograms from Chinese, Japanese or Korean is now split at each logogram and not only at whitespace. This provides for more reliable language detection for sentences that include multi-language content.

Bug Fixes

  • Errors in the rule engine for the Latvian language have been resolved.
  • Corrupted characters in the Latvian test data have been corrected.

Lingua 1.2.0

08 Apr 21:24
Compare
Choose a tag to compare

Features

  • A LanguageDetector can now be built with lazy-loading required language models on demand (default) or with preloading all language models at once by calling LanguageDetectorBuilder.with_preloaded_language_models(). (#10)

Lingua 1.1.0

31 Jan 19:22
Compare
Choose a tag to compare

Languages

  • The Maori language is now supported. Thanks to @eekkaiia for the contribution. (#5)

Performance

  • Loading and searching the language models has been quite slow so far. Using parallel iterators from the Rayon library, this process is now at least 50% faster, depending on how many CPU cores are available. (#8)

Accuracy Reports

  • Accuracy reports are now also generated for the CLD2 library and included in the language detector comparison plots. (#6)

Lingua 1.0.3

15 Jan 14:45
Compare
Choose a tag to compare

Bug Fixes

  • Lingua could not be used within other projects because of a private serde module that was accidentally tried to be exposed.
    Thanks to @luananama for reporting this bug. (#9)

Lingua 1.0.2

22 Nov 20:47
Compare
Choose a tag to compare

Bug Fixes

  • Accidentally, bug #3 was only partially fixed. This has been corrected.

Lingua 1.0.1

22 Nov 20:21
Compare
Choose a tag to compare

Bug Fixes

  • When trying to create new language models, the LanguageModelFilesWriter panicked
    when it recognized characters in a text corpus that consist of multiple bytes.
    Thanks to @eekkaiia for reporting this bug. (#3)

Lingua 1.0.0

21 Nov 15:48
Compare
Choose a tag to compare

This is the very first release of Lingua for Rust. Took me 5 months of hard work in my free time. Hope you find it useful. :)