Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

pemistahl / lingua-rs Public

Notifications You must be signed in to change notification settings
Fork 41
Star 899

Code
Issues 12
Pull requests 6
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Releases: pemistahl/lingua-rs

Releases · pemistahl/lingua-rs

Lingua 1.6.2

12 Dec 20:45

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.6.2 Latest

Latest

Improvements

Type stubs for the Python bindings are now available, allowing better static code analysis, better code completion in supported IDEs and easier understanding of the library's API.

Bug Fixes

The method LanguageDetector.detect_multiple_languages_of still returned character indices instead of byte indices when only a single DetectionResult was produced. This has been fixed.

Assets 2

Loading

thewh1teagle reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

Lingua 1.6.1

23 Nov 22:26

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.6.1

Bug Fixes

The method LanguageDetector.detect_multiple_languages_of returns byte indices. For creating string slices in Python and JavaScript, character indices are needed but were not provided. This resulted in incorrect DetectionResults for Python and JavaScript. This has been fixed now by converting the byte indices to character indices. (pemistahl/lingua-py#192)
Some minor bugs in the WASM module have been fixed to prepare the first release of Lingua for JavaScript.

Assets 2

Loading

nwagner84 reacted with thumbs up emoji

All reactions

👍 1 reaction

1 person reacted

Lingua 1.6.0

15 Nov 08:06

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.6.0

Features

Python bindings are now available for the library. These bindings replace the pure Python implementation of Lingua in order to benefit from Rust's performance in any Python software. (#262)
Parallel equivalents for all methods in LanguageDetector have been added to give the user the choice of using the library single-threaded or multi-threaded. (#271)

Bug Fixes

Several bugs in multiple languages detection have been fixed that caused incomplete results to be returned in several cases.
A significant amount of Kazakh texts were incorrectly classified as Mongolian. This has been fixed.

Assets 2

Loading

All reactions

Lingua 1.5.0

13 Jun 17:47

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.5.0

Features

The new method LanguageDetector.detect_multiple_languages_of() has been introduced. It allows to detect multiple languages in mixed-language text. (#1)
The new method LanguageDetectorBuilder.with_low_accuracy_mode() has been introduced. By activating it, detection accuracy for short text is reduced in favor of a smaller memory footprint and faster detection performance. (#119)
The new method LanguageDetector.compute_language_confidence() has been introduced. It allows to retrieve the confidence value for one specific language only, given the input text. (#102)

Improvements

The computation of the confidence values has been revised and the softmax function is now applied to the values, making them better comparable by behaving more like real probabilities. (#120)
The WASM API has been revised. Now it makes use of the same builder pattern as the Rust API. (#122)
The language model files are now compressed with the Brotli algorithm which reduces the file size by 15 %, on average. (#189)
The language model ngrams are now stored in a CompactString type which reduces the amount of consumed memory by 20 %. (#198)
Several performance optimizations have been applied which makes the library nearly twice as fast as the previous version. Big thanks go out to @serega and @koute for their help. (#82, #148, #177)
The enums IsoCode639_1 and IsoCode639_3 now implement some new traits such as Copy, Hash and Serde's Serialize and Deserialize. The enum Language now implements Copy as well. (#175)

Contributors

serega and koute

Assets 2

Loading

All reactions

Lingua 1.4.0

08 Apr 10:11

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.4.0

Features

The library can now be compiled to WebAssembly and be used in any JavaScript project. Big thanks to @martindisch for bringing this forward. (#14)

Improvements

Some minor performance tweaks have been applied to the rule engine.

Contributors

martindisch

Assets 2

Loading

All reactions

Lingua 1.3.3

22 Feb 09:56

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.3.3

Bug Fixes

This release updates outdated dependencies and fixes an incompatibility between different versions of the include_dir crate which are used in the main lingua crate and the language model crates.

Assets 2

Loading

All reactions

Lingua 1.3.2

19 Oct 22:48

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.3.2

Bug Fixes

Another compilation error has been fixed which occurred when the Latin language was left out as Cargo feature.

Assets 2

Loading

All reactions

Lingua 1.3.1

19 Oct 22:34

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.3.1

Bug Fixes

When Chinese, Japanese or Korean were left out as Cargo features, there were compilation errors. This has been fixed.

Assets 2

Loading

joshrotenberg and wooster0 reacted with hooray emoji

All reactions

🎉 2 reactions

2 people reacted

Lingua 1.3.0

19 Oct 21:39

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.3.0

Features

The language model dependencies are separate Cargo features now. Users can decide which languages shall be downloaded and used in the library. (#12)

Improvements

The code that does the lazy-loading of the language models has been refactored significantly, making the code more stable and less error-prone.

Bug Fixes

In very rare cases, the language returned by the detector was non-deterministic. This has been fixed. Big thanks to @asg0451 for identifying this problem. (#17)

Contributors

asg0451

Assets 2

Loading

wooster0 reacted with heart emoji

All reactions

❤️ 1 reaction

1 person reacted

Lingua 1.2.2

02 Jun 21:24

pemistahl

Compare

Choose a tag to compare

Loading

Lingua 1.2.2

Features

The enums Language, IsoCode639_1 and IsoCode639_3 now implement std::str::FromStr in order to instantiate enum variants by string values. This comes in handy for JavaScript bindings and the like. (#15)

Improvements

The performance of preloading the language models has been improved.

Bug Fixes

Language detection for sentences with more than 120 characters was supposed to be done by iterating through trigrams only but this was never the case. This has been corrected.

Assets 2

Loading

lhr0909 reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

Previous 1 2 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.