GlinerRust is a Rust library for Named Entity Recognition (NER) using ONNX models. It provides a simple and efficient way to perform NER tasks on text data using pre-trained models.
- Easy-to-use API for Named Entity Recognition
- Support for custom ONNX models
- Asynchronous processing for improved performance
- Configurable parameters for fine-tuning
Add this to your Cargo.toml
:
[dependencies]
glinerrust = { git = "https://github.com/srv1n/Gliner-rs.git" }
Alternatively, you can clone the repository and use it locally:
git clone https://github.com/srv1n/Gliner-rs.git
cd Gliner-rs
Then, in your Cargo.toml
, add:
[dependencies]
glinerrust = { path = "path/to/Gliner-rs" }
To run the provided example:
- Ensure you have Rust and Cargo installed on your system.
- Clone this repository:
git clone https://github.com/yourusername/glinerrust.git cd glinerrust
- Download the required model and tokenizer files:
- Place
tokenizer.json
in the project root - Place
model_quantized.onnx
in the project root
- Place
- Run the example using Cargo:
cargo run --example basic_usage
Note: Make sure you have the necessary ONNX model and tokenizer files before running the example. The specific model and tokenizer files required depend on your use case and the pre-trained model you're using.
The InitConfig
struct allows you to customize the behavior of GlinerRust:
tokenizer_path
: Path to the tokenizer JSON filemodel_path
: Path to the ONNX model filemax_width
: Maximum width for processing (optional)num_threads
: Number of threads to use for inference (optional)
The main struct for interacting with the GlinerRust library.
new(config: InitConfig) -> Self
: Create a new Gliner instanceinitialize(&mut self) -> Result<(), GlinerError>
: Initialize the Gliner instanceinference(&self, input_texts: &[String], entities: &[String], ignore_subwords: bool, threshold: f32) -> Result<Vec<InferenceResultSingle>, GlinerError>
: Perform inference on the given input texts
Configuration struct for initializing a Gliner instance.
Represents a single entity detected in the text.
Represents the inference result for a single input text.
Represents the inference results for multiple input texts.
The library uses a custom GlinerError
enum for error handling, which includes:
InitializationError
: Errors that occur during initializationInferenceError
: Errors that occur during the inference process
- The
num_threads
option inInitConfig
allows you to control the number of threads used for inference. Adjust this based on your system's capabilities. - The
max_width
option can be used to limit the maximum input size. This can help manage memory usage for large inputs.
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
If you encounter any issues or have questions, please file an issue on the GitHub repository.