SPDX-FileCopyrightText | SPDX-License-Identifier |
---|---|
2024 PyThaiNLP Project |
Apache-2.0 |
Node.js binding for nlpO3, a Thai natural language processing library in Rust.
- Thai word tokenizer
- Use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
- Fast backend in Rust
- Support custom dictionary
- Rust 2018 Edition
- Node.js v12 or newer
# In this directory
npm run release
Before build, your nlpo3/
directory should look like this:
- nlpo3/
- index.ts
- rust_mod.d.ts
After build:
- nlpo3/
- index.js
- index.ts
- rust_mod.d.ts
- rust_mode.node
For now, copy the whole nlpo3/
directory after build to your project.
npm is still experimental and may not work on all platforms. Please report issues at https://github.com/PyThaiNLP/nlpo3/issues
npm i nlpo3
In JavaScript:
const nlpO3 = require(`${path_to_nlpo3}`)
// load dictionary and tokenize a text with it
nlpO3.loadDict("path/to/dict.file", "dict_name")
nloO3.segment("สวัสดีครับ", "dict_name")
In TypeScript:
import {segment, loadDict} from `${path_to_nlpo3}/index`
// load custom dictionary and tokenize a text with it
loadDict("path/to/dict.file", "dict_name")
segment("สวัสดีครับ", "dict_name")
Please report issues at https://github.com/PyThaiNLP/nlpo3/issues
- Find a way to build binaries and publish on npm.
nlpO3 Node binding is copyrighted by its authors and licensed under terms of the Apache Software License 2.0 (Apache-2.0). See file LICENSE for details.