Skip to content

MeCab(Japanese Morphological Analyzer) bindings for dart (standalone dart and flutter!) on all platforms.

License

Notifications You must be signed in to change notification settings

CaptainDario/mecab_for_dart

 
 

Repository files navigation

mecab_for_dart

MeCab(Japanese Morphological Analyzer) bindings for dart (standalone dart and flutter!) on all platforms. Try it out in the browser.

Android iOS Windows MacOS Linux Web Web --wasm

Installation

  1. Add this plug_in as a dependency in your pubspec.yaml file.
dependencies:   
   mecab_for_dart: <your_version> 
Windows only setup Create a `blobs` folder on the top level of your application and copy the dll's from `example/blobs` there. Lastly, open `windows/CMakeLists.txt` of your application and append at the end:
# Include the mecab binary
message(STATUS "Detected processor architecture: ${CMAKE_SYSTEM_PROCESSOR}")
if(CMAKE_SYSTEM_PROCESSOR STREQUAL "ARM64")
    set(MECAB_DLL ${PROJECT_BUILD_DIR}/../blobs/libmecab_arm64.dll)
elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL "x86_64")
    set(MECAB_DLL ${PROJECT_BUILD_DIR}/../blobs/libmecab_x86.dll)
endif()

install(
  FILES
    ${MECAB_DLL}
  DESTINATION
    ${INSTALL_BUNDLE_DATA_DIR}/../blobs/
  RENAME
    libmecab.dll
)

Example

Init Mecab:

var tagger = new Mecab();
await tagger.init("path/to/your/dictionary/", true);

Set the boolean option in init function to true if you want to get the tokens including features, set it to false if you only want the token surfaces.

Use the tagger to parse text:

var tokens = tagger.parse('にわにわにわにわとりがいる。');
var text = '';

for(var token in tokens) {
  text += token.surface + "\t";
  for(var i = 0; i < token.features.length; i++) {
    text += token.features[i];
    if(i + 1 < token.features.length) {
       text += ",";
    }
  }
  text += "\n";
}

Notes for web usage

This library tries to load the mecab dictionary from the WASM filesystem. The easiest way to get the dictionary in it, is by bundling it when compiling mecab to wasm. However, it may be desirable to swap dictionaries. To do this, you need to load the dictionary into libmecab's wasm memory.

Building the binaries

Windows

Because mecab uses nmake on windows to compile, the mecab DLL needs to be created separately. For this open a Developer Command Prompt and change in the windows/src directory. In this directory execute nmake -f Makefile.x64.msvc (compile on x86) or nmake -f Makefile.arm64.msvc (compile on arm64). After the build process finished, there should be a libmecab.dll in windows/src.

Web

On web this plugin uses WASM.

To compile for WASM this project uses Emscripten. Therefore, to compile a wasm binary, first emscripten needs to be installed. Then, a WASM binary can be compiled by running compile_wasm_bare.sh (no dictionary included) or compile_wasm_embed.sh (ipadic embedded). This will generate libmecab.js and libmecab.wasm in the folder emcc_out/. Those files then need to be loaded by your application. For more details, see the example.

About

MeCab(Japanese Morphological Analyzer) bindings for dart (standalone dart and flutter!) on all platforms.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 99.5%
  • Makefile 0.2%
  • CMake 0.2%
  • Dart 0.1%
  • C 0.0%
  • Ruby 0.0%