maccha is a project that calculate sentence similarity by word mover's distance.
So far, only in Japanese.
To install required modules, simply:
$ pip install -r requirements.txt
maccha needs to install NEologd. Please install it.
First, you should download word vector and vocabulary's dictionary and store them into data directory.
For downloading files, please access
If you finish downloading the file, please unzip it into maccha/data.
Please run the test to see if it works correctly:
$ python -m unittest tests.word_mover
If following messages are displayed, everything is fine!
Distance between "JavaScript" and "JavaScript 2014" is 2.087188959121704.
Distance between "DexIndexOverflowExceptionと戦った話" and "AWS×Imagick×facedetectで困った話" is 2.034774008499384.
Distance between "ゆるっとローカル環境を作る" and "ローカル環境を作る。" is 0.0.
Distance between "PHP5.6のインストール" and "PHP5.4をインストール" is 0.0.