Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

微笑み converted to "bi emi" instead of "hohoemi" #62

Closed
louy2 opened this issue Feb 28, 2019 · 2 comments
Closed

微笑み converted to "bi emi" instead of "hohoemi" #62

louy2 opened this issue Feb 28, 2019 · 2 comments

Comments

@louy2
Copy link

louy2 commented Feb 28, 2019

微笑み is broken down to 微 and 笑み, and the resulting romaji becomes "bi emi", when it should be "hohoemi". This seems to be a limitation of kuromoji so I also opened a issue there.

takuyaa/kuromoji.js#36

@louy2
Copy link
Author

louy2 commented Feb 28, 2019

Yahoo-WebAPI seems to not have this issue, but we might want to customize dictionary, so it's best if we can continue using kuromoji.

@hexenq
Copy link
Owner

hexenq commented Mar 3, 2019

Thanks for your feedback. It's quite like what you said. It's up to you to choose the adequate dictionary to meet your tokenization and pronunciation requirements (#11). However kuromoji.js has its limitation on choosing dictionary. At this time, kuroshiro will not change the result of those morphological analyzers. But we could think of a solution for getting better results. You're welcome to raise a new issue if you have any ideas or suggestions. Closed.

@hexenq hexenq closed this as completed Mar 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants