-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom mappings #70
Custom mappings #70
Conversation
…omaji. This module is supposed to replace the constants in constants.js like FROM_ROMAJI. For now only romaji -> kana is implemented this way.
…s translitterate la to a little a instead
…o a single token; Change expected results for n edge cases, because without IME mode, all ns should be transliterated
…te parameter of toKana
At a glance, looks like there's some good things happening here.
This seems fine, but the static tests aren't adequate to ensure IME mode truly works correctly, will have to confirm manually with some different devices/keyboards.
Is your local/dev environment node 6 or below? Object.entries is 7+
Hmm, good question. I guess that makes perfect sense.
Agreed, nice catch.
This was before my time, but I guess the reasoning was the whole L/R combo sound of the R kanas. I prefer to follow the IME's here as well. The other day I was struggling to write a little ぁ and it should be possible to do with Wanakana. Supporting non-standard whacky romaji doesn't seem reasonable to me.
I believe it was intentional since the other IMEs still convert these and I'd say it falls under the umbrella of expected behaviour, disregarding whether the Japanese is nonsense. We should probably keep this in to keep backwards compat, and I don't really see a downside to allowing it.
A branch per feature/change/fix is a reasonable rule. |
… to https://en.wikipedia.org/wiki/Sokuon" This reverts commit 1a0c293. Revert commit 1a0c293 that removed the superfluous double consonants tests, because they seem to be intended
Would you be willing to update #65 (dumb/simple conversion) and other potential future requests (revised hepburn etc) could be solved via |
Sure. Right now I'm first going to revert the double consonant test commit and then implement the sokuon even for consonants that don't actually take one in Japanese, by simply changing the |
…n mapping, without making any assumptions about IMEMode etc.
…abic_n], `ちゃっちゃ` should be transliterated as `chatcha`
… romanization methods enum
I really appreciate what you're doing here, I've just been a bit busy. Are you waiting on me to take a look at the katakanaToHiragana stuff that's going on? |
Both, I guess 😆 |
I'm with you on this, I can't see a valid reason either that it has to convert I'm fairly busy at the moment as well; hoping to go through this soon since I prefer the approach. I think it would also be a good idea once this has been merged (in a separate PR, probably by me) to simplify the punctuation conversion to just match Google IME. Since users will now be able to pass their own custom processing for wonderfully confusing choices like (the currently implemented...)
|
Merged to dev for completion. |
So after discussing issue #68, the
toKana
or rather thesplitIntoKana
function seemed to be too complex to allow for changes like having a custom mapping from romaji to kana. That's why I took it upon myself to create a new method for converting romaji to kana, in order to make things easier in the long run.High level descriptions of the most important changes:
FROM_ROMAJI
) but a tree where each level is a romaji character (so instead ofmap['cha']
you now domap['c']['h']['a']['']
);toKana
still functions to the outside world as it did beforecustomKanaMapping
property, that defaults to the identity function; it takes as input a mapping tree from romaji to kana and returns a new tree (it does not modify the input in place) that is the updated mapping (the result has to be the complete mapping, not just the updates)IMEMode
anduseObsoleteKana
are also implemented as custom mappings; for compatibility, they are still used via of theoptions
object, so you do not handle those mappings explicitlyttsu
now form a unit by themselves, so they are only replaced once you write out the whole thingSome issues:
The script works fine in Google Chrome, but
npm run test
gives me an error saying thatObject.entries
is not a function. I did manage to run the tests though by addingto the top of the
constants.js
file, but I'm not sure how this works (I have no experience withnode.js
whatsoever, I just used the example from object.entries).I had to change the tests for the
ん
edge cases. I did that, because for example I think thatnn
should be replaced byん
in IME mode, butんん
without IME mode. Is that wrong? Also, I replacedlwe
->ゎ
withlwa
->ゎ
, I think that was just I typo by someone, since neither Microsoft, nor Google's IMEs do that. Speaking of which, just likeltsu
get replaced byっ
, they transliteratela
toぁ
instead ofら
, which is what WanaKana was doing. I changed that, too. I also removed some tests that are checking for conversion of double consonants of ones that are not supposed to take a sokuon, according to Wikipedia, likemma
->っま
, but because it was explicitly checked by the test, I don't know if this incorrect behaviour was intended or not.Because these changes are quit severe and only one direction has been adapted so far (the other one is still implemented like it used to), I made them in a new branch (
customMappings
). Is that how you're supposed to use branches? This is the first time I'm contributing to a project that is not my own, so I'm not really sure how to use GitHub in this regard.