Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Traditional Chinese support #121

Closed
wants to merge 7 commits into from
Closed

Conversation

fshiori
Copy link

@fshiori fshiori commented Sep 27, 2018

Add Traditional Chinese support

@michaelwittig
Copy link
Owner

Hi @fshiori Thanks for your contribution. So far, the language codes in this library are ISO 639-1 codes.

zh-tw is not in this list and might be related to #47

Can we might use a ISO 639-3 language code listed here: https://en.wikipedia.org/wiki/ISO_639_macrolanguage#zho ? I'm not an expert on languages at all and need help here.

@fshiori
Copy link
Author

fshiori commented Oct 1, 2018

Hello @michaelwittig
I suggest use BCP 47 to replace ISO 639-1/639-3, too. Like #47

Because Mandarin Chinese(ISO 639-1: zh/ ISO 639-3:cmn/BCP47: zh-Hans) and Taiwanese Mandarin(BCP47: zh-Hant)have different written word.

@michaelwittig
Copy link
Owner

I believe #47 is about entering a BCP47 code, and only use the first part before the -. But your case is different because zh and zh-tw are two different things.

@michaelwittig
Copy link
Owner

#94 seems to solve the same use case

@michaelwittig
Copy link
Owner

I found this:
"How does one make distinctions between traditional and simplified Chinese characters and using the ISO 639 language codes?
The differences between traditional and simplified Chinese characters cannot be represented using the ISO 639 codes because these are distinctions in script. The character sets can be coded using ISO 15924 (Code for the Representation of Names of Scripts) script codes as subtags appended to the primary subtag for Chinese."
http://www.loc.gov/standards/iso639-2/faq.html#23

@fshiori
Copy link
Author

fshiori commented Oct 1, 2018

Yes, #94 is solve the same use case.
And the link describe my question, too.

We can't only use ISO 639 to distinguish between simplified Chinese and traditional Chinese.
Before BCP47, we usually use zh-tw for traditional Chinese, use zh-cn for simplified Chinese.
https://en.wikipedia.org/wiki/Language_localisation

In BCP47, it add script subtags use ISO 15924 code, so we can use use zh-Hant for traditional Chinese, use zh-Hans for simplified Chinese, and simple use zh for Chinese.

@michaelwittig
Copy link
Owner

I believe that switching from ISO 639-1 codes to BCP47 is more complicated.

zh, az and ar are not a valid "languages" according to https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry They are macrolanguages.

zh should likely be changed zh-Hans
ar I have no idea
az I have no idea
...
the list continues

@fshiori
Copy link
Author

fshiori commented Oct 12, 2018

need more work.

@fshiori fshiori closed this Oct 12, 2018
@williamli
Copy link

I think you can extend the current language code to use IETF language tags https://en.wikipedia.org/wiki/IETF_language_tag

en_us = US English
en_gb = UK English

zh_hk = Hong Kong Traditional Chinese
zh_tw = Taiwan Traditional Chinese
zh_cn = Mainland Simplified Chinese

@nathanaelmartin
Copy link

We need to distinguish :
zh-Hans : Simplified Chinese
zh-Hant : traditional Chinese

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants