Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] i18n: Fresh Start #1074

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

[core] i18n: Fresh Start #1074

wants to merge 12 commits into from

Conversation

confused-Techie
Copy link
Member

@confused-Techie confused-Techie commented Aug 9, 2024

This PR builds off of the fantastic work by @meadowsys in #715 to attempt getting Pulsar up and running with translation support.

While some aspects of this PR are inspired or directly borrowed from Meadows work, a lot of what was helpful were the initial research stages done, and being able to implement many of those ideas directly.

But this PR now represents a completely functional translatable Pulsar and community packages implementation. But to dive a little deeper, it's important to mention each individual way of text appearing in Pulsar and how we translate it:

  • UI: This is the easiest, able to be translated with the i18n API.
  • Settings: These are translatable via LocaleLabels
  • Context Menu: Translatable via LocaleLabels
  • Menu: Translatable via LocaleLabels
  • Command Palette: Does not support Translation: I've opted to keep the command-palette untouched, since commands are directly created by the names of the commands, in this current form I fear we couldn't implement translations without breaking changes, so I thought it was better to leave an attempt at that as it's own PR completely.

Getting Started

To implement translations in a community project (or Pulsar for that matter) is to simply add a locales folder in the root of the project. Within this folder should be a collection of files for your translations named like so: package-name.locale.json|cson.

When your project is initialized, just like for menus, this file will be read automatically. In the case of Pulsar the i18n.initialize() function reads Pulsar's locales file automatically.

As this file is read it'll take a look at the locale for each file and find the ones that may apply to the user. Because during load we have no awareness of the completeness of each file, we want to load as many as possible, while still making sure not to do wasted work. So we look at all possible languages the user may load at some point, and include any locales that are on that list.

From here all files loaded are available via i18n.strings (although this should not be accessed directly). This key value store is then used to match all translations. Accessible via a keypath like pulsar.context-menu.core:undo.

Translating Strings

For each string that's being translated we have the full support of the ICU Message Syntax, which allows plurals, replacements and much more, this is all provided by Intl-MessageFormat.

Keep in mind that for items that need additional properties passed to them, such as replacement values, these can only be used when translating via the i18n API, as extra properties cannot be passed via LocaleLabels.

Methods of Translation

There's a few different ways that get a string translated, so lets take a look at all of them.

LocaleLabel

In some cases, it's impossible to access the i18n API to translate a string, such as files in your menus directory. Since these files are cson|json files, they cannot run JS code. To translate these items we use what I'm dubbing a LocaleLabel which is simple a keyPath that correlates to a string accessible to the i18n API, such as one stored in your locales directory, that is surrounded by %.

For example, lets say the contents of ./locales/pulsar.en.cson looked like:

'pulsar': {
  'context-menu': {
    'core:undo': 'Undo'
  }
}

And I wanted this string to appear from ./menus/win32.cson:

'context-menu': 
  'atom-text-editor, .overlayer': [
    {label: '%pulsar.context-menu.core:undo%', command: 'core:undo'}
  ]

The above will successfully translate the label of this context menu item when it appears for the user.

The LocaleLabel method of translation is supported in:

  • Config: Only supported on the title and description key.
  • Application Menu: Only supported on the label key.
  • Context Menu: Only supported on the label key.

API

But for all other cases of translation, when we have access to the i18n API we have more freedom.

Lets say we have ./locales/pulsar.en.cson:

'pulsar': {
  'ui': {
    'myString': 'Hello World'
  }
}

The simplest way to translate this would be:

const str = atom.i18n.t("pulsar.ui.myString");

And if we needed to pass any replacements or extra parameters to Intl-MessageFormat we would do that like:

const str = atom.i18n.t("pulsar.ui.myString", opts);

But lets say we wanted easy access to our namespace, or our package's namespace.

const t = atom.i18n.getT("pulsar");

const str = t.t("ui.myString");

This saves us from having to type the full API dozens or hundreds of times as well as the name of the package.

How does a user control translations?

To control translations you simply have two settings:

  • Primary Locale: This would be the most specific primary locale for the user. Such as en-US
  • Priority List: This is a list from highest to lowest of the most specific locales the user would like to see as an alternative.

These two values are used, along with the hardcoded default fallback of en) to construct a list of languages to display to a user. This list is created in accordance of RFC4647 "Lookup Filtering Fallback Pattern".

What this means is that for every single entry, we continuously fallback to less and less specific locales of that language before moving onto the next option.

For example:

  • Primary: es-MX
  • Priority List: zh-Hant-CN, ja-JP
  • Constant hardcoded fallback: en

Our priority list of locales would be:

[
  'es-MX',
  'es',
  'zh-Hant-CN',
  'zh-Hant',
  'zh',
  'ja-JP',
  'ja',
  'en'
]

How is this priority list used?

When we load data from ./locales we only ever load a locale that is present on that above list. If a user had the same list I typed above but there was ./locales/pulsar.ar.cson it would never be loaded by the system at all, because the user wouldn't encounter that language during this fallback list.

But when we ask for any individual translation of a string, no matter if via the API or a LocaleLabel, we find the string we want to translate within i18n.strings first via it's keypath, then iterate through the fallback list and return the first match we get. This means that partial translation is completely supported for every single string, given that there is always a en locale translation available. This is an important point, anytime there is translation the base translations needed that must be 100% translated should be en, not en-US or en-GB or anything else.


From the above you can see I've given tried to cover every use case and made it as easy as possible to partially translate and start small.

These changes are 100% backwards compatible, and support having a single string translated or the entire application.

@confused-Techie confused-Techie marked this pull request as draft August 9, 2024 03:14
@confused-Techie confused-Techie marked this pull request as ready for review August 10, 2024 02:57
@savetheclocktower
Copy link
Contributor

Is there any urgency to get this into 1.120, or are you cool with letting it sit for a while? I think it looks fine at first glance, but I'd love to have a few weeks to play around with it if you're open to that.

@confused-Techie
Copy link
Member Author

@savetheclocktower I'm not against this one waiting around a bit. Obviously it'd be awesome to get it in sooner rather than later, but a month seems completely reasonable

Copy link
Member

@meadowsys meadowsys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took quick (sleepy) look, looks good to me!

smol sidenote: the work translators have done so far can likely still be used! depending on final format, we may have to "import" manually / write a tool to change the format, but, still usable! that was probably my main worry after I lost motivation on the original one, that translators' work was for nothing ><

Comment on lines +115 to +121
if (Array.isArray(item.submenu)) {
for (let y = 0; y < item.submenu.length; y++) {
if (this.i18n.isAutoTranslateLabel(item.submenu[y].label)) {
item.submenu[y].label = this.i18n.translateLabel(item.submenu[y].label);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should handle submenus recursively? (or am I mistaken, and submenus aren't possible / this already handles it?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a bad call at all, I'm not sure if there's a limit to how many submenus deep this structure could go, but I'll check to see if there's docs on the topic, since we want to make sure we don't forget anything

primary: {
type: "string",
order: 1,
default: "en-US",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think maybe this default should just be en?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using RFC4647 Lookup Filtering Fallback Pattern it's best for your locale to be as specific as possible. Here if our primary is en-US by default, that'll automatically cover en-US and en, as that'll be the next available fallback language if en-US is not available. The same way that if someone's primary locale was zh-Hant-CN they'd still fallback quickly to just zh if no other locales were available.

So I'm not saying this should be the default necessarily, but whatever it is, is better off being as specific as possible in terms of locale. But since the majority of Atom was developed in English by Americans I assumed this would be the most logical default, as it's what's essentially already expected by most users.

You'll even notice the beginner locale file I made for Pulsar is ./locales/pulsar.en.cson. But that's still what's loaded with the primary locale of en-US due to the fallback behavior.

@confused-Techie
Copy link
Member Author

@meadowsys Good call on importing translations after the fact.

Since after making this PR I did take a look to realize how much has already been translated, and was worried about that work going to waste.

But you are right, pulsar-edit/i18n-intermediate-sync still exists, and we could just put all of this stuff into a new file, then during the step where file structure is modified we could combine the files and move any translated keys to other keys in the final format.

If (hopefully when) we get this merged, I'll get started over there to work all that out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants