Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internationalize About page content #526

Merged

Conversation

mpastoring
Copy link
Contributor

@mpastoring mpastoring commented Oct 18, 2024

Internationalize About page content

Closes #492

  • Move text from HTML to en.json and fr.json locale files
  • Use translation keys in ProjectPage.js
  • Add renderHTML function for safe HTML rendering
  • Implement useTranslation hook for accessing translations

This PR makes the About page content translatable and easier to maintain across languages.

@mpastoring mpastoring changed the title PR: Internationalize About page content Internationalize About page content Oct 18, 2024
@wbazant
Copy link
Collaborator

wbazant commented Oct 18, 2024

Thanks for picking up this issue, it will get us a long way towards moving the app to production! I smiled at the dangerouslySetInnerHTML in the code, I've not seen it in a while but it's totally necessary here.

I had a go at reviewing it, it's a lot for me to proofread, so I came up with a semi-automated way of checking it:

  1. check out the branch and open localhost:3000/about
  2. open Reader view on Firefox
  3. copy text and save to file: about-new.txt
  4. open https://fallingfruit.org/about in Reader view similarly and save to about-previous.txt
  5. open https://beta.fallingfruit.org/about in Reader view similarly and save to about-beta.txt
  6. run commands: diff about-previous.txt about-new.txt and diff about-previous.txt about-beta.txt

Based on the check, I have these observations:
a) the About page in this branch misses titles and names of the executive branch, I see e.g.

25,29d24
< Executive Director
< Ethan Welty
< [email protected]
< Zürich, Switzerland

b) the live and beta site differ a bit: live says e.g. "distributed over 1,667,093 locations" whereas beta says "distributed over millions of locations"

c) where live and beta site differ, your changes match the beta site, so you've not used the live site's locale files but most likely an AI query

There's nothing wrong with using AI for this - actually it's kind of essential - but at the same time, we're trying to preserve the original translations, see Ethan's comment at #488 (comment)

Would you be up for making sure the About page has the same content as the live site, but solving this problem for us more robustly? The key idea here is to never manually do tedious work like moving text between files (since humans do it unreliably) and also not ask language models to do it directly (since they also do it unreliably) but instead leverage AI to write an automated workflow or a command line program.

It could be given a path to a source code file like src/components/about/ProjectPage.js and find all translation keys used, stuff like t('pages.about.ff_disclaimer_html'), then check they exist in an old locale file like config/locales/en.yml, and put them into the new locale file like public/locales/en.json. Then we'll be able to do it for other pages, and all languages.

I'd definitely write that program with AI :) - I use Aider with Claude Sonnet 3.5 at the moment - because it'll be obvious whether it works or not.

I also realised that this issue comes with a side quest, see #528 - feel free to not solve it and just set the text "millions" for the number of locations somewhere in the code.

@ezwelty
Copy link
Collaborator

ezwelty commented Oct 18, 2024

@wbazant Were there more deviations than what you pointed out? Could you repeat the same on the French, which obviously comes from the existing locale files. Note that in a deviation from the original key pages.about.more_about_html, I had added the wording "thousands of different types of edibles ... distributed over millions of locations" so that the beta wouldn't need to query the API (and because the exact numbers don't really matter). I had also removed the link from "imported datasets" because there was no such page at the time. @mpringuelet, you seem to have updated the French accordingly?

@mpringuelet In review, here is what I think still needs to happen to wrap this up:

  • Restore titles, names, addresses, emails of people (These are not translated, but directly in the HTML. see https://www.fallingfruit.org/about for the latest values).
  • Rename pages.about.edible_map to pages.about.more_about_html, restore the link "imported datasets" (and in the French, for "données importées"), and render accordingly.
  • The old website passes the language in a locale URL param, so links in the non-English locale files have things like ?locale=fr, which I guess should be removed.

p.s. I added a translator that had since been added to the current site's HTML.

@wbazant
Copy link
Collaborator

wbazant commented Oct 19, 2024

@ezwelty some paragraphs showed up in the diff between live and beta but that turns out to be edits you wanted, thanks for joining the review!. I've also had two paragraphs not show up in the about-new.txt, but that's my reader view missing them somehow. We can check again after the PR is ready. I've also not checked if the keys used are present in the .yaml, hence the proposal to write a program, but I guess I can write that program!

@mpastoring
Copy link
Contributor Author

mpastoring commented Oct 19, 2024

Hey @wbazant and @ezwelty, thanks for the detailed review! I really appreciate your insights. Here's what I'm planning to do:

  1. I'll add back the missing people info from the live site.
  2. I'll fix the translation keys as suggested and restore the "imported datasets" link.
  3. I'll remove those unnecessary locale params from the links.
  4. I'll double-check my content against the live site to make sure I'm using the original translations.

@wbazant, I love your idea about automating this process. I'll work on a script to handle the translation key transfers. Should make things easier for future updates too!

For now, I'll set the locations to "millions" in the code as a quick fix for #528.

I'll push these changes soon and ask for another review. Thanks again for helping me improve this PR and learn more about the project!

- Replace 'edible_map' key with correct  'more_about_html' content
- Restore personal information from fallingfruit.org/about
- Add README.md with usage instructions
- Create extractTranslationKeys.js for browser-based key extraction
- Implement migrate_translations.py for YAML to JSON migration
@mpastoring
Copy link
Contributor Author

mpastoring commented Oct 22, 2024

@wbazant @ezwelty Thanks for the review! I've made significant updates to address the feedback:

  1. Content Restoration:

    • Restored all personal info (names, emails, titles) from the live site
    • Fixed the translation key from edible_map to more_about_html
    • Reverted previous changes to en.json and fr.json to preserve original translations
  2. Migration Automation:
    I created two scripts to handle translations reliably:

    • extractTranslationKeys.js: Extracts i18n_viz keys from browser
    • migrate_translations.py: Migrates translations from YAML to JSON format
    • replace_placeholders.py: Converts placeholder syntax from %{} to {{}}
  3. Additional Content:
    I've included several items that can be dropped if not needed:

    • Original .yml files from config/locales (for reference/migration)
    • Additional language JSON files beyond en/fr
    • Temporary translations for "thousands/millions" until API integration
  4. Still Pending:

    • Removing ?locale=fr URL parameters from the non-English locale files. I can write another script to handle this automatically if you'd like.

Each change is in separate commits for easy review/reversion. Let me know if you'd like me to drop any of these additions or make other adjustments.

@mpastoring
Copy link
Contributor Author

Something, not directly related to this PR: I found a typo in the Spanish translation. It should be Necesario, single s, instead of Necessario
image

@wbazant
Copy link
Collaborator

wbazant commented Oct 23, 2024

Thanks for using the same copy as on the live site! My only request for text is to add "of" to
< distributed over millions locations.
in English.

I'm not a native Spanish speaker but I saw the spelling 'necesario' appearing in the Spanish locale a few times, and https://en.wiktionary.org/wiki/necessario claims the double s is an obsolete spelling.

I really liked your solution to show all keys on the page. For ease of comparison, I propose to sort them - I asked an AI and it gave me

diff --git a/scripts/extractTranslationKeys.js b/scripts/extractTranslationKeys.js
index d22f494..228ce40 100644
--- a/scripts/extractTranslationKeys.js
+++ b/scripts/extractTranslationKeys.js
@@ -79,8 +79,12 @@ async function copyAllTranslationKeys() {
     console.log('Grouped keys:', groupedKeys);
 
     if (keys.length > 0) {
-        const keysText = Object.entries(groupedKeys)
-            .map(([category, categoryKeys]) => `${category}:\n${categoryKeys.join('\n')}`)
+        const sortedGroupedKeys = Object.fromEntries(
+            Object.entries(groupedKeys).sort(([a], [b]) => a.localeCompare(b))
+        );
+
+        const keysText = Object.entries(sortedGroupedKeys)
+            .map(([category, categoryKeys]) => `${category}:\n${categoryKeys.sort().join('\n')}`)

Then I've also made your regex into a little program that prints all the keys used in the source file:

perl -nE 'say for m{t\((?:['"'"'"])(.+?)(?:['"'"'"])\)}g' src/components/about/ProjectPage.js  | sort | cat -n

The keys on the live site are the same on the live About page and the source file, and together with the text matching, now I'm pretty confident this issue is solved correctly.

Before I merge, can we keep the locales to just two languages for now? It was fun to see the About page in my native language, but when looking at code changes I noticed myself wanting to just scroll past them. Similarly, let's keep the .yml files in the old repository, so there's no confusion about which copy is latest.

Removing ?locale=fr URL parameters from the non-English locale files. I can write another script to handle this automatically if you'd like.

Yes please! Instead of a collection of scripts, I think a better design is to have one program. Building it out will pay off over time, and it's not too hard to give it a bit of structure - we can have classes that represent different things in our problem domain, like a component file, a .yml source, a .json target, a key, a localized field, etc. No need to invest hugely into it, but as we encounter use cases, we'll be able to extend the program so it keeps doing what we want.

Would you be up for also translating e.g. the Share The Harvest page too? We can swap for the next one - if you make an issue I can take on the implementation with the tools we built here, and give it to you or Ethan for review - or we can do it the same way as we did this one.

mpastoring and others added 5 commits October 23, 2024 18:00
- Add venv to .gitignore
- Add en.json and fr.json translation files
- Create replace_placeholders.py to convert %{} to {{}} in JSON files
- Update README.md with usage instructions
- Sort grouped keys alphabetically by category
- Sort keys within each category alphabetically
@mpastoring mpastoring force-pushed the manuelpastorrringuelet-issue-492 branch from 751fad4 to 17b7526 Compare October 23, 2024 16:04
@mpastoring
Copy link
Contributor Author

Thanks for using the same copy as on the live site! My only request for text is to add "of" to < distributed over millions locations. in English.

I'm not a native Spanish speaker but I saw the spelling 'necesario' appearing in the Spanish locale a few times, and https://en.wiktionary.org/wiki/necessario claims the double s is an obsolete spelling.

I really liked your solution to show all keys on the page. For ease of comparison, I propose to sort them - I asked an AI and it gave me

diff --git a/scripts/extractTranslationKeys.js b/scripts/extractTranslationKeys.js
index d22f494..228ce40 100644
--- a/scripts/extractTranslationKeys.js
+++ b/scripts/extractTranslationKeys.js
@@ -79,8 +79,12 @@ async function copyAllTranslationKeys() {
     console.log('Grouped keys:', groupedKeys);
 
     if (keys.length > 0) {
-        const keysText = Object.entries(groupedKeys)
-            .map(([category, categoryKeys]) => `${category}:\n${categoryKeys.join('\n')}`)
+        const sortedGroupedKeys = Object.fromEntries(
+            Object.entries(groupedKeys).sort(([a], [b]) => a.localeCompare(b))
+        );
+
+        const keysText = Object.entries(sortedGroupedKeys)
+            .map(([category, categoryKeys]) => `${category}:\n${categoryKeys.sort().join('\n')}`)

Then I've also made your regex into a little program that prints all the keys used in the source file:

perl -nE 'say for m{t\((?:['"'"'"])(.+?)(?:['"'"'"])\)}g' src/components/about/ProjectPage.js  | sort | cat -n

The keys on the live site are the same on the live About page and the source file, and together with the text matching, now I'm pretty confident this issue is solved correctly.

Before I merge, can we keep the locales to just two languages for now? It was fun to see the About page in my native language, but when looking at code changes I noticed myself wanting to just scroll past them. Similarly, let's keep the .yml files in the old repository, so there's no confusion about which copy is latest.

Removing ?locale=fr URL parameters from the non-English locale files. I can write another script to handle this automatically if you'd like.

Yes please! Instead of a collection of scripts, I think a better design is to have one program. Building it out will pay off over time, and it's not too hard to give it a bit of structure - we can have classes that represent different things in our problem domain, like a component file, a .yml source, a .json target, a key, a localized field, etc. No need to invest hugely into it, but as we encounter use cases, we'll be able to extend the program so it keeps doing what we want.

Would you be up for also translating e.g. the Share The Harvest page too? We can swap for the next one - if you make an issue I can take on the implementation with the tools we built here, and give it to you or Ethan for review - or we can do it the same way as we did this one.

Thank you @wbazant! I'll make the request changes, and then I'll respond to your questions.

- Remove locale params from English and French translations
- Improve translation_manager.py with better error handling and file checks
- Update README with more detailed usage instructions
@mpastoring
Copy link
Contributor Author

mpastoring commented Oct 23, 2024

Thanks for using the same copy as on the live site! My only request for text is to add "of" to < distributed over millions locations. in English.

I'm not a native Spanish speaker but I saw the spelling 'necesario' appearing in the Spanish locale a few times, and https://en.wiktionary.org/wiki/necessario claims the double s is an obsolete spelling.

I really liked your solution to show all keys on the page. For ease of comparison, I propose to sort them - I asked an AI and it gave me

diff --git a/scripts/extractTranslationKeys.js b/scripts/extractTranslationKeys.js
index d22f494..228ce40 100644
--- a/scripts/extractTranslationKeys.js
+++ b/scripts/extractTranslationKeys.js
@@ -79,8 +79,12 @@ async function copyAllTranslationKeys() {
     console.log('Grouped keys:', groupedKeys);
 
     if (keys.length > 0) {
-        const keysText = Object.entries(groupedKeys)
-            .map(([category, categoryKeys]) => `${category}:\n${categoryKeys.join('\n')}`)
+        const sortedGroupedKeys = Object.fromEntries(
+            Object.entries(groupedKeys).sort(([a], [b]) => a.localeCompare(b))
+        );
+
+        const keysText = Object.entries(sortedGroupedKeys)
+            .map(([category, categoryKeys]) => `${category}:\n${categoryKeys.sort().join('\n')}`)

Then I've also made your regex into a little program that prints all the keys used in the source file:

perl -nE 'say for m{t\((?:['"'"'"])(.+?)(?:['"'"'"])\)}g' src/components/about/ProjectPage.js  | sort | cat -n

The keys on the live site are the same on the live About page and the source file, and together with the text matching, now I'm pretty confident this issue is solved correctly.

Before I merge, can we keep the locales to just two languages for now? It was fun to see the About page in my native language, but when looking at code changes I noticed myself wanting to just scroll past them. Similarly, let's keep the .yml files in the old repository, so there's no confusion about which copy is latest.

Removing ?locale=fr URL parameters from the non-English locale files. I can write another script to handle this automatically if you'd like.

Yes please! Instead of a collection of scripts, I think a better design is to have one program. Building it out will pay off over time, and it's not too hard to give it a bit of structure - we can have classes that represent different things in our problem domain, like a component file, a .yml source, a .json target, a key, a localized field, etc. No need to invest hugely into it, but as we encounter use cases, we'll be able to extend the program so it keeps doing what we want.

Would you be up for also translating e.g. the Share The Harvest page too? We can swap for the next one - if you make an issue I can take on the implementation with the tools we built here, and give it to you or Ethan for review - or we can do it the same way as we did this one.

@wbazant About the typo, maybe I didn't express it clearly before - 'necesario' currently has two s's in the text, but it should be with one 's'. So yes, a change is needed there.

I've implemented all the other requested changes: added "of", sorted the keys, and kept just two languages.

I'm happy to work on the Share The Harvest page next - let me know how you'd like to proceed with that.

@wbazant wbazant merged commit 23b5ec1 into falling-fruit:main Oct 24, 2024
@wbazant
Copy link
Collaborator

wbazant commented Oct 24, 2024

Thanks again for the work on this page, and for making the changes! I've merged it in.

About the typo, maybe I didn't express it clearly before - 'necesario' currently has two s's in the text, but it should be with one 's'. So yes, a change is needed there.

Right, I see, sorry! Do you want to make the change in the locale file on the live site repo and submit as PR? I've just done something like that after seeing a few errors in the about page for the Polish translation.

I'm happy to work on the Share The Harvest page next - let me know how you'd like to proceed with that.

I've made us a slightly ambitious epic, #538. Maybe just create an issue referencing the epic, open a PR, and I'll review it? When organizing software engineering work, I find it's good to say what we'll change before we change it (PRs correspond to issues ) and have more than one person look at each change, but also it's a volunteer project so anyone can make an issue or work on what they like.

@mpastoring
Copy link
Contributor Author

Thanks again for the work on this page, and for making the changes! I've merged it in.

Thank you so much for your detailed review! I'm happy to see it was merged :)

Right, I see, sorry! Do you want to make the change in the locale file on the live site repo and submit as PR? I've just done something like that after seeing a few errors in the about page for the Polish translation.

I will do it!

I've made us a slightly ambitious epic, #538. Maybe just create an issue referencing the epic, open a PR, and I'll review it? When organizing software engineering work, I find it's good to say what we'll change before we change it (PRs correspond to issues ) and have more than one person look at each change, but also it's a volunteer project so anyone can make an issue or work on what they like.

Excellent, I'll take a look at it later.

@mpastoring mpastoring deleted the manuelpastorrringuelet-issue-492 branch October 24, 2024 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internationalize the About page
3 participants