Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Translation #9

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Optimize Translation #9

wants to merge 1 commit into from

Conversation

hexakleo
Copy link
Contributor

Refactored for PEP 8 compliance and cleaner structure.

Improved inline documentation for easier comprehension.

Optimized locale discovery and target language handling.

Added robust error handling for translation failures.

Ensured efficient message translation with fallback for ignored keys.

Organized output with dynamic overwriting and better formatting.

Simplified folder creation and JSON writing process.

Refactored for PEP 8 compliance and cleaner structure.

Improved inline documentation for easier comprehension.

Optimized locale discovery and target language handling.

Added robust error handling for translation failures.

Ensured efficient message translation with fallback for ignored keys.

Organized output with dynamic overwriting and better formatting.

Simplified folder creation and JSON writing process.
Copy link
Owner

@adamlui adamlui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should make separate PR per each specific change you wanna make so it can be more likely to be merged, otherwise if 1 thing is wrong with it whole PR needs changing, also you should do it @ https://github.com/adamlui/python-utils/blob/main/translate-messages/translate-en-messages.py

@@ -1,139 +1,119 @@
'''
"""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the quotes style you should just leave them as single

Description: Translate msg's from en/messages.json to [[output_langs]/messages.json]
Description: Translate messages from en/messages.json to other language directories.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original description is useful because reader understands the structure immediately and know what to edit to customize

chromium/utils/translate-en-messages.py Show resolved Hide resolved
Comment on lines -26 to +48
key = input('Enter key to ignore (or ENTER if done): ')
if not key : break
key = input('Enter key to ignore (or press ENTER if done): ')
if not key:
break
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere, the style is single line to save vertical space

Comment on lines -34 to +67
for root, dirs, files in os.walk(script_dir): # search script dir recursively
if locales_folder in dirs:
locales_dir = os.path.join(root, locales_folder) ; break
else: # search script parent dirs recursively
parent_dir = os.path.dirname(script_dir)
while parent_dir and parent_dir != script_dir:
for root, dirs, files in os.walk(parent_dir):
if locales_folder in dirs:
locales_dir = os.path.join(root, locales_folder) ; break
if locales_dir : break
parent_dir = os.path.dirname(parent_dir)
else : locales_dir = None

# Print result
if locales_dir : print_trunc(f'_locales directory found!\n\n>> { locales_dir }\n')
else : print_trunc(f'Unable to locate a { locales_folder } directory.') ; exit()

# Load en/messages.json

for root, dirs, _ in os.walk(script_dir):
if LOCALES_FOLDER in dirs:
locales_dir = os.path.join(root, LOCALES_FOLDER)
break

if not locales_dir:
print_trunc(f"Unable to locate the {LOCALES_FOLDER} directory.")
exit()

print_trunc(f"_locales directory found: {locales_dir}\n")

# Load English messages
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here your deletion of the else block means the script will no longer find locales folder if it exists in a parent dir

Comment on lines -67 to +84
# Create/update/translate [[output_langs]/messages.json]
langs_added, langs_skipped, langs_translated, langs_not_translated = [], [], [], []
# Translate messages
langs_translated = []
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original comment is clearer, also the other array inits got deleted so script no longer works

Comment on lines -70 to +119
lang_added, lang_skipped, lang_translated = False, False, False
folder = lang_code.replace('-', '_') ; translated_msgs = {}
if '-' in lang_code: # cap suffix
sep_index = folder.index('_')
folder = folder[:sep_index] + '_' + folder[sep_index+1:].upper()

# Skip English locales
if lang_code.startswith('en'):
print_trunc(f'Skipped {folder}/messages.json...')
langs_skipped.append(lang_code) ; langs_not_translated.append(lang_code) ; continue
continue # Skip English locales

# Initialize target locale folder
folder = lang_code.replace('-', '_')
folder_path = os.path.join(locales_dir, folder)
if not os.path.exists(folder_path): # if missing, create folder
os.makedirs(folder_path) ; langs_added.append(lang_code) ; lang_added = True

# Initialize target messages
msgs_path = os.path.join(folder_path, msgs_filename)

if not os.path.exists(folder_path):
os.makedirs(folder_path)

messages = {}
if os.path.exists(msgs_path):
with open(msgs_path, 'r', encoding='utf-8') as messages_file : messages = json.load(messages_file)
else : messages = {}

# Attempt translations
print_trunc(f"{ 'Adding' if not messages else 'Updating' } { folder }/messages.json...", end='')
stdout.flush()
en_keys = list(en_messages.keys())
fail_flags = ['INVALID TARGET LANGUAGE', 'TOO MANY REQUESTS', 'MYMEMORY']
for key in en_keys:
with open(msgs_path, 'r', encoding='utf-8') as messages_file:
messages = json.load(messages_file)

translated_msgs = {}
for key, value in en_messages.items():
if key in keys_to_ignore:
translated_msg = en_messages[key]['message']
translated_msgs[key] = { 'message': translated_msg }
continue
if key not in messages:
original_msg = translated_msg = en_messages[key]['message']
translated_msgs[key] = value
else:
try:
translator = Translator(provider=provider if provider else '', to_lang=lang_code)
translated_msg = translator.translate(original_msg).replace('"', "'").replace(''', "'")
if any(flag in translated_msg for flag in fail_flags):
translated_msg = original_msg
except Exception as e:
print_trunc(f'Translation failed for key "{key}" in {lang_code}/messages.json: {e}')
translated_msg = original_msg
translated_msgs[key] = { 'message': translated_msg }
else : translated_msgs[key] = messages[key]

# Format messages
formatted_msgs = '{\n'
for index, (key, message_data) in enumerate(translated_msgs.items()):
formatted_msg = json.dumps(message_data, ensure_ascii=False) \
.replace('{', '{ ').replace('}', ' }') # add spacing
formatted_msgs += ( f' "{key}": {formatted_msg}'
+ ( ',\n' if index < len(translated_msgs) - 1 else '\n' )) # terminate line
formatted_msgs += '}'
with open(msgs_path, 'w', encoding='utf-8') as output_file : output_file.write(formatted_msgs + '\n')

# Print file summary
if translated_msgs == messages : langs_skipped.append(lang_code) ; lang_skipped = True
elif translated_msgs != messages : langs_translated.append(lang_code) ; lang_translated = True
if not lang_translated : langs_not_translated.append(lang_code)
overwrite_print(f"{ 'Added' if lang_added else 'Skipped' if lang_skipped else 'Updated' } { folder }/messages.json")

# Print final summary
print_trunc('\nAll messages.json files updated successfully!\n')
lang_data = [langs_translated, langs_skipped, langs_added, langs_not_translated]
for data in lang_data:
if data:
list_name = next(name for name, value in globals().items() if value is data)
status = list_name.split('langs_')[-1].replace('_', ' ')
print(f'Languages {status}: {len(data)}\n') # print tally
print('[ ' + ', '.join(data) + ' ]\n') # list languages
translator = Translator(to_lang=lang_code)
translated_msg = translator.translate(value['message'])
translated_msgs[key] = {'message': translated_msg}
except Exception:
translated_msgs[key] = value

with open(msgs_path, 'w', encoding='utf-8') as output_file:
json.dump(translated_msgs, output_file, ensure_ascii=False, indent=4)

langs_translated.append(lang_code)

print_trunc("\nTranslation process completed!\n")
print(f"Languages translated: {len(langs_translated)}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script won't work anymore cuz you deleted a lot of important stuff

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to update the script so that the syntax is good and the important information is preserved even though the script is more "light" and efficient. I will keep you informed...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hexakleo please go to https://github.com/adamlui/python-utils to create each change in separate PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants