The translate pipeline does not perform punctuation normalization #611

benjaminking · 2024-12-19T18:04:21Z

When performing translation, whether through translate.py or experiment.py with --translate, the Moses punctuation normalizer is not used. This does not match the other pipelines or what is done in Serval. Currently, a sentence could be translated differently with test.py and translate.py.

The text was updated successfully, but these errors were encountered:

benjaminking · 2024-12-23T17:59:19Z

I have a fix written and tested for NLLB. Is it worth committing this change since NLLB is the dominant use case? Or is it worth testing for other models like Madlad first?

benjaminking added bug Something isn't working pipeline 6: infer Issue related to using a trained model to translate. labels Dec 19, 2024

benjaminking self-assigned this Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The translate pipeline does not perform punctuation normalization #611

The translate pipeline does not perform punctuation normalization #611

benjaminking commented Dec 19, 2024

benjaminking commented Dec 23, 2024

The translate pipeline does not perform punctuation normalization #611

The translate pipeline does not perform punctuation normalization #611

Comments

benjaminking commented Dec 19, 2024

benjaminking commented Dec 23, 2024