Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not translate when using a different language model #17

Open
robertoronderosjr opened this issue Feb 3, 2023 · 1 comment
Open
Assignees
Labels
enhancement New feature or request

Comments

@robertoronderosjr
Copy link

robertoronderosjr commented Feb 3, 2023

Hey thanks a lot for this work, this is great. However, I wanted to use a different model (rjac/whisper-tiny-spanish) to transcribe a video in spanish and it did! but it translated the whole thing to english. Could this be skipped?

@pszemraj
Copy link
Owner

pszemraj commented Feb 8, 2023

Thanks for reaching out! I can look into integrating this feature when I do another PR on it. The main reason English-output was assumed (just so you know) is that a lot of the "cleanup" features (spell checking, punctuation, etc.) either originally only worked for English out of the box or were much easier to implement if you assumed that. However, I think this can be implemented in two parts:

  1. option(s)/flags to skip all the correction/translation bits etc, add in functionality for Whisper integration to specify the language (so it does not assume English, iirc it does). I can do this in a few weeks, hopefully when I have time to work on it again :)
  2. Add "real" multi-language support with the downstream bits I mentioned. Maybe one day... but to be honest, not sure I will have time to implement this myself. We will see!

Hope this helps & let me know your thoughts!

@pszemraj pszemraj self-assigned this Feb 25, 2024
@pszemraj pszemraj added the enhancement New feature or request label Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants