You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Want to use the user's own wit.ai and deepl API key for real-time speech-to-text translation.
Feature Background:
After using it for a while, I found that there is often a translation delay issue (interval=3~5) when using the medium model.
It also frequently results in blank spaces.
I don't know if it's due to the delay in voice recognition or incorrect identification of language type that causes the translation failure.
And English is not my native language. After receiving English, I need to spend some time converting it into my native language. So I hope to increase the variety of translation languages.
Proposed Solution
speech-to-text: Use wit.ai to convert audio files into text wit.ai docs
Free to use
Users can customize the unique language corresponding to the API token, so as not to cause incorrect language identification.
The recognition speed is very fast and accurate.
(I use it to identify Google reCAPTCHA voice verification, which is very fast and accurate.)
transalte: use deepl or chatGPT to translate to user target language
Deepl free api and GPT-3.5 turbo is free to use
Can set target language by user (for me: KO (text from wit.ai) -> ZH)
The text was updated successfully, but these errors were encountered:
Sorry for the delayed response. For the incorrect language identification issue, you should be able to fix that by setting the --language flag to the language spoken in the stream. The model only tries to identify the language if you leave the flag at the default ("auto").
The point of the repo was that you can use OpenAI's whisper model locally, so I don't wanna replace it with wit.ai.
Regarding adding an additional API call for translation into non-english languages: I like the idea, maybe I will add that when I get some free time. OpenAI's APIs are not free to use, only the web version of GPT-3.5 turbo is free.
I have used the --language setting to specify the language, but there are still cases where it cannot be recognized correctly.
As for using an additional API for translation, I suggest letting users fill in their own API Key (if they are using open AI or deepl's API).
Feature Request
Description of the feature you'd like:
Want to use the user's own wit.ai and deepl API key for real-time speech-to-text translation.
Feature Background:
After using it for a while, I found that there is often a translation delay issue (interval=3~5) when using the medium model.
It also frequently results in blank spaces.
I don't know if it's due to the delay in voice recognition or incorrect identification of language type that causes the translation failure.
And English is not my native language. After receiving English, I need to spend some time converting it into my native language. So I hope to increase the variety of translation languages.
Proposed Solution
speech-to-text: Use
wit.ai
to convert audio files into text wit.ai docs(I use it to identify Google reCAPTCHA voice verification, which is very fast and accurate.)
transalte: use
deepl
orchatGPT
to translate to user target languageThe text was updated successfully, but these errors were encountered: