-
Notifications
You must be signed in to change notification settings - Fork 521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TTS #67
Comments
Hey @yukiarimo , I am trying todo that too, is there any progress on you side on this? I made some progress on audio to audio
if you are interested to work on it with me, let me know. thanks |
So, I also found 2 things
Enjoy!! :) |
Gonna try it out! But how is that “without tokenizer”? |
I think you are talking about audio-to-audio, so for that I build my own tokenizer hehe :'D |
So, the concept behind the tokenizer is batches of data. Convert the combined audio say for 50MB for now; to mel spectrogram, encode the mel spectrogram into a sequence of integers and decode the sequence of integers back into the mel spectrogram. The mel spectrogram values are scaled and quantized to a range of integers. The encoding and decoding process maps these integers back and forth between the mel spectrogram values. and in more general words, like at sec 1 we have encoded some kind of Mel spectrogram data. like we had for:
Let me know if you can contribute on top of this, thanks. |
I will send you the Colab link on this, where it’s working for me . Thanks |
Hi, @yukiarimo here is the link: But take a look on attached images of train and test loss etc on this https://github.com/tttzof351/SimpleTransfromerTTS. It shows you nearly take 400K iteration to generate good results. If still issues just let me know. Thanks, |
Hello. Do you know how to turn this: https://github.com/nivibilla/build-nanogpt into TTS instead of audio-to-audio?
The text was updated successfully, but these errors were encountered: