-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The timestamp of model 'interspeech21' is incorrect #62
Comments
I am struggling with the timing as well. Is anybody aware of any library able to do a forced alignment of phonemes based on the input from allosaurus? I would really appreciate any input and tipps on how I can improve the output from allosaurus. |
I am also looking for something like this |
Hi guys, sorry I was a bit busy with other projects and my internship in the last few months and did not have time to look at it. I forgot to count the subsampling factor from the conv layer, i fixed it in the latest commit. |
A very useful library -- thank you for creating it. 0.840 0.045 ʔ |
i assumed it was because its returning the most likely phoneme at the 0.045 interval? |
I run the following command:
python -m allosaurus.run --timestamp=True -i sample.wav -m interspeech21
and it gives me
0.040 0.025 ɑ
0.080 0.025 l
0.100 0.025 ʌ
0.120 0.025 s
0.140 0.025 o
0.170 0.025 ɹ
0.180 0.025 ə
0.200 0.025 s
This is incorrect for the sample audio. Seems the window shift is set wrongly.
The text was updated successfully, but these errors were encountered: