Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange clicking sounds in some phrases on high speech rate #37

Open
cyrmax opened this issue Nov 14, 2022 · 4 comments
Open

Strange clicking sounds in some phrases on high speech rate #37

cyrmax opened this issue Nov 14, 2022 · 4 comments

Comments

@cyrmax
Copy link

cyrmax commented Nov 14, 2022

Hello. Thanks for fixing speech slowdown on long phrases.
But in the last build there is another problem occured.
At speech rates faster than 85% (i mean VoiceOver speech rate) i hear strange clicking and glitching in some phrases endings.
I use Espeak TestFlight build 1.0 (11) with iOS 16.1 on IPhone 11.
System locale is Russia, voice language is also russian, variant Max.

I will attach screencast with demonstration. Sorry for the fact that demo video is in russian but i think that shouldn't be a problem.

espeak-glitching-demo.mp4
@LeonarddeR
Copy link

I can still reproduce this on higher speech rates. It somehow sounds like the last phoneme is duplicated. This is especially audible with the Dutch languages on words ending with the letter g, for example.

@cyrmax
Copy link
Author

cyrmax commented Feb 28, 2023 via email

@LeonarddeR
Copy link

@djphoenix Is this on your radar?

@jcsteh
Copy link

jcsteh commented Nov 23, 2023

I think what might be happening is that some code somehow is pushing too many samples to the audio buffer. I saw this while I was reworking some of NVDA's eSpeak code for WASAPI. When Sonic is being used, eSpeak generates the audio at a slower rate, but then Sonic runs and shortens the buffer. If you accidentally read too many samples, you end up with some of the (original speed) audio at the end.

I know eSpeak has some bugs related to the events it passes to the callback, which is why this was happening in my code. The audio_position in the events doesn't seem to be correctly adjusted for Sonic, so you can't rely on those numbers not overrunning or underrunning the samples. However, from what I can see in the iOS code, you don't rely on eSpeak events, so this shouldn't be the cause. It seems the iOS code just pushes the number of samples passed as a callback argument, which should be correct. It mostly ignores the events currently (apart from logging), unless I'm missing something.

The only other thing I've spotted is that the iOS code calls espeak_ng_SetPhonemeEvents, but NVDA doesn't. I don't really understand why the iOS code does this though, as it doesn't seem to use them for anything other than logging. I also don't know if that could have anything to do with this or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants