-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect speech pauses; Out of Memory Crash #14
Comments
Hi Jonathan Thanks for using CMUSphinx Could you please elaborate more on this problem with pauses? I'm not sure I get it. Also please share the problematic files where you have issues with aligner. Thank you. |
Hi, So an example where it failed with the same error on 2 pc's is this audio with this transcription. The Aligner is running for ~45min and than hangs at the same position in the logging output (it just stands still, for >60min)
So here it is probably not a Out of Memory problem, but some other kind .. Could this be correlated to bad quality of the transcription? |
Hi Jonathan,
It still isn't clear what's your expected and actual output. On Sun, Oct 26, 2014 at 9:11 AM, Jonathan Werner [email protected]
Sincerely, Alexander |
Ok, let me rephrase it with an example: So you have a speech pause between 2.2 and 4. But the actual alignment looks for example like this: |
I can take a look Btw, for better alignment quality you should better use en-us generic acoustic model: |
I'm not entirely sure if this is the best place to ask these kind of questions, so please point me to a better place in case there is one.
We are currently using the Sphinx4 Long Aligner with some success for a subtitling project at University Hamburg.
Today was the first time that I tried it successfully "in the field".
I took the transcription and this video from the CCC Congress and aligned the 35min video (of course i mean the converted wav according to your instructions) in ~88 min with Sphinx Long Aligner, which is pretty good i think. (You can see the (manually optimized) results on the linked video page.)
So right now the biggest problem for this application are pauses in speech. The words are always directly next to each other even if there are long pauses. This means a lot of manual dragging around of the results. Long story short: is there an option to turn on speech pauses detection?
Also, a little second problem: when trying the Aligner with a >50min audio, it fails with an Out of Memory error at the liveCMN stage (the java vm has a 7G limit), after about 2h. Is there a way to change this?
Thanks for your help and your great work, that enables us to work on subtitling the CCC videos a magnitude faster.
The text was updated successfully, but these errors were encountered: