Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing issue in edge_tts #79

Open
rhsanborn opened this issue Jul 25, 2024 · 1 comment
Open

Parsing issue in edge_tts #79

rhsanborn opened this issue Jul 25, 2024 · 1 comment

Comments

@rhsanborn
Copy link

File: tts_providers/edge_tts_provider.py

I was processing a file and ran into weird error: ValueError: invalid literal for int() with base 10: 'A bunch of book text....'

I tracked it down to line 57 in edge_tts_provider.py. In essence, it hits the first pause, then, for every chunk is asks "is there a close bracket in this chunk". If there is, assume the text preceding that is the pause time. That works for lots of text, but if your book as brackets in it, then you end up stumbling on random brackets that are not associated with the pause time. Now it's using regex to find the beginning of the string, then any length of digits, followed by a close bracket.

Here's the fix I put in, it requires importing re:

    for part in parts:
       # if "]" in part:
        if re.search(r'^\d*]', part):
            pause_time, content = part.split("]", 1)
            yield int(pause_time), content.strip()

I'm not sure if this fixes all edge cases, but it got me past this one.

Thanks for the awesome tool. I'm super psyched to have the Edge TTS and not be paying the equivalent of a produced ebook for the Azure credits!!

@p0n1
Copy link
Owner

p0n1 commented Jul 29, 2024

Thanks for pointing out and share you fix. Yeah, there was a bug so I tried to fix it in #71. Not sure if you're using the latest code. Let me know if the latest version solves your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants