- Allow progress to work for multiple syncs even if alignment fails for a particular input;
- Allow specifying ffmpeg exe path using --ffmpeg-path;
- Updates for Python 3.12;
- Don't report sync as successful if best score is in negatives (from @ajitid);
- Turn on Audio Sync for audio extraction process (from @dvh312);
- Replace unmaintained cchardet with faust-cchardet;
- Bugfix for waitpid on Windows;
- Misc maintenance / compatibility fixes;
- Blacken code and get rid of future_annotations dependency;
- Allow --apply-offset-seconds when only subtitles specified;
- Make golden section search over scale factors option (--gss) available from help;
- Use -inf as objective for invalid offsets;
- Don't remove log file if --log-dir-path explicitly requested;
- Add --suppress-output-if-offset-less-than arg to suppress output for small syncs;
- Fix a couple of validation bugs that prevented certain uncommon command line options from use;
- Make typing_extensions a requirement
- Hotfix for pysubs2 on Python 3.6;
- Support SSA embedded fonts using new pysubs2 'opaque_fonts' metadata;
- Set min required pysubs2 version to 1.2.0 to ensure the aforementioned functionality is available;
- Pin auditok to 0.1.5 to avoid API-breaking change
- Misc sync improvements:
- Have webrtcvad use '0' as the non speech label instead of 0.5;
- Allow the vad non speech label to be specified via the --non-speech-label command line parameter;
- Don't try to infer framerate ratio based on length between first and last speech frames for non-subtitle speech detection;
- Lots of improvements from PRs submitted by @alucryd (big thanks!):
- Retain ASS styles;
- Support syncing several subs against the same ref via --overwrite-input flag;
- Add --apply-offset-seconds postprocess option to shift alignment by prespecified amount;
- Filter out metadata in subtitles when extracting speech;
- Add experimental --golden-section-search over framerate ratio (off by default);
- Try to improve sync by inferring framerate ratio based on relative duration of synced vs unsynced;
- Make default max offset seconds 60 and enforce during alignment as opposed to throwing away alignments with > max_offset_seconds;
- Add experimental section for using golden section search to find framerate ratio;
- Restore ability to read stdin and write stdout after buggy permissions check;
- Exceptions that occur during syncing were mistakenly suppressed; this is now fixed;
- Use webrtcvad-wheels on Windows to eliminate dependency on compiler;
- Misc bugfixes and stability improvements;
- Bugfix for writing subs to stdout;
- Allow MicroDVD input format;
- Use output extension to determine output format;
- Use rich formatting for Python >= 3.6;
- Use versioneer to manage versions;
- Fix regression where stdout not used for default output;
- Add ability to specify path to ffmpeg / ffprobe binaries;
- Add ability to overwrite the input / unsynced srt with the --overwrite-input flag;
- Fix Python 2 compatibility bug;
- Add --reference-stream option for selecting the stream / track from the video reference to use for speech detection;
- Remove dependency on scikit-learn;
- Implement PyInstaller / Gooey build process for graphical application on MacOS and Windows;
- Fix PyPI issues;
- Fix corner case bug that occurred when multiple sync attempts were scored the same;
- Attempt speech extraction from subtitle tracks embedded in video first before using VAD;
- Hotfix for test archive creation bug;
- Add ability to merge synced and reference subs into bilingual subs when reference is srt;
- Fix bug when handling ass/ssa input, this format should work now;
- Better detection of text file encodings;
- ASS / SSA functionality (but currently untested);
- Allow serialize speech with --serialize-speech flag;
- Convenient --make-test-case flag to create test cases when filing sync-related bugs;
- Use utf-8 as default output encoding (instead of using same encoding as input);
- More robust test framework (integration tests!);
- Try to correct for framerate differences by picking best framerate ratio;
- Revert changes from 0.2.9 now that srt parses weird timestamps robustly;
- Revert changes from 0.2.12 (caused regression on Windows);
- Bump min required scikit-learn to 0.20.4;
- Clear O_NONBLOCK flag on stdout stream in case it is set;
- Quick and dirty fix to recover without progress info if ffmpeg.probe raises;
- Specify utf-8 encoding at top of file for backcompat with Python2;
- Quck and dirty fix to properly handle timestamp ms fields with >3 digits;
- Allow user to specify start time (in seconds) for processing;
- Add utf-16 to list of encodings to try for inference purposes;
- Fix argument parsing regression;
- Clamp subtitles to maximum duration (default 10);
- Add six to requirements.txt;
- Set default encoding to utf8 to ensure non ascii filenames handled properly;
- Minor change to subtitle speech extraction;
- Allow reading input srt from stdin;
- Allow specifying encodings for reference, input, and output srt;
- Use the same encoding for both input srt and output srt by default;
- Developer note: using sklearn-style data pipelines now;
- Developer note: change progress-only to vlc-mode and remove from help docs;
- Get rid of auditok (GPLv3, was hurting alignment algorithm);
- Change to alignment algo: don't penalize matching video non-speech with subtitle speech;
- Add Chinese to the list of encodings that can be inferred;
- Make srt parsing more robust;
- Misc bugfixes;
- Proper logging;
- Proper version handling;
- Support srt format;
- Support using srt as reference;
- Support using video as reference (via ffmpeg);
- Support writing to stdout or file (read from stdin not yet supported; can only read from file);