Releases: ReadAlongs/Studio
Releases · ReadAlongs/Studio
v1.1.0
💥 BREAKING CHANGES
-
due to
edaca6a
- drop support for EOL Python 3.7 (commit by @joanise):Python 3.7 is no longer supported
✨ New Features
567cdf3
- publish the accuracy script we used in our 2022 papers (commit by @joanise)4b0ad9d
- cli generates readme (commit by @deltork)b3f8923
- switch to standard M.m.p versioning, e.g., 1.1.0 (commit by @joanise)29ec752
- including generator meta tag in .readalong files (PR #226 by @roedoejet)73d2455
- structure align output into www and Offline-HTML dirs (PR #231 by @joanise)d4b8a69
- RAS format 1.2, new DTD with annotations support (commit by @deltork)
🐛 Bug Fixes
d5520ea
- the CLI needs to output utf8 on Windows (commit by @joanise)5f446a0
- make Studio work with old and new g2p until the next g2p release (commit by @joanise)96136b1
- relax the pympi-link reqs that was unecessarily strict (commit by @joanise)65158d5
- deps: make g2p^=1.0.date instead of >=1.0.date (commit by @joanise)b5d1192
- deploy only 4 workers because we use too much RAM (commit by @joanise)daa41a2
- issue the lexicon warning for all lexicon-based g2p's (commit by @joanise)6bf0f91
- all empty words from g2p should get treated as error situations (commit by @joanise)6dfd7fe
- deps: adjust deps for latest g2p update (commit by @joanise)9a526c2
- compensate for soundswallower model breakage (commit by @dhdaines)adfe51f
- g2p main is about to be 2.0, but we still want 1.1 on Heroku (commit by @joanise)fba7b3a
- api: pin anyio to less than 4.0.0 (commit by @roedoejet)eb03934
- make Studio compatible with Pydantic 2 and thus g2p 2 (commit by @joanise)022bd31
- coloredlogs: remove bold bug (commit by @joanise)53be622
- deps: lock numpy<2 because 2.0.0 is coming and has breaking changes (commit by @joanise)ddbfaee
- bump lxml to support Python 3.11 on Windows (commit by @joanise)81effb9
- work around missing/broken editdistance on python 3.12 (commit by @dhdaines)fc5db32
- bump fastapi to minimum Pydantic v2 compatible version (commit by @joanise)8d8dfc2
- editdistance can now come from PyPI again for Py 3.12 (commit by @joanise)787ead4
- style: bump black to 24.3.0 to fix black's first CVE (commit by @joanise)06d9b17
- with pydantic 2, Field only takes examples plural (commit by @joanise)4dd0e42
- the current web-component version is 1.4.x (commit by @joanise)6ac285f
- update the fallback offline bundles to web-c 1.4.0 (commit by @joanise)469f9ec
- updated the exported readme to include default upload path and image-asset-folder attribute (commit by @deltork)c4cac88
- added meta data to generated HTML (commit by @deltork)7548038
- docs: fix errors in sphinx docs before conversion to mkdocs (commit by @joanise)d928373
- added optional id to meta tag attributes (PR #232 by @deltork)b432bad
- very minor typo correction in cli.py (commit by @MENGZHEGENG)6087b74
- very minor typo correction in cli.py (commit by @MENGZHEGENG)f114b8d
- deps: exclude panphon 0.21 not compat with Python 3.8 (PR #237 by @joanise)6fa849f
- deps: gunicorn 23 has vulnerability fixes that seem worthwhile (PR #239 by @joanise)0533b4f
- remind user to install ffmpeg when an audio file is not found (commit by @joanise)fa7b191
- deps: we are actually compatible with numpy 2 (commit by @joanise)b1db242
- deps: remove panphon declaration since g2p fixed it (commit by @joanise)c4328db
- ci: remove the broken sigstore code, and publish only real versions (commit by @joanise)56b5e1d
- fix (or attempt to fix) the pypi publication process (commit by @joanise)51311e9
- ci: pypi publish does not like the sigstore.json files (commit by @joanise)
⚡ Performance Improvements
6c2eaa6
- update Procfile to start with 5 users instead (commit by @marctessier)8bf389d
- with lower memory use we can have 5 workers again (commit by @joanise)baffdd4
- defer expensive imports to optimize readalongs -h (commit by @joanise)2daf991
- even more aggressively optimize readalongs -h (commit by @joanise)
♻️ Refactors
cd5d6df
- move get_langs out to g2p (commit by @joanise)774e178
- get_langs returns its output in order of codes, no need to re-sort (commit by @joanise)256d79b
- and g2p has been released, only use g2p.get_arpabet_langs (commit by @joanise)1ad311b
- use g2p's lexicon-based eng mapping (commit by @joanise)c34ca8c
- strip now-redundant lexicon-g2p code from Studio (commit by @joanise)94d49ae
- let us parse and load XML in just one place (commit by @joanise)396b2f1
- simplify parsing input_text in web_api /assemble (commit by @joanise)e338f97
- docs: automatically convert from Sphinx .rst to mkdocs .md (commit by @joanise)191e1fb
- **...
Release v1.0.20230228
1.0.20230228 (2023-02-28)
Features
- report empty g2p for a word as a warning (b89de62)
Bug Fixes
- make capture_logs work correctly with Python >= 3.9 (aa1ffca), closes #162
- where there are no words to align, return 422, not 500 (61d45e0)
- dtd: effective-g2p-lang missing from w def (dc944ad)
- by default, output each g2p error at most twice (a1f3c5d)
- clarify the settings and run the API by default (d5ca78e)
- do not fail when the lang code is invalid (3b2a433)
Reverts
- Revert "chore: specify python 3.8 runtime" (8012258)
Code Refactoring
- we no longer need to support g2p<=0.5.20211029 imports (04eb40e)
Build Systems
Continuous Integration
Release v1.0.20230224
1.0.20230224 (2023-02-25)
⚠ BREAKING CHANGES
- smil-ectomy
- avoid using dict for things that are lists
- new web API version
- use .ras not .xml
- no more smil
- update to .ras file extension in output
Features
- a simple DTD for standalone readalongs (f94174b)
- add time and dur attributes to w (5e92b91)
- avoid using dict for things that are lists (20fc1a5)
- basically s/tei/readalong/gi (5465d22)
- capture the logs from /assemble endpoint (5713abc)
- introduce better CORS environment variables (1eb5d74), closes #146
- introduce better CORS environment variables (d8ab4cf), closes #146
- log message to say we are in development mode (a1f53bc)
- new web API version (bec85ee)
- no more smil (eb9d25b)
- output to .ras (8d4f76b)
- refine the DTD somewhat (b7285f5)
- set our .readalong format to version 1.0 for publication (2f0da60)
- smil-ectomy (24dabbd)
- update .ras to .readalong (525facf)
- update to .ras file extension in output (163de15)
- update to href= in readalong component (79b8a64)
- use .ras not .xml (09b34d6)
Bug Fixes
- accept and use dur not duration (b65a5dd)
- add
class
to DTD and update version (903ddd5) - add xml:lang and anchors everywhere (7192fa6)
- address XML external entity expansion vulnerabilities (d0c57f3)
- correct main guard in test_package_urls.py (fbda869)
- don't create blank pages (54d23d5), closes #136
- filter ASCII langs from the Studio-Web via web_api (e63406f)
- frantic and unsuccessful attempts to make CORS work (52b9a39)
- handle requests.get() timeouts correctly (464a3cc)
- make test case valid (fdd8cb0)
- only wait 10 seconds for JS_ and FONT_BUNDLE_URLs (a67e360)
- tell the user why their config.json is not valid (4deb32e)
- test and fix load_xml_zip (10718dc)
- try validating a different way to see if the CodeQL warning goes away (6e62789)
- update bundle.css and bundle.js to @readalongs/[email protected] (a462688)
- update package URLs (4eebbb3)
- use .xml not .ras to avoid breaking MIME guessing (19fc2f1)
- validate path request by /file endpoint in views.py (5ba27ad)
- when g2p fails completely, send the log with the exception (f772eec)
- woohps tyop (367e539)
- words may not be aligned (e.g. do-not-align) (586154d)
- docs: fix formatting of /langs endpoint sample output (7059978)
- write out a version number in the .readalong format (2e28d86)
Tests
- add a couple more tests (c68f06e)
- add bogus alignments (959bda4)
- add test of RAS XML validation (7dd3072)
- appease the codecov beast (5dac6f3)
- basically s/tei/readalong/gi (4c1ed5a)
- not sure why "lang" not "xml:lang" (96ae876)
- test new web API (adb6d3e)
- there is no text + alingment ther eis only readalong (5e623b7)
- tolerate unpkg timeouts as non-failures (dbd3942)
- update .ras to .readalong (300a622)
- update for new component and file format (1d2ca59)
Code Refactoring
- capture logs with a context manager (87716bd)
- change master branch name to main (3b64837)
- move all etree.parse calls to a single well-tested function (c2996b8)
- reformat (1874e96)
- webapi: change back to v1 (c80224a)
- remove deprecated studio (692d630)
- switch to .readalong extension (c4c6c89)
Continuous Integration
- activate CodeQL code analysis (7f8761b)
- combine debug and non-debug web_api tests (214b081)
- only run CodeQL on cron and push to master and release (6d44875)
- submit PR instead of pushing version bump (6f181a7), closes #83
- test web server in development mode too (e7d892c)
Documentation
Release v0.2.20221114
0.2.20221114 (2022-11-14)
Features
- add --align-mode to readalongs align (6023367)
- improve programmatic API to readalongs (45bfd5c)
- silence aligner logs unless --debug-aligner is used (69ab713)
- starting an API for readalongs commands (08b2307)
- api: created web api with fastapi (2d11d73)
- api: the API fns now return (status, exception, log) (639c2a9)
- flask-app: starting to make flask app use API (ddd5071)
- Add a (hidden) -oo / --output-orth option to control the output orthogrpahy (93d7228)
- Add header and theme from config.json (c0c8859)
- add heroku support (17c7d82)
- apply header and theme to basic html page too (035aeca)
- clean up log handling and fix debug_aligner (d0edc09)
- convert_alignment now also supports ELAN eaf (6fcc404)
- endpoint /convert_alignment supports srt and vtt (e035cb2)
- error handling and testing for the -oo option (6316d29)
- new /convert_to_TextGrid endpoint in web_api (42f724f)
- parse_smil() with unit testing (84af6b0)
- re-introduce the sub-word functionality in Studio (6349814)
- support "acoustic_model" in config (30537c2)
- update for soundswallower 0.4.0 (7c98e45)
Bug Fixes
- adjust Docker to requirements.* changes (ad80da3)
- align should delete its temporary files (1b09db9)
- always explicitly declare the encoding when you open a file (afb0908)
- api.prepare() still needs to exist, with a deprecation warning (79d04ad)
- case-insensitive option matching done consistently (a1f0805)
- clean up our own temp files! (a27a0c7)
- cli.align() should not modify its arguments (b422083)
- default for save_temps is None so check that, not truthiness (d955f16)
- don't save SoundSwallower logs on Windows, it's buggy (ed3e4a8)
- extract sentences correctly on page changes (e9a2a18), closes #70
- failure to g2p should be 422, not 400 (b0da504)
- get_langs() should only return supported langs in the dict (6f8e458)
- ignore BOMs when reading files, though never generate them (ca8e264)
- ignore whitespace on blank lines for paragraph and page breaks (aea222f)
- in 2022, "python" is Python 3 (64051a2)
- make sure final_end is defined (9e33ecb)
- make sure the API accepts pathlib.Path objects (50972ae)
- minor bugs and efficiency improvements (5f0080b)
- more robust sentence extraction for srt, vtt, TextGrid (3218fe4)
- new acoustic model requires new soundswallower (973611b)
- no idea why set_string method does not exist on Travis-CI? (93764a9)
- noisewords from the acoustic model to avoid misalignments/warnings (bb3c0aa)
- on Windows, don't let _version.py be generated with CRLF (dedb065)
- peg web-component version to ^0.1.6 (bbb7edd)
- remove dead code (that didn't use noisewords properly) (7ab522b)
- remove unsupported encoding argument from web_api (41d5912)
- require soundswallower 0.4.1 to fix windows (50cbc37)
- restore backwards-compatiliby for getLangs==get_langs (93a9b26)
- restrict to 0.2.x soundswallower (cbf1ef9)
- switch to binary mdefs (dc212a9)
- the FSG/JSGF filters out empty ARPABET, not empty words (dc73638)
- typo (ddd9f65)
- undo change to pbeam, that was not refactoring! (5e44376)
- update soundswallower to 0.2.0, fixes failure on long inputs (31606a6)
- use --debug-g2p instead of --g2p-verbose (cd28076)
- flask-app: update for current CLI; nicer logs (065e723)
- update model layout for soundswallower 0.4.0 (d923543)
- was missing get_string()! mystery solved! (80e4697)
- work around bug in SoundSwallower on empty alignment (8cfe119)
- api: allow origins from studio app (3e9ef8a)
- api: make TextGrid work in studio demo and through API (0fc69a6)
- LICENSE: state in LICENSE difference in model license (6814640)
- sub-word: the word is all its subselements, not just word.text (1dc8527)
- test: fix the sub-word test suite (dd52238)
- test: make test_audio.py compatible with pydub 0.25.1 (d1a4712)
- test: make test_indices.py compatible with g2p PR#166 (9119681)
- test: test suite should be compatible with older g2p versi...
Release v0.2.20220126
0.2.20220126 (2022-01-26)
Features
- cli: "und" is now added by default to -l list of languages (fd6189b)
- cli: accept comma as sparate for lists as well as colon (4f9eafd)
- cli: added readalongs langs command (dfaaf15)
- silence: add fallbacks and exception handling (e61d201)
Bug Fixes
- g2p: better error messages on invalid language codes (9e71372)
- requirements: remove text-unidecode, no longer used directly (1458fd0)
- requirements: studio does not actually use Flask-Cors (5128d91)
- test: reloaded audio file should tolerate a small duration change (ceff68a)
- avoid stack trace when no non-noise segments are found (fixes #88) (07817fd)
- video: force audio mimetypes for video formats (f543055)
- be less strict about failures to guess mime type (17c2492)
- better error messages on bad utf8 plain text input. Fixes #22 (3c55e15)
Performance Improvements
- optimize the CLI, mostly by deferring expensive imports (5c9e3fb)
Tests
- increase test coverage for dev.cli (fdaccfe)
Documentation
- cli: add better cli documentation for readalongs (db8f826)
- align -o is for additional formats, on top of XML+SMIL (442e01b)
- document how to contribute to the docs/ folder (8fa98dd)
- cli: document the recent changes to the CLI (67f0593)
- document installation using Anaconda on Windows (d75c7cc)
- improvements from @roedoejet feedback on PR #93 (1d90f10)
- mention OpenSamples in README.md (6184732)
- polish the updated README.md (5ae1cd3)
- README.md improve with feedback at team meeting (5b511b0)
- recommend miniconda instead of the full anaconda (3014dae)
- remove unstable warning (8c10a5d)
- update TOC in README.md (663a3d8)
Continuous Integration
- bump dist to bionic (c9b7ed2)
Code Refactoring
- silence: change syntax for adding silence and allow output to variety of audio formats (a75dca2)
- undo und work here since it is now done in g2p (b97e1e7)
- cli: allow multiple -o values to join colon-joined (9f0a837)
- cli: change formatting for align output formats help (3588f9e)
- cli: refactor output formats to -o argument (f34f426)
- cli: remove -i; auto-determine XML vs plain text (77a1ac8)
- cli: remove alignment unit option from cli (3002626)
- cli: remove epub from cli (3e53b72)
- cli: replace --g2p-fallback option by -l with multiple languages (78ab484)
- test: stub out SoundSwallower to speed tests that don't care about its output (3428184)
- remove docs for epub (55d8e04)
- simplify python version check and move it to init.py (ca2394f)
- use the more meaningful exceptions from make_g2p when available (52470e9)
Styles
Release v0.1.20211013
0.1.20211013 (2021-10-13)
Features
- anchors: extract_section fn for audio files with testing (df974a2)
- html: first commit for web component html output (bfc1038)
- add b64 encoding of embedded images (807d913)
- add silence insertion feature (1663779)
- anchor times now supported in h/m/s/ms, like in Audacity (c4b4ca8)
- non-caching-server-3.7.py, compatible with Python 3.7 (fe7df92)
- package: add packaging of fonts and js bundles (252e605)
Bug Fixes
- correct attribute name in readalongs (17b4dfe)
- Docker on Windows compatibility issue (6ee07b1)
- issue #73, fix correction for multiple DNA segments (277b530)
- round segment times to 3 digits, i.e. 1ms precision (54c9475)
- temp files have to be closed to get auto-cleaned on Windows (1260d75)
- use consistent Copyright notices (d4c61bf)
- ci: add push (ad2c50e)
- test: fix mimetype issue with test for b64 encoding (c25502a)
Continuous Integration
- commit and merge bumped version after release (94b547e)
- rtd: upgrade read the docs to version 2 (d3ff623)
Styles
- apply a bunch of pylint recommendations (3eb2a33)
- blackify non-caching-server-3.7.py (ee480fb)
- convert some docstrings to google (#16) (7534545)
- don't hide what is going on running make in docs (f9e68e3)
- have data/ej-fra.xml also match the current readalongs prepare output (c798f29)
- nicer argument doc for create_input_tei (db70984)
- reformat make_smil.py docstring to Google format, as per issue #16 (f0aba91)
Code Refactoring
- move audio time adj fns to audio_utils.py (7d47de7)
- align: make align work on a loop of sequences (WIP anchors) (c4bc7ec)
- change pystache to chevron (d1797c0)
- make LANGS, LANG_NAMES and parse_g2p_fallback importable (60cf1e9)
- narrow the try/except to just the parse_g2p_fallback call (753c77e)
- polish Toby's util.py refactoring (b04e4fb)
- remove dead create_input_xml (1940bea)
- remove unused jsgf file (562780c)
- rename non-caching-server.py -> 3.9.py since only Python 3.9 compat (862469f)
- splice dna_utils.py out of audio_utils.py (b20d072)
- use nrc logo instead (d17aa7a)
- docs: create advanced-use.rst and cli-guide.rst (6cf12e2)
- docs: remove cli-user-guide*, all contents is now elsewhere (06557ec)
- docs: rename cli.rst->cli-ref.rst in prep for cli-guide.rst (df403bf)
- parse_time: remove unreachable else: raise statement (97b563f)
- silence: change to using proper parse time function instead of only ms (a646574)
- silence: change variable to be explicit about ms (dda3414)
- test: create basic test case class (f189614)
- test: move the align --html test to its own function (6c9dc65)
- test: name temp dirs after the classes (212df57)
- test: rename basic test case file (3c52ac2)
Tests
- unit testing for issue #73 (90cd88a)
- anchors: check that partial wav files were created (d8eb857)
- anchors: improve test coverage (97ff810)
- anchors: rfix ej-fra-anchors2.xml so it's alignable (8ae8850)
- anchors: test cases for aligning with anchors (7468589)
- anchors: test data for processing anchors (5c4f6c3)
- add basic test for single file html output (86d8f1b)
- better message for tests that might fail when dependencies change (8147b03)
- disable test_align_cli.test_permission_denied since it is unstable (251ae45)
- images for RAS testing, created by EJ@NRC (2cc6ed3)
- improve unittesting for package (231005e)
- package: add test for bundled web component assets (c190fa5)
Documentation
- add TL;DR to Contributing.md (552683a)
- convert audio_utils.py docs to Google standard (8b2e8e1)
- create CLI guide vs CLI reference sections (ac35316)
- document run.py vs readalongs/run.py (e4e0794)
- fix the README.md badges (6c10a5d)
- improve docstring readability (c95e5f0)
- improvements to the CLI documentation (9f26b98)
- include cli-user-guide in read-the-docs output (7888222)
- make README.md and docs/cli-user-guide.md more coherent ...