StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
-
Updated
Aug 24, 2024 - Python
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
Repository containing the open source code of works published at the FBK MT unit.
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.
Source code for ICLR 2023 spotlight paper "Hidden Markov Transformer for Simultaneous Machine Translation"
Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"
Code for ACL 2022 main conference paper "Modeling Dual Read/Write Paths for Simultaneous Machine Translation"
Code for ACL 2022 findings paper "Gaussian Multi-head Attention for Simultaneous Machine Translation"
Official implementation for EMNLP 2023 paper "Non-autoregressive Streaming Transformer for Simultaneous Translation"
[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."
Code for EMNLP 2021 oral paper "Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy"
Implementation of the paper "Anticipation-Free Training for Simultaneous Machine Translation"
Code Implementation for Paper "Learning Monotonic Attention in Transducer for Streaming Generation"
Source code for our EMNLP 2022 paper "Wait-info Policy: Balancing Source and Target at Information Level for Simultaneous Machine Translation"
Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.
Add a description, image, and links to the simultaneous-translation topic page so that developers can more easily learn about it.
To associate your repository with the simultaneous-translation topic, visit your repo's landing page and select "manage topics."