TransVisio provides translations for multiple languages using Large Language Models (LLMs). It takes inputs in various formats (e.g., subtitle file or a video) and it extracts the text
- GPT 4o
- GPT 4 Turbo
- GPT 3 Turbo
- Gemini 1.5 Pro
- Gemini 1.5 Flash
- Whisper 20231117 (Online)
- Faster-Whisper v1.0.3 (Offline)
- Inputs supported:
Subtitle files (*.srt *.ass *.ssa).
Video files (*.mp4 *.mkv *.webm *.flv *.avi *.mov *.wmv *.m4v).
Audio files (*.wav *.ogg *.mp3 *.aac *.flac *.m4a *.oga *.opus).
Excel files (*.xlsx *.csv).
-
You can save video/audio transcription to Excel.
-
You can specify the number of input sentences.
-
You can pause/resume translation at any point.
-
You can reverse the direction of the translated output.
-
You can remove and/or edit the translated output and the input. The tool will automatically align the rows.
-
Ability to specify Start Time and Duration for video/audio inputs.
-
Provides a Temperature setting, which Controls the randomness of the model’s output. A lower value makes the output more deterministic and focused, while a higher value makes the output more diverse and creative.
-
Light and Dark themes.
Note:
Make sure that you specify the Start Time and Duration before selecting the video/audio input.
Online Whisper requires an API key and is limited to 25 MB input size.
Offline Whisper does not require a key, but must download a model (e.g., tiny, small, etc.) on the first use.
TransVisio is part of a collaborative research funded by the Abdul Hameed Shoman Foundation (Agreement Number: 230800351).
Hosting Institution: The project is hosted by the English Language and Translation Department at the Applied Science Private University.