Whisper Truss

Whisper is a speech-to-text model by OpenAI that transcribes audio in dozens of languages with remarkable accuracy. It is open-source under the MIT license and hosted on Baseten as a pre-trained model. Read the Whisper model card for more details.

Whisper's leap in transcription quality unlocks tons of compelling use cases, including:

Moderating audio content
Auditing call center logs
Automatically generating video subtitles
Improving podcast SEO with transcripts

Deploying Whisper

To deploy the Whisper Truss, you'll need to follow these steps:

Prerequisites: Make sure you have a Baseten account and API key. You can sign up for a Baseten account here.
Install Truss and the Baseten Python client: If you haven't already, install the Baseten Python client and Truss in your development environment using:

pip install --upgrade baseten truss

Load the Whisper Truss: Assuming you've cloned this repo, spin up an IPython shell and load the Truss into memory:

import truss

whisper_truss = truss.load("path/to/whisper_truss")

Log in to Baseten: Log in to your Baseten account using your API key (key found here):

import baseten

baseten.login("PASTE_API_KEY_HERE")

Deploy the Whisper Truss: Deploy the Whisper Truss to Baseten with the following command:

baseten.deploy(whisper_truss)

Once your Truss is deployed, you can start using the Whisper model through the Baseten platform! Navigate to the Baseten UI to watch the model build and deploy and invoke it via the REST API.

Whisper API documentation

Input

This deployment of Whisper takes input as a JSON dictionary with the key url corresponding to a string of a URL pointing at an MP3 file. For example:

{
    "url": "https://cdn.baseten.co/docs/production/Gettysburg.mp3"
}

Output

The model returns a fairly lengthy dictionary. For most uses, you'll be interested in the key language which specifies the detected language of the audio and text which contains the full transcription.

{
    "language": "english",
    "segments": [
        {
        "start": 0,
        "end": 6.5200000000000005,
        "text": " Four score and seven years ago, our fathers brought forth upon this continent a new nation"
        },
        {
        "start": 6.52,
        "end": 21.6,
        "text": " conceived in liberty and dedicated to the proposition that all men are created equal."
        }
    ],
    "text": " Four score and seven years ago, our fathers brought forth upon this continent..."
}

Example usage

You can invoke your Whisper deployment via its REST API endpoint:

curl -X POST "https://app.baseten.co/models/{MODEL_ID}/predict" \
     -H "Content-Type: application/json" \
     -H 'Authorization: Api-Key {YOUR_API_KEY}' \
     -d '{"url": "https://cdn.baseten.co/docs/production/Gettysburg.mp3"}'

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
model		model
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
examples.yaml		examples.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Truss

Deploying Whisper

Whisper API documentation

Input

Output

Example usage

About

Releases

Packages

Contributors 5

Languages

License

zamoshchin/whisper-truss

Folders and files

Latest commit

History

Repository files navigation

Whisper Truss

Deploying Whisper

Whisper API documentation

Input

Output

Example usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages