Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] whisper #1251

Open
0wwafa opened this issue Dec 5, 2024 · 3 comments
Open

[feature request] whisper #1251

0wwafa opened this issue Dec 5, 2024 · 3 comments

Comments

@0wwafa
Copy link

0wwafa commented Dec 5, 2024

Since you are the only one (my hero!) who still supports CLBLAST, and since you already use whisper, can you make kobold.cpp act like whisper.cpp, so I can use it to transcribe some long italian movies and translate them to english?

@LostRuins
Copy link
Owner

It already exists! You just need to load the whisper model with --whispermodel. See https://github.com/LostRuins/koboldcpp/wiki#what-is-whisper

KoboldCpp can also transcribe wav audio files. using the OpenAI compatible endpoint. For example:

curl --request POST \
  --url http://localhost:5001/v1/audio/transcriptions \
  --header 'Content-Type: multipart/form-data' \
  --form file=@/path/to/file/audio.wav\

@0wwafa
Copy link
Author

0wwafa commented Dec 6, 2024

I see. it would be somewhat more useful to have it directly like:
./kobold.cpp -wm model.bin -ot time_offset -p "additional prompt" input.wav

or something like that.
but I'll try it out in server mode.

@LostRuins
Copy link
Owner

Hi, can you please try the latest version 1.80, I've added a feature to upload files to transcribe from the GUI

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants