We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running your latest version on ArchLinux.
nodejs-whisper says the WAV file is valid, but later the native whisper instance says it's not. Huh?
[dev:server] [Nodejs-whisper] File is a valid WAV file.
And later it says:
[dev:server] read_wav: WAV file '/home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav' must be 16 kHz [dev:server] error: failed to read WAV file '/home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav'
Here are the details from the logs:
[dev:server] DEBUG: »»-----------------------------------------► [dev:server] [Nodejs-whisper] Checking and downloading model if needed: base [dev:server] autoDownloadModelName base [dev:server] options { [dev:server] modelName: 'base', [dev:server] autoDownloadModelName: 'base', [dev:server] verbose: true, [dev:server] removeWavFileAfterTranscription: false, [dev:server] whisperOptions: { outputInVtt: true } [dev:server] } [dev:server] [Nodejs-whisper] Models already exist. Skipping download. [dev:server] [Nodejs-whisper] Checking file existence: /home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav [dev:server] [Nodejs-whisper] Converting file to WAV format: /home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav [dev:server] [Nodejs-whisper] Checking if the file is a valid WAV: /home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav [dev:server] [Nodejs-whisper] File is a valid WAV file. [dev:server] [Nodejs-whisper] Constructing command for file: /home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav [dev:server] [Nodejs-whisper] Executing command: ./main -ovtt -l auto -m ./models/ggml-base.bin -f /home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav [dev:server] code--- 0 [dev:server] stdout--- [dev:server] stderr--- whisper_init_from_file_with_params_no_state: loading model from './models/ggml-base.bin' [dev:server] whisper_model_load: loading model [dev:server] whisper_model_load: n_vocab = 51865 [dev:server] whisper_model_load: n_audio_ctx = 1500 [dev:server] whisper_model_load: n_audio_state = 512 [dev:server] whisper_model_load: n_audio_head = 8 [dev:server] whisper_model_load: n_audio_layer = 6 [dev:server] whisper_model_load: n_text_ctx = 448 [dev:server] whisper_model_load: n_text_state = 512 [dev:server] whisper_model_load: n_text_head = 8 [dev:server] whisper_model_load: n_text_layer = 6 [dev:server] whisper_model_load: n_mels = 80 [dev:server] whisper_model_load: ftype = 1 [dev:server] whisper_model_load: qntvr = 0 [dev:server] whisper_model_load: type = 2 (base) [dev:server] whisper_model_load: adding 1608 extra tokens [dev:server] whisper_model_load: n_langs = 99 [dev:server] whisper_model_load: CPU total size = 147.37 MB [dev:server] whisper_model_load: model size = 147.37 MB [dev:server] whisper_init_state: kv self size = 16.52 MB [dev:server] whisper_init_state: kv cross size = 18.43 MB [dev:server] whisper_init_state: compute buffer (conv) = 16.39 MB [dev:server] whisper_init_state: compute buffer (encode) = 132.07 MB [dev:server] whisper_init_state: compute buffer (cross) = 4.78 MB [dev:server] whisper_init_state: compute buffer (decode) = 96.48 MB [dev:server] read_wav: WAV file '/home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav' must be 16 kHz [dev:server] error: failed to read WAV file '/home/michael-heuberger/code/binarykitchen/videomail.io/var/local/tmp/clients/videomail.io/1ef7ae52-7eab-6f50-8362-05f8c267a8f2/videomail_preview.wav' [dev:server] [dev:server] whisper_print_timings: load time = 306.03 ms [dev:server] whisper_print_timings: fallbacks = 0 p / 0 h [dev:server] whisper_print_timings: mel time = 0.00 ms [dev:server] whisper_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run) [dev:server] whisper_print_timings: encode time = 0.00 ms / 1 runs ( 0.00 ms per run) [dev:server] whisper_print_timings: decode time = 0.00 ms / 1 runs ( 0.00 ms per run) [dev:server] whisper_print_timings: batchd time = 0.00 ms / 1 runs ( 0.00 ms per run) [dev:server] whisper_print_timings: prompt time = 0.00 ms / 1 runs ( 0.00 ms per run) [dev:server] whisper_print_timings: total time = 312.29 ms [dev:server] [dev:server] stdout--- [dev:server] [Nodejs-whisper] Transcribing Done! [dev:server] [Nodejs-whisper] Error during processing: Transcription failed or produced no output.
Any ideas what this could be?
Thanks!
The text was updated successfully, but these errors were encountered:
I think it's because the input sample rate is at 48kHz, while whisper expects it to be at 16 kHz. That said, you should also check the sample rate.
Sorry, something went wrong.
Yeah i think its due to sample rate, i will look into this issue
No branches or pull requests
Running your latest version on ArchLinux.
nodejs-whisper says the WAV file is valid, but later the native whisper instance says it's not. Huh?
And later it says:
Here are the details from the logs:
Any ideas what this could be?
Thanks!
The text was updated successfully, but these errors were encountered: