Inference possibilities #2

amil-rp-work · 2020-09-13T17:05:58Z

Hey,
Thanks for providing such an amazing easy-to-use webapp code.

I was wondering if this pre-trained model could be used for any random non-youtube video file(audio or video)?

jramcast · 2020-10-29T08:05:04Z

The models do not process video, only audio.

You can use any 16 bit PCM wav file, which is then preprocessed by the vggish network here (this is a prerequisite for models trained on Audioset):

https://github.com/jramcast/mgr-service/blob/master/mgr/infrastructure/audioset/vggish/loader.py#L46
https://github.com/jramcast/mgr-service/blob/master/mgr/infrastructure/audioset/vggish/model/vggish_input.py#L75

If you need to use other audio formats, e.g. mp3, you need to covert the audio to wav before feeding it into the model. I used ffmpeg here to extract the wav audio from youtube videos, but you can also use it to convert mp3 to wav.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference possibilities #2

Inference possibilities #2

amil-rp-work commented Sep 13, 2020

jramcast commented Oct 29, 2020

Inference possibilities #2

Inference possibilities #2

Comments

amil-rp-work commented Sep 13, 2020

jramcast commented Oct 29, 2020