Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to infer models in the TS section? #29

Open
hardikkamboj opened this issue Jul 28, 2023 · 1 comment
Open

How to infer models in the TS section? #29

hardikkamboj opened this issue Jul 28, 2023 · 1 comment

Comments

@hardikkamboj
Copy link

Hello team,

Thanks for proving the models, really appreciate the effort.

I am currently trying to infer on the TS models present in the table under the section "wav2vec2 based models" in the main readme of this repository. However I am unable to load it using the huggingface code (the model here is .pt and not .bin as in the huggingface models). Also, the files downloaded from the repository only contains the josn file and model file, and is missing the config files.

Can you please help on which script I can use to infer these models? (for example english_ts)

@Awaisn25
Copy link

Awaisn25 commented Nov 24, 2023

Probably too late to comment but for anyone else who is wondering how to perform inference using these models:
Its a simple forward() call on the model with the 2D pytorch tensor of waveform as a parameter. Incase your audio is mono, it will be loaded as 1D tensor so in that case:

import librosa
import torch
data, sr = librosa.load(<path_to_audio>, sr=<sample_rate>) #this is a mono audio
data_p = torch.unsqueeze(torch.from_numpy(data), 0)

model = torch.jit.load(<path_to_model>)
model(data_p)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants