Data security/privacy when using pyannote diarization on Huggingface #1401

virtualarchitectures · 2023-06-08T11:23:36Z

virtualarchitectures
Jun 8, 2023

Hi,

I'm working on a basic tool for interview transcription. I'd like to use pyannote for diarization but I'd like to understand what information is sent to the Huggingface servers when pyannote calls their API. I'm aware of Huggingface's docuementation on security (https://huggingface.co/docs/hub/security) and security and compliance (https://huggingface.co/docs/inference-endpoints/security#data-securityprivacy). This question is to better understand what information is being transacted. What's in the payload and what comes back.

Many thanks.

Answered by hbredin

Jun 8, 2023

pyannote can be used independently of Huggingface.

See "Can I use gated models and pipelines offline?" FAQ.

View full answer

hbredin · 2023-06-08T12:42:06Z

hbredin
Jun 8, 2023
Maintainer

pyannote can be used independently of Huggingface.

See "Can I use gated models and pipelines offline?" FAQ.

2 replies

virtualarchitectures Jun 8, 2023
Author

Ace! Thanks for the speedy reply. I was looking for a Wiki and missed the FAQ doc. That's great. Many thanks for this.

fitmintdotco Mar 20, 2024

the links in the FAQ are broken. any chance you can update them?

e.g. https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_model.ipynb (error occurred)

virtualarchitectures · 2023-06-09T15:05:19Z

virtualarchitectures
Jun 9, 2023
Author

Hi @hbredin your guide was really good and I have it running. However, my pipeline seems to run smoother when working with the Huggingface connection. For reference I'm using it with WhisperX. Could you please clarify what the Huggingface connection does? Is just authenticating use and providing downloads of the model's and pipelines or is any of the text I'm processing being sent over the connection to huggingface's servers? Many thanks.

3 replies

hbredin Sep 8, 2023
Maintainer

No data leaves your computer when calling diarization = pipeline("audio.wav").

timlac Sep 8, 2023

This might be a stupid question, but can you confirm that no data leaves my computer when running the voice activity detection pipeline either? https://huggingface.co/pyannote/voice-activity-detection

hbredin Sep 8, 2023
Maintainer

Yes.

But may I suggest not to trust blindly the words of a random stranger in an internet forum?
I do consulting too...

timlac · 2023-09-08T08:51:50Z

timlac
Sep 8, 2023

I have the same question. What exactly does the Huggingface connection do? Is any of my audio data sent to the servers?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data security/privacy when using pyannote diarization on Huggingface #1401

{{title}}

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Data security/privacy when using pyannote diarization on Huggingface #1401

virtualarchitectures Jun 8, 2023

Replies: 3 comments · 5 replies

hbredin Jun 8, 2023 Maintainer

virtualarchitectures Jun 8, 2023 Author

fitmintdotco Mar 20, 2024

virtualarchitectures Jun 9, 2023 Author

hbredin Sep 8, 2023 Maintainer

timlac Sep 8, 2023

hbredin Sep 8, 2023 Maintainer

timlac Sep 8, 2023

virtualarchitectures
Jun 8, 2023

Replies: 3 comments 5 replies

hbredin
Jun 8, 2023
Maintainer

virtualarchitectures Jun 8, 2023
Author

virtualarchitectures
Jun 9, 2023
Author

hbredin Sep 8, 2023
Maintainer

hbredin Sep 8, 2023
Maintainer

timlac
Sep 8, 2023