-
I use the pre-trained model directly for the speaker diarization task. But I got a bad result on AMI dataset, is this correct ? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Yes, that is probably correct. The sum of You'd have to combine the segmentation model with speaker embedding to perform proper diarization. |
Beta Was this translation helpful? Give feedback.
-
Hi @ChokJohn , |
Beta Was this translation helpful? Give feedback.
Yes, that is probably correct.
The sum of
FA
andMiss.
does match the numbers reported in the paper.The high
Conf.
is due to the fact that the segmentation model is not capable of tracking speakers over time (it only works on small 5s chunks).You'd have to combine the segmentation model with speaker embedding to perform proper diarization.
See this paper and code for a way to do that.