Publication | Closed Access
Speaker diarization of French broadcast news
22
Citations
5
References
2008
Year
Speech CorpusPhonologySpeech RecognitionPattern RecognitionSpeaker IdentificationPhoneticsSpeaker DiarizationGaussianized MfccsRobust Speech RecognitionVoice RecognitionLanguage StudiesHealth SciencesSpeech CommunicationSpeaker Diarization ProcessMulti-speaker Speech RecognitionFrench MediaSpeech ProcessingSpeech PerceptionLinguisticsSpeaker Recognition
We report results on speaker diarization of French broadcast news and talk shows on current affairs. This speaker diarization process is a multistage segmentation and clustering system. One of the stages is agglomerative clustering using state-of-the-art speaker identification methods (SID). For the QMMs used in this stage, we tried many different feature parameters, including MFCCs, Gaussianized MFCCs, Gaussianized MFCCs with cepstral mean subtraction, and Gaussianized MFCCs with cepstral mean substraction containing only frames with high energy. We found that this last set of feature parameters gave the best results. Compared to Gaussianized MFCCs, these features reduced the diarization error rate (DER) by 12% on a development set and by 19% on a test set. We also combined clusters resulting from Gaussianized and non-Gaussianized feature sets. This cluster combination resulted in another 4% reduction in DER for both the development and the test sets. The best DER we have achieved is 15.4% on the development set, and 14.5% on the test set.
| Year | Citations | |
|---|---|---|
Page 1
Page 1