Speaker diarization of French broadcast news

Abstract

We report results on speaker diarization of French broadcast news and talk shows on current affairs. This speaker diarization process is a multistage segmentation and clustering system. One of the stages is agglomerative clustering using state-of-the-art speaker identification methods (SID). For the QMMs used in this stage, we tried many different feature parameters, including MFCCs, Gaussianized MFCCs, Gaussianized MFCCs with cepstral mean subtraction, and Gaussianized MFCCs with cepstral mean substraction containing only frames with high energy. We found that this last set of feature parameters gave the best results. Compared to Gaussianized MFCCs, these features reduced the diarization error rate (DER) by 12% on a development set and by 19% on a test set. We also combined clusters resulting from Gaussianized and non-Gaussianized feature sets. This cluster combination resulted in another 4% reduction in DER for both the development and the test sets. The best DER we have achieved is 15.4% on the development set, and 14.5% on the test set.

References

Page 1

	Year	Citations

Page 1