Publication | Closed Access
Support vector machines using GMM supervectors for speaker verification
1K
Citations
14
References
2006
Year
EngineeringMachine LearningGaussian Mixture ModelsSpeech RecognitionData SciencePattern RecognitionNew Svm KernelsSpeaker DiarizationRobust Speech RecognitionVoice RecognitionHealth SciencesDeep LearningGmm SupervectorsSpeech CommunicationSvm KernelsMulti-speaker Speech RecognitionSpeech ProcessingSpeech PerceptionSpeaker Recognition
Gaussian mixture models are highly effective for text‑independent speaker recognition, are normally trained by MAP adaptation of mixture means, and recent work has stacked these means into supervectors to address speaker and channel variability. The study investigates using the GMM supervector as input to a support vector machine classifier. Two novel SVM kernels based on distance metrics between GMM models are introduced. The new kernels achieve excellent classification accuracy in the NIST speaker recognition evaluation.
Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMM models is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea of stacking the means of the GMM model to form a GMM mean supervector. We examine the idea of using the GMM supervector in a support vector machine (SVM) classifier. We propose two new SVM kernels based on distance metrics between GMM models. We show that these SVM kernels produce excellent classification accuracy in a NIST speaker recognition evaluation task.
| Year | Citations | |
|---|---|---|
Page 1
Page 1