Publication | Closed Access
Speaker adaptive training: a maximum likelihood approach to speaker normalization
101
Citations
12
References
2002
Year
Unknown Venue
EngineeringMachine LearningJoint Speaker NormalizationSpeech RecognitionNatural Language ProcessingPhoneticsSpeaker DiarizationRobust Speech RecognitionVoice RecognitionLanguage StudiesDistant Speech RecognitionSpeech SignalSpeech CommunicationMulti-speaker Speech RecognitionSpeech ProcessingSpeaker Adaptive TrainingSpeech PerceptionLinguisticsSpeaker Recognition
This paper describes the speaker adaptive training (SAT) approach for speaker independent (SI) speech recognizers as a method for joint speaker normalization and estimation of the parameters of the SI acoustic models. In SAT, speaker characteristics are modeled explicitly as linear transformations of the SI acoustic parameters. The effect of inter-speaker variability in the training data is reduced, leading to parsimonious acoustic models that represent more accurately the phonetically relevant information of the speech signal. The proposed training method is applied to the Wall Street Journal (WSJ) corpus that consists of multiple training speakers. Experimental results in the context of batch supervised adaptation demonstrate the effectiveness of the proposed method in large vocabulary speech recognition tasks and show that significant reductions in word error rate can be achieved over the common pooled speaker-independent paradigm.
| Year | Citations | |
|---|---|---|
Page 1
Page 1