Publication | Closed Access
Speaker normalization on conversational telephone speech
173
Citations
3
References
2002
Year
Unknown Venue
EngineeringMachine LearningSpeaker NormalizationSpeech RecognitionNatural Language ProcessingSpeaker DiarizationRobust Speech RecognitionConversation AnalysisVoice RecognitionHealth SciencesNew SystemVocal Tract NormalizationSpeech AnalysisSpeech TechnologySpeech CommunicationVoiceSpeech ProcessingWarp ScaleSpeech PerceptionLinguisticsSpeaker Recognition
This paper reports on a simplified system for determining vocal tract normalization. Such normalization has led to significant gains in recognition accuracy by reducing variability among speakers and allowing the pooling of training data and the construction of sharper models. But standard methods for determining the warp scale have been extremely cumbersome, generally requiring multiple recognition passes. We present a new system for warp scale selection which uses a simple generic voiced speech model to rapidly select appropriate frequency scales. The selection is sufficiently streamlined that it can moved completely into the front-end processing. Using this system on a standard test of the Switchboard Corpus, we have achieved relative reductions in word error rates of 12% over unnormalized gender-independent models and 6% over our best unnormalized gender-dependent models.
| Year | Citations | |
|---|---|---|
Page 1
Page 1