Publication | Closed Access
Speaker adaptation with all-pass transforms
30
Citations
9
References
1999
Year
Unknown Venue
EngineeringPhonologyCepstral-domain LinearitySpeech RecognitionSpeaker DiarizationRobust Speech RecognitionVoice RecognitionLanguage StudiesAll-pass TransformsRobust EstimationDistant Speech RecognitionSignal ProcessingSpeech CommunicationSpeech TechnologySpeaker Normalization SimpleMulti-speaker Speech RecognitionSpeech ProcessingSpeech PerceptionLinguisticsSpeaker Recognition
In previous work, a class of transforms were proposed which achieve a remapping of the frequency axis much like conventional vocal tract length normalization. These mappings, known collectively as all-pass transforms (APT), were shown to produce substantial improvements in the performance of a large vocabulary speech recognition system when used to normalize incoming speech prior to recognition. In this application, the most advantageous characteristic of the APT was its cepstral-domain linearity; this linearity makes speaker normalization simple to implement, and provides for the robust estimation of the parameters characterizing individual speakers. In the current work, we exploit the APT to develop a speaker adaptation scheme in which the cepstral means of a speech recognition model are transformed to better match the speech of a given speaker. In a set of speech recognition experiments conducted on the Switchboard corpus, we report reductions in word error rate of 3.7% absolute.
| Year | Citations | |
|---|---|---|
Page 1
Page 1