Publication | Closed Access
Using speech/non-speech detection to bias recognition search on noisy data
12
Citations
7
References
2003
Year
Recognition SearchEngineeringMachine LearningSpeech EnhancementRecognition Error RateSpeech RecognitionNatural Language ProcessingData SciencePattern RecognitionNoiseRobust Speech RecognitionVoice RecognitionStatisticsHealth SciencesNoisy SpeechComputer ScienceDistant Speech RecognitionSignal ProcessingSpeech CommunicationSpeech TechnologySpeech AnalysisSpeech AcousticsNoisy Speech WaveformSpeech ProcessingSpeech InputSpeech Perception
This paper focuses on the recognition of noisy speech. We show that the decoding of a noisy speech waveform can be facilitated if the recognizer has explicit knowledge of where it should hypothesize speech phones, and where it should map the acoustics to non-speech phones. We build a speech/non-speech detector and use its output as an additional front-end feature. We show that by appropriately weighting the contribution of this feature in the decoder and by modifying the acoustic models accordingly, we can penalize speech/non-speech confusions and consequently reduce the recognition error rate. This approach gives a 12% overall error rate reduction on a wide variety of recognition tasks and noise characteristics without degrading performance on clean test data. A simple extension of the approach boosts recognition improvements on noisy test sets to 14% overall.
| Year | Citations | |
|---|---|---|
Page 1
Page 1