Publication | Closed Access
Trapping conversational speech: extending TRAP/tandem approaches to conversational telephone speech recognition
40
Citations
16
References
2004
Year
Unknown Venue
EngineeringMachine LearningSpoken Language ProcessingCommunicationPhonologyConversational SpeechSpeech RecognitionNatural Language ProcessingPhoneticsSpeech InterfaceAcoustic Front EndConversation AnalysisVoice RecognitionLanguage StudiesTemporal PatternsFront EndComputer ScienceDistant Speech RecognitionSpeech CommunicationSpeech TechnologyExtending Trap/tandemSpeech ProcessingSpeech InputSpeech PerceptionLinguisticsTelephone Speech Recognition
Temporal patterns (TRAP) and tandem MLP/HMM approaches incorporate feature streams computed from longer time intervals than the conventional short-time analysis. These methods have been used for challenging small- and medium-vocabulary recognition tasks, such as Aurora and SPINE. Conversational telephone speech recognition is a difficult large-vocabulary task, with current systems giving incorrect output for 20-40% of the words, depending on the system complexity and test set. Training and test times for this problem also tend to be relatively long, making rapid development quite difficult. In this paper we report experiments with a reduced conversational speech task that led to the adoption of a number of engineering decisions for the design of an acoustic front end. We then describe our results with this front end on a full-vocabulary conversational telephone speech task. In both cases the front end yielded significant improvements over the baseline.
| Year | Citations | |
|---|---|---|
Page 1
Page 1