Publication | Closed Access
Reducing word error rate on conversational speech from the Switchboard corpus
21
Citations
6
References
2002
Year
Unknown Venue
EngineeringSpeech CorpusWord Error RateSpoken Language ProcessingCommunicationConversational SpeechCorpus LinguisticsSpeech RecognitionNatural Language ProcessingSwitchboard CorpusComputational LinguisticsPhoneticsRobust Speech RecognitionSpeech InterfaceLanguage StudiesMachine TranslationComputer ScienceAdditional Acoustic TrainingSpeech CommunicationSpeech TechnologySpeech AnalysisSpeech ProcessingSpeech InputSpeech PerceptionLinguistics
Speech recognition of conversational speech is a difficult task. The performance levels on the Switchboard corpus had been in the vicinity of 70% word error rate. In this paper, we describe the results of applying a variety of modifications to our speech recognition system and we show their impact on improving the performance on conversational speech. These modifications include the use of more complex models, trigram language models, and cross-word triphone models. We also show the effect of using additional acoustic training on the recognition performance. Finally, we present an approach to dealing with the abundance of short words, and examine how the variable speaking rate found in conversational speech impacts on the performance. Currently, the level of performance is at the vicinity of 50% error, a significant improvement over recent levels.
| Year | Citations | |
|---|---|---|
Page 1
Page 1