Publication | Closed Access
The IBM 2004 Conversational Telephony System for Rich Transcription
103
Citations
8
References
2006
Year
Unknown Venue
EngineeringMachine LearningIbm 2004Spoken Language ProcessingCommunicationRich Transcription EvaluationSpeech RecognitionNatural Language ProcessingData SciencePhoneticsComputational LinguisticsRobust Speech RecognitionConversation AnalysisVoice RecognitionLanguage StudiesMachine TranslationSpeech SynthesisSpeech OutputComputer ScienceText-to-speechLast YearSystem ArchitectureSpeech CommunicationSpeech ProcessingSpeech InputSpeech PerceptionVoice TechnologyLinguistics
This paper describes the technical advances in IBM's conversational telephony submission to the DARPA-sponsored 2004 rich transcription evaluation (RT-04). These advances include a system architecture based on cross-adaptation; a new form of feature-based MPE training; the use of a full-scale discriminatively trained full covariance Gaussian system; the use of septaphone cross-word acoustic context in static decoding graphs; and the incorporation of 2100 hours of training data in every system component. These advances reduced the error rate by approximately 21% relative, on the 2003 test set, over the best-performing system in last year's evaluation, and produced the best results on the RT-04 current and progress CTS data.
| Year | Citations | |
|---|---|---|
Page 1
Page 1