Publication | Closed Access
Development of the GALE 2008 Mandarin LVCSR system
18
Citations
17
References
2009
Year
Unknown Venue
EngineeringFeature ExtractionEducationCurrent ImprovementsSpeech RecognitionPhoneticsRobust Speech RecognitionModeling And SimulationVoice RecognitionMandarin LanguageChinese LanguageGale 2008Speech SynthesisClassical Feature ExtractionComputer ScienceDeep LearningDistant Speech RecognitionSpeech TechnologySpeech ProcessingSpeech InputSpeaker Recognition
This paper describes the current improvements of the RWTH Mandarin LVCSR system. We introduce vocal tract length normalization for the Gammatone features and present comparable results for Gammatone based feature extraction and classical feature extraction. In order to benefit from the huge amount of data of 1600h available in the GALE project we have trained the acoustic models up to 8M Gaussians. We present detailed character error rates for the different number of Gaussians. Different kinds of systems are developed and a two stage decoding framework is applied, which uses cross-adaptation and a subsequent lattice-based system combination. In addition to various acoustic front-ends, these systems use different kinds of neural network toneme posterior features. We present detailed recognition results of the development cycle and the different acoustic front-ends of the systems. Finally, we compare the ultimate evaluation system to our last years system and can report a 10% relative improvement.
| Year | Citations | |
|---|---|---|
Page 1
Page 1