Publication | Closed Access
A study on multilingual acoustic modeling for large vocabulary ASR
116
Citations
6
References
2009
Year
Unknown Venue
EngineeringSpoken Language ProcessingMultilingual PretrainingAcoustic ModelingSpeech RecognitionNatural Language ProcessingData ScienceComputational LinguisticsPhoneticsRobust Speech RecognitionLanguage StudiesMachine TranslationLarge-scale Asr ExperimentsSpeech CommunicationSpeech TechnologyLarge Vocabulary AsrAutomatic Speech RecognitionMulti-speaker Speech RecognitionSpeech ProcessingSpeech InputSpeech PerceptionLinguistics
We study key issues related to multilingual acoustic modeling for automatic speech recognition (ASR) through a series of large-scale ASR experiments. Our study explores shared structures embedded in a large collection of speech data spanning over a number of spoken languages in order to establish a common set of universal phone models that can be used for large vocabulary ASR of all the languages seen or unseen during training. Language-universal and language-adaptive models are compared with language-specific models, and the comparison results show that in many cases it is possible to build general-purpose language-universal and language-adaptive acoustic models that outperform language-specific ones if the set of shared units, the structure of shared states, and the shared acoustic-phonetic properties among different languages can be properly utilized. Specifically, our results demonstrate that when the context coverage is poor in language-specific training, we can use one tenth of the adaptation data to achieve equivalent performance in cross-lingual speech recognition.
| Year | Citations | |
|---|---|---|
Page 1
Page 1