Publication | Closed Access
A study of implicit and explicit modeling of coarticulation and pronunciation variation
13
Citations
9
References
2005
Year
Unknown Venue
MusicPsycholinguisticsHandle Pronunciation VariationSpoken Language ProcessingPhonologyAcoustic ModelingSpeech RecognitionPhoneticsPronunciation VariationRobust Speech RecognitionVoice RecognitionLanguage StudiesExplicit ModelingHealth SciencesAuditory ModelingSpeech ProductionComputer ScienceSpeech CommunicationSpeech TechnologySpeech ProcessingSpeech InputSpeech PerceptionLinguisticsMost Asr Systems
In this paper, we focus on the modeling of coarticulation and pronunciation variation in Automatic Speech Recognition systems (ASR). Most ASR systems explicitly describe these production phenomena through context-dependent phoneme models and multiple pronunciation lexicons. Here, we explore the potential benefit of using feature spaces covering longer time segments in terms of implicit modeling of coarticulation and pronunciation variants. The study is based on the analysis at the phonetic level of the performance of context-independent and context-dependent acoustic models, and more particularly the impact of modeling different time context going from 70 ms up to 310 ms on typical cases of pronunciation variants. Results, confirmed by word recognition experiment, put into light some ability of generic acoustic models to implicitly handle pronunciation variation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1