Publication | Closed Access
Multilingual speech recognition: a unified approach
24
Citations
9
References
2005
Year
Unknown Venue
Endangered LanguagesEngineeringMultilingualismSpoken Language ProcessingSpeech RecognitionWorld LanguagesComputational LinguisticsRobust Speech RecognitionSpeech InterfaceAutomatic RecognitionLanguage StudiesSpoken Language UnderstandingMachine TranslationComputer ScienceSpeech CommunicationUnified ApproachSpeech AcousticsMultilingual Speech RecognitionLanguage RecognitionSpeech ProcessingSpeech InputHidden MarkovmodelLinguistics
Abstract In this paper, we present a unified approach for hidden markovmodel based multilingual speech recognition. The proposedapproach could be used across acoustically similar as well asdiverse languages. We use an automatic phone mapping algo-rithm to map phones across languages and reduce the effectivenumber of phones in the multililingual acoustic model. Weexperimentally verify the effectiveness of the approach usingtwo acoustically similar languages, Tamil and Hindi and alsoAmerican English which is very different from the other twolanguages acoustically. The experimental results are very en-couraging and demonstrate the effectiveness of the approach inbuilding a universal multilingual speech recognition system. 1. Introduction A practical approach to multilingual speech recognition forcountries like India where more than 30 languages are spokenacrossthecountrywouldbetohaveatrulymultilingualacousticmodel. This multilingual model should then be adapted to thetarget language with the help of a language identification sys-tem. This is more important in the case of telephony applica-tions where the conversations can be of short durations and thelanguage could change from one conversation to another.IndianlanguagesingeneralbelongtoeitherDravidianfam-ily or Aryan family or both. For example, a south Indian lan-guage Malayalam is derived from another south Indian lan-guage Tamil, a Dravidian language. But, its vocabulary is richwith words derived from sanskrit, an Aryan language, makingit resemble the Dravidian and Aryan languages. The acousticcharacteristics of some of the Malayalam phones are similar toTamil, whilesomeareclosertoSanskrit. Therefore, whiledeal-ing with Indian languages, there are acoustic similarities acrosslanguages that could be shared to our advantage to reduce theoverall size of the acoustic model.Dealing with many languages at the same time, the easi-est and effective approach would be to identify the languagewith the help of a language identification system and choosethe appropriate monolingual acoustic model for decoding. Thisapproach however suffers from mainly two drawbacks:
| Year | Citations | |
|---|---|---|
Page 1
Page 1