Publication | Closed Access
Strategies for Vietnamese keyword search
47
Citations
29
References
2014
Year
Unknown Venue
EngineeringSpeech CorpusSpoken Language ProcessingPhonologyCorpus LinguisticsLanguage ProcessingSpeech RecognitionApplied LinguisticsNatural Language ProcessingKws SystemInformation RetrievalComputational LinguisticsIntelligent SearchingAutomatic RecognitionVoice RecognitionCorpus AnalysisLanguage StudiesSpoken Language UnderstandingMachine TranslationCreaky Voice QualityKeyword SearchAcoustic FeaturesSpeech AcousticsLanguage RecognitionSpeech ProcessingSpeech InputSearch TechniqueVietnamese Keyword SearchLinguistics
We propose strategies for a state-of-the-art Vietnamese keyword search (KWS) system developed at the Institute for Infocomm Research (I <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> R). The KWS system exploits acoustic features characterizing creaky voice quality peculiar to lexical tones in Vietnamese, a minimal-resource transliteration framework to alleviate out-of-vocabulary issues from foreign loan words, and a proposed system combination scheme FusionX. We show that the proposed creaky voice quality features complement pitch-related features, reaching fusion gains of 17.7% relative (6.9% absolute). To the best of our knowledge, the proposed transliteration framework is the first reported rule-based system for Vietnamese; it outperforms statistical-approach baselines up to 14.93–36.73% relative on foreign loan word search tasks. Using FusionX to combine 3 sub-systems, the actual term-weighted value (ATWV) reaches 0.4742, exceeding the ATWV=0.3 benchmark for IARPA Babel participants in the NIST OpenKWSB Evaluation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1