Publication | Open Access
Combining a two-step conditional random field model and a joint source channel model for machine transliteration
14
Citations
9
References
2009
Year
Unknown Venue
EngineeringCorpus LinguisticsText MiningSpeech RecognitionNatural Language ProcessingLanguage DocumentationDop Machine TransliterationComputational LinguisticsLanguage EngineeringMachine TransliterationNews 2009Language StudiesMachine TranslationComputer-assisted TranslationSpeech SynthesisComputer ScienceNeural Machine TranslationSpeech TranslationSource WordLanguage RecognitionSpeech ProcessingLinguisticsPo Tagging
This paper describes our system for "NEWS 2009 Machine Transliteration Shared Task" (NEWS 2009). We only participated in the standard run, which is a direct orthographical mapping (DOP) between two languages without using any intermediate phonemic mapping. We propose a new two-step conditional random field (CRF) model for DOP machine transliteration, in which the first CRF segments a source word into chunks and the second CRF maps the chunks to a word in the target language. The two-step CRF model obtains a slightly lower top-1 accuracy when compared to a state-of-the-art n-gram joint source-channel model. The combination of the CRF model with the joint source-channel leads to improvements in all the tasks. The official result of our system in the NEWS 2009 shared task confirms the effectiveness of our system; where we achieved 0.627 top-1 accuracy for Japanese transliterated to Japanese Kanji(JJ), 0.713 for English-to-Chinese(E2C) and 0.510 for English-to-Japanese Katakana(E2J).
| Year | Citations | |
|---|---|---|
Page 1
Page 1