Publication | Open Access
A scalable decoder for parsing-based machine translation with equivalent language model state maintenance
46
Citations
19
References
2008
Year
Unknown Venue
Natural Language ProcessingComputer-assisted TranslationEquivalent Language ModelSyntaxEngineeringSyntactic ParsingComputational LinguisticsScalable DecoderGrammarComputer ScienceParsing-based Machine TranslationLanguage StudiesLarge Language ModelSemantic ParsingLinguisticsMachine TranslationNeural Machine Translation
We describe a scalable decoder for parsing-based machine translation. The decoder is written in JAVA and implements all the essential algorithms described in Chiang (2007): chart-parsing, m-gram language model integration, beam- and cube-pruning, and unique k-best extraction. Additionally, parallel and distributed computing techniques are exploited to make it scalable. We also propose an algorithm to maintain equivalent language model states that exploits the back-off property of m-gram language models: instead of maintaining a separate state for each distinguished sequence of "state" words, we merge multiple states that can be made equivalent for language model probability calculations due to back-off. We demonstrate experimentally that our decoder is more than 30 times faster than a baseline decoder written in PYTHON. We propose to release our decoder as an open-source toolkit.
| Year | Citations | |
|---|---|---|
Page 1
Page 1