Publication | Open Access
Variational decoding for statistical machine translation
53
Citations
31
References
2009
Year
Unknown Venue
Natural Language ProcessingComputer-assisted TranslationSyntactic ParsingStructured PredictionSimple Viterbi ApproximationMachine LearningEngineeringComputational LinguisticsMany DerivationsMinimum-risk DecodingComputer ScienceGrammarLanguage StudiesVariational DecodingLinguisticsMachine TranslationNeural Machine Translation
Statistical models in machine translation exhibit spurious ambiguity. That is, the probability of an output string is split among many distinct derivations (e.g., trees or segmentations). In principle, the goodness of a string is measured by the total probability of its many derivations. However, finding the best string (e.g., during decoding) is then computationally intractable. Therefore, most systems use a simple Viterbi approximation that measures the goodness of a string using only its most probable derivation. Instead, we develop a variational approximation, which considers all the derivations but still allows tractable decoding. Our particular variational distributions are parameterized as n-gram models. We also analytically show that interpolating these n-gram models for different n is similar to minimum-risk decoding for BLEU (Tromble et al., 2008). Experiments show that our approach improves the state of the art.
| Year | Citations | |
|---|---|---|
Page 1
Page 1