Publication | Open Access
Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation
56
Citations
37
References
2017
Year
Unknown Venue
EngineeringCross-lingual RepresentationCross-lingual Semantic DivergenceEntailment (Linguistics)Textual EntailmentMultilingual PretrainingSemanticsCorpus LinguisticsText MiningNatural Language ProcessingApplied LinguisticsDivergent ExamplesLanguage DocumentationParallel CorporaComputational LinguisticsLanguage StudiesMachine TranslationNoisy TranslationsNeural Machine TranslationLinguistics
Parallel corpora are often not as parallel as one might assume: non-literal translations and noisy translations abound, even in curated corpora routinely used for training and evaluation. We use a cross-lingual textual entailment system to distinguish sentence pairs that are parallel in meaning from those that are not, and show that filtering out divergent examples from training improves translation quality.
| Year | Citations | |
|---|---|---|
Page 1
Page 1