Publication | Open Access
A Semi-supervised Word Alignment Algorithm with Partial Manual Alignments
26
Citations
21
References
2010
Year
Unknown Venue
Natural Language ProcessingComputer-assisted TranslationWord Alignment FrameworkEngineeringInformation RetrievalData ScienceSpeech TranslationCorpus LinguisticsComputational LinguisticsPartial Manual AlignmentsConstrained Em AlgorithmLanguage StudiesText ProcessingNamed-entity RecognitionLinguisticsText MiningMachine TranslationNeural Machine Translation
We present a word alignment framework that can incorporate partial manual alignments. The core of the approach is a novel semi-supervised algorithm extending the widely used IBM Models with a constrained EM algorithm. The partial manual alignments can be obtained by human labelling or automatically by high-precision-low-recall heuristics. We demonstrate the usages of both methods by selecting alignment links from manually aligned corpus and apply links generated from bilingual dictionary on unlabelled data. For the first method, we conduct controlled experiments on Chinese-English and Arabic-English translation tasks to compare the quality of word alignment, and to measure effects of two different methods in selecting alignment links from manually aligned corpus. For the second method, we experimented with moderate-scale Chinese-English translation task. The experiment results show an average improvement of 0.33 BLEU point across 8 test sets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1