Concepedia

Publication | Closed Access

Iterative, MT-based sentence alignment of parallel texts

51

Citations

13

References

2011

Year

Abstract

Recent research has shown that MT-based sentence alignment is a robust approach for noisy parallel texts.
\nHowever, using Machine Translation for sentence alignment causes a chicken-and-egg problem: to train a corpus-based MT system, we need sentence-aligned data, and MT-based sentence alignment depends on an MT system.
\nWe describe a bootstrapping approach to sentence alignment that resolves this circular dependency by computing an initial alignment with length-based methods.
\nOur evaluation shows that iterative MT-based sentence alignment significantly outperforms widespread alignment approaches on our evaluation set, without requiring any linguistic resources other than the to-be-aligned bitext.

References

YearCitations

Page 1