Concepedia

Publication | Open Access

Weighted Set-Theoretic Alignment of Comparable Sentences

25

Citations

15

References

2017

Year

Abstract

This article presents the STACC w system for the BUCC 2017 shared task on parallel sentence extraction from comparable corpora. The original STACC approach, based on set-theoretic operations over bags of words, had been previously shown to be efficient and portable across domains and alignment scenarios. We describe an extension of this approach with a new weighting scheme and show that it provides significant improvements on the datasets provided for the shared task.

References

YearCitations

Page 1