Publication | Open Access
Weighted Set-Theoretic Alignment of Comparable Sentences
25
Citations
15
References
2017
Year
Unknown Venue
This article presents the STACC w system for the BUCC 2017 shared task on parallel sentence extraction from comparable corpora. The original STACC approach, based on set-theoretic operations over bags of words, had been previously shown to be efficient and portable across domains and alignment scenarios. We describe an extension of this approach with a new weighting scheme and show that it provides significant improvements on the datasets provided for the shared task.
| Year | Citations | |
|---|---|---|
Page 1
Page 1