Concepedia

Publication | Open Access

JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages

200

Citations

30

References

2019

Year

Abstract

Viable cross-lingual transfer critically depends on the availability of parallel texts. Shortage of such resources imposes a development and evaluation bottleneck in multilingual processing. We introduce JW300, a parallel corpus of over 300 languages with around 100 thousand parallel sentences per language pair on average. In this paper, we present the resource and showcase its utility in experiments with crosslingual word embedding induction and multisource part-of-speech projection.

References

YearCitations

Page 1