Publication | Open Access
Building a German/Simple German Parallel Corpus for Automatic Text Simplification
44
Citations
17
References
2013
Year
Syntactic ParsingEngineeringCorpus LinguisticsText MiningNatural Language ProcessingSyntaxLanguage DocumentationData ScienceComputational LinguisticsLanguage EngineeringText SimplificationGrammarLanguage StudiesParallel CorpusMachine TranslationComputer-assisted TranslationSmt SystemsLinguisticsNeural Machine TranslationAutomatic Text SimplificationLexical Complexity PredictionSpeech TranslationParallel Data
In this paper we report our experiments in creating a parallel corpus using German/Simple German documents from the web. We require parallel data to build a statistical machine translation (SMT) system that translates from German into Simple German. Parallel data for SMT systems needs to be aligned at the sentence level. We applied an existing monolingual sentence alignment algorithm. We show the limits of the algorithm with respect to the language and domain of our data and suggest ways of circumventing them.
| Year | Citations | |
|---|---|---|
Page 1
Page 1