Publication | Closed Access
Building a Large Machine-Aligned Parallel Treebank
21
Citations
11
References
2009
Year
Structured PredictionSyntactic ParsingEngineeringMachine LearningCorpus LinguisticsText MiningNatural Language ProcessingSyntaxComputational LinguisticsGrammarTree-aligned Parallel TreebankLanguage StudiesTransfer KnowledgeMachine TranslationTree LanguageSyntax-based Machine TranslationLinguisticsComputer ScienceNeural Machine TranslationTreebanksParallel ProgrammingSpeech Translation
This paper reports on-going work on building a large automatically tree-aligned parallel treebank in the context of a syntax-based machine translation (MT) approach. For this we develop a discriminative tree aligner based on a log-linear model with a rich feature set. We incorporate various language-independent and language-specific features taking advantage of existing tools and annotation. Our initial experiments on a small hand-aligned treebank show promising results even with small amounts of training data. The performance of our approach is well above unsupervised techniques reported elsewhere. This enables us to quickly create training material and alignment models for additional language pairs. In recent work, we aligned more than one million sentence pairs and started our experiments with the extraction of transfer knowledge for our example-based machine translation system.
| Year | Citations | |
|---|---|---|
Page 1
Page 1