Publication | Open Access
Sentence alignment of Hungarian-English parallel corpora using a hybrid algorithm
15
Citations
9
References
2008
Year
Syntactic ParsingMultilingualismSyntactic StructureCorpus LinguisticsNatural Language ProcessingSyntaxComputational LinguisticsLanguage EngineeringEfficient Hybrid MethodGrammarLanguage StudiesMachine TranslationComputer-assisted TranslationNamed Entity RecognitionLinguisticsCross-language RetrievalNew AlgorithmSentence AlignmentNeural Machine TranslationArtsSpeech Translation
We present an efficient hybrid method for aligning sentences with their translations in a parallel bilingual corpus. The new algorithm is composed of a length-based and anchor matching method that uses Named Entity recognition. This algorithm combines the speed of length-based models with the accuracy of anchor finding methods. The accuracy of finding cognates for Hungarian-English language pair is extremely low, hence we thought of using a novel approach that includes Named Entity recognition. Due to the well selected anchors it was found to outperform the best two sentence alignment algorithms so far published for the Hungarian-English language pair.
| Year | Citations | |
|---|---|---|
Page 1
Page 1