Concepedia

Publication | Closed Access

Using Parallel Corpora to enrich Multilingual Lexical Resources.

34

Citations

7

References

2002

Year

Abstract

This paper describes the use of a bilingual vector model for the automatic discovery of German translations of English terms. The model is built by analysing co-occurence patterns in a parallel corpus of English and German medical abstracts, a method also used for CrossLingual Information Retrieval. The model generates candidate German translations of English words using the cosine similarity measure between terms in the bilingual vector space. The correct translations could be added to UMLS, the multilingual dictionary in question. The accuracy of the translations is evaluated by measuring how many of the existing UMLS translations are correctly predicted by the vector translations. The model also detects synonymy, particularly acronyms. An online public demonstration of the model is available.

References

YearCitations

Page 1