Concepedia

Publication | Closed Access

Learning Bilingual Lexicons from Monolingual Corpora

312

Citations

12

References

2008

Year

Abstract

We present a method for learning bilingual translation lexicons from monolingual corpora. Word types in each language are characterized by purely monolingual features, such as context counts and orthographic substrings. Translations are induced using a generative model based on canonical correlation analysis, which explains the monolingual lexicons in terms of latent matchings. We show that high-precision lexicons can be learned in a variety of language pairs and from a range of corpus types. 1

References

YearCitations

Page 1