Concepedia

Publication | Closed Access

Literature Review of Cross Language Information Retrieval.

38

Citations

15

References

2005

Year

TLDR

Cross‑language information retrieval (CLIR) extends traditional IR by allowing queries in one language to retrieve documents in another, using bilingual dictionaries or machine translation, but faces challenges from linguistic ambiguity, synonymy, and limited query context. This paper reviews prior CLIR work, identifies current challenges, and proposes future research directions. Keywords: Cross Language Information Retrieval, Lexical Semantics, Disambiguation, Translation.

Abstract

Classical Information Retrieval (IR) is the sifting out of the documents most relevant to a user’s information requirement (expressed as a “query”), from a large electronic store of documents. A search engine performs IR by retrieving relevant web pages from the internet. Rather than regarding foreign-language documents simply as unwanted “noise”, Cross Language Information Retrieval allows the user to state their query in one language, and retrieve documents in another. Some CLIR systems use language resources such as bilingual dictionaries to translate the user’s original query, while other systems use machine translation to translate the foreign-language documents beforehand, enabling them to be retrieved by the original query. Problems arise due to ambiguity in language, the use of synonyms to express a single idea, and the lack of context available in translating a short query. This paper will discuss previous work in CLIR, current problems in CLIR, and make recommendations for future work. Keywords-Cross Language Information Retrieval, Lexical Semantics, Disambiguation, Translation.

References

YearCitations

Page 1