Publication | Closed Access
Preliminary study into query translation for patent retrieval
19
Citations
31
References
2010
Year
Unknown Venue
Natural Language ProcessingPatent RetrievalEngineeringInformation RetrievalInformation NeedsComputational LinguisticsLinguisticsTerminology ExtractionIntellectual PropertyCross-language RetrievalQuery ExpansionLanguage StudiesPrior Art SearchCorpus LinguisticsText MiningMachine TranslationPatent Analysis
Patent retrieval is a branch of Information Retrieval (IR) aiming to support patent professionals in retrieving patents that satisfy their information needs. Often, patent granting bodies require patents to be partially translated into one or more major foreign languages, so that language boundaries do not hinder their accessibility. This multilinguality of patent collections offers opportunities for improving patent retrieval. In this work we exploit these opportunities by applying query translation to patent retrieval. We expand monolingual patent queries with their translations, using both a domain-specific patent dictionary that we extract from the patent collection, and a general domain-free dictionary. Experimental evaluation on a standard CLEF-IP dataset shows that using either translation dictionary fetches similar results: query translation can help patent retrieval, but not always, and without great improvement compared to standard statistical monolingual query expansion (Rocchio). The improvement is greater when the source language is English, as opposed to French or German, a finding partly due to the effect of the complex French and German morphology upon translation accuracy, but also partly due to the prevalence of English in the collection. A thorough per-query analysis reveals that cases where standard query expansion fails (e.g. zero recall) can benefit from query translation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1