Concepedia

Publication | Closed Access

Similarity measures for short queries

50

Citations

0

References

1995

Year

Abstract

Ad-hoc queries are usually short, of perhaps two to ten terms. However, in previous rounds of TREC we have concentrated on obtaining optimal performance for the long TREC topics. In this paper we investigate the behaviour of similarity measures on short queries, and show experimentally that two successful measures---which give similar, good performance on long TREC topics---do not work well for short queries. We explore methods for achieving greater effectiveness for short queries, and conclude that a successful approach is to combine these similarity measures with other evidence. We also briefly describe our experiments with the Spanish data. 1 Introduction Users of interactive text retrieval systems often pose short queries, typically of a few words only. Our experience with users of real systems who pose such queries is that they are unsatisfied with the behaviour of standard similarity measures such as cosine, reporting that the answers often have little apparent correspondence t...