Predicting query performance

TLDR

The study develops a method that predicts query performance by computing relative entropy between a query language model and the collection language model, and introduces an algorithm to automatically set clarity score thresholds for distinguishing poorly‑performing from acceptable queries, validated on TREC data. The method computes a clarity score by measuring the relative entropy between a query language model and the collection language model, quantifying the coherence of document language likely to generate the query, and uses this score to set automatic thresholds distinguishing poorly‑performing from acceptable queries, with comparisons to optimal thresholds and sampling experiments on TREC data. Clarity scores, which quantify query ambiguity, correlate positively with average precision across multiple TREC test sets and can identify ineffective queries on average without relevance information.

Abstract

We develop a method for predicting query performance by computing the relative entropy between a query language model and the corresponding collection language model. The resulting clarity score measures the coherence of the language usage in documents whose models are likely to generate the query. We suggest that clarity scores measure the ambiguity of a query with respect to a collection of documents and show that they correlate positively with average precision in a variety of TREC test sets. Thus, the clarity score may be used to identify ineffective queries, on average, without relevance information. We develop an algorithm for automatically setting the clarity score threshold between predicted poorly-performing queries and acceptable queries and validate it using TREC data. In particular, we compare the automatic thresholds to optimum thresholds and also check how frequently results as good are achieved in sampling experiments that randomly assign queries to the two classes.