Publication | Closed Access
Evaluation strategies for top-k queries over memory-resident inverted indexes
66
Citations
18
References
2011
Year
EngineeringOnline Advertising SystemsCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningOnline AdvertisingQuery ExpansionData ManagementSearch TechnologyTaat AlgorithmsEvaluation StrategiesKnowledge DiscoveryText IndexingComputer ScienceSearch Engine DesignQuery OptimizationData IndexingSearch Engine IndexingIndexing Technique
Top-k retrieval over main-memory inverted indexes is at the core of many modern applications: from large scale web search and advertising platforms, to text extenders and content management systems. In these systems, queries are evaluated using two major families of algorithms: document-at-a-time (DAAT) and term-at-a-time (TAAT). DAAT and TAAT algorithms have been studied extensively in the research literature, but mostly in disk-based settings. In this paper, we present an analysis and comparison of several DAAT and TAAT algorithms used in Yahoo!'s production platform for online advertising. The low-latency requirements of online advertising systems mandate memory-resident indexes. We compare the performance of several query evaluation algorithms using two real-world ad selection datasets and query workloads. We show how some adaptations of the original algorithms for main memory setting have yielded significant performance improvement, reducing running time and cost of serving by 60% in some cases. In these results both the original and the adapted algorithms have been evaluated over memory-resident indexes, so the improvements are algorithmic and not due to the fact that the experiments used main memory indexes.
| Year | Citations | |
|---|---|---|
Page 1
Page 1