Publication | Closed Access
Bayesian Browsing Model
19
Citations
37
References
2010
Year
Bayesian StatisticEngineeringQuery ModelOptimal ScalabilitySemantic WebText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningQuery ExpansionStatisticsSearch TechnologyKnowledge DiscoveryPersonalized SearchProbability TheoryComputer ScienceBayesian Browsing ModelQuery AnalysisSearch Engine DesignSearch Log
A fundamental challenge in utilizing Web search click data is to infer user-perceived relevance from the search log. Not only is the inference a difficult problem involving statistical reasonings but the bulky size, together with the ever-increasing nature, of the log data imposes extra requirements on scalability. In this paper, we propose the Bayesian Browsing Model (BBM), which performs exact inference of the document relevance, only requires a single pass of the data (i.e., the optimal scalability), and is shown effective. We present two sets of experiments to evaluate the model effectiveness and scalability. On the first set of over 50 million search instances of 1.1 million distinct queries, BBM outperforms the state-of-the-art competitor by 29.2% in log-likelihood while being 57 times faster. On the second click log set, spanning a quarter of petabyte, we showcase the scalability of BBM: we implemented it on a commercial MapReduce cluster, and it took only 3 hours to compute the relevance for 1.15 billion distinct query-URL pairs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1