Publication | Closed Access
Efficient answering of set containment queries for skewed item distributions
25
Citations
42
References
2011
Year
Unknown Venue
EngineeringRange QueriesInformation RetrievalData ScienceData MiningEfficient AnsweringManagementData IntegrationSemi-structured DataData ManagementStatisticsOrdered Inverted FileKnowledge DiscoveryProbability TheoryComputer ScienceBig Data SearchDatabase TheoryQuery OptimizationData IndexingApproximate Query AnsweringContainment Queries
In this paper we address the problem of efficiently evaluating containment (i.e., subset, equality, and superset) queries over set-valued data. We propose a novel indexing scheme, the Ordered Inverted File (OIF) which, differently from the state-of-the-art, indexes set-valued attributes in an ordered fashion. We introduce query processing algorithms that practically treat containment queries as range queries over the ordered postings lists of OIF and exploit this ordering to quickly prune unnecessary page accesses. OIF is simple to implement and our experiments on both real and synthetic data show that it greatly outperforms the current state-of-the-art methods for all three classes of containment queries.
| Year | Citations | |
|---|---|---|
Page 1
Page 1