Publication | Closed Access
On a model of indexability and its bounds for range queries
66
Citations
43
References
2002
Year
Storage PerformanceEngineeringStorage RedundancyRange QueriesComputer ArchitectureComputational ComplexityStorage StructureIndexing SchemeRange SearchingInformation RetrievalData ScienceDiscrete MathematicsParallel ComputingCombinatorial OptimizationData ManagementIndexing WorkloadComputer EngineeringComputer ScienceQuery OptimizationExternal-memory AlgorithmData IndexingParallel ProgrammingSearch Engine IndexingIndexing TechniqueIn-storage Computing
We develop a theoretical framework to characterize the hardness of indexing data sets on block-access memory devices like hard disks. We define an indexing workload by a data set and a set of potential queries. For a workload, we can construct an indexing scheme, which is a collection of fixed-sized subsets of the data. We identify two measures of efficiency for an indexing scheme on a workload: storage redundancy, r (how many times each item in the data set is stored), and access overhead, A (how many times more blocks than necessary does a query retrieve).For many interesting families of workloads, there exists a trade-off between storage redundancy and access overhead. Given a desired access overhead A , there is a minimum redundancy that any indexing scheme must exhibit. We prove a lower-bound theorem for deriving the minimum redundancy. By applying this theorem, we show interesting upper and lower bounds and trade-offs between A and r in the case of multidimensional range queries and set queries.
| Year | Citations | |
|---|---|---|
Page 1
Page 1