Concepedia

Publication | Closed Access

Optimal Histograms with Quality Guarantees

381

Citations

11

References

1998

Year

Abstract

Histograms are commonly used to capture attribute value distribution statistics for query optimizers. More recently, histograms have also been considered as a way to produce quick approximate answers to decision support queries. This widespread interest in histograms motivates the problem of computing histograms that are good under a given error metric. In particular, we are interested in an efficient algorithm for choosing the bucket boundaries in a way that either minimizes the estimation error for a given amount of space (number of buckets) or, conversely, minimizes the space needed for a given upper bound on the error. Under the assumption that finding optimal bucket boundaries is computationally inefficient, previous research has focused on heuristics with no provable bounds on the quality of the solutions. In this paper, we present algorithms for computing optimal bucket boundaries in time proportional to the square of the number of distinct data values, for a ...

References

YearCitations

Page 1