Publication | Closed Access
Space Efficient Streaming Algorithms for the Maximum Error Histogram
46
Citations
20
References
2007
Year
Unknown Venue
Cluster ComputingEngineeringData ScienceOptimal B-bucket HistogramStreaming EngineMaximum Error HistogramStreaming AlgorithmMaximum ErrorParallel ProgrammingComputer ScienceData Stream ManagementData Streaming ArchitectureParallel ComputingStreaming DataData ManagementSignal ProcessingB-bucket HistogramStream Processing
We propose new algorithms for constructing maximum error (L <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">∞</sub> ) histograms in the data stream model. Our first algorithm (Min-Merge) achieves the following performance guarantee: using O(B) memory, it constructs a 2B-bucket histogram whose approximation error is at most the error of the optimal B-bucket histogram. Our second algorithm (Min-Increment) achieves a (1 + ε)-approximation of a B-bucket histogram using O(ε <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-1</sup> B log U) space, where U is the size of the domain for data values. The memory requirements of these algorithms are a significant improvement over the previous best schemes for constructing near-optimal histograms in the data stream model, making them ideal for data summary applications where memory is at a premium, such as wireless sensor networks. Our Min-Increment algorithm also extends to the sliding window model without any asymptotic increase in space. Finally, using synthetic and real-world data, we show that our algorithms are indeed as space-efficient in practice as their theoretical analysis predicts - compared to previous best algorithms, they require two or more orders of magnitude less memory for the same approximation error.
| Year | Citations | |
|---|---|---|
Page 1
Page 1