An array-based algorithm for simultaneous multidimensional aggregates

TLDR

Computing multiple related group‑by aggregates is a core OLAP operation, and the Cube operator computes all subset aggregations, yet while efficient ROLAP algorithms exist, no literature addresses computing the Cube in MOLAP systems that use sparse arrays. The study proposes a MOLAP algorithm for computing the Cube and compares it to a leading ROLAP algorithm. The algorithm processes data stored in sparse arrays, performing position‑based aggregation to compute all subset aggregates and then converting results back to table form. Tests show that, with appropriate compression, the MOLAP algorithm is significantly faster than the ROLAP algorithm, to the extent that it could also accelerate ROLAP systems by converting tables to arrays, cubing, and converting back.

Abstract

Computing multiple related group-bys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the "Cube" operator, which computes group-by aggregations over all possible subsets of the specified dimensions. The rapid acceptance of the importance of this operator has led to a variant of the Cube being proposed for the SQL standard. Several efficient algorithms for Relational OLAP (ROLAP) have been developed to compute the Cube. However, to our knowledge there is nothing in the literature on how to compute the Cube for Multidimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables. In this paper, we present a MOLAP algorithm to compute the Cube, and compare it to a leading ROLAP algorithm. The comparison between the two is interesting, since although they are computing the same function, one is value-based (the ROLAP algorithm) whereas the other is position-based (the MOLAP algorithm). Our tests show that, given appropriate compression techniques, the MOLAP algorithm is significantly faster than the ROLAP algorithm. In fact, the difference is so pronounced that this MOLAP algorithm may be useful for ROLAP systems as well as MOLAP systems, since in many cases, instead of cubing a table directly, it is faster to first convert the table to an array, cube the array, then convert the result back to a table.

References

Page 1

	Year	Citations

Page 1