Concepedia

Publication | Closed Access

Maintenance of data cubes and summary tables in a warehouse

185

Citations

20

References

1997

Year

TLDR

Data warehouses store large volumes of heterogeneous data and rely on OLAP, which requires many complex aggregate queries that cannot be recomputed on the fly, so materialized summary tables are built and must be updated when source data changes. This study introduces the summary‑delta table method to efficiently maintain large numbers of summary tables, minimizing batch windows and handling many tables over the same base data. The method loads source changes daily in a batch window, updates summary tables while the warehouse is temporarily unavailable for querying, and uses delta tables to reduce maintenance time.

Abstract

Data warehouses contain large amounts of information, often collected from a variety of independent sources. Decision-support functions in a warehouse, such as on-line analytical processing (OLAP), involve hundreds of complex aggregate queries over large volumes of data. It is not feasible to compute these queries by scanning the data sets each time. Warehouse applications therefore build a large number of summary tables, or materialized aggregate views, to help them increase the system performance.As changes, most notably new transactional data, are collected at the data sources, all summary tables at the warehouse that depend upon this data need to be updated. Usually, source changes are loaded into the warehouse at regular intervals, usually once a day, in a batch window, and the warehouse is made unavailable for querying while it is updated. Since the number of summary tables that need to be maintained is often large, a critical issue for data warehousing is how to maintain the summary tables efficiently.In this paper we propose a method of maintaining aggregate views (the summary-delta table method), and use it to solve two problems in maintaining summary tables in a warehouse: (1) how to efficiently maintain a summary table while minimizing the batch window needed for maintenance, and (2) how to maintain a large set of summary tables defined over the same base tables.While several papers have addressed the issues relating to choosing and materializing a set of summary tables, this is the first paper to address maintaining summary tables efficiently.

References

YearCitations

Page 1