Concepedia

TLDR

Data warehousing systems enable managers to integrate heterogeneous data and query large databases efficiently, but building a warehouse requires distinct design techniques and a solid conceptual foundation to ensure documentation and requirement satisfaction. The paper formalizes the Dimensional Fact model, a graphical conceptual framework for data warehouses, and proposes a semi‑automated method to construct it from existing enterprise database schemas. The model represents reality as fact schemes comprising facts, measures, attributes, dimensions, and hierarchies, supporting additivity, optionality, and non‑dimension attributes, and allows overlapping schemes for drill‑across queries; a simple query language links these schemes to workload data for logical and physical design.

Abstract

Data warehousing systems enable enterprise managers to acquire and integrate information from heterogeneous sources and to query very large databases efficiently. Building a data warehouse requires adopting design and implementation techniques completely different from those underlying operational information systems. Though most scientific literature on the design of data warehouses concerns their logical and physical models, an accurate conceptual design is the necessary foundations for building a DW which is well-documented and fully satisfies requirements. In this paper we formalize a graphical conceptual model for data warehouses, called Dimensional Fact model, and propose a semi-automated methodology to build it from the pre-existing (conceptual or logical) schemes describing the enterprise relational database. The representation of reality built using our conceptual model consists of a set of fact schemes whose basic elements are facts, measures, attributes, dimensions and hierarchies; other features which may be represented on fact schemes are the additivity of fact attributes along dimensions, the optionality of dimension attributes and the existence of non-dimension attributes. Compatible fact schemes may be overlapped in order to relate and compare data for drill-across queries. Fact schemes should be integrated with information of the conjectured workload, to be used as the input of logical and physical design phases; to this end, we propose a simple language to denote data warehouse queries in terms of sets of fact instances.

References

YearCitations

Page 1