Concepedia

Publication | Open Access

Optimizing ETL Processes in Data Warehouses

205

Citations

10

References

2005

Year

TLDR

ETL tools extract, cleanse, customize, and load data into warehouses, and must complete within a time window, making execution time optimization essential. This study aims to logically optimize ETL processes by modeling them as a state‑space search problem. The authors model each ETL workflow as a state, construct a state space through valid transitions, and develop algorithms to reduce execution cost. They present algorithms that minimize the execution cost of an ETL workflow.

Abstract

Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. Usually, these processes must be completed in a certain time window; thus, it is necessary to optimize their execution time. In this paper, we delve into the logical optimization of ETL processes, modeling it as a state-space search problem. We consider each ETL workflow as a state and fabricate the state space through a set of correct state transitions. Moreover, we provide algorithms towards the minimization of the execution cost of an ETL workflow.

References

YearCitations

Page 1