Skip to main content
WorldCist'17 - 5th World Conference on Information Systems and Technologies

Full Program »

A Process Mining Approach for Discovering ETL Black Points

One of the most critical actions in data warehousing system development is the implementation of its populating process - ETL (Extract-Transform-Load). In fact, this is not an easy process to implement. The number of variables involved with difficult immense its implementation, right from the design to the implementation and testing phases. Many of the difficulties come from the number of sources of information we need to work, the heterogeneity and dispersion of data, and from the complexity of the tasks to implement, in order to populate appropriately the data warehouse. Usually, ETL tasks are quite complex and its interconnection very entangled, which often leads to a very complex and large network of working processes. Thus, it is not difficult to occur some undesirable situations related to ETL system design errors or to the implementation of faulty or inefficient tasks. Many of these situations are only detectable at run time. In this paper, we discuss in particular the case of ETL bottleneck situations - ETL black points -, which can occur during the execution of an ETL system, identifying them and characterizing them using process mining. Based on the process mining results analysis, it is possible to develop alternative implementations for inefficient tasks and improve the overall system performance.

Author(s):

Orlando Belo    
University of Minho
Portugal

Nuno Dias    
University of Minho
Portugal

Carlos Ferreira    
University of Minho
Portugal

Filipe Pinto    
University of Minho
Portugal

 

Powered by OpenConf®
Copyright ©2002-2016 Zakon Group LLC