Data Quality in Data warehouse: problems and solution
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 1
Abstract
In recent years, corporate scandals, regulatory changes, and the collapse of major financial institutions have brought much warranted attention to the quality of enterprise data if we can better understand the problems of quality issues, then we can develop a plan of action to address the problem that is both proactive and strategic. Each instance of a quality issue presents challenges in both identifying where problems exist and in quantifying the extent of the problems. Quantifying the issues is important in order to determine where our efforts should be focused. It is reported that more than $2 billion of U.S. federal loan money had been lost because of poor data quality at a single agency. It also reported that manufacturing companies spent over 25% of their sales on wasteful practices. Over the period of time many researchers have contributed to the data quality issues, but no research has collectively gathered all the causes of data quality problems at all the phases of data warehousing along with their possible solution. problems in different phase of data warehouse i.e.; data sources, data integration & data profiling, Data staging and ETL, data warehouse modeling & schema design are discussed in this paper. The purpose of the paper is to identify the reasons for data deficiencies, non-availability or reach ability problems at all the aforementioned stages of data warehousing and to give some classification of these causes as well as solution for improving data quality through Statistical Process Control (SPC),Quality engineering management . etc I have identified possible set of causes of data quality issues from the extensive literature review and with consultation of the data warehouse practitioners working in renowned IT company on India. I hope this will help developers & Implementers of warehouse to examine and analyze these issues before moving ahead for data integration and data warehouse solutions for quality decision oriented and business intelligence oriented applications.
Authors and Affiliations
Rahul Kumar Pandey
Preliminary Design of A Model Computerised Economic Growth Monitoring System
Abstract: In the face of economic depression and technological advancements round the world, there is growing need to design a computerized monitoring system in a bid to adapt to the global trend in financial managem...
Personal Financial Assistant
Abstract: Managing our finances has always been an important part of our life. While it is easy to let situation define our expenses, it becomes exceedingly difficult to keep track of all the spending, and is not very ap...
Comparative Study of RBFS & ARBFS Algorithm
RBFS is a best-first search that runs in space that is linear with respect to the maximum search depth, regardless of the cost function used. This algorithm allows the use of all available memory. One major ...
Implementation of AES Algorithm in MicroController Using PIC18F452
Security has become an increasingly important feature with the growth of electronic communication which calls for more advanced ways to encrypt the raw data[1]AES-128 is going to be implemented as the encr...
Using Data-Mining Technique for Census Analysis to Give GeoSpatial Distribution of Nigeria.
There are patterns buried within the mass of data in the various editions of population census figures in this country. These are patterns that will be impossible for humans working with bare eyes and hands, to u...