Data Quality in Data warehouse: problems and solution

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 1

Abstract

 In recent years, corporate scandals, regulatory changes, and the collapse of major financial institutions have brought much warranted attention to the quality of enterprise data if we can better understand the problems of quality issues, then we can develop a plan of action to address the problem that is both proactive and strategic. Each instance of a quality issue presents challenges in both identifying where problems exist and in quantifying the extent of the problems. Quantifying the issues is important in order to determine where our efforts should be focused. It is reported that more than $2 billion of U.S. federal loan money had been lost because of poor data quality at a single agency. It also reported that manufacturing companies spent over 25% of their sales on wasteful practices. Over the period of time many researchers have contributed to the data quality issues, but no research has collectively gathered all the causes of data quality problems at all the phases of data warehousing along with their possible solution. problems in different phase of data warehouse i.e.; data sources, data integration & data profiling, Data staging and ETL, data warehouse modeling & schema design are discussed in this paper. The purpose of the paper is to identify the reasons for data deficiencies, non-availability or reach ability problems at all the aforementioned stages of data warehousing and to give some classification of these causes as well as solution for improving data quality through Statistical Process Control (SPC),Quality engineering management . etc I have identified possible set of causes of data quality issues from the extensive literature review and with consultation of the data warehouse practitioners working in renowned IT company on India. I hope this will help developers & Implementers of warehouse to examine and analyze these issues before moving ahead for data integration and data warehouse solutions for quality decision oriented and business intelligence oriented applications.

Authors and Affiliations

Rahul Kumar Pandey

Keywords

Related Articles

A Critical Analysis of Knowledge Management in E-Learning

The integration of knowledge management (Klick) and e-encyclopaedism (Altitude) become inevitable day by day. KM coating focuses in providing institutions with prick to enrich knowledge, while the EL focuses on managing...

 Video Steganography Using LSB Matching Revisited Algorithm

 Abstract: Video Steganography deals with hiding secret data or information within a video. In this paper, a spatial domain technique for LSB Matching Revisited algorithm (LSBMR) has been proposed, where the secret...

 Natural Language Query Processing on Dynamic Databases Using Semantic Grammar

 Abstract: Ease and effectiveness of Information Retrieval from Structured Database through Natural Language provides high utility value. The main purpose of Natural Language Query Processing is that an English...

 A Modular Approach To Intrusion Detection in Homogenous Wireless Network

 Wireless network is the latest and popular technology nowadays. Due to its ample advantage in various fields, it has always been the prime target for hackers and attackers to break through its security and the...

The Importance of Object-Oriented Programming in This Era of Mobile Application Development

Abstract: In the past two decades object oriented programming has become the dominant programming OOP paradigm used by application developers. Object oriented programming scales very well, from the most trivial of proble...

Download PDF file
  • EP ID EP152060
  • DOI -
  • Views 75
  • Downloads 0

How To Cite

Rahul Kumar Pandey (2014).  Data Quality in Data warehouse: problems and solution. IOSR Journals (IOSR Journal of Computer Engineering), 16(1), 18-24. https://europub.co.uk/articles/-A-152060