Comparative Study of Three Imputation Methods to Treat Missing Values
Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2013, Vol 11, Issue 7
Abstract
One relevant problem in data preprocessing is the presence of missing data that leads the poor quality of patterns, extracted after mining. Imputation is one of the widely used procedures that replace the missing values in a data set by some probable values. The advantage of this approach is that the missing data treatment is independent of the learning algorithm used. This allows the user to select the most suitable imputation method for each situation. This paper analyzes the various imputation methods proposed in the field of statistics with respect to data mining. A comparative analysis of three different imputation approaches which can be used to impute missing attribute values in data mining are given that shows the most promising method. An artificial input data (of numeric type) file of 1000 records is used to investigate the performance of these methods. For testing the significance of these methods Z-test approach were used.
Authors and Affiliations
Rahul Singhai
Prospects and Challenges of Implementing Enterprise Mobility Management Case of a Large Telecom Service Provider in United Arab Emirates
Over the last few years, there has been an exponential rise in the trend to use mobile devices within enterprises. Organizations and employees are using their smart phones and tablets to aid in work. Several organization...
E-government Implementation in Developing Countries: A Literature Review
Recently, due to the numerous benefits of e-government implementation, so it becomes inevitable for both developed and developing countries. However, the benefits of implementing e-government, it faces many challenges in...
Improved Adaptive Huffman Compression Algorithm
In information age, sending the data from one end to another endneed lot of space as well as time. Data compression is atechnique to compress the information source (e.g. a data file, aspeech signal, an image, or a video...
ARTIFICIAL NEURAL NETWORK BASED CHARACTER RECOGNITION USING BACKPROPAGAT
Optical Character Recognition, or OCR, is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera into editable and searcha...
Ontological Engineering Approach Towards Botnet Detection in Network Forensics
The abundance in the usage of Internet, in every arena of life from social to personal, commercial to domestic and other aspects of life as well, leads the rise in cybercrime at an upsetting speed. More illegal activitie...