Performance Analysis of Machine Learning Algorithms for Missing Value Imputation
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2018, Vol 9, Issue 6
Abstract
Data mining requires a pre-processing task in which the data are prepared, cleaned, integrated, transformed, reduced and discretized for ensuring the quality. Missing values is a universal problem in many research domains that is commonly encountered in the data cleaning process. Missing values usually occur when a value of stored data absent for a variable of an observation. Missing values problem imposes undesirable effect on analysis results, especially when it leads to biased parameter estimates. Data imputation is a common way to deal with missing values where the missing value’s substitutes are discovered through statistical or machine learning techniques. Nevertheless, examining the strengths (and limitations) of these techniques is important to aid understanding its characteristics. In this paper, the performance of three machine learning classifiers (K-Nearest Neighbors (KNN), Decision Tree, and Bayesian Networks) are compared in terms of data imputation accuracy. The results shows that among the three classifiers, Bayesian has the most promising performance.
Authors and Affiliations
Nadzurah Zainal Abidin, Amelia Ritahani Ismail, Nurul A. Emran
Wavelet Time-frequency Analysis of Electro-encephalogram (EEG) Processing
This paper proposes time-frequency analysis of EEG spectrum and wavelet analysis in EEG de-noising. In this paper, the basic idea is to use the characteristics of multi-scale multi-resolution, using four different...
Active and Reactive Power Control of a Variable Speed Wind Energy Conversion System based on Cage Generator
This manuscript presents the modeling and control design for a variable speed wind energy conversion system (VS-WECS). This control scheme is based on three-phase squirrel cage induction generator driven by a horizontal-...
Repository System for Geospatial Software Development and Integration
The integration of geospatial software components has recently received considerable attention due to the need for rapid growth of GIS application and development environments. However, finding appropriate source code co...
Convex Hybrid Restoration and Segmentation Model for Color Images
Image restoration and segmentation are important areas in digital image processing and computer vision. In this paper, a new convex hybrid model is proposed for joint restoration and segmentation during the post-processi...
Online Monitoring System Design of Intelligent Circuit Breaker Based on DSP and ARM
In order to accurately analyze the dynamic characteristics of the vacuum circuit breaker, a dual-core master-slave processor structure for online monitoring system based on DSP and ARM is proposed. This structure consist...