Performance Analysis of Machine Learning Algorithms for Missing Value Imputation
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2018, Vol 9, Issue 6
Abstract
Data mining requires a pre-processing task in which the data are prepared, cleaned, integrated, transformed, reduced and discretized for ensuring the quality. Missing values is a universal problem in many research domains that is commonly encountered in the data cleaning process. Missing values usually occur when a value of stored data absent for a variable of an observation. Missing values problem imposes undesirable effect on analysis results, especially when it leads to biased parameter estimates. Data imputation is a common way to deal with missing values where the missing value’s substitutes are discovered through statistical or machine learning techniques. Nevertheless, examining the strengths (and limitations) of these techniques is important to aid understanding its characteristics. In this paper, the performance of three machine learning classifiers (K-Nearest Neighbors (KNN), Decision Tree, and Bayesian Networks) are compared in terms of data imputation accuracy. The results shows that among the three classifiers, Bayesian has the most promising performance.
Authors and Affiliations
Nadzurah Zainal Abidin, Amelia Ritahani Ismail, Nurul A. Emran
RPOA Model-Based Optimal Resource Provisioning
Optimal utilization of resources is the core of the provisioning process in the cloud computing. Sometimes the local resources of a data center are not adequate to satisfy the users’ requirements. So, the providers need...
Construction Project Quality Management using Building Information Modeling 360 Field
A quality management process plays a vital role in the success of engineering and construction projects. The management process needs to be effective and efficient if projects are to be completed on time and within the p...
The cybercrime process : an overview of scientific challenges and methods
The aim of this article is to describe the cybercrime process and to identify all issues that appear at the different steps, between the detection of incident to the final report that must be exploitable for a judge. It...
Academic Emotions Affected by Robot Eye Color: An Investigation of Manipulability and Individual-Adaptability
We investigate whether academic emotions are affected by the color of a robot’s eyes in lecture behaviors. In conventional human-robot interaction research on robot lecturers, the emphasis has been on robots assisting or...
Applying Genetic Algorithms to Test JUH DBs Exceptions
Database represents an essential part of software applications. Many organizations use database as a repository for large amount of current and historical information. With this context testing database applications is a...