A Comparison of Five Methods for Missing Value Imputation in Data Sets

Journal Title: International Scientific and Vocational Studies Journal - Year 2018, Vol 2, Issue 2

Abstract

The missing values in the data sets do not allow for accurate analysis. Therefore, the correct imputation of missing values has become the focus of attention of researchers in recent years. This paper focuses on a comparison of most reliable and up to date estimation methods to imputing the missing values. Imputation of missing values has a very high priority because of its impact on next pre-processing, data analysis, classification, clustering, etc. Root mean square error (RMSE) value, classification accuracy and execution time are used to evaluate the performances of most popular five methods (mean, k-nearest neighbors, singular value decomposition, bayesian principal component analysis and missForest). When RMSE and classification accuracy values of methods were compared, it has observed that missForest method outperformed other methods in all datasets.

Authors and Affiliations

Pınar Cihan

Keywords

Related Articles

Interlock Optimization Of An Accelerator Using Genetic Algorithm

Accelerators are systems where high-tech experiments are conducted today and contain high-tech constructions. Construction and operation of accelerators require multidisciplinary studies. Each accelerator structure has i...

Elastic Analysis of an Hollow Cylinder Made from Functionally Graded Material Exposed to Internal Pressure

This study relates to the determination of radial and tangential stresses and radial displacements in a hollow cylinder made of functional graded material (FGM) subject to internal pressure both analytically and using A...

Microstructural Evaluation of Cement Mortars with Blast Furnace Slag Exposed to Sulfate Attack

Sulfate is a major chemical threat to concrete and reinforced concrete structures. In this study, a micro-structural analysis of the effects of sulfate on cement mortars with blast furnace slag substitution was conducte...

Solar Cell Usage in a House in Erdemli District of Mersin for Meeting Electricity Demand and Cost Analysis

Energy is the one of the basic needs in order to survive since human existence. The vast majority of this energy is derived from fosil fuels. The increase in energy demand, the limited resources, and harmful effect of di...

Analysis of The Legal Aspects of Work Accidents

Occupational health and safety related to the work which is done in workplaces in general people full body health and safety provision. In our country, the occupational health and safety risk assessment analysis moral,...

Download PDF file
  • EP ID EP35735
  • DOI -
  • Views 318
  • Downloads 0

How To Cite

Pınar Cihan (2018). A Comparison of Five Methods for Missing Value Imputation in Data Sets. International Scientific and Vocational Studies Journal, 2(2), -. https://europub.co.uk/articles/-A-35735