A Comparison of Five Methods for Missing Value Imputation in Data Sets

Journal Title: International Scientific and Vocational Studies Journal - Year 2018, Vol 2, Issue 2

Abstract

The missing values in the data sets do not allow for accurate analysis. Therefore, the correct imputation of missing values has become the focus of attention of researchers in recent years. This paper focuses on a comparison of most reliable and up to date estimation methods to imputing the missing values. Imputation of missing values has a very high priority because of its impact on next pre-processing, data analysis, classification, clustering, etc. Root mean square error (RMSE) value, classification accuracy and execution time are used to evaluate the performances of most popular five methods (mean, k-nearest neighbors, singular value decomposition, bayesian principal component analysis and missForest). When RMSE and classification accuracy values of methods were compared, it has observed that missForest method outperformed other methods in all datasets.

Authors and Affiliations

Pınar Cihan

Keywords

Related Articles

Improved Compound Multiphase Waveforms with Additional Amplitude Modulation (periodic mode) for Marine Radars

This paper has presented the basis of a compound multiphase waveform design with additional amplitude modulation, capable of controlling a waveform pick-factor, suitable for use with marine radar. The waveform shows goo...

Solar Cell Usage in a House in Erdemli District of Mersin for Meeting Electricity Demand and Cost Analysis

Energy is the one of the basic needs in order to survive since human existence. The vast majority of this energy is derived from fosil fuels. The increase in energy demand, the limited resources, and harmful effect of di...

The Status Of Automation System At The International Islamic University Chittagong (IIUC) Library, Bangladesh: A Study

This study evaluated the performance of the central library at International Islamic University Chittagong in Bangladesh and tried to measure the operational process of “Koha open source integrated library system (ILS)”...

Performance Analysis of Storage, Grid Connected Hybrid Photovoltaic System

Photovoltaic solar energy plants are rapidly increasing. These systems are generally on-grid or off-grid photovoltaic systems. In this study, a hybrid system is realized and analyzed. This system contains feature of on-g...

Exterior Scaffolds of Prefabricated Components, Loads Affecting Scaffolds and Scaffold Experiments

In the construction sector, constructions, painting, heat insulation, coating and so on. Outside facade scaffoldings are widely used in outdoor facade applications. The floats consist of temporary elements which are des...

Download PDF file
  • EP ID EP35735
  • DOI -
  • Views 341
  • Downloads 0

How To Cite

Pınar Cihan (2018). A Comparison of Five Methods for Missing Value Imputation in Data Sets. International Scientific and Vocational Studies Journal, 2(2), -. https://europub.co.uk/articles/-A-35735