Regression model approach to predict missing values in the Excel sheet databases

Abstract

The most important stage of data mining is pre-processing, where we prepare the data for mining. Real-world data tends to be incomplete, noisy, and inconsistent and an important task when pre-processing the data is to fill in missing values, smooth out noise and correct inconsistencies. We can handle the missing values by ignoring data row, using global constant to fill miss missing value, using attribute mean to fill missing value, using attribute mean for all samples belonging to the same class, using most probable value to fill the missing value , and finally we can use the data mining algorithm to predict the value. We use Regression method for this prediction of missing values. This method is used to map a data item to a real valued prediction variable. All these operations can be done by using EXCEL sheet database also.

Authors and Affiliations

Z. Mahesh Kum , R. Manjula

Keywords

Related Articles

Credential Proactive Protection Guard: A Proactive Password Checking Tool

Over the internet, user profiling is one of the key activity in which user is asked to provide personal as well as professional information. The user is not aware about the misuse of profiling. It has been observed that...

Evaluating Recommender Strategies 

Recommender systems are a subclass of information filtering systems that seek to generate meaningful recommendations to users for products or items that might interest them. In recent times, it has become common to colle...

Design and Implementation of an Active RFID Tag

The Active Radio Frequency Identification tag that is RFID tag with battery is promising for RFID low power consumption and precise localization in indoor cluttered as well as for outdoor environment. In this paper, Desi...

Automatic System For Brain Tumor Detection And Classification Using Level Set And ANN

Even with increasing popularity of MRI imaging techniques, the assessment of lesions in brain area is still performed manually or semi-manually. The major drawbacks to manual image segmentation are time consuming and sub...

Study Of Multidomain Query Optimization And Answering

In queries having multiple domains it is seen that general purpose search engines are not able to answer multidomain queries and one of the domain is considered by specific search services but no integrated framework is...

Download PDF file
  • EP ID EP92755
  • DOI -
  • Views 139
  • Downloads 0

How To Cite

Z. Mahesh Kum, R. Manjula (2012). Regression model approach to predict missing values in the Excel sheet databases. International Journal of Computer Science & Engineering Technology, 3(4), 130-135. https://europub.co.uk/articles/-A-92755