Regression model approach to predict missing values in the Excel sheet databases

Abstract

The most important stage of data mining is pre-processing, where we prepare the data for mining. Real-world data tends to be incomplete, noisy, and inconsistent and an important task when pre-processing the data is to fill in missing values, smooth out noise and correct inconsistencies. We can handle the missing values by ignoring data row, using global constant to fill miss missing value, using attribute mean to fill missing value, using attribute mean for all samples belonging to the same class, using most probable value to fill the missing value , and finally we can use the data mining algorithm to predict the value. We use Regression method for this prediction of missing values. This method is used to map a data item to a real valued prediction variable. All these operations can be done by using EXCEL sheet database also.

Authors and Affiliations

Z. Mahesh Kum , R. Manjula

Keywords

Related Articles

A SURVEY ON LUNG SEGMENTATION TECHNIQUES

Interstitial lung disease is one of the main treat to the health .Computer Tomography is used for assessment of interstitial lung disease .But sometimes it is difficult to visually interpret because of the crossing and o...

Cybercrime: A Global Threat to Cybercommunity

Cyberspace is a virtual space equal important as real space for business, politics and communities. Cyberspace is vulnerable to borderless cyberattacks. The global problem of Cybercrime is also social, legal and it’s gro...

A COMPARATIVE STUDY OF CLASSIFICATION ALGORITHM USING ACCIDENT DATA

Road traffic accidents are the majority and severe issue, it results death and injuries of various levels. The traffic control system is one of the main areas, where critical data regarding the society is noted and kept...

Medical Disease Diagnosis Using Structuring Text

Medical diagnosis is an important domain of research which aids to identify the occurrence of a disease. The paper proposes a novel glide path to knowledge discovery in medical systems by acquiring relevant information f...

A Comparative Study of Various Network Simulation Tools

In the area of network research, establishing a network in a real time scenario is very difficult. A single test bed takes a large amount of time and cost. Thus a network simulator tool helps the network developer to che...

Download PDF file
  • EP ID EP92755
  • DOI -
  • Views 127
  • Downloads 0

How To Cite

Z. Mahesh Kum, R. Manjula (2012). Regression model approach to predict missing values in the Excel sheet databases. International Journal of Computer Science & Engineering Technology, 3(4), 130-135. https://europub.co.uk/articles/-A-92755