Optimization of Naïve Bayes Data Mining Classification Algorithm

Abstract

As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity; however, the performance of Naive Bayes classification algorithm suffers in the domains (data set) that involve correlated features. [Correlated features are the features which have a mutual relationship or connection with each other. As correlated features are related to each other, they are measuring the same feature only, means they are redundant features]. This paper is focused upon optimization of Naive Bayes classification algorithms to improve the accuracy of generated classification results with reduced time to build the model from training dataset. The aim is to improve the performance of Naive Bayes algorithms by removing the redundant correlated features before giving the dataset to classifier. This paper highlights and discusses the mathematical derivation of Naive Bayes classifier and theoretically proves how the redundant correlated features reduce the accuracy of the classification algorithm. Finally, from the experimental reviews using WEKA data mining software, this paper presents the impressive results with significant improvement into the accuracy and time taken to build the model by Naive Bayes classification algorithm.

Authors and Affiliations

Maneesh Singhal, Ramashankar Sharma

Keywords

Related Articles

slugDirect torque control of three Phase induction motor using matlab

Induction machines are widely employed in ind ustries due to their rugged structure, high maintainability and economy than DC motors. There has been constant development in the induction motor...

SAKTHI: Scheduling Algorithm K to Hybrid in Cloud Computing

Cloud computing is based on the concepts of distributed computing, grid computing, utility computing and virtualization. It is a virtual pool of resources which are provided to users via Internet. These cloud computing...

Implementation of The K-Means Clustering Algorithm to Analyze the User Interest by Analyzing the University Web Log Servers

Web Usage mining is considered as one of the very important category of the web data mining, which manages the extraction of useful and interesting data from the web log documents. Web utilization has turned out to be t...

A Novel Energy Conservation Method in Television Studio Using NI Lab VIEW

The increasing demand for power has led to considerable fossil fuels burning which has in turn had an adverse impact on environment. In this context, efficient use of energy and its conservation is of paramount importan...

A Method for Compression of Solar Image using Integer Wavelet Transform

Compression is becoming a very important method for reducing the redundancy in the images. The use of larger DWT (discrete wavelet transform) basis functions or wavelet filters produced distortion in the reconstructed i...

Download PDF file
  • EP ID EP18589
  • DOI -
  • Views 851
  • Downloads 25

How To Cite

Maneesh Singhal, Ramashankar Sharma (2014). Optimization of Naïve Bayes Data Mining Classification Algorithm. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2(8), -. https://europub.co.uk/articles/-A-18589