Optimization of Naïve Bayes Data Mining Classification Algorithm

Abstract

As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity; however, the performance of Naive Bayes classification algorithm suffers in the domains (data set) that involve correlated features. [Correlated features are the features which have a mutual relationship or connection with each other. As correlated features are related to each other, they are measuring the same feature only, means they are redundant features]. This paper is focused upon optimization of Naive Bayes classification algorithms to improve the accuracy of generated classification results with reduced time to build the model from training dataset. The aim is to improve the performance of Naive Bayes algorithms by removing the redundant correlated features before giving the dataset to classifier. This paper highlights and discusses the mathematical derivation of Naive Bayes classifier and theoretically proves how the redundant correlated features reduce the accuracy of the classification algorithm. Finally, from the experimental reviews using WEKA data mining software, this paper presents the impressive results with significant improvement into the accuracy and time taken to build the model by Naive Bayes classification algorithm.

Authors and Affiliations

Maneesh Singhal, Ramashankar Sharma

Keywords

Related Articles

Efficient Dynamic Data Flow and Black Hole Detection in Manet

In Wireless networks, when data packets are being transferred between nodes from a specified node to destination, then source node checks for shortest path to reach destination. In this approach there may be a possibili...

Providing Security To Audio With Increased Hiding Capacity Using Cryptography & Steganography

Due to research and new technologies it is possible to store and exchange information in different formats. Secret Information is a very important resource for any organization or individual person. Audio or sound mediu...

slugA survey on Nanotechnology and Its Medical Applications

Nanotechnology is an advanced scientific technique that provides more accurate and timely medical information for diagnosing disease. Nanotechnology is a focal point in diabetes research, where nanoparticles in particul...

Reduction in Heavy Metals and Microflora of Potable Water of Gwalior Region with the Application of Plant Extracts

This study was aimed to improve the quality of drinking water of municipal supply of Gwalior region (M.P.) with the application of plant extracts of Moringa oleifera, Vigna unguiculata (cowpeas), Vigna mungo (urad) and...

Analysis of Water Transmission Behaviour in Sandy Loam Soil under Different Tillage Operations of Mould Board Plough applying /Using Different Infiltration Models

The present research work was performed to investigate the effect of tillage intensity to the water transmission behavior for sandy loam soil in terms of infiltration rates and cumulative infiltration and its validation...

Download PDF file
  • EP ID EP18589
  • DOI -
  • Views 916
  • Downloads 25

How To Cite

Maneesh Singhal, Ramashankar Sharma (2014). Optimization of Naïve Bayes Data Mining Classification Algorithm. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2(8), -. https://europub.co.uk/articles/-A-18589