Optimization of Naïve Bayes Data Mining Classification Algorithm

Abstract

As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity; however, the performance of Naive Bayes classification algorithm suffers in the domains (data set) that involve correlated features. [Correlated features are the features which have a mutual relationship or connection with each other. As correlated features are related to each other, they are measuring the same feature only, means they are redundant features]. This paper is focused upon optimization of Naive Bayes classification algorithms to improve the accuracy of generated classification results with reduced time to build the model from training dataset. The aim is to improve the performance of Naive Bayes algorithms by removing the redundant correlated features before giving the dataset to classifier. This paper highlights and discusses the mathematical derivation of Naive Bayes classifier and theoretically proves how the redundant correlated features reduce the accuracy of the classification algorithm. Finally, from the experimental reviews using WEKA data mining software, this paper presents the impressive results with significant improvement into the accuracy and time taken to build the model by Naive Bayes classification algorithm.

Authors and Affiliations

Maneesh Singhal, Ramashankar Sharma

Keywords

Related Articles

Feasibility of Pipe Distribution Network (PDN) over Canal Distribution Network (CDN) For Irrigation

Water, which is a valuable, finite, renewable and shared resource required by various sectors, must be managed optimally. Stress due to scarcity of water is growing at an alarming rate. To reduce this stresses and to me...

Identification of Black Mold Disease in Tomato using Fuzzy Inference System

Tomato is most commonly grown vegetable in all over the world. Tomato is used in many ways as a constituent such as sauces, pickles, salads, and drinks etc[1]. Tomatoes get easily infected as they are susceptible to tem...

Mechanical Properties Assessment of Ultra High Performance Fibre Reinforced Concrete (UHPFRC)

The present paper is aimed to assessment of UHPFRC the conventional concrete is replaced by UHPFRC, an Advanced Cement based Super plasticised concrete with high workability and low water cement ratio. The constituents...

A Study of Various Algorithms Used for Analyzing Eavesdropping Attack in Industrial Wireless Sensor Network

In industrial applications, the real time communications among the spatially distributed sensors should satisfy reliability requirements and strict security. Most of the industries use wireless networks for communicatin...

Gesture Controlled Product Explorer in E-Commerce

E-Commerce, Augmented Reality (AR), Virtual Reality (VR). Three rapidly evolving digital technologies that have long held the theoretical promise of delivering more convenient, enhanced and immersive shopping experience...

Download PDF file
  • EP ID EP18589
  • DOI -
  • Views 854
  • Downloads 25

How To Cite

Maneesh Singhal, Ramashankar Sharma (2014). Optimization of Naïve Bayes Data Mining Classification Algorithm. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2(8), -. https://europub.co.uk/articles/-A-18589