Optimization of Naïve Bayes Data Mining Classification Algorithm

Optimization of Naïve Bayes Data Mining Classification Algorithm


Subject and more

  • LCC Subject Category: Engineering, Applied Linguistics
  • Publisher's keywords: classification, Naive Bayes, Correlated Redundant Features, CFS algorithm, Classifier Prediction Accuracy, WEKA
  • Language of fulltext: english
  • Full-text formats available: PDF
  • Time From Submission to Publication: 8


    Maneesh Singhal, Ramashankar Sharma


To download PDF files Login to your Account.


As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity; however, the performance of Naive Bayes classification algorithm suffers in the domains (data set) that involve correlated features. [Correlated features are the features which have a mutual relationship or connection with each other. As correlated features are related to each other, they are measuring the same feature only, means they are redundant features]. This paper is focused upon optimization of Naive Bayes classification algorithms to improve the accuracy of generated classification results with reduced time to build the model from training dataset. The aim is to improve the performance of Naive Bayes algorithms by removing the redundant correlated features before giving the dataset to classifier. This paper highlights and discusses the mathematical derivation of Naive Bayes classifier and theoretically proves how the redundant correlated features reduce the accuracy of the classification algorithm. Finally, from the experimental reviews using WEKA data mining software, this paper presents the impressive results with significant improvement into the accuracy and time taken to build the model by Naive Bayes classification algorithm.

About Europub

EuroPub is a comprehensive, multipurpose database covering scholarly literature, with indexed records from active, authoritative journals, and indexes articles from journals all over the world. The result is an exhaustive database that assists research in every field. Easy access to a vast database at one place, reduces searching and data reviewing time considerably and helps authors in preparing new articles to a great extent. EuroPub aims at increasing the visibility of open access scholarly journals, thereby promoting their increased usage and impact.