Optimization of Naïve Bayes Data Mining Classification Algorithm

Abstract

As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity; however, the performance of Naive Bayes classification algorithm suffers in the domains (data set) that involve correlated features. [Correlated features are the features which have a mutual relationship or connection with each other. As correlated features are related to each other, they are measuring the same feature only, means they are redundant features]. This paper is focused upon optimization of Naive Bayes classification algorithms to improve the accuracy of generated classification results with reduced time to build the model from training dataset. The aim is to improve the performance of Naive Bayes algorithms by removing the redundant correlated features before giving the dataset to classifier. This paper highlights and discusses the mathematical derivation of Naive Bayes classifier and theoretically proves how the redundant correlated features reduce the accuracy of the classification algorithm. Finally, from the experimental reviews using WEKA data mining software, this paper presents the impressive results with significant improvement into the accuracy and time taken to build the model by Naive Bayes classification algorithm.

Authors and Affiliations

Maneesh Singhal, Ramashankar Sharma

Keywords

Related Articles

Electric Commuter Bike

Increasing demand for non-polluting mechanized transportation has revived the interest in the use of electric power for personal transportation and also reduced reliance on automobiles. Electric bike is a low cost alter...

slugAn Approach To Automatically Detect Cardiac Arrhythmia

Electrocardiogram (ECG), a non-invasive technique is used as a primary diagnostic tool for cardiovascular diseases. The main objective is to make the analysis of normal and abnormal beats easy so that the patient could...

Architecture for Presence Factor-Oriented Blog Summarization

Numerous approaches for identifying important content for automatic text summarization have been developed to date. Topic representation approaches first derive an intermediate representation of the text that captures t...

Video Forgery Detection Using DWT, Optical Flow and SIFT Methods

Digital video offer many attributes for tamper detection algorithms to take advantage of, specifically the color and brightness of individual pixels as well as the resolution and format. These properties provide scope f...

Stress analysis and shape optimization of wheel rim

Automobile have number of components which are important for good performance of the vehicle. Testing component with the help of various software is economical as compared to laboratory testing. The computational power...

Download PDF file
  • EP ID EP18589
  • DOI -
  • Views 950
  • Downloads 25

How To Cite

Maneesh Singhal, Ramashankar Sharma (2014). Optimization of Naïve Bayes Data Mining Classification Algorithm. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2(8), -. https://europub.co.uk/articles/-A-18589