Filter-Wrapper Approach to Feature Selection Using PSO-GA for Arabic Document Classification with Naive Bayes Multinomial
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 6
Abstract
Abstract: Text categorization and feature selection are two of the many text data mining problems. In text categorization, the document that contains a collection of text will be changed to the dataset format, the dataset that consists of features and class, words become features and categories ofdocuments become class on this dataset. The number of features that too many can cause a decrease in performance of classifier because many of the features that are redundant and not optimal so that feature selection is required to select the optimal features. This paper proposed a feature selectionstrategy based on Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) methods for Arabic Document Classification with Naive Bayes Multinomial (NBM). Particle Swarm Optimization (PSO) is adopted in the first phase with the aim to eliminate the insignificant features and prepared the reduce features to the next phase. In the second phase, the reduced features are optimized using the new evolutionary computation method, Genetic Algorithm (GA). These methods have greatly reduced the features and achieved higher classification compared with full features without features selection. From the experiment that has been done the obtained results of accuracy are NBM85.31%, NBM-PSO 83.91% and NBM-PSO-GA 90.20%.
Authors and Affiliations
Indriyani , Wawan Gunawan , Ardhon Rakhmadi
A Novel Learning Formulation in a unified Min-Max Framework for Computer Aided Diagnosis
Vehicle Security System with Theft Identification and Accident Notification
The rapid development of electronics provides secured environment to the human. As a part of this ‘Vehicle Security System With Theft Identification And Accident Notification’ is designed to reduce the risk i...
ID3 Derived Fuzzy Rules for Predicting the Students AcedemicPerformance
Abstract: This paper presents a technique to use ID3 decision rules to produce fuzzy rules to get the optimizeprediction of the students academic performance. In this paper, a the student administrative data for a...
A Novel Approach For Data Hiding In Web Page Steganography Using Encryption With Compression Based Technique
Abstract: Cryptography, Steganography and Watermarking are three rudimentary techniques which will avail us to secure data from unauthorized access. Steganography is one of the best techniques to obnubilate messagefrom u...
Simulation of Route Optimization with load balancing Using AntNet System
This paper is based on analysis of the performance of load balancing and route optimization in computerized networks. The complete system model shows the scenario of Packet distribution between nodes, and if cong...