Filter-Wrapper Approach to Feature Selection Using PSO-GA for Arabic Document Classification with Naive Bayes Multinomial

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 6

Abstract

Abstract: Text categorization and feature selection are two of the many text data mining problems. In text categorization, the document that contains a collection of text will be changed to the dataset format, the dataset that consists of features and class, words become features and categories ofdocuments become class on this dataset. The number of features that too many can cause a decrease in performance of classifier because many of the features that are redundant and not optimal so that feature selection is required to select the optimal features. This paper proposed a feature selectionstrategy based on Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) methods for Arabic Document Classification with Naive Bayes Multinomial (NBM). Particle Swarm Optimization (PSO) is adopted in the first phase with the aim to eliminate the insignificant features and prepared the reduce features to the next phase. In the second phase, the reduced features are optimized using the new evolutionary computation method, Genetic Algorithm (GA). These methods have greatly reduced the features and achieved higher classification compared with full features without features selection. From the experiment that has been done the obtained results of accuracy are NBM85.31%, NBM-PSO 83.91% and NBM-PSO-GA 90.20%.

Authors and Affiliations

Indriyani , Wawan Gunawan , Ardhon Rakhmadi

Keywords

Related Articles

 Feature Selection And Vectorization In Legal Case DocumentsUsing Chi-Square Statistical Analysis And Naïve BayesApproaches

 Abstract : Most machine learning techniques employed in the area of text classification require the features ofthe documents to be effectively selected owing to the large chunk of data encountered in the classifica...

 Use of Data Mining in Various Field: A Survey Paper

 Abstract: Data mining is extracts the knowledge/ information from a large amount of data which stores in multiple heterogeneous data base. Knowledge /information are conveying the message through direct or indirect...

 Flexible Dynamic Recommender System

 A Recommender System now becoming decision maker for the people who lack sufficient personal experience to evaluate the items that are on website. It provides recommendation for specific items such as books, news,...

 An Overview of TRIZ Problem-Solving Methodology and its  Applications

 TRIZ, which is a Russian word that stands for “Theory of Inventive Problem Solving”, is a problemsolving methodology that was invented based on the belief that there are universal principles of invention that &nbsp...

A Survey on Brute Force Attack on Open Functionality Secured

The project entitled as Brute Force Attack On Open Functionality Secured is to design and develop the application package for well secured dynamic application. A common threat web developer’s face is a password-guessing...

Download PDF file
  • EP ID EP122852
  • DOI -
  • Views 119
  • Downloads 0

How To Cite

Indriyani, Wawan Gunawan, Ardhon Rakhmadi (2015). Filter-Wrapper Approach to Feature Selection Using PSO-GA for Arabic Document Classification with Naive Bayes Multinomial. IOSR Journals (IOSR Journal of Computer Engineering), 17(6), 45-51. https://europub.co.uk/articles/-A-122852