Different Type of Feature Selection for Text Classification
Journal Title: INTERNATIONAL JOURNAL OF COMPUTER TRENDS & TECHNOLOGY - Year 2014, Vol 10, Issue 2
Abstract
Text categorization is the task of deciding whether a document belongs to a set of pre specified classes of documents. Automatic classification schemes can greatly facilitate the process of categorization. Categorization of documents is challenging, as the number of discriminating words can be very large. Many existing algorithms simply would not work with these many numbers of features. For most text categorization tasks, there are many irrelevant and many relevant features. The main objective is to propose a text classification based on the features selection and pre-processing thereby reducing the dimensionality of the Feature vector and increase the classification accuracy. In the proposed method, machine learning methods for text classification is used to apply some text preprocessing methods in different dataset, and then to extract feature vectors for each new document by using various feature weighting methods for enhancing the text classification accuracy. Further training the classifier by Naive Bayesian (NB) and K-nearest neighbor (KNN) algorithms, the predication can be made according to the category distribution among this k nearest neighbors. Experimental results show that the methods are favorable in terms of their effectiveness and efficiency when compared with other.
Authors and Affiliations
M. Ramya , J. Alwin Pinakas
Enhancing Detection Rate in Selfish Attack Detection Scheme in Cognitive Radio Adhoc Networks
Cognitive radio (CR) is a significant communication technology in which the unlicensed users use the maximum available bandwidth. When the spectrum is not used by the licensed primary user, the obtainable channels are al...
A Survey of an Adaptive Weighted Spatio-Temporal Pyramid Matching For Video Retrieval
Recently, in the field of video analysis and retrieval Human action recognition in video is an important research and challenging topic. An efficient video retrieval is needed to search most similar and relevant video co...
A Review on Impersonation Attack in Mobile Ad-Hoc Network
An ad hoc network is a collection of mobile nodes that dynamically form a temporary network and are capable of communicating with each other without the use of a network infrastructure or any centralized administration....
J48 Classifier Approach to Detect Characteristic of Bt Cotton base on Soil Micro Nutrient
Agriculture is an emerging research field that is experiencing a constant development. In agriculture area problem of efficient knowledge exploitation and acquisition is very popular. In agriculture decision making proce...
Natural Radioactivity Measurements in different regions in Najaf city, Iraq
This study measures the activity of 238U, 232Th, and 40K. The soil samples collected from different sites in Najaf city, Iraq. The studied samples were analyzed and the concentrations of radionuclides were determined usi...