Different Type of Feature Selection for Text Classification

Journal Title: INTERNATIONAL JOURNAL OF COMPUTER TRENDS & TECHNOLOGY - Year 2014, Vol 10, Issue 2

Abstract

Text categorization is the task of deciding whether a document belongs to a set of pre specified classes of documents. Automatic classification schemes can greatly facilitate the process of categorization. Categorization of documents is challenging, as the number of discriminating words can be very large. Many existing algorithms simply would not work with these many numbers of features. For most text categorization tasks, there are many irrelevant and many relevant features. The main objective is to propose a text classification based on the features selection and pre-processing thereby reducing the dimensionality of the Feature vector and increase the classification accuracy. In the proposed method, machine learning methods for text classification is used to apply some text preprocessing methods in different dataset, and then to extract feature vectors for each new document by using various feature weighting methods for enhancing the text classification accuracy. Further training the classifier by Naive Bayesian (NB) and K-nearest neighbor (KNN) algorithms, the predication can be made according to the category distribution among this k nearest neighbors. Experimental results show that the methods are favorable in terms of their effectiveness and efficiency when compared with other.

Authors and Affiliations

M. Ramya , J. Alwin Pinakas

Keywords

Related Articles

A Privacy Preserving of Composite Private/Public Key in Cloud Servers

Security is a term used to provide secrecy of data from the illegal entries. It is used to prevent a user that he/she should not have access to. It is a two step process. The security system in the first step identifies...

An Efficient Web Prediction Model Using Modified Markov Model with ANN

Web prediction is a classification problem in which we try to predict the preceding set of Web pages in which a user may visit supported on the knowledge of the previously visited pages. While serving the Internet user’s...

A Novel Technique in Cryptography for Data Hiding i n Digital Images

Visual cryptography [1],[6] an emerging technology used in the purpose of data hiding and other specific purposes, uses the characteristics of normal encryption[13] and decryption levels of many digital images. Generally...

Secured Packet Hiding Technique for Packet Jamming Attacks

Wireless networks are built upon a shared medium that makes it easy for adversaries to launch jamming-style attacks. Jamming attacks can severely interfere with the normal operation of Networks and, consequently, mechani...

Wireless Sensor Based Remote Monitoring System For Fluoride Affected Areas Using GPRS and GIS

Recent developments in the availability of low-cost integrated General Packet Radio Service (GPRS)/Global Positioning Systems (GPS) modem and publically available web based Geographical Information Systems (GIS)have enab...

Download PDF file
  • EP ID EP147300
  • DOI -
  • Views 123
  • Downloads 0

How To Cite

M. Ramya, J. Alwin Pinakas (2014). Different Type of Feature Selection for Text Classification. INTERNATIONAL JOURNAL OF COMPUTER TRENDS & TECHNOLOGY, 10(2), 102-107. https://europub.co.uk/articles/-A-147300