Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 1

Abstract

 Text classification is the task of assigning predefined categories to free-text documents based on their content. Traditional approaches used unigram based models for text classification. Unigram based models such as Bag Of Words(BOW) models are not considering co-occurrence of set of words in a document level. This paper proposes a way to find co-occurrence feature from anchor text of wikipedia pages, proposes a way to incorporate co-occurrence feature to BOW model. Finally the method is analyzed to know how it performs in task of text classification.

Authors and Affiliations

Soumya George K

Keywords

Related Articles

 Medical Image Segmentation Based on Level Set Method

 This paper presents a shape-based approach to curve evolution for the segmentation of medical images. Automatic interpretation of medical images is a very difficult problem in computer vision. Several methods...

 Security Issues in Next Generation IP and Migration Networks

 Abstract: As networks are mushrooming, the growth and development of IPv6 is gaining more importance. Thewide scale deployment of this protocol into operational networks raises certain issues with security being on...

Data attribute security and privacy in distributed database system

Now a days there are a need of data attribute security in distributed database while preserving privacy. In the proposed work, we consider problem related in publishing collaborative data for anonymizing,vertically and h...

 Model of Computation-Turing Machine

 : In theoretical computer science and mathematics , the theory of computation is the branch that deals with how efficiently problems can be solved on a model of computation, using an algorithm. The field is &nbs...

 Efficient video watermarking with SWT and empirical PCAbased decoding

Abstract: Digital content piracy is one of the major crimes in the present world. Protection of digital content like music, video and images has become a major problem. Watermarking is one of the methods to protect digit...

Download PDF file
  • EP ID EP99283
  • DOI -
  • Views 99
  • Downloads 0

How To Cite

Soumya George K (2014).  Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature. IOSR Journals (IOSR Journal of Computer Engineering), 16(1), 34-38. https://europub.co.uk/articles/-A-99283