Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 1

Abstract

 Text classification is the task of assigning predefined categories to free-text documents based on their content. Traditional approaches used unigram based models for text classification. Unigram based models such as Bag Of Words(BOW) models are not considering co-occurrence of set of words in a document level. This paper proposes a way to find co-occurrence feature from anchor text of wikipedia pages, proposes a way to incorporate co-occurrence feature to BOW model. Finally the method is analyzed to know how it performs in task of text classification.

Authors and Affiliations

Soumya George K

Keywords

Related Articles

Survey Of DDoS Attacks Based On TCP/IP Protocol Vulnerabilities

Abstract: Distributed denial-of-service (DDoS) attacks are one of the key threats and perhaps the toughest security problem for today’s Internet.Distributed Denial of Service (DDoS) attack has become a stimulating proble...

 Analysis and evaluation of probabilistic routing protocol for intermittently connected network

 Intermittently connected network often referred to as Delay/Disruption Tolerant Network which is an infrastructure less network suffers from intermittent connection i.e. a connected path from the source to the de...

 Reliable Multicast and Energy Conservation in MANET: A Survey

 Abstract: Mobile ad hoc networks are networks without any infrastructure and topology. They are self organized and battery powered networks with a large number of mobile nodes. Limited battery power is the most i...

 RKO Technique for Color Visual Cryptography

 Abstract : To maintain the secrecy and confidentiality of images two different approaches are being followed, Image Encryption and Visual Cryptography. The former being encrypting the images through encryption algo...

 A Survey on Balancing the Network Load Using GeographicHash Tables

 Abstract: The load Balancing in the network is a severe problem in network. The data created in wirelessnetwork is kept on node. It accessed over geographic hash table. The geographic hash table is used to recoverd...

Download PDF file
  • EP ID EP99283
  • DOI -
  • Views 109
  • Downloads 0

How To Cite

Soumya George K (2014).  Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature. IOSR Journals (IOSR Journal of Computer Engineering), 16(1), 34-38. https://europub.co.uk/articles/-A-99283