Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 1

Abstract

 Text classification is the task of assigning predefined categories to free-text documents based on their content. Traditional approaches used unigram based models for text classification. Unigram based models such as Bag Of Words(BOW) models are not considering co-occurrence of set of words in a document level. This paper proposes a way to find co-occurrence feature from anchor text of wikipedia pages, proposes a way to incorporate co-occurrence feature to BOW model. Finally the method is analyzed to know how it performs in task of text classification.

Authors and Affiliations

Soumya George K

Keywords

Related Articles

Smart GNC scheme forAutonomous Planetary Landing

Abstract: The moon or other nearest planets are important destinations for space science and the smart landing is key technology for exploring the different planets without fail. Due to the long round-trip delay ofcommun...

Enhancement of Bag-of-Words for Legal documents using Legal Statute

Abstract: In this paper Legal statute related to dowry acts has been processed to obtain a distinct set of legal keywords which don’t have a common occurrence in day to day dowry case judgments. This effort coupled with...

 Preventing Web-Proxy Based DDoS using Request SequenceFrequency

 Abstract: In order to control the request flow in Computer Networks, a proxy server is used. Proxy Server is aserver which acts as an intermediary server between server and clients. The more adaptable and converted...

 Estimation of Word Net-Based Lexical Semantic Similarity Measure for Telugu Documents

 The estimation of lexical semantic relatedness has numerous applications in NLP. Several measures are available for the evaluation of lexical semantic relatedness. This paper presents two approaches for measuring...

 Persuasive Cued Click Based Graphical Password with  Scrambling For Knowledge Based Authentication Technique with  Image Scrambling

 Adequate user authentication is a persistent problem, particularly with hand- held devices such as Personal Digital Assistants (PDAs), which tend to be highly personal and at the fringes of an organization’s &nb...

Download PDF file
  • EP ID EP99283
  • DOI -
  • Views 106
  • Downloads 0

How To Cite

Soumya George K (2014).  Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature. IOSR Journals (IOSR Journal of Computer Engineering), 16(1), 34-38. https://europub.co.uk/articles/-A-99283