Text Classification by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 1
Abstract
Text classification is the task of assigning predefined categories to free-text documents based on their content. Traditional approaches used unigram based models for text classification. Unigram based models such as Bag Of Words(BOW) models are not considering co-occurrence of set of words in a document level. This paper proposes a way to find co-occurrence feature from anchor text of wikipedia pages, proposes a way to incorporate co-occurrence feature to BOW model. Finally the method is analyzed to know how it performs in task of text classification.
Authors and Affiliations
Soumya George K
Smart GNC scheme forAutonomous Planetary Landing
Abstract: The moon or other nearest planets are important destinations for space science and the smart landing is key technology for exploring the different planets without fail. Due to the long round-trip delay ofcommun...
Enhancement of Bag-of-Words for Legal documents using Legal Statute
Abstract: In this paper Legal statute related to dowry acts has been processed to obtain a distinct set of legal keywords which don’t have a common occurrence in day to day dowry case judgments. This effort coupled with...
Preventing Web-Proxy Based DDoS using Request SequenceFrequency
Abstract: In order to control the request flow in Computer Networks, a proxy server is used. Proxy Server is aserver which acts as an intermediary server between server and clients. The more adaptable and converted...
Estimation of Word Net-Based Lexical Semantic Similarity Measure for Telugu Documents
The estimation of lexical semantic relatedness has numerous applications in NLP. Several measures are available for the evaluation of lexical semantic relatedness. This paper presents two approaches for measuring...
Persuasive Cued Click Based Graphical Password with Scrambling For Knowledge Based Authentication Technique with Image Scrambling
Adequate user authentication is a persistent problem, particularly with hand- held devices such as Personal Digital Assistants (PDAs), which tend to be highly personal and at the fringes of an organization’s &nb...