Self Appreciating Concept Based Model For Cross Domain Document Classification

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3

Abstract

 Abstract : In text mining, text categorization is an important technique for classifying the documents. Most of the times statistical approaches that are based on analysis of the term in the form of frequency of the term, that is the number of occurrences of one or more words in the document are used for classification. Even statistical analysis indicates the importance of the term, but it is hard to analyze when multiple terms have the same frequency value, but one term is more important in terms of meaning than the other. Also, there are a wide variety of documents being generated that belongs to different domains which differ in formats, writing styles, etc. These domains can be news articles, e-mails, online chats, blogs, wiki articles, twitter posts, message forums, speech transcripts, etc. Often a classification method that works well in one domain does not work as well in another. The proposed system tries to implement a concept based text classification model that classifies the cross-domain text data based on the semantics or theme of the text data. Also the proposed approach makes the training system stronger and stronger at all possible positive tests of the categorizer. This system is called as a Self Appreciating Concept Based Classifier (SACBC).

Authors and Affiliations

Dipak A. Sutar

Keywords

Related Articles

 Security Enforcement with query routing Information Brokering in Distributed Information Sharing

Abstract: Information brokering system (IBS) shares information via on-demand access. IBS connect large-scale loosely federated data sources via a brokering overlay. It is a peer-to-peer overlay network that comprises di...

Mining Frequent Patterns on Object-Relational Data

Abstract : Data mining is viewed as an essential part of the process towards knowledge discovery. Through data mining process different kinds of patterns that is frequent pattern and others, are discovered, evaluated and...

 Software Development Effort and Cost Estimation: Neuro-Fuzzy Model

 Software development effort and cost prediction is one of the important activities in software project management. Accuracy in prediction is a challenge for software developers. There are many models exists that...

Managing IoT data using relational schema and JSON fields, a comparative study

Data transmitted from sensors and actuators as part of the Internet of Things (IoT) infrastructure are stored either in database tables following relational schema and normalization forms or in schema less collections us...

 Lightning Strike and Thunder and its Effect on Television Signal Transmission

Abstract: The lightning strike and thunder and its effect on television signal transmission in Nigerian Television, NTA Abakaliki Nigeria was analyzed. From the analyses, it was observed that lightning and thunder strike...

Download PDF file
  • EP ID EP152519
  • DOI 10.9790/0661-16329095
  • Views 102
  • Downloads 0

How To Cite

Dipak A. Sutar (2014).  Self Appreciating Concept Based Model For Cross Domain Document Classification. IOSR Journals (IOSR Journal of Computer Engineering), 16(3), 90-95. https://europub.co.uk/articles/-A-152519