Self Appreciating Concept Based Model For Cross Domain Document Classification
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3
Abstract
Abstract : In text mining, text categorization is an important technique for classifying the documents. Most of the times statistical approaches that are based on analysis of the term in the form of frequency of the term, that is the number of occurrences of one or more words in the document are used for classification. Even statistical analysis indicates the importance of the term, but it is hard to analyze when multiple terms have the same frequency value, but one term is more important in terms of meaning than the other. Also, there are a wide variety of documents being generated that belongs to different domains which differ in formats, writing styles, etc. These domains can be news articles, e-mails, online chats, blogs, wiki articles, twitter posts, message forums, speech transcripts, etc. Often a classification method that works well in one domain does not work as well in another. The proposed system tries to implement a concept based text classification model that classifies the cross-domain text data based on the semantics or theme of the text data. Also the proposed approach makes the training system stronger and stronger at all possible positive tests of the categorizer. This system is called as a Self Appreciating Concept Based Classifier (SACBC).
Authors and Affiliations
Dipak A. Sutar
Security Enforcement with query routing Information Brokering in Distributed Information Sharing
Abstract: Information brokering system (IBS) shares information via on-demand access. IBS connect large-scale loosely federated data sources via a brokering overlay. It is a peer-to-peer overlay network that comprises di...
Mining Frequent Patterns on Object-Relational Data
Abstract : Data mining is viewed as an essential part of the process towards knowledge discovery. Through data mining process different kinds of patterns that is frequent pattern and others, are discovered, evaluated and...
Software Development Effort and Cost Estimation: Neuro-Fuzzy Model
Software development effort and cost prediction is one of the important activities in software project management. Accuracy in prediction is a challenge for software developers. There are many models exists that...
Managing IoT data using relational schema and JSON fields, a comparative study
Data transmitted from sensors and actuators as part of the Internet of Things (IoT) infrastructure are stored either in database tables following relational schema and normalization forms or in schema less collections us...
Lightning Strike and Thunder and its Effect on Television Signal Transmission
Abstract: The lightning strike and thunder and its effect on television signal transmission in Nigerian Television, NTA Abakaliki Nigeria was analyzed. From the analyses, it was observed that lightning and thunder strike...