A Novel Approach for Semi Supervised Document Clustering with Constraint Score based Feature Supervision
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 2
Abstract
Abstract: Text document clustering provides an effective technique to manage a huge amount of retrieval outcome by grouping documents in a small number of meaningful classes. In unsupervised clustering method the unlabeled input data is used to estimate the parameter values. In a semi supervised document clustering both labeled and unlabeled input data is used for document clustering. A semi supervised clustering with feature supervision and constraint score is proposed in this paper. This proposed system which handles document clustering and feature Supervision simultaneously and this system finds the number of clusters automatically. Feature supervision uses pairwise constraints that performs supervision between the each documents. The semi-supervised constraint score that uses both pairwise constraints and the constraint score is to compute relevant features and irrelevant feature on document data set. A variational inference algorithm uses the Dirichlet Process Mixture model for the document clustering.
Authors and Affiliations
S. Princiya, , M. Prabakaran
VRIT: An Innovative Approach of Industrial Training through Virtual Reality
The emerging global competition and increasing costs are a great challenge to industries. New cost effective training methods are explored to cope with this demand. In-depth knowledge of the functions in a fact...
A Survey on Multiple Patient Data Semantic Conflicts and the Methods of Electronically Data Exchange Advantages and Disadvantages
Abstract: In last few years heterogeneous healthcare information such as electronically patient data has gains a great attention especially from clinicians, researchers, health care organizations. The government of unite...
Estimation of Word Net-Based Lexical Semantic Similarity Measure for Telugu Documents
The estimation of lexical semantic relatedness has numerous applications in NLP. Several measures are available for the evaluation of lexical semantic relatedness. This paper presents two approaches for measuring...
Analysis of Data Mining Tasks, Techniques, Tools, Applications And Trend
Data mining is a process which finds useful patterns from huge amount of data. It is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. It u...
An EfficientHybrid Push-Pull Based Protocol for VOD In Peer-to-Peer network
Abstract : Video-on-Demand is a service where movies are delivered to distributed users with low delay and free interactivity. The traditional client-server architecture experiences scalability issues to provide vi...