A Survey on Improving the Clustering Performance in Text Mining for Efficient Information Retrieval

Journal Title: INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY - Year 2014, Vol 8, Issue 5

Abstract

In recent years, the development of information systems in every field such as business, academics and medicine has led to increase in the amount of stored data year by year. A vast majority of data are stored in documents that are virtually unstructured. Text mining technology is very helpful for people to process huge information by imposing structure upon text. Clustering is a popular technique for automatically organizing a large collection of text. However, in real application domains, the experimenter possesses some background knowledge that helps in clustering the data. Traditional clustering techniques are rather unsuitable of multiple data types and cannot handle sparsity and high dimensional data. Co-clustering techniques are adopted to overcome the traditional clustering technique by simultaneously performing document and word clustering handling both deficiencies. Semantic understanding has become essential ingredient for information extraction, which is made by adopting constraints as a semi-supervised learning strategy. This survey reviews on the constrained co-clustering strategies adopted by researchers to boost the clustering performance. Experimental results using 20-Newsgroups dataset shows that the proposed method is effective for clustering textual documents. Furthermore, the proposed algorithm consistently outperformed all the existing constrained clustering and coclustering methods under different conditions.

Authors and Affiliations

S. Saranya , R. Munieswari

Keywords

Related Articles

Pre-Recover from a Node Failure in Ad hoc Network Using Reactive Protocols

Ad-hoc Network is an infrastructure less networks, which will configure by it without any base stations. A mobile ad hoc network will move freely in any direction without any restrictions. Reactive protocol will intimate...

 Low Power Design and Simulation of 7T SRAM Cell using various Circuit Techniques

 Low power memory is required today most priority with also high stability. The power is most important factor for today technology so the power reduction for one cell is vital role in memory design techniques. In...

A Survey on Mining Weakly Labeled Web Facial Images for Search-Based Face Annotation

Auto face annotation is playing important role in many real-world knowledge management systems and multimedia information. Auto face annotation can be beneficial to many real world applications. Face annotation related t...

 RGB Image Compression Using Two Dimensional Discrete Cosine Transform

 To addresses the problem of reducing the memory space and amount of data required to represent a digital image. Image compression plays a crucial role in many important and adverse applications and including televi...

A Strategic Review of Routing Protocols for Mobile Ad Hoc Networks

In recent years, a rapid growth of research interests in mobile ad hoc networking has been see. The infrastructureless and the dynamic nature of these networks demand an efficient and reliable routing strategy. Due to th...

Download PDF file
  • EP ID EP131614
  • DOI -
  • Views 105
  • Downloads 0

How To Cite

S. Saranya, R. Munieswari (2014). A Survey on Improving the Clustering Performance in Text Mining for Efficient Information Retrieval. INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY, 8(5), 249-256. https://europub.co.uk/articles/-A-131614