A Survey on Improving the Clustering Performance in Text Mining for Efficient Information Retrieval

Journal Title: INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY - Year 2014, Vol 8, Issue 5

Abstract

In recent years, the development of information systems in every field such as business, academics and medicine has led to increase in the amount of stored data year by year. A vast majority of data are stored in documents that are virtually unstructured. Text mining technology is very helpful for people to process huge information by imposing structure upon text. Clustering is a popular technique for automatically organizing a large collection of text. However, in real application domains, the experimenter possesses some background knowledge that helps in clustering the data. Traditional clustering techniques are rather unsuitable of multiple data types and cannot handle sparsity and high dimensional data. Co-clustering techniques are adopted to overcome the traditional clustering technique by simultaneously performing document and word clustering handling both deficiencies. Semantic understanding has become essential ingredient for information extraction, which is made by adopting constraints as a semi-supervised learning strategy. This survey reviews on the constrained co-clustering strategies adopted by researchers to boost the clustering performance. Experimental results using 20-Newsgroups dataset shows that the proposed method is effective for clustering textual documents. Furthermore, the proposed algorithm consistently outperformed all the existing constrained clustering and coclustering methods under different conditions.

Authors and Affiliations

S. Saranya , R. Munieswari

Keywords

Related Articles

 Design and Experimental Study of Small-Scale Fabricated Thermo-Acoustic Refrigerator

 All refrigerator and air conditioning, contains refrigerants that are harmful to environment, CFC’s the original refrigerant, are depleting the ozone layer and replaced by HCFC’s or HFC’s the new product do not app...

Improvement Tracking Dynamic Programming using Replication Function for Continuous Sign Language Recognition

In this paper we used a Replication Function (R. F.) for improvement tracking with dynamic programming. The R. F. transforms values of gray level [0 255] to [0 1]. The resulting images of R. F. are more striking and visi...

Finite Element Analysis of Y+41 LiNbO3 Based on SAW Resonators Including Perfectly Matched Layer

This paper describes the modelling and simulation of Surface acoustic waves (SAW) in piezoelectric thin film by using a finite element method (FEM) in combination with Perfectly Matched Layer (PML). Various important pie...

 Cloud Computing: Reverse Caesar Cipher Algorithm to Increase Data Security

 Cloud computing is a large pool of easily and accessible virtualized resources, such as hardware, development platforms and services. The main problem associated with cloud computing is data privacy, security etc....

Optimization Speed Control of DC Separately Excited Motor Using Tuning Controller of Linear Quadratic Regulator (LQR) Technique

The tuning of Linear Quadratic regulator (LQR) controllers is a challenge for researchers and plant operators. This paper presents a optimization and comparison of time response specification between Traditional ZN Tunin...

Download PDF file
  • EP ID EP131614
  • DOI -
  • Views 116
  • Downloads 0

How To Cite

S. Saranya, R. Munieswari (2014). A Survey on Improving the Clustering Performance in Text Mining for Efficient Information Retrieval. INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY, 8(5), 249-256. https://europub.co.uk/articles/-A-131614