Effective Term Based Text Clustering Algorithms

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 5

Abstract

Text clustering methods can be used to group large sets of text documents. Most of the text clustering methods do not address the problems of text clustering such as very high dimensionality of the data and understandability of the clustering descriptions. In this paper, a frequent term based approach of clustering has been introduced; it provides a natural way of reducing a large dimensionality of the document vector space. This approach is based on clustering the low dimensionality frequent term sets and not on clustering high dimensionality vector space. Four algorithms for effective term based text clustering has been presented. An experimental evaluation on classical text ocuments as well as on web ocuments demonstrates that the proposed algorithms obtain clustering of comparable quality significantly more efficient than existing text clustering algorithms.

Authors and Affiliations

P. Ponmuthuramalingam , T. Devi

Keywords

Related Articles

SEGMENTATION OF OIL SPILL IMAGES USING IMPROVED FCM AND LEVEL SET METHODS.

The main part of image processing and computer vision is Image segmentation. Image segmentation is the task of splitting a digital image into one or more regions of interest. In this paper a robust method for oil spill S...

MEDICAL IMAGE SEGMENTATION

Image segmentation is an essential but critical component in low level vision image analysis, pattern recognition, and in obotic systems. It is one of the most difficult and challenging tasks in image processing which d...

Security of Data Fragmentation and Replication over Un-trusted Hosts Arun Kumar Yadav, Dr. Ajay Agarwal

Data replication technique is used to avoid the fault tolerance and improve the performance .With this initiation of content delivery networks, it is becoming more and more frequent that data content is placed on hosts t...

Secure Transmission of Compound Information Using Image Steganography

The security of information handled in real time transmission reception like internet is of paramount consideration, as this information may be confidential. And also, the parameter in concern nowa- days is size as it ma...

Recoverable Timestamping Approach For Concurrency Control In Distributed Database

A distributed database consists of different number of sites which are interconnected by a communication network. In this environment in absence of proper synchronization among different transaction may lead to inconsist...

Download PDF file
  • EP ID EP119030
  • DOI -
  • Views 128
  • Downloads 0

How To Cite

P. Ponmuthuramalingam, T. Devi (2010). Effective Term Based Text Clustering Algorithms. International Journal on Computer Science and Engineering, 2(5), 1665-1673. https://europub.co.uk/articles/-A-119030