Effective Term Based Text Clustering Algorithms

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 5

Abstract

Text clustering methods can be used to group large sets of text documents. Most of the text clustering methods do not address the problems of text clustering such as very high dimensionality of the data and understandability of the clustering descriptions. In this paper, a frequent term based approach of clustering has been introduced; it provides a natural way of reducing a large dimensionality of the document vector space. This approach is based on clustering the low dimensionality frequent term sets and not on clustering high dimensionality vector space. Four algorithms for effective term based text clustering has been presented. An experimental evaluation on classical text ocuments as well as on web ocuments demonstrates that the proposed algorithms obtain clustering of comparable quality significantly more efficient than existing text clustering algorithms.

Authors and Affiliations

P. Ponmuthuramalingam , T. Devi

Keywords

Related Articles

HIERARCHICAL DOCUMENT ORGANIZATION AND RETRIEVAL BASED ON THEMES FOR NEWS TRACKS

Organizing text documents is an important task and there are also numbers of strategies available in it. A good document clustering approach can assist computers in organizing the document corpus automatically into a mea...

Modeling of the Systems of Piloting In Training Institutions

From a globalization and complex environment .point of view the evaluation of the educational system turns out to be an imperative .The performance indicators allow to characterize the evolution of the educational system...

Software Reliability Analyzer for improving Software Quality and Reliability

A software product is tested throughout testing stage of the software development life cycle to check whether the software meets the user’s necessities or not. For forecasting the reliability of the software, software re...

The Efficient Ant Routing Protocol for MANET

In recent years, mobile computing and wireless networks have witnessed a tremendous rise in popularity and technological advancement. The basic routing problem in MANET deals with methods to transport a packet across a n...

Optimization of Composite Plates Based on Imperialist Competitive Algorithm

Imperialist Competitive Algorithm (ICA) is a new optimization algorithm that inspired by socio-political process of imperialistic competition. In this paper an optimization methodology for the design of composite plates...

Download PDF file
  • EP ID EP119030
  • DOI -
  • Views 123
  • Downloads 0

How To Cite

P. Ponmuthuramalingam, T. Devi (2010). Effective Term Based Text Clustering Algorithms. International Journal on Computer Science and Engineering, 2(5), 1665-1673. https://europub.co.uk/articles/-A-119030