Effective Term Based Text Clustering Algorithms

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 5

Abstract

Text clustering methods can be used to group large sets of text documents. Most of the text clustering methods do not address the problems of text clustering such as very high dimensionality of the data and understandability of the clustering descriptions. In this paper, a frequent term based approach of clustering has been introduced; it provides a natural way of reducing a large dimensionality of the document vector space. This approach is based on clustering the low dimensionality frequent term sets and not on clustering high dimensionality vector space. Four algorithms for effective term based text clustering has been presented. An experimental evaluation on classical text ocuments as well as on web ocuments demonstrates that the proposed algorithms obtain clustering of comparable quality significantly more efficient than existing text clustering algorithms.

Authors and Affiliations

P. Ponmuthuramalingam , T. Devi

Keywords

Related Articles

Implementation of ETAS (Embedding Text in Audio Signal) Model to Ensure Secrecy

Steganography is the art of hiding information that evolves as a new secret communication technology. For a long period time, information hiding was done using plain text, still images, video and IP datagram. Embedding s...

Security For Wireless Sensor Network

Wireless sensor network is highly vulnerable to attacks because it consists of various resourceconstrained devices with their low battery power, less memory, and associated low energy. Sensor nodes communicate among them...

A Bespoke Approach For Face-Recognition Using PCA

In this paper we have developed a bespoke approach to face recognition with Eigenfaces using principal component analysis. We have focused on the effects of taking the number of significant eigenfaces. Eigenfaces approac...

Integer Formulation and Data Analysis of a Real-World Course Timetabling Problem

Belonging to the class of hard combinatorial optimization problems, educational timetabling problems are considered to be challenging and attractive to operation research community in recent years. In this paper, we inve...

An Optimized Round Robin Scheduling Algorithm for CPU Scheduling

The main objective of this paper is to develop a new approach for round robin scheduling which help to improve the CPU efficiency in real time and time sharing operating system. There are many algorithms available for CP...

Download PDF file
  • EP ID EP119030
  • DOI -
  • Views 142
  • Downloads 0

How To Cite

P. Ponmuthuramalingam, T. Devi (2010). Effective Term Based Text Clustering Algorithms. International Journal on Computer Science and Engineering, 2(5), 1665-1673. https://europub.co.uk/articles/-A-119030