Effective Term Based Text Clustering Algorithms
Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 5
Abstract
Text clustering methods can be used to group large sets of text documents. Most of the text clustering methods do not address the problems of text clustering such as very high dimensionality of the data and understandability of the clustering descriptions. In this paper, a frequent term based approach of clustering has been introduced; it provides a natural way of reducing a large dimensionality of the document vector space. This approach is based on clustering the low dimensionality frequent term sets and not on clustering high dimensionality vector space. Four algorithms for effective term based text clustering has been presented. An experimental evaluation on classical text ocuments as well as on web ocuments demonstrates that the proposed algorithms obtain clustering of comparable quality significantly more efficient than existing text clustering algorithms.
Authors and Affiliations
P. Ponmuthuramalingam , T. Devi
Software efforts estimation using Use Case Point approach by increasing Technical Complexity and Experience Factors
An IT industry wants a simple and accurate method of efforts estimation. Estimation of efforts before starting of work is a prediction and prediction always not accurate. Intermediate COCOMO considered 17 factor that aff...
A SURVEY OF CALL MARKET (DISCRETE) AGENT BASED ARTIFICIAL STOCK MARKETS
Artificial stock markets are models of financial markets used to study and understand market dynamics. Agent Based Artificial Stock Markets can be seen as any market model in which prices are formed endogenously as a res...
Speech Volume Monitor for Hearing Impaired
Hearing impaired can be classified into people who were affected by birth and those who developed the problem at a later stage. The second category knows how to speak but cannot hear them. They encounter embarrassment by...
EFFICIENT TRANSACTION REDUCTION IN ACTIONABLE PATTERN MINING FOR HIGH VOLUMINOUS DATASETS BASED ON BITMAP AND CLASS LABELS
Frequent pattern mining in databases plays an indispensable role in many data mining tasks namely, classification, clustering, and association rules analysis. When a large number of item sets are processed by the databas...
Grid Computing: A Collaborative Approach in Distributed Environment for Achieving Parallel Performance and Better Resource Utilization
From the very beginning various measures are taken or consider for better utilization of available limited resources in he computer system for operational environment, this is came n consideration because most of the...