A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 5

Abstract

K-Means Algorithm is most widely used algorithms in document clustering. However, it still suffer some shortcomings like random initialization, solution converges to local minima, and empty cluster formation. Genetic algorithm is often used for document clustering because of its global search and optimization ability over heuristic problems. In this paper, search ability of genetic algorithm has exploited with a modification from the general genetic algorithm by not using the random initial population.A new algorithm for population initialization is given in this paper and results are compared with k-means algorithm.

Authors and Affiliations

Harish Verma , Eatesh Kandpal , Bipul Pandey , Joydip Dhar

Keywords

Related Articles

Intelligent Farm Expert Multi Agent System

Farming data has been rapidly increasing in volume in different Web data sources. Querying multiple data sources manually on the internet is time consuming and laborious process for farmers. Traditional information syste...

E-Commerce Security using PKI approach

As a most popular business model, ECommerce provides a more convenient business mode and lower transaction cost. Currently Ecommerce security is still an obstacle in evelopment of e-commerce. It is the need of the hour...

Detection of Multiple Black hole nodes attack in MANET by modifying AODV protocol

Ad hoc Networks (MANET) is a self-configuring, infrastructure less network consists of independent mobile nodes that can communicate via wireless medium. Each mobile node can move freely in any direction, and changes the...

Speaker Identification using Row Mean of DCT and Walsh Hadamard Transform

In this paper we propose a unique approach to text dependent speaker identification using transformation techniques such as DCT (Discrete Cosine Transform) and WHT (Walsh and Hadamard Transform). The feature vectors for...

Comparative Study of the Factors that Affect Maintainability

The maintainability of the software system is becoming a very important characteristic due to growth in demand of quality software system. Software maintainability means the ease with which a software system or component...

Download PDF file
  • EP ID EP160490
  • DOI -
  • Views 126
  • Downloads 0

How To Cite

Harish Verma, Eatesh Kandpal, Bipul Pandey, Joydip Dhar (2010). A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms. International Journal on Computer Science and Engineering, 2(5), 1875-1879. https://europub.co.uk/articles/-A-160490