A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 5

Abstract

K-Means Algorithm is most widely used algorithms in document clustering. However, it still suffer some shortcomings like random initialization, solution converges to local minima, and empty cluster formation. Genetic algorithm is often used for document clustering because of its global search and optimization ability over heuristic problems. In this paper, search ability of genetic algorithm has exploited with a modification from the general genetic algorithm by not using the random initial population.A new algorithm for population initialization is given in this paper and results are compared with k-means algorithm.

Authors and Affiliations

Harish Verma , Eatesh Kandpal , Bipul Pandey , Joydip Dhar

Keywords

Related Articles

An Extensive Study on the Future of Modeling in Software Development

In recent years, software modeling realized much attention in the field of software research and development due to demonstrating the capability of decreases time and cost of software development and also improves overal...

Implementation of Multiplatform RIA using User Interface Components

Abstract—Nowadays, there are a growing number of Web 1.0 applications that are migrating towards Web 2.0 User Interfaces, in search of multimedia support and higher levels of interaction among other features. These We...

The proposed quantum computational basis of deep ecology: its implications for agriculture

Quantum computation has been proposed to generate consciousness. The terms atman field and consciousness vector have also been used to describe the properties of consciousness. It has also been proposed that the human ac...

Improve Performance of Extract, Transform and Load (ETL) in Data Warehouse

Extract, transform and load (ETL) is the core process of data integration and is typically associated with data warehousing. ETL tools extract data from a chosen source, transform it into new formats according to busines...

Performability Measures Of Multiple Path Multistage Interconnection Networks

In this paper, attempts have been made to develop different combinatorial models for evaluation performability of various multiple path multistage interconnection networks (MINs). For the purpose here, two metrics of per...

Download PDF file
  • EP ID EP160490
  • DOI -
  • Views 89
  • Downloads 0

How To Cite

Harish Verma, Eatesh Kandpal, Bipul Pandey, Joydip Dhar (2010). A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms. International Journal on Computer Science and Engineering, 2(5), 1875-1879. https://europub.co.uk/articles/-A-160490