Comparative analysis of mid-point based and proposed mean based K-Means Clustering Algorithm for Data Mining

Abstract

In the original k-means algorithm the initial centroids are taken just randomly out of the input data set. But this random selection of initial centroids leads the computation of the algorithm into local optima. Each time the end clustering results will come out to be different. This is the limitation which needs to be dealt with in order to make the k-means algorithm more efficient. The mid –point is used as a metric for computing the initial centroids but this algorithm may be suitable for a wide variety of problems but it is not suitable for all kinds of problems. As it concentrates on calculating the mid-point of different subsets of the data set, so it is most suitable to problems where the input data is regularly or uniformly distributed across the space. But in the situations where the input data is irregular or non-uniformly distributed, this algorithm will not produce the appropriate results. This paper presents the mean as the metric for choosing initial centroid and the comparison of both the algorithms.

Authors and Affiliations

Kirti Aggarwal, Neha Aggarwal

Keywords

Related Articles

A Study on the Consumers Emotional Connect with respect to Fragrance in Soaps

The paper is aimed at impact of fragrance in soaps that help to develop a perception towards the product. The data used in this paper comprises of the attitude the youngsters have regarding fragrances in soap in the yea...

Evaluating the performance of Symmetric Key Algorithms: AES (Advanced Encryption Standard) and DES (Data Encryption Standard)

Encryption algorithms are known to be computational intensive. Internet and networks applications growing very fast, so the needs to protect. Encryption plays the very important role in information security system. On th...

An Power Efficient New CRT Based Reverse Converter for Moduli Set { }

The 4-moduli{ } set has been recently proposed for large dynamic range of 6n-bits .for this 4-moduli set reverse converter design based on New CRT-1 and MRC has already been proposed. In this paper we propose reverse con...

Role of Contextual Factors in using eLearning Systems for Higher Education in Developing Countries

The same basic computing facilities are available in most of the Asian countries like Pakistan; however it is never possible to attain the same outputs from digital systems working either in public or private sector. Thi...

Area-Power Efficient Generic Modulo Adder

Modular adder is a crucial component which is typically employed in Forward, Reverse and channels in a Residue Number System. In This Paper we proposed a Novel generic modulo adder architecture which is based on Look ahe...

Download PDF file
  • EP ID EP97983
  • DOI -
  • Views 152
  • Downloads 0

How To Cite

Kirti Aggarwal, Neha Aggarwal (2012). Comparative analysis of mid-point based and proposed mean based K-Means Clustering Algorithm for Data Mining. International Journal of Computational Engineering and Management IJCEM, 15(4), 71-74. https://europub.co.uk/articles/-A-97983