DETERMINING THE NUMBER OF CLUSTERS FOR A K-MEANS CLUSTERING ALGORTIHM
Journal Title: Indian Journal of Computer Science and Engineering - Year 2012, Vol 3, Issue 5
Abstract
Clustering is a process used to divide data into a number of groups. All data points have some mathematical parameter according to which grouping can be done. For instance, if we have a number of points on a twodimensional grid, the x and y coordinates of the points are the parameters according to which clustering is done. If the k-means algorithm is run with k=3, the data points will be split into 3 groups such that the sum of the variance for each group is minimized. The problem here, of course, is the choice of the parameter k. We may get a much better modeling of the data if we split the data points into 2 or 4 groups. Determining the ‘best’ value of k is a broad problem – there is no obvious parameter according to which this can be done. This paper looks at a new, efficient approach to determine the number of clusters.
Authors and Affiliations
Abhijit Kane
APPLICABILITY OF CLOUD COMPUTING IN ACADEMIA
The Indian Education sector has seen a tremendous rise in the field of higher education which has led to the demand for the automation of education sector at all the levels in order to cater to the need of information of...
LS and MMSE based Localization Algorithm for WSNs amid obstacles
In recent years, optimization and Wireless Sensor Networks (WSNs) are extensively used in numerous milieus and hostile topographies. In this paper, we proposed an improved localization algorithm by means of Least square...
STUDY AND PERFORMANCE ANALYSIS OF THE WYLLIE’S LIST RANKING ALGORITHM USING VARIOUS PARALLEL PROGRAMMING MODELS
The Wyllie’s list ranking algorithm takes a linked list data structure as an input and it pass the linked list successor elements to the succ1 array to find the Rank. The algorithm depends upon the Pointer jumping operat...
ON SOME COMPARATIVE RESULTS OF REGIONAL HEXAGONAL TILE REWRITING GRAMMARS
Regional hexagonal Tile rewriting grammars(RHTRG) are the recently introduced hexagonal picture generating devices which used a simple type of tiling called regional hexagonal tiling. This model is having isometric rules...
A NOVEL APPROACH TO TEST SUITE REDUCTION USING DATA MINING
Software testing is the most important and time consuming part of software development lifecycle. The time spent in testing is mainly concerned with generating the test cases and testing them. Our goal is to reduce the t...