A Novel Benchmark K-Means Clustering on Continuous Data
Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 8
Abstract
Cluster analysis is one of the prominent techniques in the field of data mining and k-means is one of the most well known popular and partitioned based clustering algorithms. K-means clustering algorithm is widely used in clustering. The performance of k-means algorithm will affect when clustering the continuous data. In this paper, a novel approach for performing k-means clustering on continuous data is proposed. It organizes all the continuous data sets in a sorted structure such that one can find all the data sets which are closest to a given centroid efficiently. The key institution behind this approach is calculating the distance from origin to each data point in the data set. The data sets are portioned into k-equal number of cluster with initial centroids and these are updated all at a time with closest one according to newly calculated distances from the data set. The experimental results demonstrate that proposed approach can improves the computational speed of the direct k-means algorithm in the total number of distance calculations and the overall time of computations particularly in handling continuous data.
Authors and Affiliations
K. Prasanna , M. Sankara Prasanna Kumar , G. Surya Narayana
Image Mining using Content Based Image Retrieval System
The image depends on the Human perception and is also based on the Machine Vision System. The Image Retrieval is based on the color Histogram, texture. The perception of the Human System of Image is based on the Human Ne...
Analyzing Motivation of Private Engineering College Students: A Fuzzy Logic Approach (A case study of private Engineering ollege)
A method for analyzing and comparing group of students motivation using fuzzy logic is proposed. A fuzzy inference system is designed and implemented using Simulink in Matlab[19] with fuzzy statistical analysis to includ...
LITERATURE SURVEY ON ENHANCING CLUSTER QUALITY
In this paper, we extensively study about the important aspect of various Clustering techniques, the cluster quality. The oodness of clustering is measured in terms of cluster validity indices where the results of cluste...
Efficiency of K-Means Clustering Algorithm in Mining Outliers from Large Data Sets
This paper presents the performance of k-means clustering algorithm, depending upon various mean values input methods. Clustering plays a vital role in data mining. Its main job is to group the similar data together base...
HYBRID FEATRUE SELECTION FOR NETWORK INTRUSION
In Computer Communications, collecting and storing characteristics about connections into a data set is needed to analyze its behaviour. Generally this data set is multidimensional and larger in size. When this data set...