AN IMPROVED HYBRIDIZED KMEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT’S PERFORMANCE ANALYSIS
Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 3
Abstract
In practical life we can see the rapid growth in the various data objects around us, which thereby demands the increase of features and attributes of the data set. This phenomenon, in turn leads to the increase of dimensions of the various data sets. When increase of dimension occurred, the ultimate problem referred to as the ‘the curse of dimensionality’ comes in to picture. For this reason, in order to mine a high dimensional data set an improved and an efficient dimension reduction technique is very crucial and apparently can be considered as the need of the hour. Numerous methods have been proposed and many experimental analyses have been done to find out an efficient reduction technique so as to reduce the dimension of a high dimensional data set without affecting the original data’s. In this paper we proposed the use of Canonical Variate analysis, which serves the purpose of reducing the dimensions of a high dimensional dataset in a more efficient and effective manner. Then to the reduced low dimensional data set, a clustering technique is applied using a modified k-means clustering. In our paper for the purpose of initializing the initial centroids of the Improved Hybridized K Means clustering algorithm (IHKMCA) we make use of genetic algorithm, so as to get a more accurate result. The results thus found from the proposed work have better accuracy, more efficient and less time complexity as compared to other approaches.
Authors and Affiliations
H. S Behera , Rosly Boy Lingdoh , Diptendra Kodamasingh
Performance Comparison for Resource Allocation Schemes using Cost Information in Cloud
A wide variety of different types of virtual computer are available in cloud computing, each with different usage costs for processing performance and time. Consequently, similar processing tasks can incur different proc...
Security of Data Fragmentation and Replication over Un-trusted Hosts Arun Kumar Yadav, Dr. Ajay Agarwal
Data replication technique is used to avoid the fault tolerance and improve the performance .With this initiation of content delivery networks, it is becoming more and more frequent that data content is placed on hosts t...
Observer Design for Simultaneous State and Faults Estimation
This paper addresses the problem of state and faults estimation for Takagi-Sugeno nonlinear systems. Based on this structure for modeling, a proportional integralmultiple observer with unknown inputs is proposed in order...
PRESERVING PRIVACY IN DATA MINING USING SEMMA METHODOLOGY
The huge amount of data available means that it is possible to learn a lot of information about individuals from public data. Here, this open data need to be sheltered from unlawful contact. The privacy-preserving data m...
Genetic Algorithm Based Adaptive Learning Scheme Generation For Context Aware E-Learning
Context aware e-learning system helps to provide elearning contents which are customized according to the learner’s ontext. For generating context aware contents many adaptation parameters have to be considered. Customi...