THE PROBLEM OF HIGH DIMENSIONALITY WITH LOW DENSITY IN CLUSTERING

Abstract

In many real-world applications, there are a number of dimensions having large variations in a dataset. The dimensions of the large variations scatter the cluster and confuse the distance between two samples in a dataset. This degrades the performances of many existing algorithms. This problem can be happened even when the number of dimensions of a dataset is small. Moreover, no existing method can distinguish whether the dataset has the highly repeated problem or low-density‟s problem. The only way to distinguish the problem is by a prior knowledge, which is given by the user. There are many methods to resolve this type of high dimensionality problem. The common way is to prune the non-significant features so that the features having large variations are removed and high-density cluster centers are obtained. Much research work has been carried out based on this criterion. The subspace clustering method is one of the well-known tools. The feature space is first partitioned into a number of equal length grids. Then, the density of each interval is measured. The features having low density are discarded and the clustering is conducted on the high density regions. Although these methods work very well on synthetic datasets, the pruned dimensions can carry useful information and hence, pruning them may increase the classification error rates.

Authors and Affiliations

Prof. T. Sudha and Swapna Sree Reddy. Obili

Keywords

Related Articles

Application of Analytical Tools in Student Retention System

Higher education is learning that is provided by universities, vocational universities, degree colleges, arts colleges, technical and medical colleges, and other institutions that award academic degrees. Higher educati...

Data Sharing and Querying in Peer-to-Peer Data management System

In this work, we investigate mechanisms to support data sharing and querying in a peer-to-peer data management system, that is, a peer-to-peer system where each peer manages its own data. To support data sharing, we pr...

Multi Objective Multi Agent Based AccessPoint Selection Mechanism using Fuzzy Logic.

The last few years have seen a tremendous increase in the deployment of 802.11 Wireless Local Area Networks (WLANs). The proliferation of wireless users and the promise of converged voice, data and video technology is...

“Electronic Technology Participation in Teaching and Learning process: An Advanced mode of education dissemination”

This paper describes the concepts and issues related with e-learning technology. E-learning is a growing field in which new inventions, technologies will introduce almost everyday. In this paper some of the concepts wh...

COMMUNITY PARTICIPATION IN WATER SUPPLY SCHEMES IN OKE-OGUN ZONE,OYO STATE,NIGERIA

The study evaluates community participation in rural water supply scheme in Oke-Ogun area of Oyo State, Nigeria. The specific objectives for the study include examination of socioeconomic characteristics, identification...

Download PDF file
  • EP ID EP26568
  • DOI -
  • Views 333
  • Downloads 8

How To Cite

Prof. T. Sudha and Swapna Sree Reddy. Obili (2012). THE PROBLEM OF HIGH DIMENSIONALITY WITH LOW DENSITY IN CLUSTERING. International Journal of Engineering, Science and Mathematics, 2(2), -. https://europub.co.uk/articles/-A-26568