THE PROBLEM OF HIGH DIMENSIONALITY WITH LOW DENSITY IN CLUSTERING
Journal Title: International Journal of Engineering, Science and Mathematics - Year 2012, Vol 2, Issue 2
Abstract
In many real-world applications, there are a number of dimensions having large variations in a dataset. The dimensions of the large variations scatter the cluster and confuse the distance between two samples in a dataset. This degrades the performances of many existing algorithms. This problem can be happened even when the number of dimensions of a dataset is small. Moreover, no existing method can distinguish whether the dataset has the highly repeated problem or low-density‟s problem. The only way to distinguish the problem is by a prior knowledge, which is given by the user. There are many methods to resolve this type of high dimensionality problem. The common way is to prune the non-significant features so that the features having large variations are removed and high-density cluster centers are obtained. Much research work has been carried out based on this criterion. The subspace clustering method is one of the well-known tools. The feature space is first partitioned into a number of equal length grids. Then, the density of each interval is measured. The features having low density are discarded and the clustering is conducted on the high density regions. Although these methods work very well on synthetic datasets, the pruned dimensions can carry useful information and hence, pruning them may increase the classification error rates.
Authors and Affiliations
Prof. T. Sudha and Swapna Sree Reddy. Obili
“M-LearNiNg” - A Buzzword in Computer Technology
With the exponential development in technology, there is always the danger that the equilibrium between technology and learning is disturbed. But Mobile Learning or short M-Learning creates completely new possibilities...
A study on role of transformational leadership behaviors across cultures in effectively solving the issues in Mergers and Acquisitions
As and when Merger and acquisition comes to a discussion, financial aspects will take a lead and less will be the humanitarian approach. By understanding this fact there are considerable research work which started foc...
Challenges in Allocation of Resources in Future Cellular Mobile Networks
Efficient resource sharing is very important in mobile networks to provide satisfactory service to customers. The resource demand is increasing due to increasing number of user population and high bandwidth demanding m...
Identification of Paraphrasing in the context of Plagiarism
Paraphrasing is a very important form of processing for natural language processing (NLP). A characteristic property of natural language is that various expressions can exist to express a single concept. The aim of thi...
Challenges and Opportunities of Technology Transfer Management
Today, technology has become so rich that many of the developing countries can , t even afford to have all the facilities, It is admitted that these changes and takeovers affected economical, industrial, security, tech...