Fast and Highly Scalable Multiresolution Linear Word based Clustering in Multidimensional data

Abstract

Clustering problems are well known in database literature for their use in numerous applications. Multidimensional data always is a challenge for clustering algorithms. The Halite, fast and scalable clustering method that looks for clusters in subspaces of multidimensional data. The tree root corresponds to a hypercube embodying the full data set. The next level divides the space in a set of 2D hypercube. The resulting hypercube are divided again, generating the tree structure. Bump Hunting task refers to apply for each level of the Counting-tree one d-dimensional Laplacian mask over the respective grid to spot bumps in the respective resolution. Specifically the main contributions of Halite are: Scalability: it is linear in time and space regarding the data size and dimensionality of the clusters’ subspaces. Usability: it is deterministic, robust to noise, doesn’t take the number of clusters as an input parameter, and detects clusters in subspaces generated by original axes or by their linear combinations, including space rotation. Effectiveness: it is accurate, providing results with equal or better quality. It is achieved through word based approach Generality: it includes a soft clustering approach.

Authors and Affiliations

P. Rubi, M. Govindaraj

Keywords

Related Articles

A Survey on Feature Selection in Data Mining

Feature Selection is a fundamental problem in machine learning and data mining . Feature Selection is an effective way for reducing dimensionality, removing irrelevant data increasing learning accuracy. Feature Selection...

Survey on Fog Computing Mitigating Data Theft Attacks in Cloud

Cloud computing now-a-days forms a very important unit in the online world, by modifying how the computers and Internet were used few years back. Cloud computing provides the facility to store personal information and th...

Enhancing Momentum Trading with Macroeconomic Indicators- A Strategic Approach

Traditional momentum trading strategies capitalize on existing market trends but often overlook broader macroeconomic contexts, potentially limiting their effectiveness during periods of economic fluctuation. This paper...

A System Model of Fault Tolerance Technique in the Distributed and Scalable System: A Review

Fault tolerance is one of the most crucial concerns in distributed systems. Flout tolerance system is very difficult to implement due to its dynamic nature and complex services. Several research efforts consare istently...

A Machine Learning Model for Clinical Decision Support for Drug Recommendation

Modern machine learning techniques play a very crucial role in dealing with very complex unstructured data that is available in the medical domain. The wide range of applications in this area is capable of changing the...

Download PDF file
  • EP ID EP749392
  • DOI -
  • Views 20
  • Downloads 0

How To Cite

P. Rubi, M. Govindaraj (2014). Fast and Highly Scalable Multiresolution Linear Word based Clustering in Multidimensional data. International Journal of Innovative Research in Computer Science and Technology, 2(3), -. https://europub.co.uk/articles/-A-749392