Fast and Highly Scalable Multiresolution Linear Word based Clustering in Multidimensional data

Abstract

Clustering problems are well known in database literature for their use in numerous applications. Multidimensional data always is a challenge for clustering algorithms. The Halite, fast and scalable clustering method that looks for clusters in subspaces of multidimensional data. The tree root corresponds to a hypercube embodying the full data set. The next level divides the space in a set of 2D hypercube. The resulting hypercube are divided again, generating the tree structure. Bump Hunting task refers to apply for each level of the Counting-tree one d-dimensional Laplacian mask over the respective grid to spot bumps in the respective resolution. Specifically the main contributions of Halite are: Scalability: it is linear in time and space regarding the data size and dimensionality of the clusters’ subspaces. Usability: it is deterministic, robust to noise, doesn’t take the number of clusters as an input parameter, and detects clusters in subspaces generated by original axes or by their linear combinations, including space rotation. Effectiveness: it is accurate, providing results with equal or better quality. It is achieved through word based approach Generality: it includes a soft clustering approach.

Authors and Affiliations

P. Rubi, M. Govindaraj

Keywords

Related Articles

An Review of World Lavender Oil Markets and Lessons for Turkey

Lavender farms have lately grown in popularity in Turkey. Lavender farming is becoming more popular as a source of essential oils and rural tourism, which raises a slew of concerns about production and marketing. Turkey...

Credit Card Fraud Analysis And Detection

Due to the rise and rapid growth of E-Commerce, use of credit cards for online purchases has dramatically increased and it caused an explosion in the credit card fraud. As credit card becomes the most popular mode of pay...

Higher Order Statics Based Primary User Emulation Attack Detection

Cognitive radio is one of the promising technique for dynamic spectrum sharing to solve the spectrum scarcity problem. Spectrum sensing is the key process in cognitive radio for the dynamic spectrum usage. But security i...

Real Estate Price Prediction

Analysing various fields, associate numbers, events became the need of time and most important step to do anything and hence data science became an important part in every field. Using the concept of data science, the pr...

CMRepo End-to-End Automation

Performance Automation is a technique which helps in determining the performance of the system, by determining various system parameters under different workloads. This project aims at developing the framework for determ...

Download PDF file
  • EP ID EP749392
  • DOI -
  • Views 50
  • Downloads 0

How To Cite

P. Rubi, M. Govindaraj (2014). Fast and Highly Scalable Multiresolution Linear Word based Clustering in Multidimensional data. International Journal of Innovative Research in Computer Science and Technology, 2(3), -. https://europub.co.uk/articles/-A-749392