Density based Clustering Algorithm for Distributed Datasets using Mutual k-Nearest Neighbors
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 3
Abstract
Privacy and security have always been a concern that prevents the sharing of data and impedes the success of many projects. Distributed knowledge computing, if done correctly, plays a key role in solving such a problem. The main goal is to obtain valid results while ensuring the non-disclosure of data. Density-based clustering is a powerful algorithm in analyzing uncertain data that naturally occur and affect the performance of many applications like location-based services. Nowadays, a huge number of datasets have been introduced for researchers which involve high-dimensional data points with varying densities. Such datasets contain data points with high-density regions surrounded by data points with sparse density. The existing clustering approaches handle these situations inefficiently, especially in the context of distributed data. In this paper, we design a new decomposable density-based clustering algorithm for distributed datasets (DDBC). DDBC utilizes the concept of mutual k-nearest neighbor relationship to cluster distributed datasets with different density. The proposed DDBC algorithm is capable of preserving the privacy and security of data on each site by requiring a minimal number of transmissions to other sites.
Authors and Affiliations
Ahmed Salim
Medical Image Retrieval based on the Parallelization of the Cluster Sampling Algorithm
Cluster sampling algorithm is a scheme for sequential data assimilation developed to handle general non-Gaussian and nonlinear settings. The cluster sampling algorithm can be used to solve a wide spectrum of problems tha...
Design and Realization of Mongolian Syntactic Retrieval System Based on Dependency Treebank
In the past seven years, Language Research Institute of Inner Mongolia University has constructed a 500,000-word scale Mongolian dependency treebank. The syntactic treebank provides a favorable data platform for language...
Speculating on Speculative Execution
Threat actors continue to design exploits that specifically target physical weaknesses in processor hardware rather than more traditional software vulnerabilities. The now infamous attacks, Spector and Meltdown, ushered...
An Information Theoretic Analysis of Random Number Generator based on Cellular Automaton
Realization of Randomness had always been a controversial concept with great importance both from theoretical and practical Perspectives. This realization has been revolutionized in the light of recent studies especially...
Influence of Nitrogen-di-Oxide, Temperature and Relative Humidity on Surface Ozone Modeling Process Using Multigene Symbolic Regression Genetic Programming
Automatic monitoring, data collection, analysis and prediction of environmental changes is essential for all living things. Understanding future climate changes does not only helps in measuring the influence on people li...