Efficient Way of Determining the Number of Clusters Using Hadoop Architecture
Journal Title: UNKNOWN - Year 2015, Vol 4, Issue 2
Abstract
The process of data mining is to extract information from a data set and transform it into an understandable structure. The clustering task plays a very important role in many areas such as exploratory data analysis, pattern recognition, computer vision, and information retrieval. The key idea is to view clustering as a supervised classification problem, in which we estimate the “true” class labels. The problem of determining the valid number of clusters is not easy. To overcome this problem many well known methods are used to find a correct number of clusters i.e. Gap statistic, Path based clustering and Figure of Merit (FOM) but these methods could not solve the problem of finding number of clusters efficiently. This paper focuses on “Average Intracluster Distance” index to validate the estimated number of arbitrary shaped clusters. In hadoop the proposed technique is based on the local relations between patterns and their clustering labels which makes use of Minimum Spanning Tree (MST) algorithm based on the multiplicity property of MST to get accurate results in efficient manner .
A New Architecture of High Performance WG Stream Cipher
Cipher is an algorithm for transforming the message. Stream ciphers are light weight symmetric key cryptosystems. These ciphers encrypt a plain-text or decrypt a cipher-text by adding the plain-text or cipher-text bit by...
A Review on use of Computational Fluid Dynamics in Gas Turbine Combustor Analysis and its Scope
Computational fluid dynamics (CFD) modeling is now widely applied as combustion optimization tool. The steady increase in computer power over recent years has enabled combustion engineers to model reacting multi-phase fl...
A Survey on Encryption Methods for Providing Security in Pub/Sub System
Internet has changed the world of distributed computing significantly. Peer-to-peer communication mechanism making system more rigid and static applications in distributed system, making a way to loosely coupled infrastr...
A Pilot Study on the Assessment of Nutritional Status In The School Going Children (6-11 Years) In Rural Areas of Coonoor, Nilgiris District
Malnutrition is India-s silent emergency and among India-s greatest human development challenges. The crisis of malnutrition is real and its persistence has profound and frightening implications for children, society and...
Dual Security Using Dual Encryption Schemes and Efficient User Revocation in Cloud
Cloud computing in the domain of distributed systems introduces many challenges in the day-to-day life. One of the main challenges is data security and privacy. Security on cloud data can be enhanced using dual encryptio...