A Review: Hadoop Storage and Clustering Algorithms

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2016, Vol 18, Issue 1

Abstract

Abstract : In the last few years there has been voluminous increase in the storing and processing of data, which require convincing speed and also requirement of storage space. Big data is a defined as large, diverseand complex data sets which has issues of storage, analysis and visualization for processing results. Four characteristics of Big data which are–Volume, Value, Variety and Velocity makes it difficult for traditional systems to process the big data. Apache Hadoop is an auspicious software framework that develops applications that process huge amounts of data in parallel with large clusters of commodity hardware in a fault-tolerant and veracious manner. Various performance metrics such as reliability, fault tolerance, accuracy, confidentiality and security are improved as with the use of Hadoop. Hadoop MapReduce is an effective Computation Model for processing large data on distributed data clusters such as Clouds. We first introduce the general idea of big data and then review related technologies, such as could computing and Hadoop. Various clustering techniques are also analyzed based on parameters like numbers of clusters, size of clusters, type of dataset and noise.

Authors and Affiliations

Latika Kakkar , Gaurav Mehta

Keywords

Related Articles

A Survey of Network Security in Mobile Ad-Hoc Network

Abstract: This paper describes the concept of ad hoc networking and the security issues faced by giving its background and presenting some of the security challenges that are faced by the mobile ad hoc network. Ad hoc wi...

 Scheduling Using Multi Objective Genetic Algorithm

Abstract : Multiprocessor task scheduling is considered to be the most important and very difficult issue. Taskscheduling is performed to match the resource requirement of the job with the available resources resulting i...

 Classification of Student’s E-Learning Experiences’ in SocialMedia via Text Mining

Abstract : In today’s world, social media is used every individual for expressing their feelings, opinion,experiences’ and emotions. Applying data mining on all these emotions expressed in posts, comments and likescalled...

 Improved AODV based on Load and Delay for Route Discovery in MANET

 A mobile Ad-hoc network (MANET) is a self configuring network of mobile devices connected by wireless links. A dynamic traffic allocation algorithm based on packet delay and hops in Mobile Ad hoc networks is pr...

 Comparison and Enhancement of Digital Image by Using Canny Filter and Sobel Filter

 In this research paper we have defining two different edge detection methods i.e canny edge detection and Sobel edge detection and we are also discussing some image quality parameters like PSNR, SNR, MSE, RMSE,...

Download PDF file
  • EP ID EP112189
  • DOI -
  • Views 97
  • Downloads 0

How To Cite

Latika Kakkar, Gaurav Mehta (2016). A Review: Hadoop Storage and Clustering Algorithms. IOSR Journals (IOSR Journal of Computer Engineering), 18(1), 23-29. https://europub.co.uk/articles/-A-112189