Highly Available Hadoop Name Node Architecture-Using Replicas of Name Node with Time Synchronization among Replicas

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3

Abstract

 Abstract : Hadoop is a Java software framework that supports data - intensive distributed applications and is developed under open source license. It enables applications to work with thousands of nodes and petabytes of data. The two major pieces of Hadoop are HDFS and MapReduce. HDFS works with two types of hardware machines, the DataNode (Slave machine) which is the machine on which application’s data is stored and the NameNode (Master machine) which store the metadata of file system. Where NameNode is the only single machine for storing metadata of file system and is the Single Point of Failure (SPOF) for the HDFS. SPOF of NameNode machine affects the overall availability of Hadoop. When NameNode goes down the entire system become offine and cannot do any operation until NameNode gets restart. If the NameNode machine fails, the system needs to be re-started manually, making the system less available. This paper proposes a highly available architecture and its working principle for the HDFS NameNode against its SPOF utilizing well-known 2-Phase Commit (2PC) Protocol and election by bully with Time synchronization mechanism.

Authors and Affiliations

Soubhagya V N , Nikhila T Bhuvan

Keywords

Related Articles

Detection of Breast Cancer by the Identification of Circulating Tumor Cells Using Association Rule Mining

Abstract: Circulating Tumor Cells (CTCs) are cells that have shed into the vasculate from the primary tumor and circulate into the blood stream. In this proposed work, the major genes causing the breast cancer is identif...

 Methods Migration from On-premise to Cloud

 Abstract: Cloud computing is evolving as a key computing platform for sharing resources that includeinfrastructures, software, applications, and business. An increasing number of companies are expected tomigrate th...

 Decision Making and Autonomic Computing

 Autonomic Computing refers to the self-managing characteristics of distributed computing resources, adapting to unpredictable changes while hiding intrinsic complexity to operators and users. An autonomic system...

 GBC-TD: Gateway Based Congestion and Traffic Distribution Model for Load Sharing in WMN

 Abstract: Effective communication can be categorized by its approaches used to handles the uncertainties and especially in wireless medium. Wireless mesh network is one of the ad-hoc networks having huge applicabil...

 Protecting Attribute Disclosure for High Dimensionality and Preserving Publishing of Microdata

 : Generalization and Bucketization, have been designed for privacy preserving microdata publishing. Recent work has shown that generalization loses considerable amount of information, especially for high-dimen...

Download PDF file
  • EP ID EP99757
  • DOI 10.9790/0661-16325862
  • Views 142
  • Downloads 0

How To Cite

Soubhagya V N, Nikhila T Bhuvan (2014).  Highly Available Hadoop Name Node Architecture-Using Replicas of Name Node with Time Synchronization among Replicas. IOSR Journals (IOSR Journal of Computer Engineering), 16(3), 58-62. https://europub.co.uk/articles/-A-99757