Speed-up Extension to Hadoop System
Journal Title: INTERNATIONAL JOURNAL OF ENGINEERING TRENDS AND TECHNOLOGY - Year 2014, Vol 12, Issue 2
Abstract
For storage and analysis of online or streaming data which is too big in size most organization are moving toward using Apaches Hadoop- HDFS. Applications like log processors, search engines etc. using Hadoop Map Reduce for computing and HDFS for storage. Hadoop is most popular for analysis, storage and processing very large data but there need to be lots of changes in hadoop system. Here problem of data storage and data processing try to solve which helps hadoop system to improve processing speed and reduce time to execute the task. Hadoop application requires streaming access to data files. During placement of data files default placement of Hadoop does not consider any data characteristics. If the related set of files is stored in the same set of nodes, the efficiency and access latency can be increased. Hadoop uses Map Reduce framework for implementing large-scale distributed computing on unpredicted data sets. There are potential duplicate computations being performed in this process. No mechanism is to identify such duplicate computations which increase processing time. Solution for above problem is to co-locate related files by considering content and using locality sensitive hashing algorithm which is a clustering based algorithm will try to co -locate related file streams to the same set of nodes without affecting the default scalability and fault tolerance properties of Hadoop and for avoiding duplicate computation processing mechanism is developed which store executed task with result and before execution of any task stored executed tasks are compared if task find then direct result will be provided . By storing related files in same cluster which improve data locality mechanism and avoiding repeated execution of task improves processing time, both helps to speed up execution of Hadoop.
Authors and Affiliations
Sayali Ashok Shivarkar
A Review On Energy Efficient Secure Routing For Data Aggregation In Wireless Sensor Networks
Wireless sensor nodes challenges are supply maximum lifetime and provide secure communication to network. It has small in size and limited processing capability with very low battery power. This restriction of low batt...
Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture
Hidden layers plays a vital role in the performance of Back Propagation Neural Network especially in the case where problems related to arbitrary decision boundary to arbitrary accuracy with rational activation fun...
Application of CAD/CAE & Rapid Prototyping Technology in Medical Field
Now a days , one of the critical factors in competitive technology is “time to market” along with full proof design. This critical factor indicates the entire product design cycle from concept to product design to...
Influence of Fly Ash and Silica Fumes on the Behavior of Self Compacting Concrete
Self Compacting Concrete (SCC) is one of the most significant advances in concrete technology in the last decades. SCC was mainly developed to ensure adequate compaction through self compaction and facilitate placement o...
A Survey on an Efficient Technique of Encryption Scheme and its Extension in Cloud Based PHR System
Scalable and secure sharing of personal health record in cloud computing is an emerging trend in Health field for exchange and the use of personal Health information. This sensitive data is shared and stored by the third...