PROCESSING IMAGE FILES USING SEQUENCE FILE IN HADOOP
Journal Title: International Journal of Engineering Sciences & Research Technology - Year 30, Vol 5, Issue 10
Abstract
This paper presents MapReduce as a distributed data processing model utilizing open source Hadoop frameworkfor work huge volume of data. The expansive volume of data in the advanced world, especially multimedia data,makes new requirement for processing and storage. As an open source distributed computational framework,Hadoop takes into consideration processing a lot of images on an unbounded arrangement of computing nodes bygiving fundamental foundations. We have lots and lots of small images files and need to remove duplicate filesfrom the available data. As most binary formats—particularly those that are compressed or encrypted—cannot besplit and must be read as a single linear stream of data. Using such files as input to a MapReduce job means thata single mapper will be used to process the entire file, causing a potentially large performance hit. The paperproposes splitable format such as SequenceFile and uses MD5 algorithm to improve the performance of imageprocessing.
Authors and Affiliations
Dr. E. Laxmi Lydia
MECHANICAL BEHAVIOR OF FLY ASH IMPREGNATED NATURAL FIBRE REINFORCED POLYMER COMPOSITE
A composite material is the combination of two or more materials, which are having different phases and the properties superior to the base material. The effect of the coir fiber and 75μm flyash particles on mechanical...
Lipid Characterization of Wild Species Pinctadaradiata in Southern Tunisia East
The lipid and Fatty Acids composition of the total lipids of the pearl oyster (Pinctada radiate), in different seasons in southern Tunisia east, were analyzedin order to assess and enhance this species. Total fat...
A REVIEW PAPER ON DENOISING MULTI-CHANNEL IMAGES IN PARALLEL MRI BY LOW RANK MATRIX DECOMPOSITION AND BACTERIAL FORAGING ALGORITHM
Parallel magnetic resonance imaging has emerged as an effective means for high-speed imaging in various applications. The reconstruction of parallel magnetic resonance imaging (pMRI) [1] data can be a computationa...
INCREASING BANDWIDTH ON CELL BREATHING TECHNOLOGY USING RAT ALGORITHM
In a typical enterprise WLAN, it is difficult to identify and implement the types of network settings which cause poor performance where number of hosts may attain larger share of the available bandwidth in a access...
Thermodynamic Characterization of Sorption of Copper(II) ions on Rice Husk
This paper attempts to develop simple and easily understandable thermodynamic parameters related sorption process at the equilibrium. Batch kinetic studies were conducted for the adsorption of Cu(II) on SCRH (sod...