Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2013, Vol 14, Issue 6

Abstract

 Bioinformatics may be defined as the application of computer science to molecular biology in the form of statistics and analytics. The bioinformatics applications deal with bulk amount of data. Researchers are now facing problems with the analysis of such ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. More over big challenge is involved in processing, storing and analyzing these peta bytes of data without causing much delay. Most of the bioinformatics algorithms are sequential thus making situation rather worse. This implies that data manipulations by means of uniprocessor systems are impractical. However most of the biological problems have parallel nature. Hence a practical and effective approach involves the usage of parallel clusters of workstations. Hadoop can be used to tackle this class of problems with good performance and scalability. This technology could be the basis of a computational parallel platform for several problems in the context of bioinformatics applications. Normally, Hadoop is deployed over high performance computing systems which are expensive involving complex deployment scenarios that only big enterprises are able to make it possible. So for smaller research organizations where cost is an important factor cannot choose systems with high computational capabilities for cluster set up. Rocks cluster is a viable solution in such scenarios. Rocks Cluster Distribution originally called NPACI Rocks is a Linux distribution intended for high-performance computing clusters. This paper implements a cost-effective cluster for paralyzing bioinformatics applications by deploying Hadoop over rock cluster and Emphasizes on the usage of commodity clusters for paralyzing bioinformatics applications by providing necessary justifications. Results show that paralyzing bioinformatics application saves much time compared to stand alone mode of execution effectively under optimal cost considerations.

Authors and Affiliations

Bincy P Andrews

Keywords

Related Articles

 VLSI Implementation of High Speed & Low Power Multiplier in FPGA

 We known that different multipliers consume most of the power in DSP computations, FIR filters. Hence, it is very important factor for modern DSP systems to built low-power multipliers to minimize the power &nbsp...

 A Survey on Approaches for Mining Frequent Itemsets

 Abstract: Data mining is gaining importance due to huge amount of data available. Retrieving information from the warehouse is not only tedious but also difficult in some cases. The most important usage of data min...

Research on Industrial Robot Teaching Pendant based on Android and its Realization

Abstract: As the current industrial robots teaching systems have some disadvantages including high maintenance cost, poor portability and operational complexity, an industrial robot teaching system based on Android platf...

 Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Features

 We present a visual dictionary based method for content based image retrieval in cervical microscopy images using texture features. The nucleus region in each image is identified by a simple and  reliable se...

A Survey of the Internet of Things

Abstract: This paper studies the state-of-art of Internet of Things (IoT). By enabling new forms of communication between people and things, and between things themselves, IoT would add a new dimension to the world of in...

Download PDF file
  • EP ID EP141681
  • DOI -
  • Views 81
  • Downloads 0

How To Cite

Bincy P Andrews (2013).  Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster. IOSR Journals (IOSR Journal of Computer Engineering), 14(6), 89-93. https://europub.co.uk/articles/-A-141681