Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2013, Vol 14, Issue 6

Abstract

 Bioinformatics may be defined as the application of computer science to molecular biology in the form of statistics and analytics. The bioinformatics applications deal with bulk amount of data. Researchers are now facing problems with the analysis of such ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. More over big challenge is involved in processing, storing and analyzing these peta bytes of data without causing much delay. Most of the bioinformatics algorithms are sequential thus making situation rather worse. This implies that data manipulations by means of uniprocessor systems are impractical. However most of the biological problems have parallel nature. Hence a practical and effective approach involves the usage of parallel clusters of workstations. Hadoop can be used to tackle this class of problems with good performance and scalability. This technology could be the basis of a computational parallel platform for several problems in the context of bioinformatics applications. Normally, Hadoop is deployed over high performance computing systems which are expensive involving complex deployment scenarios that only big enterprises are able to make it possible. So for smaller research organizations where cost is an important factor cannot choose systems with high computational capabilities for cluster set up. Rocks cluster is a viable solution in such scenarios. Rocks Cluster Distribution originally called NPACI Rocks is a Linux distribution intended for high-performance computing clusters. This paper implements a cost-effective cluster for paralyzing bioinformatics applications by deploying Hadoop over rock cluster and Emphasizes on the usage of commodity clusters for paralyzing bioinformatics applications by providing necessary justifications. Results show that paralyzing bioinformatics application saves much time compared to stand alone mode of execution effectively under optimal cost considerations.

Authors and Affiliations

Bincy P Andrews

Keywords

Related Articles

 Classification of Mammogram Images for Detection of BreastCancer

 Abstract: Breast cancer is the most commonly observed cancer in women both in the developing and thedeveloped countries of the world. The survival rate in it has improved over the past few years with thedevelopment...

Object Detection & Tracking in Moving Background Under Different Environmental Conditions

Abstract: Object detection and tracking has been a widely studied research problem in recent years. Currently system architectures are service oriented i.e. they offer number of services. All such common services are gro...

 A Review on Diverse Ensemble Methods for Classification

 Ensemble methods for different classifiers like Bagging and Boosting which combine the decisions of multiple hypotheses are some of the strongest existing machine learning methods. The diversity ofthe members of...

HiRLoc: High-resolution Robust Localization for Wireless Sensor Networks

In this paper the tiny nodes are deployed in target areas according to the deployment nature of target but nodes are easily targeted by attacker with physical attack of node capture. So, secure,communications in som...

 Retinal Vessels Segmentation Using Supervised Classifiers for  Identification of Cardio Vascular Diseases

 The risk of cardio vascular diseases can be identified by measuring the retinal blood vessel. The identification of wrong blood vessel may result in wrong clinical diagnosis. This proposed system addresses the &n...

Download PDF file
  • EP ID EP141681
  • DOI -
  • Views 128
  • Downloads 0

How To Cite

Bincy P Andrews (2013).  Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster. IOSR Journals (IOSR Journal of Computer Engineering), 14(6), 89-93. https://europub.co.uk/articles/-A-141681