Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster

Apply

 Paralyzing Bioinformatics Applications Using Conducive Hadoop Cluster

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2013, Vol 14, Issue 6

Abstract

 Bioinformatics may be defined as the application of computer science to molecular biology in the form of statistics and analytics. The bioinformatics applications deal with bulk amount of data. Researchers are now facing problems with the analysis of such ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. More over big challenge is involved in processing, storing and analyzing these peta bytes of data without causing much delay. Most of the bioinformatics algorithms are sequential thus making situation rather worse. This implies that data manipulations by means of uniprocessor systems are impractical. However most of the biological problems have parallel nature. Hence a practical and effective approach involves the usage of parallel clusters of workstations. Hadoop can be used to tackle this class of problems with good performance and scalability. This technology could be the basis of a computational parallel platform for several problems in the context of bioinformatics applications. Normally, Hadoop is deployed over high performance computing systems which are expensive involving complex deployment scenarios that only big enterprises are able to make it possible. So for smaller research organizations where cost is an important factor cannot choose systems with high computational capabilities for cluster set up. Rocks cluster is a viable solution in such scenarios. Rocks Cluster Distribution originally called NPACI Rocks is a Linux distribution intended for high-performance computing clusters. This paper implements a cost-effective cluster for paralyzing bioinformatics applications by deploying Hadoop over rock cluster and Emphasizes on the usage of commodity clusters for paralyzing bioinformatics applications by providing necessary justifications. Results show that paralyzing bioinformatics application saves much time compared to stand alone mode of execution effectively under optimal cost considerations.

Authors and Affiliations

Bincy P Andrews

Keywords

Hadoop Clusters Rocks Cluster commodity clusters cluster environment Big Data bioinformatics

 Vehicle Security System with Theft Identification and Accident Notification

  The rapid development of electronics provides secured environment to the human. As a part of this ‘Vehicle Security System With Theft Identification And Accident Notification’ is designed to reduce the risk i...

SSL-QA: Analysis of Semi-Supervised Learning for QuestionAnswering

Open domain natural language question answering (QA) is a process of automatically finding answers to questions searching collections of text files. Question answering (QA) is a long-standing challenge in NLP, and the co...

 Prediction levels of heavy metals (Zn, Cu and Mn) in current Holocene deposits of the eastern part of the Mediterranean Moroccan margin (Alboran Sea)

 The Alboran basin is part of the bético Rif chain and represents a point of exchange through the Strait of Gibraltar between the Atlantic Ocean to the west and the Algerian-Balearic Basin to the east.The purpose of...

 Brain Tumor Detection through MR Images: A Review of Literature

Abstract: A brain tumor is an abnormal growth of tissue in the brain or central spine that can disrupt proper brain function and creates an increasing pressure in the brain. This paper is intended to present a comprehens...

Computational Analysis of Sequences to Determine Expectation Value Commonly Used in Bioinformatics Database.

Abstract: Solanum lycopersicum economically important crop world wide, intensively investigated and model system for genetic studies in plant ,variability is a measure spread of data set. Genome analysis andannotation us...

EP ID EP141681
DOI -
Views 130
Downloads 0