Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework

Abstract

Big data are giving new research challenges in the life sciences domain because of their variety, volume, veracity, velocity, and value. Predicting gene biomarkers is one of the vital research issues in bioinformatics field, where microarray gene expression and network based methods can be used. These datasets suffer from the huge data voluminous, causing main memory problems. In this paper, a Random Committee Node Classifier algorithm (RCNC) is proposed for identifying cancer biomarkers, which is based on microarray gene expression data and Protein-Protein Interaction (PPI) data. Data are enriched from other public databases, such as IntACT1 and UniProt2 and Gene Ontology3 (GO). Cancer Biomarkers are identified when applied to different datasets with an accuracy rate an accuracy rate 99.16%, 99.96% precision, 99.24% recall, 99.16% F1-measure and 99.6 ROC. To speed up the performance, it is run within a MapReduce framework, where RCNC MapReduce algorithm is much faster than RCNC sequential algorithm when having large datasets.

Authors and Affiliations

Taysir Soliman

Keywords

Related Articles

Metric-based Measurement and Selection for Software Product Quality Assessment: Qualitative Expert Interviews

A systematic and efficient measurement process can assist towards the production of quality software product. Metric-based measurement method often used to assess the product quality. Currently several hundreds of metric...

A Study of Retrieval Methods of Multi-Dimensional Images in Different Domains

Multiple amount of multi-dimensional images are designed and most of them are available on internet at free of cost. The 3D images include three characteristics namely width, height, and depth. The images which are creat...

Initialization Method for Communication and Data Sharing in P2P Environment Between Wireless Sensor Nodes

Wireless Sensor Networks have increased notewor-thy thought nowadays, rather than wired sensor systems, by presenting multi-useful remote hubs, which are littler in size. However, WSNs correspondence is inclined to negat...

Robust Image Watermarking using Fractional Sinc Transformation

The increased utilization of internet in sharing and dissemination of digital data makes it is very difficult to maintain copyright and ownership of data. Digital watermarking offers a method for authentication and copyr...

Comparative Performance Analysis of Efficient MIMO Detection Approaches

The promising massive level MIMO (multiple-input-multiple-output) systems based on extremely huge antenna collections have turned into a sizzling theme of wireless com-munication systems. This paper assesses the performa...

Download PDF file
  • EP ID EP101255
  • DOI 10.14569/IJACSA.2015.061225
  • Views 119
  • Downloads 0

How To Cite

Taysir Soliman (2015). Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework. International Journal of Advanced Computer Science & Applications, 6(12), 184-189. https://europub.co.uk/articles/-A-101255