Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework

Abstract

Big data are giving new research challenges in the life sciences domain because of their variety, volume, veracity, velocity, and value. Predicting gene biomarkers is one of the vital research issues in bioinformatics field, where microarray gene expression and network based methods can be used. These datasets suffer from the huge data voluminous, causing main memory problems. In this paper, a Random Committee Node Classifier algorithm (RCNC) is proposed for identifying cancer biomarkers, which is based on microarray gene expression data and Protein-Protein Interaction (PPI) data. Data are enriched from other public databases, such as IntACT1 and UniProt2 and Gene Ontology3 (GO). Cancer Biomarkers are identified when applied to different datasets with an accuracy rate an accuracy rate 99.16%, 99.96% precision, 99.24% recall, 99.16% F1-measure and 99.6 ROC. To speed up the performance, it is run within a MapReduce framework, where RCNC MapReduce algorithm is much faster than RCNC sequential algorithm when having large datasets.

Authors and Affiliations

Taysir Soliman

Keywords

Related Articles

Mining Opinion in Online Messages

The number of messages that can be mined from online entries increases as the number of online application users increases. In Malaysia, online messages are written in mixed languages known as ‘Bahasa Rojak’. Therefore,...

The Impact of Flyweight and Proxy Design Patterns on Software Efficiency: An Empirical Evaluation

In this era of technology, delivering quality software has become a crucial requirement for the developers. Quality software is able to help an organization to success and gain a competitive edge in the market. There are...

Discovering Semantic and Sentiment Correlations using Short Informal Arabic Language Text

Semantic and Sentiment analysis have received a great deal of attention over the last few years due to the important role they play in many different fields, including marketing, education, and politics. Social media has...

Evaluating Web Accessibility Metrics for Jordanian Universities

University web portals are considered one of the main access gateways for universities. Typically, they have a large candidate audience among the current students, employees, and faculty members aside from previous and f...

A web based Publish-Subscribe framework for mobile computing

The growing popularity of mobile devices is permanently changing the Internet user’s computing experience. Smartphones and tablets begin to replace the desktop as the primary means of interacting with various information...

Download PDF file
  • EP ID EP101255
  • DOI 10.14569/IJACSA.2015.061225
  • Views 93
  • Downloads 0

How To Cite

Taysir Soliman (2015). Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework. International Journal of Advanced Computer Science & Applications, 6(12), 184-189. https://europub.co.uk/articles/-A-101255