Machine Learning based Predictive Model for Screening Mycobacterium Tuberculosis Transcriptional Regulatory Protein Inhibitors from High-Throughput Screening Dataset

Abstract

In view of the essential role played by dosRS in the survival of Mycobacterium in the infected granuloma cells, dosRS transcriptional regulatory proteins were considered as a validated target for high throughput screening (HTS). However, the cost and time factor involved in screening large compound libraries are an important hurdle in identifying lead compounds. Therefore, the use of computational machine learning techniques to build a predictive model for screening putative drug-like molecule has gained significance. In this regard, a target-based predictive model using machine learning approaches was built to develop fast and efficient virtual screening procedures to screen anti-dosRS molecules. In the present study, we have used various structural and physiochemical attributes of compounds from HTS dataset to train and build a chemoinformatics predictive model based on four state-of-art supervised classifiers (Random forest, SMO, J48, and Naïve Bayes). The trained model was applied to test dataset for validating the robustness, accuracy, and sensitivity of the predictive model in screening active anti-dosRS molecules. The Cost-Sensitive Classifier (CSC) with Random Forest (RF) algorithm based predictive model showed a high sensitivity (100%) and specificity (83.13%) to identify active and inactive molecules, respectively from assay dataset (ID: 1159583). CSC-RF proved to more robust and efficient in classifying active molecule from an imbalanced dataset with highest Balancing Classification Rate (BCR) (91.57%) and maximum Area under the Curve (AUC) value (0.999).

Authors and Affiliations

Syed Asif Hassan, Tabrej Khan

Keywords

Related Articles

Triangle Hyper Hexa-cell Interconnection Network A Novel Interconnection Network

The interconnection networks play the main role in many applications, because it has a direct influence on it. Nowadays; the challenge is to find suitable topology that can deal with fewer requirements and min-cost. One...

Impact of Anaphora Resolution on Opinion Target Identification

Opinion mining is an interesting area of research because of its wide applications in the decision-making process. Opinion mining aims to extract user’s perception from the text and to create a fast and accurate summary...

A Collective Neurodynamic Approach to Survivable Virtual Network Embedding

Network virtualization has attracted significant amount of attention in the last few years as one of the key features of cloud computing. Network virtualization allows multiple virtual networks to share physical resource...

FPGA Prototype Implementation of Digital Hearing Aid from Software to Complete Hardware Design

The design and implementation of digital hearing aids requires a detailed knowledge of various digital signal processing techniques used in hearing aids like Wavelet Trans-forms, uniform and non-uniform Filter Banks and...

Semi Supervised Method for Detection of Ambiguous Word and Creation of Sense: Using WordNet

Machine Translation, Information Retrieval and Knowledge Acquisition are the three main applications of Word Sense Disambiguation (WSD). The sense of a target word can be identified from a dictionary using a ‘bag of word...

Download PDF file
  • EP ID EP258304
  • DOI 10.14569/IJACSA.2017.081215
  • Views 108
  • Downloads 0

How To Cite

Syed Asif Hassan, Tabrej Khan (2017). Machine Learning based Predictive Model for Screening Mycobacterium Tuberculosis Transcriptional Regulatory Protein Inhibitors from High-Throughput Screening Dataset. International Journal of Advanced Computer Science & Applications, 8(12), 116-123. https://europub.co.uk/articles/-A-258304