Machine Learning based Predictive Model for Screening Mycobacterium Tuberculosis Transcriptional Regulatory Protein Inhibitors from High-Throughput Screening Dataset

Abstract

In view of the essential role played by dosRS in the survival of Mycobacterium in the infected granuloma cells, dosRS transcriptional regulatory proteins were considered as a validated target for high throughput screening (HTS). However, the cost and time factor involved in screening large compound libraries are an important hurdle in identifying lead compounds. Therefore, the use of computational machine learning techniques to build a predictive model for screening putative drug-like molecule has gained significance. In this regard, a target-based predictive model using machine learning approaches was built to develop fast and efficient virtual screening procedures to screen anti-dosRS molecules. In the present study, we have used various structural and physiochemical attributes of compounds from HTS dataset to train and build a chemoinformatics predictive model based on four state-of-art supervised classifiers (Random forest, SMO, J48, and Naïve Bayes). The trained model was applied to test dataset for validating the robustness, accuracy, and sensitivity of the predictive model in screening active anti-dosRS molecules. The Cost-Sensitive Classifier (CSC) with Random Forest (RF) algorithm based predictive model showed a high sensitivity (100%) and specificity (83.13%) to identify active and inactive molecules, respectively from assay dataset (ID: 1159583). CSC-RF proved to more robust and efficient in classifying active molecule from an imbalanced dataset with highest Balancing Classification Rate (BCR) (91.57%) and maximum Area under the Curve (AUC) value (0.999).

Authors and Affiliations

Syed Asif Hassan, Tabrej Khan

Keywords

Related Articles

Sustainable Green SLA (GSLA) Validation using Bayesian Network Model

Currently, most of the IT (Information Technology) and ICT (Information and Communication Technology) industries/companies provides their various services/product at a different level of customers/users through newly dev...

Implementation of a Beowulf Cluster and Analysis of its Performance in Applications with Parallel Programming

In the Image Processing Research Laboratory (INTI-Lab) of the Universidad de Ciencias y Humanidades, the permission to use the embedded systems laboratory was obtained. INTI-Lab researchers will use this laboratory to do...

Exploreing K-Means with Internal Validity Indexes for Data Clustering in Traffic Management System

Traffic Management System (TMS) is used to improve traffic flow by integrating information from different data repositories and online sensors, detecting incidents and taking actions on traffic routing. In general, two d...

QUATERNIONIC WIGNER-VILLE DISTRIBUTION OF ANALYTICAL SIGNAL IN HYPERSPECTRAL IMAGERY

The 2D Quaternionic Fourier Transform (QFT), applied to a real 2D image, produces an invertible quaternionic spectrum. If we conserve uniquely the first quadrant of this spectrum, it is possible, after inverse transforma...

An Empirical Analysis Over the Four Different Feature-Based Face and Iris Biometric Recognition Techniques

Recently, multimodal biometric systems have been widely accepted, which has shown increased accuracy and population coverage, while reducing vulnerability to spoofing. The main feature to multimodal biometrics is the ama...

Download PDF file
  • EP ID EP258304
  • DOI 10.14569/IJACSA.2017.081215
  • Views 98
  • Downloads 0

How To Cite

Syed Asif Hassan, Tabrej Khan (2017). Machine Learning based Predictive Model for Screening Mycobacterium Tuberculosis Transcriptional Regulatory Protein Inhibitors from High-Throughput Screening Dataset. International Journal of Advanced Computer Science & Applications, 8(12), 116-123. https://europub.co.uk/articles/-A-258304