Machine Learning based Predictive Model for Screening Mycobacterium Tuberculosis Transcriptional Regulatory Protein Inhibitors from High-Throughput Screening Dataset
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 12
Abstract
In view of the essential role played by dosRS in the survival of Mycobacterium in the infected granuloma cells, dosRS transcriptional regulatory proteins were considered as a validated target for high throughput screening (HTS). However, the cost and time factor involved in screening large compound libraries are an important hurdle in identifying lead compounds. Therefore, the use of computational machine learning techniques to build a predictive model for screening putative drug-like molecule has gained significance. In this regard, a target-based predictive model using machine learning approaches was built to develop fast and efficient virtual screening procedures to screen anti-dosRS molecules. In the present study, we have used various structural and physiochemical attributes of compounds from HTS dataset to train and build a chemoinformatics predictive model based on four state-of-art supervised classifiers (Random forest, SMO, J48, and Naïve Bayes). The trained model was applied to test dataset for validating the robustness, accuracy, and sensitivity of the predictive model in screening active anti-dosRS molecules. The Cost-Sensitive Classifier (CSC) with Random Forest (RF) algorithm based predictive model showed a high sensitivity (100%) and specificity (83.13%) to identify active and inactive molecules, respectively from assay dataset (ID: 1159583). CSC-RF proved to more robust and efficient in classifying active molecule from an imbalanced dataset with highest Balancing Classification Rate (BCR) (91.57%) and maximum Area under the Curve (AUC) value (0.999).
Authors and Affiliations
Syed Asif Hassan, Tabrej Khan
Fuzzy Ontology based Approach for Flexible Association Rules Mining
Data mining is used for extracting related data. The association rules approach is one of the used methods for analyzing, discovering and extracting knowledge and mining the relationships among raw data. Commonly, it is...
Swarm Optimization based Radio Resource Allocation for Dense Devices D2D Communication
In Device to Device (D2D) communication two or more devices communicate directly with each other in the in-band cellular network. It enhances the spectral efficiency due to cellular radio resources (RR) are shared among...
Task Allocation Model for Rescue Disabled Persons in Disaster Area with Help of Volunteers
In this paper, we present a task allocation model for search and rescue persons with disabilities in case of disaster. The multi agent-based simulation model is used to simulate the rescue process. Volunteers and d...
A Survey on Tor Encrypted Traffic Monitoring
Tor (The Onion Router) is an anonymity tool that is widely used worldwide. Tor protect its user privacy against surveillance and censorship using strong encryption and obfuscation techniques which makes it extremely diff...
A QoS Solution for NDN in the Presence of Congestion Control Mechanism
Both congestion control and Quality of Service (QoS) are important quality attributes in computer networks. Specifically, for the future Internet architecture known as Named Data Networking (NDN), solutions using hop-by-...