PaMSA: A Parallel Algorithm for the Global Alignment of Multiple Protein Sequences
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 4
Abstract
Multiple sequence alignment (MSA) is a well-known problem in bioinformatics whose main goal is the identification of evolutionary, structural or functional similarities in a set of three or more related genes or proteins. We present a parallel approach for the global alignment of multiple protein sequences that combines dynamic programming, heuristics, and parallel programming techniques in an iterative process. In the proposed algorithm, the longest common subsequence technique is used to generate a first MSA by aligning identical residues. An iterative process improves the MSA by applying a number of operators that were defined in the present work, in order to produce more accurate alignments. The accuracy of the alignment was evaluated through the application of optimization functions. In the proposed algorithm, a number of processes work independently at the same time searching for the best MSA of a set of sequences. There exists a process that acts as a coordinator, whereas the rest of the processes are considered slave processes. The resulting algorithm was called PaMSA, which stands for Parallel MSA. The MSA accuracy and response time of PaMSA were compared against those of Clustal W, T-Coffee, MUSCLE, and Parallel T-Coffee on 40 datasets of protein sequences. When run as a sequential application, PaMSA turned out to be the second fastest when compared against the nonparallel MSA methods tested (Clustal W, T-Coffee, and MUSCLE). However, PaMSA was designed to be executed in parallel. When run as a parallel application, PaMSA presented better response times than Parallel T-Cofffee under the conditions tested. Furthermore, the sum-of-pairs scores achieved by PaMSA when aligning groups of sequences with an identity percentage score from approximately 70% to 100%, were the highest in all cases. PaMSA was implemented on a cluster platform using the C++ language through the application of the standard Message Passing Interface (MPI) library.
Authors and Affiliations
Irma R. Andalon-Garcia, Arturo Chavoya
Software Requirements Conflict Identification: Review and Recommendations
Successful development of software systems re-quires a set of complete, consistent and clear requirements. A wide range of different stakeholders with various needs and backgrounds participate in the requirements enginee...
LQR Robust Control for Active and Reactive Power Tracking of a DFIG based WECS
This research work sets forward a new formulation of Linear Quadratic Regulator problem (LQR) applied to a Wind Energy Conversion System (WECS). A new necessary and sufficient condition of Lyapunov asymptotic stability i...
New mechanism for Cloud Computing Storage Security
Cloud computing, often referred to as simply the cloud, appears as an emerging computing paradigm which promises to radically change the way computer applications and services are constructed, delivered, managed and fina...
Combination of Neural Networks and Fuzzy Clustering Algorithm to Evalution Training Simulation-Based Training
With the advancement of computer technology, computer simulation in the field of education are more realistic and more effective. The definition of simulation is to create a virtual environment that accurately and real e...
Artificial Neural Networks and Support Vector Machine for Voice Disorders Identification
The diagnosis of voice diseases through the invasive medical techniques is an efficient way but it is often uncomfortable for patients, therefore, the automatic speech recognition methods have attracted more and more int...