An Effective Identification of Species from DNA Sequence: A Classification Technique by Integrating DM and ANN

Abstract

Species classification from DNA sequences remains as an open challenge in the area of bioinformatics, which deals with the collection, processing and analysis of DNA and proteomic sequence. Though incorporation of data mining can guide the process to perform well, poor definition, and heterogeneous nature of gene sequence remains as a barrier. In this paper, an effective classification technique to identify the organism from its gene sequence is proposed. The proposed integrated technique is mainly based on pattern mining and neural network-based classification. In pattern mining, the technique mines nucleotide patterns and their support from selected DNA sequence. The high dimension of the mined dataset is reduced using Multilinear Principal Component Analysis (MPCA). In classification, a well-trained neural network classifies the selected gene sequence and so the organism is identified even from a part of the sequence. The proposed technique is evaluated by performing 10-fold cross validation, a statistical validation measure, and the obtained results prove the efficacy of the technique.

Authors and Affiliations

Sathish S, Dr. N. Duraipandian

Keywords

Related Articles

A New Approach for Time Series Forecasting: Bayesian Enhanced by Fractional Brownian Motion with Application to Rainfall Series

A new predictor algorithm based on Bayesian enhanced approach (BEA) for long-term chaotic time series using artificial neural networks (ANN) is presented. The technique based on stochastic models uses Bayesian inference...

Enhanced Textual Password Scheme for Better Security and Memorability

Traditional textual password scheme provides a large number of password combinations but users generally use a small portion of available password space. Complex textual passwords are difficult to remember, therefore mos...

Architecture Aware Programming on Multi-Core Systems

In order to improve the processor performance, the response of the industry has been to increase the number of cores on the die. One salient feature of multi-core architectures is that they have a varying degree of shar...

Crowd Counting Mapping to make a Decision

Congestion typically occurs when the number of crowds exceeds the capacity of facilities. In some cases, when buildings have to be evacuated, people might be trapped in congestion and cannot escape from the building earl...

Performance Evaluation of Transmission Line Protection Characteristics with DSTATCOM Implementation

To meet with the ever-enhancing load demands, new transmission lines should be bolted-on in the existing power system but the economic and environmental concerns are major constraints to this addition. Hence utilities ha...

Download PDF file
  • EP ID EP145799
  • DOI -
  • Views 121
  • Downloads 0

How To Cite

Sathish S, Dr. N. Duraipandian (2012). An Effective Identification of Species from DNA Sequence: A Classification Technique by Integrating DM and ANN. International Journal of Advanced Computer Science & Applications, 3(8), 104-114. https://europub.co.uk/articles/-A-145799