Evaluation of data integration strategies based on kernel method of clinical and microarray data
Journal Title: Bioinformation - Year 2012, Vol 8, Issue 3
Abstract
The cancer classification problem is one of the most challenging problems in bioinformatics. The data provided by Netherland Cancer Institute consists of 295 breast cancer patient; 101 patients are with distant metastases and 194 patients are without distant metastases. Combination of features sets based on kernel method to classify the patient who are with or without distant metastases will be investigated. The single data set will be compared with three data integration strategies and also weighted data integration strategies based on kernel method. Least Square Support Vector Machine (LS-SVM) is chosen as the classifier because it can handle very high dimensional features, for instance, microarray data. The experiment result shows that the performance of weighted late integration and the using of only microarray data are almost similar. The data integration strategy is not always better than using single data set in this case. The performance of classification absolutely depends on the features that are used to represent the object.
Authors and Affiliations
Ary Noviyanto , Ito Wasito
RegStatGel: proteomic software for identifying differentially expressed proteins based on 2D gel images.
Image analysis of two-dimensional gel electrophoresis is a key step in proteomic workflow for identifying proteins that change under different experimental conditions. Since there are usually large amount of proteins and...
In silico modeling of ligand molecule for non structural 3 (NS3) protein target of flaviviruses
Flaviviruses are small, enveloped RNA viruses which cause a variety of diseases into animals and man. Despite the existence of licensed vaccines, yellow fever, Japanese encephalitis and tick-borne encephalitis also claim...
Molecular modelling of the TSR domain of R-spondin 4.
R-spondin 4 is a secreted protein mainly associated with embryonic nail development. R-spondins have been recently identified as heparin-binding proteins with high affinity. Proteoglycan binding has been associated with...
Recent trends in remote homology detection: an Indian Medley.
The development of remote homology detection methods is a challenging area in Bioinformatics. Sequence analysis-based approaches that address this problem have employed the use of profiles, templates and Hidden Markov Mo...
Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus
Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases....