Evaluation of data integration strategies based on kernel method of clinical and microarray data

Journal Title: Bioinformation - Year 2012, Vol 8, Issue 3

Abstract

The cancer classification problem is one of the most challenging problems in bioinformatics. The data provided by Netherland Cancer Institute consists of 295 breast cancer patient; 101 patients are with distant metastases and 194 patients are without distant metastases. Combination of features sets based on kernel method to classify the patient who are with or without distant metastases will be investigated. The single data set will be compared with three data integration strategies and also weighted data integration strategies based on kernel method. Least Square Support Vector Machine (LS-SVM) is chosen as the classifier because it can handle very high dimensional features, for instance, microarray data. The experiment result shows that the performance of weighted late integration and the using of only microarray data are almost similar. The data integration strategy is not always better than using single data set in this case. The performance of classification absolutely depends on the features that are used to represent the object.

Authors and Affiliations

Ary Noviyanto , Ito Wasito

Keywords

Related Articles

Classification and comparative analysis of Curcuma longa L. expressed sequences tags (ESTs) encoding glycine-rich proteins (GRPs)

Glycine-rich proteins (GRPs) are a group of proteins characterized by their high content of glycine residues often occurring in repetitive blocs. The diverse expression pattern and sub cellular localization of various GR...

SeqCalc: A portable bioinformatics software for sequence analysis.

Rapid genome sequencing enriched biological databases with enormous sequence data. Yet it remains a daunting task to unravel this information. However experimental and computational researchers lead their own way in anal...

EGID: an ensemble algorithm for improved genomic island detection in genomic sequences

Genomicislands (GIs) are genomic regions that are originally transferred from other organisms. The detection of genomic islands in genomes can lead to many applications in industrial, medical and environmental contexts....

Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus

Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases....

Download PDF file
  • EP ID EP129909
  • DOI 10.6026/97320630008147
  • Views 151
  • Downloads 0

How To Cite

Ary Noviyanto, Ito Wasito (2012). Evaluation of data integration strategies based on kernel method of clinical and microarray data. Bioinformation, 8(3), 147-150. https://europub.co.uk/articles/-A-129909