Classifying Lung Adenocarcinoma and Squamous Cell Carcinoma using RNA-Seq Data

Journal Title: Cancer Studies & Molecular Medicine – Open Journal - Year 2017, Vol 3, Issue 2

Abstract

Background: Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are two primary subtypes of non-small cell lung carcinoma (NSCLC). Currently, the most widely used method to discriminate between LUAD and LUSC is hematoxylin-eosin (HE) staining. However, this method sometimes is unable to make the precise diagnosis on LUAD or LUSC. More accurate diagnostic approaches are highly desired. Methods: We propose to use gene expression profile to discriminate NSCLC patient’s subtype. We leveraged RNA-Seq data from The Cancer Genome Atlas (TCGA) and randomly split the data into training and testing subsets. To construct classifiers based on the training data, we considered three methods: logistic regression on principal components (PCR), logistic regression with LASSO shrinkage (LASSO), and kth nearest neighbors (KNN). Performances of classifiers were evaluated and compared based on the testing data. Results: All gene expression-based classifiers show high accuracy in discriminating LUSC and LUAD. The classifier obtained by LASSO has the smallest overall misclassification rate of 3.42% (95% CI: 3.25%-3.60%) when using 0.5 as the cutoff value for the predicted probability of belonging to a subtype, followed by classifiers obtained by PCR (4.36%, 95% CI: 4.23%- 4.49%) and KNN (8.70%, 95% CI: 8.57%-8.83%). The LASSO classifier also has the highest average area under the receiver operating characteristic curve (AUC) value of 0.993, compared to PCR (0.987) and KNN (0.965). Conclusions: Our results suggest that mRNA expressions are highly informative for classifying NSCLC subtypes and may potentially be used to assist clinical diagnosis.

Authors and Affiliations

Chi Wang

Keywords

Related Articles

How Target Therapy can Induce Cardiotoxicity: The Onco-Cardiologist Point of View

The life expectancy of an oncological patient has been increased significantly decades due to the evolution of cancer therapies. The aim of this paper is to review how these new target molecules, which have a direct impa...

Potential of Molecular Imaging to Advance Molecular Medicine

Molecular imaging is a technology that allows for non-invasive interrogation of physiological and biochemical processes in the body. Over the years, the use of Molecular Imaging is on the rise, especially towards diagnos...

Dysadherin: A Novel Oncogenic Molecular Biomarker in Oesophageal Cancer

Dysadherin or FXYD5 is a transmembrane glycoprotein, identified for the first time in 2002 by a team of Japanese researchers.1 Although, it is generally accepted that most of its action is derived from the increased maxi...

Regulation in Cell Cycle via p53 and PTEN Tumor Suppressors

One of the target effectors of p53 transcription factor is the Phosphatase and Tensin homologue deleted on chromosome 10 (PTEN) which has protein phosphatase activity and lipid phosphatase activity that antagonizes PI3K...

Flavones and Flavonols may have Clinical Potential as CK2 Inhibitors in Cancer Therapy

The serine-threonine kinase CK2, which targets over 300 cellular proteins, is over expressed in all cancers, presumably reflecting its ability to promote proliferation, spread, and survival through a wide range of comple...

Download PDF file
  • EP ID EP551937
  • DOI 10.17140/CSMMOJ-3-120
  • Views 133
  • Downloads 0

How To Cite

Chi Wang (2017). Classifying Lung Adenocarcinoma and Squamous Cell Carcinoma using RNA-Seq Data. Cancer Studies & Molecular Medicine – Open Journal, 3(2), 27-31. https://europub.co.uk/articles/-A-551937