Classifying Lung Adenocarcinoma and Squamous Cell Carcinoma using RNA-Seq Data

Journal Title: Cancer Studies & Molecular Medicine – Open Journal - Year 2017, Vol 3, Issue 2

Abstract

Background: Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are two primary subtypes of non-small cell lung carcinoma (NSCLC). Currently, the most widely used method to discriminate between LUAD and LUSC is hematoxylin-eosin (HE) staining. However, this method sometimes is unable to make the precise diagnosis on LUAD or LUSC. More accurate diagnostic approaches are highly desired. Methods: We propose to use gene expression profile to discriminate NSCLC patient’s subtype. We leveraged RNA-Seq data from The Cancer Genome Atlas (TCGA) and randomly split the data into training and testing subsets. To construct classifiers based on the training data, we considered three methods: logistic regression on principal components (PCR), logistic regression with LASSO shrinkage (LASSO), and kth nearest neighbors (KNN). Performances of classifiers were evaluated and compared based on the testing data. Results: All gene expression-based classifiers show high accuracy in discriminating LUSC and LUAD. The classifier obtained by LASSO has the smallest overall misclassification rate of 3.42% (95% CI: 3.25%-3.60%) when using 0.5 as the cutoff value for the predicted probability of belonging to a subtype, followed by classifiers obtained by PCR (4.36%, 95% CI: 4.23%- 4.49%) and KNN (8.70%, 95% CI: 8.57%-8.83%). The LASSO classifier also has the highest average area under the receiver operating characteristic curve (AUC) value of 0.993, compared to PCR (0.987) and KNN (0.965). Conclusions: Our results suggest that mRNA expressions are highly informative for classifying NSCLC subtypes and may potentially be used to assist clinical diagnosis.

Authors and Affiliations

Chi Wang

Keywords

Related Articles

Potential of Molecular Imaging to Advance Molecular Medicine

Molecular imaging is a technology that allows for non-invasive interrogation of physiological and biochemical processes in the body. Over the years, the use of Molecular Imaging is on the rise, especially towards diagnos...

Bilateral Symmetric Thalamic Metastasis in a Patient with Small Cell Lung Cancer

A 68-year old woman was referred to our hospital because of dry cough with persistent numbness in the extremities. Chest roentgenogram showed right hilar tumor with mediastinal lymphnode and brain magnetic resonance imag...

Effects of Three Dimensional Microenvironment on Tumorigenicity of Fibrosarcoma in vitro

Tumor microenvironment plays an important role in cancer progression owing to interactions between the tumor and adjoining cells and, as in bone marrow, the unique architecture and chemical compounds that characterize it...

The Proportion of ALDEFLUOR-Positive Cancer Stem Cells Changes with Cell Culture Density Due to the Expression of Different ALDH Isoforms

A significant number of discrepancies exist within the literature regarding ALDEFLUOR-positive stem cell populations in cell lines. We hypothesized that these inconsistencies resulted from differences in culture conditio...

Prostate Cancer Trends in Developing Countries

Prostate cancer is the commonest cancer in USA and most European Countries and is the 2nd commonest cancer among males globally. An estimated 1.1 million men worldwide were diagnosed with prostate cancer in 2012, account...

Download PDF file
  • EP ID EP551937
  • DOI 10.17140/CSMMOJ-3-120
  • Views 127
  • Downloads 0

How To Cite

Chi Wang (2017). Classifying Lung Adenocarcinoma and Squamous Cell Carcinoma using RNA-Seq Data. Cancer Studies & Molecular Medicine – Open Journal, 3(2), 27-31. https://europub.co.uk/articles/-A-551937