Optimization Techniques for SCAD Variable Selection in Medical Research

Abstract

High-dimensional data analysis requires variable selection to identify truly relevant variables. More often it is done implicitly via regularization, such as penalized regression. Of the many versions of penalties, SCAD has shown good properties and has been widely adopted in medical research and many more areas. This paper reviews the various optimization techniques in solving SCAD penalized regression. High-dimensional data analysis has been a common and important topic in biomedical/genomic/clinical studies. For example, the identification of genetic factors for complex diseases such as lung cancer implicates a variety of genetic variants. For high-dimensional data, there is the well-known problem of curse of dimensionality arising in modeling. Therefore, variable selection is a fundamental task for high-dimensional statistical modeling. The "old school" way of doing variable selection is to follow a subset selection procedure prior to building the model of interest. The procedure commonly adopts AIC/BIC as evaluation metric and often iterates in a stepwise fashion. Yet this is independent of the subsequent modeling task hence the effectiveness might be less desirable. A more natural way is to integrate the variable selection into the modeling itself, i.e., the penalized regression, which simultaneously performs variable selection and coefficient estimation. Theoretically, the "best" penalty for the penalized regression is the number of non-zero variables, to push as many variables to zero as possible. Yet, it is well known that the L0 (also known as the entropy penalty) optimization [1] is infeasible. As such, the L1 (LASSO) penalty Tibshirani [2] is our "next best" candidate, which is widely adopted in statistical and machine learning community for sparse solutions. However, [3] point out that L1 suffers the problem of biasedness. They propose the Smoothly Clipped Absolute Deviation (SCAD) penalty that can produce unbiased estimates while retaining good properties of L1. Subsequently, the SCAD penalty function has seen a wide range of applications including medical/clinical research, such as [1,4-7]. Nevertheless, the estimating procedure for SCAD penalized regression is no trivial task, because the target function a) is a high-dimensional non-concave function, b) is singular at the origin, c) does not have continuous second order derivatives.

Authors and Affiliations

Yan Fang, Yan Yan Kong, Yumei Jiao

Keywords

Related Articles

Candida Hip Prosthesis Infection: A Case Report

We describe a rare case of a Candida albicans PJI (prosthetic joint infection) in an otherwise healthy woman treated with a two-stage revision and directed antifungal therapy. One year after the final surgery, the patien...

Pancreatic Metastasis of a Colic Carcinoma

Pancreatic secondary tumours are rare and motivate 1.6 and 3.9% of pancreatic resections [1,2]. Most published cases are metastases of clear cell renal carcinoma, but other tumours can also metastasize to the pancreas. T...

Accuracy of Fnac in Diagnosing Thyroid Nodules: A Single Institution Experience

Background: FNA is the first-line investigation in the diagnosis of thyroid nodules. Aims: To determine the accuracy of FNAC in diagnosing thyroid nodules. To identify the major cause of discrepancy in FNAC reports Metho...

Modern Treatment Approaches for Osteoporotic Low -Traumatic Vertebral Body Fractures

Osteoporosis is the global burden public health issue due to the high prevalence and substantial impact on morbidity and mortality; the incidence rates of osteoporotic fracturesare rise exponentially with an aging popula...

Laparoscopic Sacrocolpopexy - A Retrospective Study

Post-hysterectomy vaginal vault prolapsed is a common complication following different types of hysterectomy which can lead to significant urinary, anorectal and sexual impairment in the patients. Pre existing pelvic flo...

Download PDF file
  • EP ID EP585942
  • DOI 10.26717/BJSTR.2018.08.001632
  • Views 129
  • Downloads 0

How To Cite

Yan Fang, Yan Yan Kong, Yumei Jiao (2018). Optimization Techniques for SCAD Variable Selection in Medical Research. Biomedical Journal of Scientific & Technical Research (BJSTR), 8(2), 6425-6426. https://europub.co.uk/articles/-A-585942