A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 1
Abstract
Medical data are extensively used in the diagnosis of human health. So it has played a vital role for physicians as well as in medical engineering. Accordingly, many types of research are going on related to this to have a better prediction of the diseases or to improve the diagnosis quality. However, most of the researchers work on either dimensionality space or imbalanced data. Due to this, sometimes one may not have the accurate predictions or classifications of the malignant diseases as both the factors are equally important. So it still needs an improvement or more work required to address these biomedical challenges by combing both the factors. As such this paper proposes a new and efficient combined algorithm based on FD_SMOTE (Farther Distance Based on Synthetic Minority Oversampling Techniques) and Principle Component Analysis (PCA), which successfully reduces the high dimensionality and balances the minority class. Finally, the present algorithm has been investigated on biomedical data and it gives the desired results in terms of dimensionality and data balancing. Here, In this paper, the quality of dimensionality reduction and balanced data has been evaluated using assessment metrics like co-variance, Accuracy (ACC) and Area Under the Curve (AUC). It has been observed from the numerical results that the performance of the algorithm achieved the best accuracy with metrics of ACC and AUC.
Authors and Affiliations
NADIR MUSTAFA, JIAN-PING LI, Raheel A. Memon, Mohammed Z. Omer
Data Citation Service for Wikipedia Articles
The citation of big scientific data is crucial not only for scientific activity but also for the scientific discovery and dissemination within scientist network. The main objective of this research is to develop a servic...
Automatic Construction of Java Programs from Functional Program Specifications
This paper presents a novel approach to construct Java programs automatically from the input functional program specifications on natural numbers from the constructive proofs of the input specifications using an inductiv...
Thinging for Computational Thinking
This paper examines conceptual models and their application to computational thinking. Computational thinking is a fundamental skill for everybody, not just for computer scientists. It has been promoted as skills that ar...
An Embedded Modbus Compliant Interactive Operator Interface for a Variable Frequency Drive Using Rs 485
The paper proposes the architecture and software design of a Modbus Compliant Operator Interface Panel (MCOIP) for a high speed Variable Frequency Drive (VFD) – a state of the art embedded design that offers several key...
A Second Correlation Method for Multivariate Exchange Rates Forecasting
Foreign exchange market is one of the most complex dynamic market with high volatility, non linear and irregularity. As the globalization spread to the world, exchange rates forecasting become more important and complica...