Data Mining Approach for Breast Cancer Patient Recovery

Journal Title: EMITTER International Journal of Engineering Technology - Year 2017, Vol 5, Issue 1

Abstract

Breast cancer is the second highest cancer type which attacked Indonesian women. There are several factors known related to encourage an increased risk of breast cancer, but especially in Indonesia that factors often depends on the treatment routinely. This research examines the determinant factors of breast cancer and measures the breast cancer patient data to build the useful classification model using data mining approach.The dataset was originally taken from one of Oncology Hospital in East Java, Indonesia, which consists of 1097 samples, 21 attributes and 2 classes. We used three different feature selection algorithms which are Information Gain, Fisher’s Discriminant Ratio and Chi-square to select the best attributes that have great contribution to the data. We applied Hierarchical K-means Clustering to remove attributes which have lowest contribution. Our experiment showed that only 14 of 21 original attributes have the highest contribution factor of the breast cancer data. The clustering algorithmdecreased the error ratio from 44.48% (using 21 original attributes) to 18.32% (using 14 most important attributes).We also applied the classification algorithm to build the classification model and measure the precision of breast cancer patient data. The comparison of classification algorithms between Naïve Bayes and Decision Tree were both given precision reach 92.76% and 92.99% respectively by leave-one-out cross validation. The information based on our data research, the breast cancer patient in Indonesia especially in East Java must be improved by the treatment routinely in the hospital to get early recover of breast cancer which it is related with adherence of patient.

Authors and Affiliations

Tresna Maulana Fahrudin, Iwan Syarif, Ali Ridho Barakbah

Keywords

Related Articles

Performance Analysis of Circular 8-QAM Constellation with MMSE Equalizer for OFDM System Using USRP

Bandwidth is very important in communication system, and it is a limited resource. In order to save the limited bandwidth resource, high order M-ary modulation is widely employed in modern communication and broadcasting...

Tooth Color Detection Using PCA and KNN Classifier Algorithm Based on Color Moment

Matching the suitable color for tooth reconstruction is an important step that can make difficulties for the dentists due to the subjective factors of color selection. Accurate color matching system is mainly result bas...

Performance Analysis of CP-Based and CAZAC Training Sequence-Based Synchronization in OFDM System

Orthogonal Frequency Division Multiplexing (OFDM) is a popular wireless data transmission scheme. However, its synchronization is still being a major problem when it is applied in real hardware. Cyclic Prefix (CP) based...

Comparative Study of Modulation-Based Individual Inverter Techniques for Direct and Inverse by using Star-Connection Induction Motor in Extra Low Voltage Application

In this study, the IEEE 519 Standard as a basis benchmarking for voltage (THDV) and current (THDI) in draft performance. Comparative Study based onthree-techniques of 2-Level Converter (2LC) by using a Star-Connection In...

Differential Spatio-temporal Multiband Satellite Image Clustering using K-means Optimization With Reinforcement Programming

Deforestration is one of the crucial issues in Indonesia because now Indonesia has world's highest deforestation rate. In other hand, multispectral image delivers a great source of data for studying spatial and temporal...

Download PDF file
  • EP ID EP312792
  • DOI 10.24003/emitter.v5i1.190
  • Views 65
  • Downloads 0

How To Cite

Tresna Maulana Fahrudin, Iwan Syarif, Ali Ridho Barakbah (2017). Data Mining Approach for Breast Cancer Patient Recovery. EMITTER International Journal of Engineering Technology, 5(1), 36-71. https://europub.co.uk/articles/-A-312792