Classification and Diagnostic Prediction of Breast Cancers via Different Classifiers
Journal Title: International Scientific and Vocational Studies Journal - Year 2018, Vol 2, Issue 2
Abstract
Cancer is one of the leading causes of human death in the world and has caused the death of approximately 9.6 million people in 2018. Breast cancer is the most important cause of cancer deaths in women. However, breast cancer is a type of cancer that can be treated when diagnosed early. The aim of this study is to identify cancer early in life. In this study, early diagnosis and treatment were performed by using machine learning methods. The characteristics of the people included in the Wisconsin Diagnostic Breast Cancer (WDBC) data set were classified by support vector machines (SVM), k-nearest neighborhood, Naive Bayes, J48 and random forests methods. The preprocessing step was applied to the data set prior to classification. After the preprocessing stage, 5 different classifiers were applied to the data using 10-fold cross-validation method. Accuracy, sensitivity, specificity values and confusion matrices were used to measure the success of the methods. As a result of the application, it was found that SVM with linear kernel was the most successful method with 98.24% success rate. Although it was a very simple method, the second most successful method was the k-nearest neighborhood method with a success rate of 97.72%. When the results obtained from feature selection are evaluated, it is seen that feature selection and other preprocessing methods increase the success of the system. It can be said that the success achieved in comparison with previous studies is at a good level.
Authors and Affiliations
Ahmet Saygılı
Exterior Scaffolds of Prefabricated Components, Loads Affecting Scaffolds and Scaffold Experiments
In the construction sector, constructions, painting, heat insulation, coating and so on. Outside facade scaffoldings are widely used in outdoor facade applications. The floats consist of temporary elements which are des...
Performance Analysis of Storage, Grid Connected Hybrid Photovoltaic System
Photovoltaic solar energy plants are rapidly increasing. These systems are generally on-grid or off-grid photovoltaic systems. In this study, a hybrid system is realized and analyzed. This system contains feature of on-g...
A Comparison of Five Methods for Missing Value Imputation in Data Sets
The missing values in the data sets do not allow for accurate analysis. Therefore, the correct imputation of missing values has become the focus of attention of researchers in recent years. This paper focuses on a compar...
Prediction of Evaporation Values of Konya Closed Basin via Developed Empirical Formula
Accurate evaporation prediction is significant for the management of water resources systems. The advantage of empirical formulas is that they don’t require a lot of parameters. In this study, evaporation values of mete...
A Reduced Reference Metric for Enhanced 3D Video Perception
Currently, one of the trending research topics among the researchers assisting to the enhancement of the 3D video services relies on the 3 Dimensional (3D) video Quality of Experience (QoE) prediction metric development...