Optimal Tree Depth in Decision Tree Classifiers for Predicting Heart Failure Mortality
Journal Title: Healthcraft Frontiers - Year 2023, Vol 1, Issue 1
Abstract
The depth of a decision tree (DT) affects the performance of a DT classifier in predicting mortality caused by heart failure (HF). A deeper tree learns complex patterns within the data, theoretically leading to better predictive performance. A very deep tree also leads to overfitting, because the model learns the training data rather than generalize to new and unseen data, resulting in a lower classification performance on test data. Similarly, a shallow tree does not learn much of the complexity within the data, leading to underfitting and a lower performance. The pruning method has been proposed to set a limit on the maximum tree depth or the minimum number of instances required to split a node to reduce the computational complexity. Pruning helps avoid overfitting. However, it does not help find the optimal depth of the tree. To build a better-performing DT classifier, it is crucial to find the optimal tree depth to achieve optimal performance. This study proposed cross-validation to find the optimal tree depth using validation data. In the proposed method, the cross-validated accuracy for training and test data is empirically tested using the HF dataset, which contains 299 observations with 11 features collected from the Kaggle machine learning (ML) data repository. The observed result reveals that tuning the DT depth is significantly important to balance the learning process of the DT because relevant patterns are captured and overfitting is avoided. Although cross-validation techniques prove to be effective in determining the optimal DT depth, this study does not compare different methods to determine the optimal depth, such as grid search, pruning algorithms, or information criteria. This is the limitation of this study.
Authors and Affiliations
Tsehay Admassu Assegie, Ahmed Elaraby
Impact of Maternal Health Education on Pediatric Oral Health in Banda Aceh: A Quasi-Experimental Study
In Banda Aceh City, Indonesia, particularly in Punge Jurong Gampong, the effectiveness of child oral health service interventions is notably impacted by the level of maternal knowledge and involvement. This quasi-experim...
Evaluation of Factors Contributing to Potential Drug-Drug Interactions in Cardiovascular Disease Management: A Retrospective Study
A retrospective analysis was conducted to assess potential drug-drug interactions (pDDIs) in the management of cardiovascular diseases, evaluating 500 prescriptions from hospitalized patients between January 1 and Apri...
Segmentation and Classification of Skin Cancer in Dermoscopy Images Using SAM-Based Deep Belief Networks
In the field of computer-aided diagnostics, the segmentation and classification of biomedical images play a pivotal role. This study introduces a novel approach employing a Self-Augmented Multistage Deep Learning Network...
Pneumonia Detection Technique Empowered with Transfer Learning Approach
Detection of normal findings or pneumonia using modern technology has a lot of significance in medical analysis and artificial intelligence. Still, more specifically, its importance increases in deep learning. Deep l...
A CNN Approach for Enhanced Epileptic Seizure Detection Through EEG Analysis
Epilepsy, the most prevalent neurological disorder, is marked by spontaneous, recurrent seizures due to widespread neuronal discharges in the brain. This condition afflicts approximately 1% of the global population, with...