Impact of Data Preprocessing Techniques on the Performance of Machine Learning Models for Drought Prediction
Journal Title: Acadlore Transactions on AI and Machine Learning - Year 2025, Vol 4, Issue 1
Abstract
Drought, a complex natural phenomenon with profound global impacts, including the depletion of water resources, reduced agricultural productivity, and ecological disruption, has become a critical challenge in the context of climate change. Effective drought prediction models are essential for mitigating these adverse effects. This study investigates the contribution of various data preprocessing steps—specifically class imbalance handling and dimensionality reduction techniques—to the performance of machine learning models for drought prediction. Synthetic Minority Over-sampling Technique (SMOTE) and near miss sampling methods were employed to address class imbalances within the dataset. Additionally, Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were applied for dimensionality reduction, aiming to improve computational efficiency while retaining essential features. Decision tree algorithms were trained on the preprocessed data to assess the impact of these preprocessing techniques on model accuracy, precision, recall, and F1-score. The results indicate that the SMOTE-based sampling approach significantly enhances the overall performance of the drought prediction model, particularly in terms of accuracy and robustness. Furthermore, the combination of SMOTE, PCA, and LDA demonstrates a substantial improvement in model reliability and generalizability. These findings underscore the critical importance of carefully selecting and applying appropriate data preprocessing techniques to address class imbalances and reduce feature space, thus optimizing the performance of machine learning models in drought prediction. This study highlights the potential of preprocessing strategies in improving the predictive capabilities of models, providing valuable insights for future research in climate-related prediction tasks.
Authors and Affiliations
Serap Erçel, Sinem Akyol
Augmenting Diabetic Retinopathy Severity Prediction with a Dual-Level Deep Learning Approach Utilizing Customized MobileNet Feature Embeddings
Diabetic retinopathy, a severe ocular disease correlated with elevated blood glucose levels in diabetic patients, carries a significant risk of visual impairment. The essentiality of its timely and precise severity class...
Comparative Analysis of Mortality Predictions from Lassa Fever in Nigeria: A Study Using Count Regression and Machine Learning Methods
In Sub-Saharan Africa, particularly in Nigeria, Lassa fever poses a significant infectious disease threat. This investigation employed count regression and machine learning techniques to model mortality rates associated...
Hierarchical Aggregate Assessment of Multi-Level Teams Using Competency Ontologies
It is complex to assess multi-level hierarchical teams, because the solution needs to organize their rapid dynamic adaptation to perform operational tasks, and train team members without sufficient competencies, skills a...
Innovative Hybrid Deep Learning Models for Financial Sentiment Analysis
This study explores hybrid deep learning architectures for the classification of financial sentiment, focusing on the integration of the Convolutional Neural Network (CNN) with the Support Vector Machine (SVM) and the Ra...
A Novel Machine Learning Approach for Optimizing Radar Warning Receiver Preprogramming
Radar warning receivers (RWRs) are critical for swiftly and accurately identifying potential threats in complex electromagnetic environments. Numerous methods have been developed over the years, with recent advances in a...