A Novel Data Handling Technique for Wine Quality Analysis using ML Techniques
Journal Title: International Journal of Experimental Research and Review - Year 2024, Vol 45, Issue 9
Abstract
In this era, wine is a regularly redeemed beverage, and industries are seeing increased sales due to product quality certification. This research aims to identify key wine characteristics that contribute to significant outcomes through the application of machine learning classification techniques, specifically Random Forest (RF), Decision Tree (DT) and Multi-Layer Perceptron (MLP), using white and red wine datasets sourced from the UCI Machine Learning repository. This research aims to develop a multiclass classification model using machine learning (ML) to accurately assess the quality of a balanced wine dataset comprising both white and red wines. The dataset is balanced by random oversampling to avoid biases in ML techniques for the majority class obtained by the imbalanced multiclass dataset (IMD). Furthermore, we apply a Yeo-Jhonson transformation (YJT) to the datasets to reduce skewness. We validated the ML algorithm's result using a 10-fold cross-validation approach and found that RF yielded the highest overall accuracy of 93.14%, within a range of 75% to 94%. We have observed that the proposed approach for balanced white wine dataset accuracy is 93.14% using RF, 90.83% using DT, and 75.49% using MLP. Similarly, for the balanced red wine dataset, accuracy is 89.36% using RF, 85.36% using DT, and 78.00% using MLP. The proposed approach improves accuracy by RF 23%, DT 30%, and MLP 21% for the white wine dataset. Similarly, accuracy by RF remained the same, DT 10%, and MLP 22% is improved in the red wine dataset. Additionally, the proposed approach's RF, DT, and MLP yield mean squared error (MSE) values of 0.080, 0.151, and 0.443 for the white wine dataset and 0.143, 0.221, and 0.396 for the red wine dataset. We also observed that the RF accuracy for the proposed technique is the highest among all specified classifiers for white and red wine datasets, respectively.
Authors and Affiliations
Onima Tigga, Jaya Pal, Debjani Mustafi
A study on abundance and group diversity of soil microarthropods at four different soil habitats in North Dinajpur, West Bengal, India
Sampling was conducted at four different sites i.e., an agricultural field, a river basin, a brick field and a forest floor from the district of Uttar Dinajpur, West Bengal, India. Though abundance was higher at the fore...
Ameliorating effects of Vit-C on protein and nucleic acid content in dimecron intoxicated chick embryos
Dimecron when introduced into the fertilized hen’s egg at a certain dose before incubation shows a characteristic and interesting feature which has been studied and discussed. A quantitative study of proteins from differ...
Machine Learning-Driven Assessment and Security Enhancement for Electronic Health Record Systems
The digitalized patient-centric system, the Electronic Health Record (EHR), is a platform where comprehensive health information is stored, managed, and accessed electronically. The primary findings of this study aim to...
A Comparative Phenological Studies of High-Value Medicinal Herbs: Cassia tora and Argemone maxicana in Achanakmar Regions of Chhattisgarh, India
The phenological behaviour of two herb species, Cassia tora and Argemone maxicana, were studied in Shivtarai Achanakmar regions of Chhattisgarh state, India, from January 2020 to December 2022. Both these species showed...
Validated Stability Indicating UHPLC Method for the Quantification of Escitalopram and Flupentixol in Pharmaceutical Formulation
To assess Escitalopram and flupentixol simultaneously, a verified method for ultra-phase high-performance liquid chromatography (UHPLC) has been developed to indicate stability. The method was thoroughly evaluated and me...