Handling Multicollinearity; A Comparative Study Of The Prediction Performance Of Some Methods Based On Some Probabiltiy Distributions
Journal Title: Annals. Computer Science Series - Year 2018, Vol 16, Issue 1
Abstract
This study used some probability distribution (Gamma, Beta and Chi-square distributions) to assess the performance of partial least square regression (PLSR), ridge regression (RR) and LASSO regression (LR) methods. Ordinary Least Squares may fail if the variables are almost collinear or related. As such, this methods (PLSR, RR, AND LR) were compared using simulated data that follows gamma, beta and chi-square distributions with number variables (P=4 and 10) and sample sizes (n=60 and 90). The comparison was carried out using Mean Square Log Error (MSLE), Mean Absolute Error (MAE) and R-Square (R2) which shows that the results of RR is better when P=4 and n=60 using gamma distribution, but using chi square distribution PLRS is better methods. Also, when P=4 and n=90, RR shows better results with both gamma and beta distributions but with chi square distribution all methods have equal predictive ability. However, at P=10 and n=60 RR performed better with both gamma and chi square distributions while when data follows beta distribution all distributions have equal predictive ability. RR shows better results at both gamma and chi square distributions when P=10 and n=90 while PLSR performed better with beta distribution.
Authors and Affiliations
ZAKARI Yahaya ZAKARI, S. A. Yau, U. USMAN
Simulation of an Intelligent Traffic Light using Embedded System
The level of urbanization in developing nations indicates that more people live in cities than before. This increase heaviness on traffic flow and makes living in urban area complex. Traffic control at road junction whic...
New Trends in Modelling Climate Change in the Era of Big Data
Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with. It is typically characterized by the so called, seven “V’s” namely; volume, velo...
Motivating students in learning mathematics with GeoGebra
In this study, I researched three cases of educating mathematics with computer for the gifted students in Korea. The findings show that students were motivated to study math by various reasons (making their beautiful wor...
Modelling Queuing System with Inverse Gamma Distribution: A Spreedsheet Simulation Approach
There is a need to provide user friendly approach to modeling and simulation for learners and business modeler. This study offers process-driven queuing simulation via spreadsheet which provides a user friendly, yet a re...
Framework for a Genetic-Neuro-Fuzzy Inferential System for Diagnosis of Diabetes Mellitus
One of the most dangerous diseases in the modern society is diabetes mellitus and it is not only a medical problem but also a socio-economy. Artificial Intelligence techniques have been successfully employed in diabetes...