Handling Multicollinearity; A Comparative Study Of The Prediction Performance Of Some Methods Based On Some Probabiltiy Distributions
Journal Title: Annals. Computer Science Series - Year 2018, Vol 16, Issue 1
Abstract
This study used some probability distribution (Gamma, Beta and Chi-square distributions) to assess the performance of partial least square regression (PLSR), ridge regression (RR) and LASSO regression (LR) methods. Ordinary Least Squares may fail if the variables are almost collinear or related. As such, this methods (PLSR, RR, AND LR) were compared using simulated data that follows gamma, beta and chi-square distributions with number variables (P=4 and 10) and sample sizes (n=60 and 90). The comparison was carried out using Mean Square Log Error (MSLE), Mean Absolute Error (MAE) and R-Square (R2) which shows that the results of RR is better when P=4 and n=60 using gamma distribution, but using chi square distribution PLRS is better methods. Also, when P=4 and n=90, RR shows better results with both gamma and beta distributions but with chi square distribution all methods have equal predictive ability. However, at P=10 and n=60 RR performed better with both gamma and chi square distributions while when data follows beta distribution all distributions have equal predictive ability. RR shows better results at both gamma and chi square distributions when P=10 and n=90 while PLSR performed better with beta distribution.
Authors and Affiliations
ZAKARI Yahaya ZAKARI, S. A. Yau, U. USMAN
Bayesian Classification of High Dimensional Data with Gaussian Process using Different Kernels
The study investigates asymptotic classification of high dimensional data by adopting Gaussian Process, five different kernels(covariance functions) were employed and compared to showcase the outperformed kernel asymptot...
Environmental Waste Management in Ilorin Metropolis using Software Application
One of the major causes of death in Ilorin metropolis which is the capital of Ilorin Kwara State is the issue of environmental waste. Wastes are unused and rejected materials from household, schools, industries and highw...
An Improved Procedure for Fourier Regression Analysis
Fourier regression is a method used to represent time series by a set of elementary functions called basis. This work was used to propose a new procedure for Fourier regression which has the ability to reveal the period...
Modelling of Enugu State Monthly Rainfall using Box and Jenkins Methodology
The paper examined the rainfall distribution of Enugu state in Nigeria. Box-Jenkins methodology was used to build ARIMA model to analyze data and forecast for the period of 15 years, from January, 2002 to December, 2016...
Limits of Educational Soft “GeoGebra” in a Critical Constructive Review
Mathematical educational soft explore, investigating in a dynamical way, some algebraically, geometrically problems, the expected results being used to involve a lot of mathematical results. One such software soft is Geo...