Handling Multicollinearity; A Comparative Study Of The Prediction Performance Of Some Methods Based On Some Probabiltiy Distributions
Journal Title: Annals. Computer Science Series - Year 2018, Vol 16, Issue 1
Abstract
This study used some probability distribution (Gamma, Beta and Chi-square distributions) to assess the performance of partial least square regression (PLSR), ridge regression (RR) and LASSO regression (LR) methods. Ordinary Least Squares may fail if the variables are almost collinear or related. As such, this methods (PLSR, RR, AND LR) were compared using simulated data that follows gamma, beta and chi-square distributions with number variables (P=4 and 10) and sample sizes (n=60 and 90). The comparison was carried out using Mean Square Log Error (MSLE), Mean Absolute Error (MAE) and R-Square (R2) which shows that the results of RR is better when P=4 and n=60 using gamma distribution, but using chi square distribution PLRS is better methods. Also, when P=4 and n=90, RR shows better results with both gamma and beta distributions but with chi square distribution all methods have equal predictive ability. However, at P=10 and n=60 RR performed better with both gamma and chi square distributions while when data follows beta distribution all distributions have equal predictive ability. RR shows better results at both gamma and chi square distributions when P=10 and n=90 while PLSR performed better with beta distribution.
Authors and Affiliations
ZAKARI Yahaya ZAKARI, S. A. Yau, U. USMAN
On the Estimation of Empty cell Probabilities in a Contingency Table
In this paper, an Independent Binary Model (IBM) is proposed. It is aimed at estimating cell probabilities in an r x c contingency table when some of the cells have zero count. Existing methods in this situation are eith...
Variable Selection in the Modeling of Nigeria Economic Growth
This study aimed at identifying and retaining factors that contributed immensely to economic growth in Nigeria based on some variable selection methods. Stepwise regressions are often not efficient when there is multicol...
Neuro-Fuzzy Expert System For Diagnosis Of Thyroid Diseases
The computerization of medical procedures has been identified to be one of the major challenges in the medical sector. Several techniques have been used in order to automate the processes in diagnosis of diseases; such p...
Modelling of Enugu State Monthly Rainfall using Box and Jenkins Methodology
The paper examined the rainfall distribution of Enugu state in Nigeria. Box-Jenkins methodology was used to build ARIMA model to analyze data and forecast for the period of 15 years, from January, 2002 to December, 2016...
Computer applications in clinical psychology
The computer-assisted analysis is not currently a novelty, but a necessity in all areas of psychology. A number of studies that examine the limits of the computer assisted and analyzed interpretations, also its advantage...