A Machine Learning Tool for Weighted Regressions in Time, Discharge, and Season
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2014, Vol 5, Issue 3
Abstract
A new machine learning tool has been developed to classify water stations with similar water quality trends. The tool is based on the statistical method, Weighted Regressions in Time, Discharge, and Season (WRTDS), developed by the United States Geological Survey (USGS) to estimate daily concentrations of water constituents in rivers and streams based on continuous daily discharge data and discrete water quality samples collected at the same or nearby locations. WRTDS is based on parametric survival regressions using a jack-knife cross validation procedure that generates unbiased estimates of the prediction errors. One of the disadvantages of WRTDS is that it needs a large number of samples (n > 200) collected during at least two decades. In this article, the tool is used to evaluate the use of Boosted Regression Trees (BRT) as an alternative to the parametric survival regressions for water quality stations with a small number of samples. We describe the development of the machine learning tool as well as an evaluation comparison of the two methods, WRTDS and BRT. The purpose of the tool is to evaluate the reduction in variability of the estimates by clustering data from nearby stations with similar concentration and discharge characteristics. The results indicate that, using clustering, the predicted concentrations using BRT are in general higher than the observed concentrations. In addition, it appears that BRT generates higher sum of square residuals than the parametric survival regressions.
Authors and Affiliations
Alexander Maestre, Eman El-Sheikh, Derek Williamson, Amelia Ward
Frequency Estimation of Single-Tone Sinusoids Under Additive and Phase Noise
We investigate the performance of main frequency estimation methods for a single-component complex sinusoid under complex additive white Gaussian noise (AWGN) as well as phase noise (PN). Two methods are under test: Maxi...
Nonquadratic Lyapunov Functions for Nonlinear Takagi-Sugeno Discrete Time Uncertain Systems Analysis and Control
This paper deals with the analysis and design of the state feedback fuzzy controller for a class of discrete time Takagi -Sugeno (T-S) fuzzy uncertain systems. The adopted framework is based on the Lyapunov theory and us...
Multivariate Statistical Analysis on Anomaly P2P Botnets Detection
Botnets population is rapidly growing and they become a huge threat on the Internet. Botnets has been declared as Advanced Malware (AM) and Advanced Persistent Threat (APT) listed attacks which is able to manipulate adva...
AN OPEN CLOUD MODEL FOR EXPANDING HEALTHCARE INFRASTRUCTURE
With the rapid improvement of computation facilities, healthcare still suffers limited storage space and lacks full utilization of computer infrastructure. That not only adds to the cost burden but also limits the possib...
Non-Linear Distance Transformation Algorithm and its Application in Medical Image Processing in Healthcare
Medical image processing is one of the most demanding domains of the computing sciences. The importance of the domain is in terms of the CPU and the memory requirements that shall be used by the system to compute the res...