Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Abstract

Pollutant forecasting is an important problem in the environmental sciences. Data mining is an approach to discover knowledge from large data. This paper tries to use data mining methods to forecast ?PM?_(2.5) concentration level, which is an important air pollutant. There are several tree-based classification algorithms available in data mining, such as CART, C4.5, Random Forest (RF) and C5.0. RF and C5.0 are popular ensemble methods, which are, RF builds on CART with Bagging and C5.0 builds on C4.5 with Boosting, respectively. This paper builds ?PM?_(2.5) concentration level predictive models based on RF and C5.0 by using R packages. The data set includes 2000-2011 period data in a new town of Hong Kong. The ?PM?_(2.5) concentration is divided into 2 levels, the critical points is 25µg/m^3 (24 hours mean). According to 100 times 10-fold cross validation, the best testing accuracy is from RF model, which is around 0.845~0.854.

Authors and Affiliations

Yin Zhao, Yahya Hasan

Keywords

Related Articles

 Clustering and Bayesian network for image of faces classification

  In a content based image classification system, target images are sorted by feature similarities with respect to the query (CBIR). In this paper, we propose to use new approach combining distance tangent, k-m...

Proposing a Keyword Extraction Scheme based on Standard Deviation, Frequency and Conceptual Relation of the Words

At each text there are a few keywords which provide important information about the content of that text. Since this limited set of words (keywords) is supposed to describe the total concept of a text (e.g. article, book...

Fuzzy Based Evaluation of Software Quality Using Quality Models and Goal Models

Software quality requirements are essential part for the success of software development. Defined and guaranteed quality in software development requires identifying, refining, and predicting quality properties by approp...

Nonlinear Mixing Model of Mixed Pixels in Remote Sensing Satellite Images Taking Into Account Landscape

Nonlinear mixing model of mixed pixels in remote sensing satellite images taking into account landscape is proposed. Most of linear mixing models of mixed pixels do not work so well because the mixed pixels consist of se...

Designing and Building a Framework for DNA Sequence Alignment Using Grid Computing

Deoxyribonucleic acid (DNA) is a molecule that encodes unique genetic instructions used in the development and functioning of all known living organisms and many viruses. This Genetic information is encoded as a sequence...

Download PDF file
  • EP ID EP114913
  • DOI 10.14569/IJACSA.2013.040503
  • Views 87
  • Downloads 0

How To Cite

Yin Zhao, Yahya Hasan (2013). Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms. International Journal of Advanced Computer Science & Applications, 4(5), 21-27. https://europub.co.uk/articles/-A-114913