Application of imputation methods for missing values of PM10 and O3 data: Interpolation, moving average and K-nearest neighbor methods

Journal Title: Environmental Health Engineering and Management Journal - Year 2021, Vol 8, Issue 3

Abstract

Background: PIn air quality studies, it is very often to have missing data due to reasons such as machine failure or human error. The approach used in dealing with such missing data can affect the results of the analysis. The main aim of this study was to review the types of missing mechanism, imputation methods, application of some of them in imputation of missing of PM10 and O3 in Tabriz, and compare their efficiency. Methods: Methods of mean, EM algorithm, regression, classification and regression tree, predictive mean matching (PMM), interpolation, moving average, and K-nearest neighbor (KNN) were used. PMM was investigated by considering the spatial and temporal dependencies in the model. Missing data were randomly simulated with 10, 20, and 30% missing values. The efficiency of methods was compared using coefficient of determination (R2), mean absolute error (MAE) and root mean square error (RMSE). Results: Based on the results for all indicators, interpolation, moving average, and KNN had the best performance, respectively. PMM did not perform well with and without spatio-temporal information. Conclusion: Given that the nature of pollution data always depends on next and previous information, methods that their computational nature is based on before and after information indicated better performance than others, so in the case of pollutant data, it is recommended to use these methods.

Authors and Affiliations

Parisa Saeipourdizaj, Parvin Sarbakhsh, Akbar Gholampour

Keywords

Related Articles

Knowledge, attitude, and practice of nurse aids and service staff about nosocomial infection control: A case study in Iran

Background: Hospital-acquired infection (HAI) or nosocomial infection is a major public health concern. In this study, the status of knowledge, attitude, and practice (KAP) of service staff and nurse aids in reference...

Comparison of three digestion methods for determination of lead and cadmium in milk and dairy products

Background: Toxic metals enter the human food chain through water, soil, and plants. High consumption of dairy products makes it necessary to measure their concentrations in milk and its products. Methods: In this stu...

Health and safety hazards identification and risk assessment in the swimming pools using combined HAZID and ALARP

Background: Swimming pools are recreation and sport sites where the lack of safety and health can have severe adverse effect on public health. This study aimed to identify and assess health and safety risks using HAZID...

Assessment of pollution and ecological risk of heavy metals in the sediments of the western coast of Persian Gulf: A case study in Dayer port

Background: In recent years, the pollution of heavy metals in the beaches has been noticed due to the increase of human activities. This study aimed to evaluate heavy metal pollution, its ecological risk, and possible...

Assessment of air pollution in exercise centers and health risks

Background: In recent years, Tehran has faced major problems with air pollution for many reasons, and this issue has become a critical point in most of the days. However, less attention has been paid to the indoor air po...

Download PDF file
  • EP ID EP696923
  • DOI 10.34172/EHEM.2021.25
  • Views 123
  • Downloads 0

How To Cite

Parisa Saeipourdizaj, Parvin Sarbakhsh, Akbar Gholampour (2021). Application of imputation methods for missing values of PM10 and O3 data: Interpolation, moving average and K-nearest neighbor methods. Environmental Health Engineering and Management Journal, 8(3), -. https://europub.co.uk/articles/-A-696923