Abstract Ambient air ozone (O3), a secondary photochemical pollutant, is seriously harmful to human health. Accurate estimation of O3 exposure requires the ability to monitor O3 surface concentration with a… Click to show full abstract
Abstract Ambient air ozone (O3), a secondary photochemical pollutant, is seriously harmful to human health. Accurate estimation of O3 exposure requires the ability to monitor O3 surface concentration with a high spatiotemporal resolution. Several spatiotemporal land use regression (LUR) models have integrated meteorological factors based on different statistical algorithms to support such epidemiological studies. From among such various existing statistical algorithms, we aim to identify a high-efficiency modeling method, as well as the most suitable lengths of the modeling period (time scale). Three types of typical spatiotemporal LUR models based on parametric, semi-parametric, and non-parametric statistic methods, respectively, are considered to predict daily ground-level O3 in the megacity of Tianjin, China. Based on monthly, seasonal (cold and warm), and annual time scales, these models include: a series of monthly hybrid LUR (Two-stage) models consisting of two sub-models based on the multiple linear regression (MLR) algorithm, general additive mixed models (GAMMs), and land use random forest (LURF) models. Leave-one-out cross-validation was performed to evaluate the temporal and spatial predictive accuracy of each model using the adjusted coefficient of determination (adjR2CV) and root mean square error (RMSECV). In the GAMMs and LURF models, models using a shorter time scale (monthly models) outperformed those using a longer one. In monthly models, the GAMMs performed the best, with the highest average adjR2CV (0.747) and the lowest average RMSECV (15.721 μg/m3), followed by the LURF models (average adjR2CV = 0.695, average RMSECV = 16.405), and the Two-stage models (average adjR2CV = 0.466, average RMSECV = 23.934). Thus, the modeling format consisting of a shorter time scale and the GAMM algorithm performs relatively well in predicting daily O3 pollution on a megacity scale. These findings can be used to select appropriate modeling methods for epidemiological research of O3 pollution.
               
Click one of the above tabs to view related content.