"Effect of sample number and location on accuracy of land use regression model in NO2 prediction"

Abstract Land use regression model (LUR) is one of the most commonly used methods to project the spatial concentration of ambient pollutants. The number and location of samples are two key factors affecting the accuracy of LUR, yet limited detail is known to us. In order to explore such effect, we collected NO2 monitoring data in high spatial density with a total of 263 sites in Shijiazhuang city of China, and designed four sampling strategies: random sampling, regular sampling, attribute hierarchical sampling, and purposive sampling. Under each strategy, LUR model was repeatedly built with increasing number of modeling site (NMS). Results showed that NMS and their locations affected model performance largely especially when NMS was less than 30. With the increase of NMS, the accuracy of LUR models gradually stabilized. The minimum NMS required for LUR would be 30, and the ideal number would be 60 for the study area. Purposive sampling was the most efficient strategies. R2 during modeling and cross validation was greatly inflated comparing to hold-out validation, which was more obvious with less NMS.

Keywords: use regression; accuracy; regression model; number; model; land use

Journal Title: Atmospheric Environment
Year Published: 2020

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended