LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Effect of sample number and location on accuracy of land use regression model in NO2 prediction

Photo from wikipedia

Abstract Land use regression model (LUR) is one of the most commonly used methods to project the spatial concentration of ambient pollutants. The number and location of samples are two… Click to show full abstract

Abstract Land use regression model (LUR) is one of the most commonly used methods to project the spatial concentration of ambient pollutants. The number and location of samples are two key factors affecting the accuracy of LUR, yet limited detail is known to us. In order to explore such effect, we collected NO2 monitoring data in high spatial density with a total of 263 sites in Shijiazhuang city of China, and designed four sampling strategies: random sampling, regular sampling, attribute hierarchical sampling, and purposive sampling. Under each strategy, LUR model was repeatedly built with increasing number of modeling site (NMS). Results showed that NMS and their locations affected model performance largely especially when NMS was less than 30. With the increase of NMS, the accuracy of LUR models gradually stabilized. The minimum NMS required for LUR would be 30, and the ideal number would be 60 for the study area. Purposive sampling was the most efficient strategies. R2 during modeling and cross validation was greatly inflated comparing to hold-out validation, which was more obvious with less NMS.

Keywords: use regression; accuracy; regression model; number; model; land use

Journal Title: Atmospheric Environment
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.