"On the Effect of Spatially Non-Disjoint Training and Test Samples on Estimated Model Generalization Capabilities in Supervised Classification With Spatial Features"

In this letter, we establish two sampling schemes to select training and test sets for supervised classification. We do this in order to investigate whether estimated generalization capabilities of learned models can be positively biased from the use of spatial features. Numerous spatial features impose homogeneity constraints on the image data, whereby a spatially connected set of image elements is attributed identical feature values. In addition to a frequent occurrence of intrinsic spatial autocorrelation, this leads to extrinsic spatial autocorrelation with respect to the image data. The first sampling scheme follows a spatially random partitioning into training and test sets. In contrast to that, the second strategy implements a spatially disjoint partitioning, which considers in particular topological constraints that arise from the deployment of spatial features. Experimental results are obtained from multi- and hyperspectral acquisitions over urban environments. They underline that a large share of the differences between estimated generalization capabilities obtained with the spatially disjoint and non-disjoint sampling strategies can be attributed to the use of spatial features, whereby differences increase with an increasing size of the spatial neighborhood considered for computing a spatial feature. This stresses the necessity of a proper spatial sampling scheme for model evaluation to avoid overoptimistic model assessments.

Keywords: training test; generalization capabilities; model; spatial features

Journal Title: IEEE Geoscience and Remote Sensing Letters
Year Published: 2017

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended