Abstract Selecting a proper set of covariates is one of the most important factors that influence the accuracy of digital soil mapping (DSM). The statistical or machine learning methods for… Click to show full abstract
Abstract Selecting a proper set of covariates is one of the most important factors that influence the accuracy of digital soil mapping (DSM). The statistical or machine learning methods for selecting DSM covariates are not available for those situations with limited samples. To solve the problem, this paper proposed a case-based method which could formalize the covariate selection knowledge contained in practical DSM applications. The proposed method trained Random Forest (RF) classifiers with DSM cases extracted from the practical DSM applications and then used the trained classifiers to determine whether each one potential covariate should be used in a new DSM application. In this study, we took topographic covariates as examples of covariates and extracted 191 DSM cases from 56 peer-reviewed journal articles to evaluate the performance of the proposed case-based method by Leave-One-Out cross validation. Compared with a novices’ commonly-used way of selecting DSM covariates, the proposed case-based method improved more than 30% accuracy according to three quantitative evaluation indices (i.e., recall, precision, and F1-score). The proposed method could be also applied to selecting the proper set of covariates for other similar geographical modeling domains, such as landslide susceptibility mapping, and species distribution modeling.
               
Click one of the above tabs to view related content.