Abstract Improvements of Machine Learning-based landslide prediction models can be made by optimizing scale, customizing training samples to provide sets with the best examples, feature selection, etc. Herein, a novel… Click to show full abstract
Abstract Improvements of Machine Learning-based landslide prediction models can be made by optimizing scale, customizing training samples to provide sets with the best examples, feature selection, etc. Herein, a novel approach, named Cross-Scaling, is proposed that includes the mixing of training and testing set resolutions. Hypothetically, training on a coarser resolution dataset and testing the model on a finer resolution should help the algorithm to better generalize ambiguous examples of landslide classes and yield fewer over/underestimations in the model. This case study considers the City of Belgrade area for training and its south-eastern suburb for testing. The dataset is exceptionally rich with detailed geological, morphological and environmental data, so 24 landslide predictors were used for multi-class mapping: Class 0 – stable ground, Class 1 - dormant landslides, and Class 2 – active landslides. Two state-of-the-art algorithms were implemented: Support Vector Machines and Random Forest. Additionally, our modelling included variants with an implemented feature selection by using the Information Gain and Correlation Feature Selection. All these variants were modelled across four resolutions - 25, 50, 100 and 200 m, whereby Cross-Scaling was implemented as follows: training on 50 and testing on 25, training on 100 and testing on 25, training on 100 and testing on 50, training on 200 and testing on 25, training on 200 and testing on 50, and finally, training on 200 and testing on 100 m resolution datasets. The results clearly show that Cross-Scaling improves the performance of the model, especially for Class 2, when compared to the performance of their non-Cross-Scaled counterparts; this thereby proves the initial hypothesis. Random Forest models tend to be less sensitive to scale and feature selection effects than the SVM. Class 1 remains the most difficult to discern, leaving some room for even further customization and adjustments. In conclusion, the Cross-Scaling technique is proposed as a method that could become a promising tool for training/testing protocols in landslide assessment.
               
Click one of the above tabs to view related content.