The lack of quality label data is considered one of the main bottlenecks for training machine and deep learning (DL) models. Weakly supervised learning using incomplete, coarse, or inaccurate data… Click to show full abstract
The lack of quality label data is considered one of the main bottlenecks for training machine and deep learning (DL) models. Weakly supervised learning using incomplete, coarse, or inaccurate data is an alternative strategy to overcome the scarcity of training data. We trained a U-Net model for segmenting buildings’ footprints from a high-resolution digital elevation model (DEM), using the existing label data from the open-access Microsoft building footprints (MS-BF) dataset. Comparison using an independent, manually labeled benchmark indicated the success of weak supervision learning as the quality of model prediction [intersection over union (IoU): 0.876] surpassed that of the original Microsoft data quality (IoU: 0.672) by approximately 20%. Moreover, adding extra channels such as elevation derivatives, slope, aspect, and profile curvatures did not enhance the weak learning process as the model learned directly from the original elevation data. Our results demonstrate the value of using existing data for training DL models even if they are noisy and incomplete.
               
Click one of the above tabs to view related content.