Undersampling is one of the most popular techniques for dealing with class-imbalance problems. Various undersampling methods have emerged over the past few decades. Each of them exhibits the superiority in… Click to show full abstract
Undersampling is one of the most popular techniques for dealing with class-imbalance problems. Various undersampling methods have emerged over the past few decades. Each of them exhibits the superiority in some scenarios. However, selecting representative majority-class samples such that the structures of the selected groups are maintained according to the underlying imbalanced distribution remains a challenge. For this purpose, this paper proposes Spatial Distribution-based UnderSampling (SDUS) for imbalanced learning. SDUS uses a supervised constructive process to learn majority-class local patterns in terms of sphere neighborhoods (SPN). Two sample selection strategies, specifically, a top-down strategy and a bottom-up strategy, are proposed for maintaining the distribution pattern of original data in selecting majority-class sample subsets from different perspectives. SDUS introduces an ensemble technique that improves learning performance by utilizing the diversity caused by the randomness of the local-pattern learning process. Numerical experiments on 38 typical datasets from KEEL repository and 13 state-of-the-art comparison methods demonstrate the effectiveness of SDUS in maintaining the underlying distribution characteristics for imbalanced undersampling. The implementation of the proposed SDUS in programming language Python is available at https://github.com/ytyancp/SDUS.
               
Click one of the above tabs to view related content.