Class imbalance problems are prevalent in the real world. In such cases, traditional supervised algorithms tend to have difficulty in recognizing minority data because the models are likely to maximize… Click to show full abstract
Class imbalance problems are prevalent in the real world. In such cases, traditional supervised algorithms tend to have difficulty in recognizing minority data because the models are likely to maximize prediction accuracy by simply ignoring minority data. To address the class imbalance problem, various approaches have been tried, including data preprocessing techniques, cost-sensitive learning, and ensemble modeling. Recently, several hybrid models combining sampling methods with boosting have been proposed, such as RUSBoost, LIUBoost, and CUSBoost. In this study, a novel under-sampling-based boosting method named MPSUBoost is proposed to handle the class imbalance problem. The proposed method is an integration of modified PSU and AdaBoost. The performance benchmark testing conducted on 35 highly imbalanced datasets indicated that the proposed method provided performance improvement over three existing methods (RUSBoost, LIUBoost, and CUSBoost). Moreover, we verified that the samples obtained by MPSUBoost effectively represented the given majority data, which led to a competitive advantage in the imbalanced data, particularly when true positives are imperative.
               
Click one of the above tabs to view related content.