"SPE$^{2}$: Self-Paced Ensemble of Ensembles for Software Defect Prediction"

Photo by thisisengineering from unsplash

Software defect prediction aims to predict defect-prone code regions automatically before defects are discovered. Accurate prediction helps software practitioners to prioritize their testing efforts. In recent decades, dozens of approaches have been put forward and acquired good results in this field. However, in practical scenarios, many projects have limited labeled instances; more than that, most of these labeled instances are nondefective. The lack of training data and class imbalance problem together bring serious challenges to software defect prediction tasks. So far, few of prevailing approaches can well handle these two difficulties simultaneously. One important reason is that they do not pay adequate attention to several key instances, which are difficult to classify in a small imbalanced dataset. This article introduces the concept of “instance hardness” to integrate various difficulties of imbalance classification tasks. Based on it, a novel imbalance learning framework named self-paced ensemble of ensembles (SPE$^{2}$) is proposed to perform software defect prediction. SPE$^{2}$ aims to generate a strong ensemble of ensembles by self-paced harmonizing instance hardness via undersampling. Finally, SPE$^{2}$ is extensively compared with eight imbalance learning approaches on ten open-source defect datasets. Experiments indicate that SPE$^{2}$ improves the performance and achieves better and more significant F-measure values than its existing counterparts, based on Brunner’s statistical significance test and Cliff’s effect sizes.

Keywords: software defect; inline formula; prediction; tex math

Journal Title: IEEE Transactions on Reliability
Year Published: 2022

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
2

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended