"Imbalanced Data Processing Model for Software Defect Prediction"

In the field of software engineering, software defect prediction is the hotspot of the researches which can effectively guarantee the quality during software development. However, the problem of class imbalanced datasets will affect the accuracy of overall classification of software defect prediction, which is the key issue to be solved urgently today. In order to better solve this problem, this paper proposes a model named ASRA which combines attribute selection, sampling technologies and ensemble algorithm. The model adopts the Chi square test of attribute selection and then utilizes the combined sampling technique which includes SMOTE over-sampling and under-sampling to remove the redundant attributes and make the datasets balance. Afterwards, the model ASRA is eventually established by ensemble algorithm named Adaboost with basic classifier J48 decision tree. The data used in the experiments comes from UCI datasets. It can draw the conclusion that the effect of software defect prediction classification which using this model is improved and better than before by comparing the precision P, F-measure and AUC values from the results of the experiments.

Keywords: imbalanced data; defect prediction; model; software defect; software

Journal Title: Wireless Personal Communications
Year Published: 2018

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended