LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An extensive evaluation of ensemble techniques for software change prediction

Photo from wikipedia

Predicting the areas of the source code having a higher likelihood to change in the future represents an important activity to allow developers to plan preventive maintenance operations. For this… Click to show full abstract

Predicting the areas of the source code having a higher likelihood to change in the future represents an important activity to allow developers to plan preventive maintenance operations. For this reason, several change prediction models have been proposed. Moreover, research community demonstrated how different classifiers impact on the performance of devised models as well as classifiers tend to perform similarly even though they are able to correctly predict the change proneness of different code elements, possibly indicating the presence of some complementarity among them. In this paper, we deeper investigated whether the use of ensemble approaches, ie, machine learning techniques able to combine multiple classifiers, can improve the performances of change prediction models. Specifically, we built three change prediction models based on different predictors, ie, product‐, process‐ metrics‐, and developer‐related factors, comparing the performances of four ensemble techniques (ie, Boosting, Random Forest, Bagging, and Voting) with those of standard machine learning classifiers (ie, Logistic Regression, Naive Bayes, Simple Logistic, and Multilayer Perceptron). The study was conducted on 33 releases of 10 open‐source systems, and the results showed how ensemble methods and in particular Random Forest provide a significant improvement of more than 10% in terms of F measure. Indeed, the statistical analyses conducted confirm the superiority of this ensemble technique. Moreover, the model built using developer‐related factors performed better than the other models that exploit product and process metrics and achieves an overall median of F measure around 77%.

Keywords: ensemble techniques; change prediction; change; prediction models; software

Journal Title: Journal of Software: Evolution and Process
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.