"A Data Transfer and Relevant Metrics Matching Based Approach for Heterogeneous Defect Prediction"

Heterogeneous defect prediction (HDP) is a promising research area in the software defect prediction domain to handle the unavailability of the past homogeneous data. In HDP, the prediction is performed using source dataset in which the independent features (metrics) are entirely different than the independent features of target dataset. One important assumption in machine learning is that independent features of the source and target datasets should be relevant to each other for better prediction accuracy. However, these assumptions do not generally hold in HDP. Further in HDP, the selected source dataset for a given target dataset may be of small size causing insufficient training. To resolve these issues, we have proposed a novel heterogeneous data preprocessing method, namely, Transfer of Data from Target dataset to Source dataset selected using Relevance score (TDTSR), for heterogeneous defect prediction. In the proposed approach, we have used chi-square test to select the relevant metrics between source and target datasets and have performed experiments using proposed approach with various machine learning algorithms. Our proposed method shows an improvement of at least 14% in terms of AUC score in the HDP scenario compared to the existing state of the art models.

Keywords: heterogeneous defect; source; prediction; defect prediction; relevant metrics; approach

Journal Title: IEEE Transactions on Software Engineering
Year Published: 2023

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
2

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended