Cross-project defect prediction (CPDP) aims to build a prediction model on existing source projects and predict the labels of target project. The data distribution difference between different projects makes CPDP… Click to show full abstract
Cross-project defect prediction (CPDP) aims to build a prediction model on existing source projects and predict the labels of target project. The data distribution difference between different projects makes CPDP very challenging. Besides, most existing CPDP methods usually require sufficient and labeled data. However, acquiring lots of labeled data for a new project is difficult while obtaining the unlabeled data is relatively easy. A desirable approach is building a prediction model on unlabeled data and labeled data. CPDP in this scenario is called cross-project semi-supervised defect prediction (CSDP). Recently, generative adversarial networks have achieved impressive results with these strong ability of learning data distribution and discriminative representation. For effectively learning the discriminative features of data from different projects, we propose a Discriminative Adversarial Feature Learning (DAFL) approach for CSDP. DAFL consists of feature transformer and project discriminator, which compete with each other. A feature transformer tries to generate feature representation, which learns the discriminant information and preserves intrinsic structure inferred from both labeled and unlabeled data. A project discriminator tries to discriminate source and target instances on the generated representation. Experiments on 16 projects show that DAFL performs significantly better than baselines.
               
Click one of the above tabs to view related content.