LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

A variable-level automated defect identification model based on machine learning

Photo by cokdewisnu from unsplash

Static analysis tools, automatically detecting potential source code defects at an early phase during the software development process, are diffusely applied in safety-critical software fields. However, alarms reported by the… Click to show full abstract

Static analysis tools, automatically detecting potential source code defects at an early phase during the software development process, are diffusely applied in safety-critical software fields. However, alarms reported by the tools need to be inspected manually by developers, which is inevitable and costly, whereas a large proportion of them are found to be false positives. Aiming at automatically classifying the reported alarms into true defects and false positives, we propose a defect identification model based on machine learning. We design a set of novel features at variable level, called variable characteristics, for building the classification model, which is more fine-grained than the existing traditional features. We select 13 base classifiers and two ensemble learning methods for model building based on our proposed approach, and the reported alarms classified as unactionable (false positives) are pruned for the purpose of mitigating the effort of manual inspection. In this paper, we firstly evaluate the approach on four open-source C projects, and the classification results show that the proposed model achieves high performance and reliability in practice. Then, we conduct a baseline experiment to evaluate the effectiveness of our proposed model in contrast to traditional features, indicating that features at variable level improve the performance significantly in defect identification. Additionally, we use machine learning techniques to rank the variable characteristics in order to identify the contribution of each feature to our proposed model.

Keywords: model; machine learning; variable level; defect identification

Journal Title: Soft Computing
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.