Because software emerged, locating software faults has been intensively researched, culminating in various approaches and tools that have been applied in real development. Despite the success of these developments, improved… Click to show full abstract
Because software emerged, locating software faults has been intensively researched, culminating in various approaches and tools that have been applied in real development. Despite the success of these developments, improved tools are still demanded by programmers. Meanwhile, some programmers are reluctant to use any tools when locating faults in their development. The state-of-the-art situation can be naturally improved by learning how programmers locate faults. The rapid development of open-source software has accumulated many bug fixes. A bug fix is a specific type of comments containing a set of buggy files and their corresponding fixed files, which reveal how programmers repair bugs. Feasibly, an automatic model can learn fault locations from bug fixes, but prior attempts to achieve this vision have been prevented by various technical challenges. For example, most bug fixes are not compilable after checking out, which hinders analyzing bug fixes by most advanced static/dynamic tools. This paper proposes an approach called C la F a that trains a graph-based fault classifier from bug fixes. C la F a is built on a recent partial-code tool called G rapa , which enables the analysis of partial programs by the complete code tool called WALA. Once G rapa has built a program dependency graph from a bug fix, C la F a compares the graph from the buggy code with the graph from the fixed code, locates the buggy nodes, and extracts the various graph features of the buggy and clean nodes. Based on the extraction result, C la F a trains a classifier that combines Adaboost and decision tree learning. The trained C la F a can predict whether a node of a program dependency graph is buggy or clean. We evaluate C la F a on thousands of buggy files collected from four open-source projects: Aries, Mahout, Derby, and Cassandra. The f -scores of C la F a achieves are approximately 80% on all projects.
               
Click one of the above tabs to view related content.