UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Adaptive Cross-Project Bug Localization with Graph Learning

Loading...
Thumbnail Image

Date

2022-06-07

Authors

Arumugam, Venkatraman

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Bug localization is the process of identifying the source code files associated with a bug report. This is important because it allows developers to focus their efforts on fixing the bugs than finding the root cause of bugs in the first place. A number of different techniques have been developed for bug localization, but recent research has shown that supervised approaches using historical data are more effective than other methods. In reality, for the supervised approaches to work, these approaches need high quality and quantity of label-rich datasets. However, preparing training data for new projects and retraining the bug localization models can be highly expensive. Additionally, most of the projects do not have rich historic bug data, as pointed out by Zimmermann et al. This necessitates cross-project bug localization, which involves using data from one project to extract the transferable features to localize bugs in a new project. In this thesis, we aim to provide a bug localization model to locate buggy source code files in a new project without retraining by leveraging the transfer learning capability of deep learning models. Deep learning models can be trained once in a label-rich dataset and transferred to a new dataset. By leveraging deep learning, we propose AdaBL and AdaBL+GL, which can be trained once and transferred to a new project. The main idea behind AdaBL is to learn the syntactic and semantic relationship between bug reports and source code separately. The syntactic patterns are transferable features that exist between cross-projects. We pair AdaBL with a graph neural network to represent the source code as a graph to improve the semantic learning capability. We also performed a detailed survey to compile the bug localization research published since 2016 to examine the experimental settings practiced and the availability of the replication package of deep learning-based bug localization research.

Description

Keywords

software engineering, graph neural networks, bug localization, cross-project, deep learning

LC Keywords

Citation