Optimizing Automated Bug Localization for Practical Use

dc.contributor.authorChakraborty, Partha
dc.date.accessioned2024-12-13T18:54:38Z
dc.date.available2024-12-13T18:54:38Z
dc.date.issued2024-12-13
dc.date.submitted2024-12-11
dc.description.abstractA considerable share of resources and developers' efforts is focused on addressing software bugs. Identifying the root causes of these bugs within the codebase is crucial for their resolution. Automated tools for bug localization aim to assist in this process. However, their effectiveness is often limited, leading to low adoption rates. This low adoption rate indicates the disparity between research goals and developers' expectations, emphasizing the need for improvements in bug localization tools. This thesis explores and addresses the challenges faced by developers and tool-builders in implementing practical bug localization tools. Our research focuses on understanding developers' expectations and enhancing the tools' overall effectiveness. Initially, we conduct a mixed-method empirical study to understand developers' expectations. The study reveals that while developers are willing to use bug localization tools, they have concerns related to accuracy and potential leakage of intellectual property. We found that only 27.5% of developers are familiar with these tools. The study indicates that developers need more reliable performance, better integration, flexibility, transparency, and contextual understanding to increase adoption and effectiveness. We also examine performance issues in bug localization tools, particularly with their base—the embedding model. We found that key factors such as pre-training strategies, data familiarity, and input sequence length in embedding techniques significantly affect performance. Our findings show that using project-specific data and pre-training methods like ELECTRA can improve model performance by 25.9%. Additionally, we explore the use of reinforcement learning (RL) in bug localization and propose an RL agent called RLocator. RLocator learns from developer feedback, making it suitable for low-data environments. We also propose BLAZE, an efficient bug localization technique for cross-project and cross-language settings. By using dynamic chunking, a technique that dynamically adjusts the size of the input data to the model, and hard example learning, BLAZE achieves up to a 144% improvement in Mean Average Precision (MAP) compared to previous tools. In conclusion, our findings highlight the shortcomings in the adaptability and efficiency of current tools. We advocate for highly adaptable cross-language, cross-project bug localizers to enhance adoption rates among developers. By leveraging our observations, curated datasets, and proposed methods, tool builders can create more user-friendly bug localization tools for software developers, inspiring a new wave of innovation in this field.
dc.identifier.urihttps://hdl.handle.net/10012/21249
dc.language.isoen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectbug
dc.subjectfault
dc.subjectlocalization
dc.subjectembedding
dc.subjectdeep learning
dc.subjectmachine learning
dc.subjectbug reports
dc.subjectlarge language model
dc.subjectllm
dc.titleOptimizing Automated Bug Localization for Practical Use
dc.typeDoctoral Thesis
uws-etd.degreeDoctor of Philosophy
uws-etd.degree.departmentDavid R. Cheriton School of Computer Science
uws-etd.degree.disciplineComputer Science
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0
uws.contributor.advisorNagappan, Meiyappan
uws.contributor.affiliation1Faculty of Mathematics
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chakraborty_Partha.pdf
Size:
21.05 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: