Show simple item record

dc.contributor.authorLiu, Taiyue 16:51:09 (GMT) 04:50:07 (GMT)
dc.description.abstractAutomated program repair has been a heated topic in software engineering. In recent years, we have witnessed many successful applications such as Genprog, SPR, RSRepair, etc. Given a bug and its test suite, which includes both passed test cases and failed test cases, these tools aim to automatically generate a patch that fixes the bug without developers' efforts. All these tools adopt a "Generate-and-Validate" approach, which assumes a tool-generated patch to be correct as long as it passes all its test cases. However, if test suites are of poor quality that cannot cover all the situations, incorrect tool-generated patches might pass all their test cases and be regarded as correct patches. We call such patches that are incorrect but can pass whose test suites as overfitted patches. In order to investigate the reasons why overfitted patches are generated and to reduce overfitted patches, we perform a deep analysis on the patches composed by developers, and the patches (i.e., the correct and the overfitted patches) that are generated by Genprog and SPR. In this thesis, we propose two orthogonal approaches to filter out overfitted patches: 1) To preserve correct tool-generated patches and filter out only overfitted patches, we propose some patterns, named anti-patterns, that can efficiently distinguish correct patches against overfitted patches. We select nine bugs from the Genprog benchmark data set to evaluate the anti-patterns. By embedding the anti-patterns into SPR and filtering out overfitted patches, on average, developers can review 44.7% less tool-generated patches to reach correct patches. Meanwhile, by filtering out overfitted patches at runtime, the anti-patterns speed up SPR's efficiency by 1.34 times on average. 2) We leverage machine learning techniques with meaningful features to predict the correctness of tool-generated patches. Our results show that the machine learning approach cannot preserve correct patches well. In other words, machine learning techniques would mis-classify correct patches as overfitted patches and filter them out. Thus, we believe the machine learning approach requires significant future work, e.g., more representative features and effective classification algorithms, to be useful in practice. These two orthogonal approaches provide automatic program repair tools with valuable guidance on how to avoid generating overfitted patches.en
dc.publisherUniversity of Waterlooen
dc.subjectAutomated Program Repairen
dc.subjectStaged Program Repairen
dc.subjectMachine Learningen
dc.subjectSoftware Engineeringen
dc.subjectOverfitted Patchesen
dc.titleAnti-Patterns for Automatic Program Repairsen
dc.typeMaster Thesisen
dc.pendingfalse and Computer Engineeringen and Computer Engineeringen of Waterlooen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.embargo.terms1 yearen
uws.contributor.advisorTan, Lin
uws.contributor.affiliation1Faculty of Engineeringen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages