Improving the Correctness of Automated Program Repair
MetadataShow full item record
Developers spend much of their time fixing bugs in software programs. Automated program repair (APR) techniques aim to alleviate the burden of bug fixing from developers by generating patches at the source-code level. Recently, Generate-and-Validate (G&V) APR techniques show great potential to repair general bugs in real-world applications. Recent evaluations show that G&V techniques repair 8–17.7% of the collected bugs from mature Java or C open-source projects. Despite the promising results, G&V techniques may generate many incorrect patches and are not able to repair every single bug. This thesis makes contributions to improve the correctness of APR by improving the quality assurance of the automatically-generated patches and generating more correct patches by leveraging human knowledge. First, this thesis investigates whether improving the test-suite-based validation can precisely identify incorrect patches that are generated by G&V, and whether it can help G&V generate more correct patches. The result of this investigation, Opad, which combines new fuzz-generated test cases and additional oracles (i.e., memory oracles), is proposed to identify incorrect patches and help G&V repair more bugs correctly. The evaluation of Opad shows that the improved test-suite-based validation identifies 75.2% incorrect patches from G&V techniques. With the integration of Opad, SPR, one of the most promising G&V techniques, repairs one additional bug. Second, this thesis proposes novel APR techniques to repair more bugs correctly, by leveraging human knowledge. Thus, APR techniques can repair new types of bugs that are not currently targeted by G&V APR techniques. Human knowledge in bug-fixing activities is noted in the forms such as commits of bug fixes, developers’ expertise, and documentation pages. Two techniques (APARE and Priv) are proposed to target two types of defects respectively: project-specific recurring bugs and vulnerability warnings by static analysis. APARE automatically learns fix patterns from historical bug fixes (i.e., originally crafted by developers), utilizes spectrum-based fault-localization technique to identify highly-likely faulty methods, and applies the learned fix patterns to generate patches for developers to review. The key innovation of APARE is to utilize a percentage semantic-aware matching algorithm between fix patterns and faulty locations. For the 20 recurring bugs, APARE generates 34 method fixes, 24 of which (70.6%) are correct; 83.3% (20 out of 24) are identical to the fixes generated by developers. In addition, APARE complements current repair systems by generating 20 high-quality method fixes that RSRepair and PAR cannot generate. Priv is a multi-stage remediation system specifically designed for static-analysis security-testing (SAST) techniques. The prototype is built and evaluated on a commercial SAST product. The first stage of Priv is to prioritize workloads of fixing vulnerability warnings based on shared fix locations. The likely fix locations are suggested based on a set of rules. The rules are concluded and developed through the collaboration with two security experts. The second stage of Priv provides additional essential information for improving the efficiency of diagnosis and fixing. Priv offers two types of additional information: identifying true database/attribute-related warnings, and providing customized fix suggestions per warning. The evaluation shows that Priv suggests identical fix locations to the ones suggested by developers for 50–100% of the evaluated vulnerability findings. Priv identifies up to 2170 actionable vulnerability findings for the evaluated six projects. The manual examination confirms that Priv can generate patches of high-quality for many of the evaluated vulnerability warnings.
Cite this version of the work
Jinqiu Yang (2018). Improving the Correctness of Automated Program Repair. UWSpace. http://hdl.handle.net/10012/13224