Towards Understanding and Improving Code Review Quality

Kononenko, Oleksii

Towards Understanding and Improving Code Review Quality

Files

Kononenko_Oleksii.pdf (1.33 MB)

Date

2017-06-12

Authors

Kononenko, Oleksii

Advisor

Godfrey, Michael
Baysal, Olga

Publisher

University of Waterloo

Abstract

Code review is an essential element of any mature software development project, it is key to ensuring the long-term quality of the code base. Code review aims at evaluating code contributions submitted by developers before they are committed into the project's version control system. Code review is considered to be one of the most effective QA practices for software projects. In principle, the code review process should improve the quality of committed code changes. However, in practice, the execution of this process can still allow bugs to enter into the codebase unnoticed. Moreover, the notion of the quality of the code review process is not limited to the quality of the source code that passed a review. It goes beyond that, the quality of the code review process can affect how successful a software development project is. For instance, in the world of open source software (OSS), a particular execution code review process may encourage or deter the contributions from ``external" developers, the people who are essential to OSS projects. We claim that by analyzing various software artifacts as well as assessing developers' daily experience, we can create models that represent the established code review processes and highlight potentially weak points in their execution. Having this information, the stakeholders can channel the available resources to address the deficiencies in their code review process. To support such a claim, we perform the following studies. First, we study the tool-based code review processes of two large OSS projects that use traditional model of evaluating code contributions. We analyse the software artifacts extracted from the issue tracking systems to understand what can affect code review response time and eventual outcome. We found that code review is affected not only by technical factors (e.g., patch size, priority, etc.) but also by non-technical ones (e.g., developers' affiliation, their experience, etc.). Second, we investigate the quality of contributions that passed the code review process and explore the relationships between the reviewers' code inspections and a set of factors, both personal and social in nature, that might affect the quality of such inspections. By mining the software repository and the issue tracking system of the Mozilla project, as well as applying the SZZ algorithm to detect bug-inducing changes, we were able to find that 54\% of the reviewed changes introduced bugs in the code. Our findings also showed that both personal metrics, such as reviewer workload and experience, and participation metrics, such as the number of involved developers, are associated with the quality of the code review process. Third, we further study the topic of code review quality by studying the developers' attitude and perception of review quality as well as the factors they believe to be important. To accomplish this, we surveyed 88 Mozilla core developers, and applied grounded theory to analyze their responses. The results provide developer insights into how they define review quality, what factors contribute to how they evaluate submitted code and what challenges they face when performing review tasks. Finally, we examined the code review processes executed in a completely different environment --- an industrial project that uses pull-based development model. Our case study was Active Merchant project developed by Shopify Inc. We performed a quantitative analysis of their software repository to understand the effects of a variety of factors on pull request review time and outcome. After that, we surveyed the developers to understand their perception of the review process and how it is different from developers' perception in traditional development model. The studies presented in this thesis focus on code review processes performed by projects of different nature --- OSS vs. industrial, traditional vs. pull-based. Nevertheless, we observed similar patterns in the execution of code review that the stakeholder should be aware of to maintain the long-term health of the projects.