Mining Software Repositories to Assist Developers and Support Managers

Loading...
Thumbnail Image

Date

2004

Authors

Hassan, Ahmed

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

This thesis explores mining the evolutionary history of a software system to support software developers and managers in their endeavors to build and maintain complex software systems. We introduce the idea of evolutionary extractors which are specialized extractors that can recover the history of software projects from software repositories, such as source control systems. The challenges faced in building C-REX, an evolutionary extractor for the C programming language, are discussed. We examine the use of source control systems in industry and the quality of the recovered C-REX data through a survey of several software practitioners. Using the data recovered by C-REX, we develop several approaches and techniques to assist developers and managers in their activities. We propose <em>Source Sticky Notes</em> to assist developers in understanding legacy software systems by attaching historical information to the dependency graph. We present the <em>Development Replay</em> approach to estimate the benefits of adopting new software maintenance tools by reenacting the development history. We propose the <em>Top Ten List</em> which assists managers in allocating testing resources to the subsystems that are most susceptible to have faults. To assist managers in improving the quality of their projects, we present a complexity metric which quantifies the complexity of the changes to the code instead of quantifying the complexity of the source code itself. All presented approaches are validated empirically using data from several large open source systems. The presented work highlights the benefits of transforming software repositories from static record keeping repositories to active repositories used by researchers to gain empirically based understanding of software development, and by software practitioners to predict, plan and understand various aspects of their project.

Description

Keywords

Computer Science, ining Software Repositories, Source Control, Software Reliability, Developers, Managers, Development Chaos, Change Propagation, Software Understanding, Software Architecture, Fault Prediction

LC Subject Headings

Citation