A Bayesian Framework for Software Regression Testing
Mir arabbaygi, Siavash
MetadataShow full item record
Software maintenance reportedly accounts for much of the total cost associated with developing software. These costs occur because modifying software is a highly error-prone task. Changing software to correct faults or add new functionality can cause existing functionality to regress, introducing new faults. To avoid such defects, one can re-test software after modifications, a task commonly known as regression testing. Regression testing typically involves the re-execution of test cases developed for previous versions. Re-running all existing test cases, however, is often costly and sometimes even infeasible due to time and resource constraints. Re-running test cases that do not exercise changed or change-impacted parts of the program carries extra cost and gives no benefit. The research community has thus sought ways to optimize regression testing by lowering the cost of test re-execution while preserving its effectiveness. To this end, researchers have proposed selecting a subset of test cases according to a variety of criteria (test case selection) and reordering test cases for execution to maximize a score function (test case prioritization). This dissertation presents a novel framework for optimizing regression testing activities, based on a probabilistic view of regression testing. The proposed framework is built around predicting the probability that each test case finds faults in the regression testing phase, and optimizing the test suites accordingly. To predict such probabilities, we model regression testing using a Bayesian Network (BN), a powerful probabilistic tool for modeling uncertainty in systems. We build this model using information measured directly from the software system. Our proposed framework builds upon the existing research in this area in many ways. First, our framework incorporates different information extracted from software into one model, which helps reduce uncertainty by using more of the available information, and enables better modeling of the system. Moreover, our framework provides flexibility by enabling a choice of which sources of information to use. Research in software measurement has proven that dealing with different systems requires different techniques and hence requires such flexibility. Using the proposed framework, engineers can customize their regression testing techniques to fit the characteristics of their systems using measurements most appropriate to their environment. We evaluate the performance of our proposed BN-based framework empirically. Although the framework can help both test case selection and prioritization, we propose using it primarily as a prioritization technique. We therefore compare our technique against other prioritization techniques from the literature. Our empirical evaluation examines a variety of objects and fault types. The results show that the proposed framework can outperform other techniques on some cases and performs comparably on the others. In sum, this thesis introduces a novel Bayesian framework for optimizing regression testing and shows that the proposed framework can help testers improve the cost effectiveness of their regression testing tasks.