UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Case Studies of a Machine Learning Process for Improving the Accuracy of Static Analysis Tools

Loading...
Thumbnail Image

Date

2016-10-18

Authors

Zhao, Peng

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Static analysis tools analyze source code and report suspected problems as warnings to the user. The use of these tools is a key feature of most modern software development processes; however, the tools tend to generate large result sets that can be hard to process and prioritize in an automated way. Two particular problems are (a) a high false positive rate, where warnings are generated for code that is not problematic and (b) a high rate of non-actionable true positives, where the warnings are not acted on or do not represent signi cant risks to the quality of the source code as perceived by the developers. Previous work has explored the use of machine learning to build models that can predict legitimate warnings with logistic regression [38] against Google Java codebase. Heckman [19] experimented with 15 machine learning algorithms on two open source projects to classify actionable static analysis alerts. In our work, we seek to replicate these ideas on di erent target systems, using di erent static analysis tools along with more machine learning techniques, and with an emphasis on security-related warnings. Our experiments indicate that these models can achieve high accuracy in actionable warning classi cation. We found that in most cases, our models outperform those of Heckman [19].

Description

Keywords

machine learning, static analysis tool

LC Keywords

Citation