Pluggable Properties for Program Understanding: Ontic Type Checking and Inference

Chen, Zhuo

Pluggable Properties for Program Understanding: Ontic Type Checking and Inference

Files

zhuo_chen.pdf (431.37 KB)

Date

2018-04-26

Authors

Chen, Zhuo

Advisor

Dietl, Werner

Publisher

University of Waterloo

Abstract

Pluggable type systems is a powerful approach to add additional information on types, which can facilitate the understanding of programs. This thesis presents our work on three pluggable type systems for helping both programmers and other program analysis tools to better understand programs. Pluggable type checking ensures the absence of formalized vulnerabilities. This thesis presents the EOF Value Checker, which prevents unsafe end-of-file (EOF) value compar- isons. Unsafe EOF value comparisons can lead infinite loops in programs, and the details are described by the SEI CERT Oracle Coding standard for Java rule FIO-08J. EOF Value Checker examines additional safety-related properties on integer types, and statically en- sures no FIO-08J rule violations appear in a given program. Traditionally, pluggable type inference alleviates manual annotation efforts for type checking. This thesis presents work on extending the purpose of type inference to also pro- duce high-level abstractions for programs. We present a novel type system called Ontology, which reasons about a coarse abstraction for a given program based on ontic concepts. An ontic concept is a high-level semantic conclusion of concrete types or fields in programs. The goal of Ontology is to produce reasonable semantic abstractions for a given program, so that the produced abstraction may facilitate other program analysis tools on their tasks. To effectively solve type constraints for Ontology, as foundational work, we extend the solver framework in the Checker Framework Inference from only supporting satisfiability problem (SAT) solvers to also supporting Satisfiability Modulo Theories (SMT) solvers. Then, a new SMT encoding approach based on this extension is proposed for Ontology type system. We also apply this new encoding on Dataflow type system, which is previous work on reasoning about concrete run-time types for the components in programs. Our new encoding makes it be possible to support partially annotated programs. We evaluate EOF Value Checker on 35 real world projects, and it finds 3 defects, 8 bad coding practices, and no false positives are generated. Ontology and Dataflow type systems are evaluated on 15 scientific libraries (range from 393 to 86k LoC). For Ontology type system, it summarizes 4937 built-in ontic concepts, and propagates 274 domain- specific concepts in two pre-annotated physical libraries. For Dataflow type system, the new approach propagates 1.56 times more interesting Dataflow types. In addition, we manually examine the inference results, and verify the two type systems produce meaningful program abstractions of projects in the benchmark. These results suggest that pluggable type systems can provide confidence on preventing formalized vulnerabilities, and be able to infer high-level abstractions for programs.