AlphaSMT: A Reinforcement Learning Guided SMT Solver

Lu, Zhengyang

AlphaSMT: A Reinforcement Learning Guided SMT Solver

Files

Lu_Zhengyang.pdf (2.84 MB)

Date

2023-06-19

Authors

Lu, Zhengyang

Advisor

Ganesh, Vijay

Publisher

University of Waterloo

Abstract

Satisfiability Modulo Theories (SMT) solvers are programs that decide whether a first-order logic formula is satisfiable. Over the last two decades, these solvers have become central to many methods and tools in fields as diverse as software engineering, verification, security, and Artificial Intelligence. Most modern SMT solvers provide user-controllable strategies, i.e., users can define a strategy that customizes a solving algorithm for their own problems by combining individual tactics as building blocks. A tactic is a well-defined and implemented reasoning step provided by the SMT solver, which either simplifies, trans- forms, or solves the given input SMT formula. The flexibility of customizing a strategy to a specialized type of formula is important since no existing strategy is considered optimal for all instances. However, finding a good customized strategy is challenging even for experts. In this thesis we present a novel class of reinforcement-learning (RL) guided methods, implemented in the Z3 SMT solver and that we refer to as AlphaSMT, which adaptively constructs the expected best strategy for any given input SMT formula. Briefly, the AlphaSMT RL framework combines deep Monte-Carlo Tree Search (MCTS) and logical reasoning in a unique way in order to enable the RL agent to learn the best combination of tactics for a given class of formulas. In more detail, a deep neural network serves as both the value function and the policy, evaluating state-action pairs and making the decision of which tactic to choose at each step. The neural network is trained toward the optimal policy by learning from self-exploring sampling solving processes. MCTS is used as a lookahead planning step for each decision made in the sampling processes. We evaluated the performance of AlphaSMT on benchmark sets from three SMT logics, namely, quantifier-free non-linear real arithmetic (QF NRA), quantifier-free non-linear integer arithmetic (QF NIA), and quantifier-free bit-vector (QF BV). In all these logics, AlphaSMT outperforms its base solver, Z3, by solving up to 80.5% more instances in a testing set. Evaluation results also show that a reasonably longer tactic timeout helps solve more instances and a pre-solver contributes significantly to the speedup.

Keywords

automated reasoning, reinforcement learning, SMT solver, artificial intelligence

URI

http://hdl.handle.net/10012/19562

Collections

Theses
Electrical and Computer Engineering

Full item page

AlphaSMT: A Reinforcement Learning Guided SMT Solver

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections