Understanding and Enhancing CDCL-based SAT Solvers
Loading...
Date
2018-08-02
Authors
Zulkoski, Edward
Advisor
Ganesh, Vijay
Czarnecki, Krzysztof
Czarnecki, Krzysztof
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Modern conflict-driven clause-learning (CDCL) Boolean satisfiability (SAT) solvers routinely
solve formulas from industrial domains with millions of variables and clauses, despite the Boolean
satisfiability problem being NP-complete and widely regarded as intractable in general. At the
same time, very small crafted or randomly generated formulas are often infeasible for CDCL
solvers. A commonly proposed explanation is that these solvers somehow exploit the underlying
structure inherent in industrial instances. A better understanding of the structure of Boolean
formulas not only enables improvements to modern SAT solvers, but also lends insight as to why
solvers perform well or poorly on certain types of instances. Even further, examining solvers
through the lens of these underlying structures can help to distinguish the behavior of different
solving heuristics, both in theory and practice.
The first issue we address relates to the representation of SAT formulas. A given Boolean
satisfiability problem can be represented in arbitrarily many ways, and the type of encoding can
have significant effects on SAT solver performance. Further, in some cases, a direct encoding
to SAT may not be the best choice. We introduce a new system that integrates SAT solving
with computer algebra systems (CAS) to address representation issues for several graph-theoretic
problems. We use this system to improve the bounds on several finitely-verified conjectures
related to graph-theoretic problems. We demonstrate how our approach is more appropriate for
these problems than other off-the-shelf SAT-based tools.
For more typical SAT formulas, a better understanding of their underlying structural properties,
and how they relate to SAT solving, can deepen our understanding of SAT. We perform a largescale
evaluation of many of the popular structural measures of formulas, such as community
structure, treewidth, and backdoors. We investigate how these parameters correlate with CDCL
solving time, and whether they can effectively be used to distinguish formulas from different
domains. We demonstrate how these measures can be used as a means to understand the behavior
of solvers during search. A common theme is that the solver exhibits locality during search
through the lens of these underlying structures, and that the choice of solving heuristic can greatly
influence this locality. We posit that this local behavior of modern SAT solvers is crucial to their
performance.
The remaining contributions dive deeper into two new measures of SAT formulas. We first
consider a simple measure, denoted “mergeability,” which characterizes the proportion of input
clauses pairs that can resolve and merge. We develop a formula generator that takes as input a seed
formula, and creates a sequence of increasingly more mergeable formulas, while maintaining many
of the properties of the original formula. Experiments over randomly-generated industrial-like
instances suggest that mergeability strongly negatively correlates with CDCL solving time, i.e., as
the mergeability of formulas increases, the solving time decreases, particularly for unsatisfiable
instances.
Our final contribution considers whether one of the aforementioned measures, namely backdoor
size, is influenced by solver heuristics in theory. Starting from the notion of learning-sensitive
(LS) backdoors, we consider various extensions of LS backdoors by incorporating different branching
heuristics and restart policies. We introduce learning-sensitive with restarts (LSR) backdoors
and show that, when backjumping is disallowed, LSR backdoors may be exponentially smaller
than LS backdoors. We further demonstrate that the size of LSR backdoors are dependent on the
learning scheme used during search. Finally, we present new algorithms to compute upper-bounds
on LSR backdoors that intrinsically rely upon restarts, and can be computed with a single run of
a SAT solver. We empirically demonstrate that this can often produce smaller backdoors than
previous approaches to computing LS backdoors.
Description
Keywords
SAT Solvers, Backdoors, Computer Algebra Systems