Understanding and Improving SAT Solvers via Proof Complexity and Reinforcement Learning

Ganesh, VIjayLi, Chunxiao2023-12-182023-12-182023-12-182023-12-14http://hdl.handle.net/10012/20173Despite the fact that the Boolean satisfiability (SAT) problem is NP-complete and believed to be intractable, SAT solvers are routinely used by practitioners to solve hard problems in wide variety of fields such as software engineering, formal methods, security, and AI. This gap between theory and practice has motivated an entire line of research whose primary goals are twofold: first, to develop a better theoretical framework aimed at accurately capturing solver behavior and thus prove tighter complexity bounds; and second, to further experimentally verify the soundness of the theory thus developed via rigorous empirical analysis and design theory-inspired techniques to improve solver performance. This interplay between theory and practice is at the heart of the work presented here. More precisely, this thesis contains a collection of results which attempt to resolve the above-described discrepancy between theory and practice. The first two sets of results are centered around the restart problem. Restarts are classes of methods which aim at erasing part of the progress a solver may have made at run time, in order to help solvers escape from the ``bad parts'' of the search space. We provide a detailed theoretical analysis of the power of restarts used in modern Conflict-Driven Clause Learning (CDCL) SAT solvers, where we prove a series of equivalence and separation results for various configurations of solvers with and without restarts. From the intuition developed via this theoretical analysis, we design and implement a machine learning based reset policy, where resets are variants of restarts that erase activity scores in addition to the parts of the solver state erased by restarts. We perform extensive experimental work to show that our reset policy outperforms both baseline and state-of-the-art solvers over a class of cryptographic instances derived from bitcoin mining problems. In a different direction, we propose the concept of hierarchical community structure (HCS) for Boolean formulas. We first theoretically show that formulas with ``good'' HCS parameter values have short CDCL proofs. Then we construct an Empirical Hardness Model using the HCS parameters. These HCS parameters exhibit a robust correlation with solver run time, leading to the development of a classifier capable of accurately distinguishing between easily solvable industrial instances and challenging random/crafted scenarios. We also present scaling studies of formulas with HCS structures to further support of theoretical analysis. In the latter part of the thesis, the focus shifts to satisfaction-driven clause-learning (SDCL) solvers, known to be being exponentially more powerful than CDCL solvers. Despite the theoretical strength of SDCL, it remains a challenge to automate and determinize such solvers. To address this, we again leverage machine learning techniques to strategically decide when to invoke an SDCL subroutine, with the goal of minimizing the associated overhead. The resulting SDCL solver, enhanced with MaxSAT techniques and conflict analysis, outperforms existing solvers on combinatorial benchmarks, particularly demonstrating superior efficacy on Mutilated Chess Board (MCB) problems.enSAT solvingReinforcement LearningProof ComplexityUnderstanding and Improving SAT Solvers via Proof Complexity and Reinforcement LearningDoctoral Thesis