Theoretical and Practi- Cal Aspects of Bit-Vector Reasoning

Eingereicht von Dipl.-Inf. Andreas Fröhlich Angefertigt am Institut fürFormale Mo- delle und Verifikation Erstbeurteiler Univ.-Prof. Dr. Armin Biere Zweitbeurteiler Theoretical and Practi- Univ.-Prof. Dr. Christoph Scholl cal Aspects of Bit-Vector März2016 Reasoning Dissertation zur Erlangung des akademischen Grades Doktor der Technischen Wissenschaften im Doktoratsstudium der Technischen Wissenschaften JOHANNES KEPLER UNIVERSITAT¨ LINZ Altenbergerstraße 69 4040 Linz, Osterreich¨ www.jku.at DVR 0093696 ii Abstract Satisfiability Modulo Theories (SMT) is a broad field of research and an important topic for many practical applications. In this thesis, we focus on theories of bit- vectors as used, e.g., in hardware and software verification, but also in many other areas. In particular, we discuss satisfiability of bit-vector logics and related problems. The satisfiability problem of propositional formulas (SAT) is a well-known NP-complete problem. In the past, satisfiability for quantifier-free bit-vector logics was often assumed to be NP-complete as well. We show that several earlier complexity results for bit-vector logics only hold if a unary encoding for scalars is used. Instead, complexity for many decision problems grows significantly, if a more succinct logarithmic encoding is used for representation. As one of our central results, we prove that satisfiability of quantifier-free bit-vector formulas turns out to be NEXPTIME-complete. We also show that similar results can be ob- tained for satisfiability of quantified bit-vector logics with uninterpreted functions (2-NEXPTIME-complete), and can even be extended to multi-logarithmic encodings. For the latter case, we give a very general ν-NEXPTIME-completeness result, when considering bit-vector logics with ν-logarithmic scalar encodings. We further analyze how the choice of operators affects the expressiveness of certain bit- vector logics and can lead to specific fragments of quantifier-free bit-vector logics that are PSPACE-complete or NP-complete. On the practical side, this implies that the bit-blasting followed by the use of a CDCL (conflict driven clause learning) SAT solver, which is the common approach in state-of-the-art bit-vector solvers, can be exponential. Our complexity results directly point to several possibilities for new solving approaches, proposing reductions to model checking (for the PSPACE- complete fragment), to EPR (for the general, NEXPTIME-complete class), or by applying SLS (stochastic local search) directly on the theory level. We also develop two algorithms for solving Dependency Quantified Boolean Formulas (DQBF), a further NEXPTIME-complete problem, either using a DPLL based algorithm or an instantiation based approach, similar to the one applied in EPR solving. iii iv Zusammenfassung Erfüllbarkeit Modulo einer Theorie (SMT) ist ein breites Forschungsgebiet mit vielen praktische Anwendungen. Diese Dissertation setzt ihren Fokus auf Theo- rien über Bitvektoren, wie sie zum Beispiel in der Hardware und Software Verfika- tion - aber auch in vielen anderen Bereichen - Verwendung finden. Insbesondere diskutieren wir die Erfüllbarkeit von Bitvektor-Logiken und verwandte Probleme. Das Erfüllbarkeitsproblem der Aussagenlogik (SAT) ist ein sehr bekanntes NP- vollständiges Problem. Bisher wurde oft angenommen, dass das Erfüllbarkeits- problem für quantorenfreie Bitvektor-Logiken ebenfalls NP-vollständig sei. Wir zeigen, dass viele frühere Komplexitäts-Resultate für Bitvektor-Logiken nur dann gelten, wenn enthaltene Skalare unär kodiert werden. Im Gegensatz dazu steigt die Komplexität vieler Entscheidungsprobleme drastisch, wenn eine kompaktere logarithmische Kodierung zur Darstellung verwendet wird. Unter anderem be- weisen wir, dass das Erfüllbarkeitsproblem für quantorenfreie Bitvektor Formeln NEXPTIME-vollständig ist. Des Weiteren zeigen wir, dass vergleichbare Resul- tate auch für das Erfüllbarkeitsproblem von quantifizierten Bitvektor-Logiken mit uninterpretierten Funktionen (2-NEXPTIME-vollständig), sowie für Logiken mit multi-logarithmisch kodierten Skalaren gelten. Für den zweiten Fall zeigen wir ν-NEXPTIME-Vollständigkeit, wenn eine ν-logarithmische Kodierung für Skalare verwendet wird. Wir analysieren zudem, wie sich die Wahl der Operatoren auf die Komplexität von verschiedenen Bitvektor-Logiken auswirkt und wie sie dazu führen kann, dass bestimmte Fragmente von quantorenfreien Bitvektor-Logiken PSPACE-vollständig oder NP-vollständig werden. Für die Praxis bedeuten unsere Resultate, dass der übliche Ansatz von Bitvektor Algorithmen, der “bit-blasting” mit der Verwendung von CDCL (konfliktgesteuerte Klauseln lernende) SAT Al- gorithmen kombiniert, exponentiell sein kann. Unsere Resultate eröffnen direkt mehrere neue Ansätze, z.B. durch Modellprüfung (für das PSPACE-vollständige Fragment), Übersetzung nach EPR (für die allgemeine, NEXPTIME-vollständige Klasse), oder Anwendung eines SLS (stochastische lokale Suche) Algorithmus direkt auf der Theorie-Repräsentation. Wir entwickeln zudem zwei Algorithmen um DQBF zu lösen, ein weiteres NEXPTIME-vollständiges Problem; entweder mit einem DPLL-basierten Algorithmus, oder durch einen auf Instanziierung basieren- den Ansatz, ähnlich dem für EPR. v vi Eidesstattliche Erklärung Ich erkläre an Eides statt, dass ich die vorliegende Dissertation selbstständig und ohne fremde Hilfe verfasst, andere als die angegebenen Quellen und Hilfsmittel nicht benutzt bzw. die wörtlich oder sinngemäß entnommenen Stellen als solche kenntlich gemacht habe. Die vorliegende Dissertation ist mit dem elektronisch übermittelten Textdokument identisch. vii viii Acknowledgements I would like to thank my parents, Roswitha and Wolfgang, for being supportive with me throughout my whole life—thanks for everything. This thesis is dedicated to them. I am also very thankful to my partner, Kathrin, for bearing with me when I decided to move to Linz, and for encouraging me to take the chance of working at Microsoft Research in Cambridge. Further thanks go to all my colleagues and co-authors—it was a joy working with you. Finally, I would like to thank Armin, for being the best supervisor any student could possibly hope for. ix x “In individuals, insanity is rare—but in groups, parties, nations and epochs, it is the rule.” - Friedrich Wilhelm Nietzsche, Beyond Good and Evil, 1886 Contents 1 Introduction 1 1.1 Background . .1 1.1.1 Propositional Satisfiability . .2 1.1.2 SAT Algorithms . .3 1.1.3 Quantification and Dependencies . .6 1.1.4 Satisfiability Modulo Theories . .7 1.2 Outline . .8 1.3 Bugs . 12 2 On the Complexity of Fixed-Size Bit-Vector Logics with Binary En- coded Bit-Width 13 2.1 Introduction . 13 2.2 Preliminaries . 15 2.3 Complexity . 15 2.3.1 QF_BV2 is NEXPTIME-hard . 17 2.3.2 UFBV2 is 2-NExpTime-hard . 20 2.4 Problems Bounded in Bit-Width . 22 2.4.1 Benchmark Problems . 23 2.5 Conclusion . 24 2.6 Appendix . 24 2.6.1 Example: A Reduction of DQBF to QF_BV2 ....... 24 2.6.2 Table: Completeness Results for Bit-Vector Logics . 26 3 More on the Complexity of Quantifier-Free Fixed-Size Bit-Vector Log- ics with Binary Encoding 27 3.1 Introduction . 27 3.2 Motivation . 29 3.3 Definitions . 29 3.4 Complexity Results . 31 3.5 Discussion . 36 3.6 Conclusion . 37 3.7 Appendix . 38 xi xii CONTENTS 3.7.1 Table: Completeness Results for Fixed-Size and Non-Fixed- Size Logics . 38 3.7.2 Example: A Reduction of QBF to QF_BV2!1 ...... 39 4 Complexity of Fixed-Size Bit-Vector Logics 43 4.1 Introduction . 43 4.2 Motivation . 45 4.3 Preliminaries . 46 4.3.1 SAT, QBF, and DQBF . 46 4.3.2 Circuits . 47 4.3.3 Fixed-Size Bit-Vector Logics . 48 4.4 Logics With Unary Encoding . 57 4.5 Scalar-Bounded Problems . 58 4.6 Quantifier-Free Logics with Binary Encoding . 59 4.7 Extensions and Alternative Characterizations . 69 4.7.1 Notation . 70 4.7.2 QF_BV2bw ......................... 71 4.7.3 QF_BV2!1 ......................... 73 4.7.4 QF_BV2!c ......................... 76 4.8 Logics with Quantifiers and Binary Encoding . 80 4.8.1 General Quantification . 80 4.8.2 Restricting the Bit-Width of Universal Variables . 84 4.8.3 Non-Recursive Macros . 85 4.9 Practical Considerations . 87 4.9.1 Alternative Approaches . 87 4.9.2 Benchmark Problems . 88 4.10 Conclusion . 89 4.11 Appendix . 90 4.11.1 Example: Reduction from DQBF to QF_BV2!c ..... 90 4.11.2 Example: Reduction from QBF to QF_BV2!1 ...... 92 4.11.3 Example: Bit-Width Reduction in QF_BV2bw ....... 93 4.11.4 Example: Half-Shuffle and Expand . 94 4.11.5 Example: Multiplication . 95 5 A DPLL Algorithm for Solving DQBF 97 5.1 Introduction . 97 5.2 Definitions . 98 5.3 DQDPLL Architecture . 100 5.4 Conversion of Concepts from SAT/QBF . 104 5.5 Preliminary Results . 108 5.6 Future Work . 109 5.7 Conclusion . 110 CONTENTS xiii 6 Bv2epr: A Tool for Polynomially Translating Quantifier-free Bit-Vector Formulas into EPR 111 6.1 Introduction . 111 6.2 Preliminaries . 112 6.2.1 Existing Translations . 112 6.3 The Tool . 113 6.3.1 The Translator . 114 6.4 Benchmarks and Experiments . 116 6.5 Conclusion . 117 7 IDQ: Instantiation-Based DQBF Solving 119 7.1 Introduction . 119 7.2 Preliminaries . 121 7.3 Related Work . 123 7.4 IDQ architecture . 124 7.5 Implementation . 127 7.6 Experimental Results . 130 7.7 Conclusion . 132 8 Efficiently Solving Bit-Vector Problems Using Model Checkers 135 8.1 Introduction . 135 8.2 QF_BV!1 to SMV . 137 8.3 Experiments . 139 8.4 Conclusion . 143 9 Quantifier-Free Bit-Vector Formulas with Binary Encoding: Bench- mark Description 147 9.1 Introduction . 147 9.2 Benchmarks . 148 9.2.1 Translating Bit-Vector Operations . 148 9.2.2 Bit-Vector Properties in PSPACE .............. 148 9.3 SMT2 and CNF generation . 149 9.4 Practical Considerations . 149 10 Stochastic Local Search for Satisfiability Modulo Theories 151 10.1 Introduction . 151 10.2 Preliminaries . 153 10.3 Architecture . 154 10.4 Implementation . 155 10.5 Experimental Results . 159 10.6 Discussion . 162 10.7 Related Work . 163 10.8 Conclusion . 163 xiv CONTENTS 11 Contributions 165 12 Beyond Previous Work 169 12.1 Complexity of Quantified Bit-Vector Formulas .

Theoretical and Practi- Cal Aspects of Bit-Vector Reasoning

Solving SAT and SAT Modulo Theories: from an Abstract Davis–Putnam–Logemann–Loveland Procedure to DPLL(T)

MASTER THESIS Strong Proof Systems

Satisfiability 6 the Decision Problem 7

Automated and Interactive Theorem Proving 1: Background & Propositional Logic

Foundations of Artificial Intelligence

Conjunctive Normal Form Algorithm

Conflict Driven Learning in a Quantified Boolean Satisfiability

Automated Theorem Prover Is an Algorithm That Determines Whether a Mathematical Or Logical Proposition Is Valid (Satisfiable)

A Brief History of Reasoning

A History of Satisfiability

Turing's Algorithmic Lens

MASTER THESIS Solving Boolean Satisfiability Problems