Satisfiability Modulo Theory
Total Page:16
File Type:pdf, Size:1020Kb
Satisfiability modulo theory David Monniaux CNRS / VERIMAG October 2017 1 / 185 Schedule Introduction Propositional logic First-order logic Applications Beyond DPLL(T) Quantifier elimination Interpolants 2 / 185 Who I am David Monniaux Senior researcher (directeur de recherche) at CNRS (National center for scientific research) Working at VERIMAG in Grenoble 3 / 185 Who we are VERIMAG = joint research unit of I CNRS (9 permanent researchers) I Université Grenoble Alpes (15 faculty) I Grenoble-INP (8 faculty) 4 / 185 What we do (Widely speaking) Methods for designing and verifying safety-critical embedded systems I formal methods I static analysis I proof assistants I decision procedures I exact analysis I modeling of hardware/software system platforms (caches, networks on chip etc.) I hybrid systems 5 / 185 Program verification Program = instructions in I “toy” programming language or formalism (Turing machines, λ-calculus, “while language”) I real programming language Program has a set of behaviors (semantics) Prove these behaviors included in acceptable behaviors (specification) 6 / 185 Difficulties Defining the semantics of a real language (C, C++…) is very difficult. Add parallelism etc. to make it nearly impossible. Specifications are written by humans, may be themselves buggy. I will concentrate on the proving phase. 7 / 185 Objectives Increasing ambition: Advanced testing find execution traces leading to violations More efficiently than by hand or randomly Assisted proof help the user prove the program Automated proof prove the program automatically 8 / 185 Schedule Introduction Propositional logic First-order logic Applications Beyond DPLL(T) Quantifier elimination Interpolants 9 / 185 Schedule Introduction Propositional logic Generalities Binary decision diagrams Satisfiability checking Applications First-order logic Applications Beyond DPLL(T) 10 / 185 Quantifier elimination Interpolants Quantifier-free propositional logic ^ (and), _ (or), : (not) (also noted ¯x) Possibly add ⊕ (exclusive-or) t = “true”, f = “false” Formula with variables: (a _ b) ^ (c _ ¯a) An assignment gives a value to all variables. A satisfying assignment or model makes the formula evaluate to t. 11 / 185 Exponential size in the input. Disjunctive normal form Disjunction of conjunction of literals (= variable or negation of variable) Obtained by distributivity (a _ b) ^ c −! (a ^ c) _ (b ^ c) Inconvenience? 12 / 185 Disjunctive normal form Disjunction of conjunction of literals (= variable or negation of variable) Obtained by distributivity (a _ b) ^ c −! (a ^ c) _ (b ^ c) Inconvenience? Exponential size in the input. 12 / 185 Conjunctive normal form Disjunction of literals = clause CNF = conjunction of clauses Obtained by distributivity (a ^ b) _ c −! (a _ c) ^ (b _ c) 13 / 185 Negation normal form Push negations to the leaves of the syntax tree using De Morgan’s laws :(a _ b) −! (:a) ^ (:b) :(a ^ b) −! (:a) _ (:b) Makes formula “monotone” with respect to f < t ordering. 14 / 185 Applications Verification of combinatorial circuits Equivalence of two circuits 15 / 185 Schedule Introduction Propositional logic Generalities Binary decision diagrams Satisfiability checking Applications First-order logic Applications Beyond DPLL(T) 16 / 185 Quantifier elimination Interpolants The problem Representing compactly set of Boolean states A set of vector n Booleans = a function from f0; 1gn into f0; 1g. Example: f(0; 0; 0); (1; 1; 0)g represented by (0; 0; 0) 7! 1, (1; 1; 0) 7! 1 and 0 elsewhere. 17 / 185 Expanded BDD BDD = Binary decision diagram Given ordered Boolean variables (a; b; c), represent (a ^ c) _ (b ^ c) : a 0 1 b b c c c c 00 0 1 0 1 0 1 18 / 185 Removing useless nodes Silly to keep two identical subtrees: a 0 1 b b c c c c 00 0 1 0 1 0 1 identical identical 19 / 185 Compression a 0 1 c b 1 0 c 0 0 1 identical 20 / 185 Reduced BDD a 0 b 1 0 c 0 1 Idea: turn the original tree into a DAG with maximal sharing. Two different but isomorphic subtrees are never created. Canonicity: a given example is always encoded by the same DAG. 21 / 185 Implementation: hash-consing Important: implementation technique that you may use in other contexts “Consing” from “constructor” (cf Lisp : cons). In computer science, particularly in functional programming, hash consing is a technique used to share values that are structurally equal. […] An interesting property of hash consing is that two structures can be tested for equality in constant time, which in turn can improve efficiency of divide and conquer algorithms when data sets contain overlapping blocks. https://en.wikipedia.org/wiki/Hash_consing 22 / 185 Implementation: hash-consing 2 Keep a hash table of all nodes created, with hashcode H(x) computed quickly. If node = (v; b0; b1) compute H from v and unique identifiers of b0 and b1 Unique identifier = address (if unmovable) or serial number If an object matching (v; b0; b1) already exists in the table, return it How to collect garbage nodes? (unreachable) 23 / 185 (Other use of weak pointers: caching recent computations.) Garbage collection in hash consing Needs weak pointers: the pointer from the hash table should be ignored by the GC when it computes reachable objects I Java WeakHashMap I OCaml Weak 24 / 185 Garbage collection in hash consing Needs weak pointers: the pointer from the hash table should be ignored by the GC when it computes reachable objects I Java WeakHashMap I OCaml Weak (Other use of weak pointers: caching recent computations.) 24 / 185 Hash-consing is magical Ensures: I maximal sharing: never two identical objects in two =6 locations in memory I ultra-fast equality test: sufficient to compare pointers (or unique identifiers) And once we have it, BDDs are easy. 25 / 185 BDD operations Once a variable ordering is chosen: I Create BDD f, t(1-node constants). I Create BDD for v, for v any variable. I Operations ^, _, etc. 26 / 185 I without dynamic programming: unfolds DAG into tree ) exponential I with dynamic programming O(jaj:jbj) where jxj the size of DAG x Binary BDD operations Operations ^, _: recursive descent on both subtrees, with memoizing: I store values of f(a; b) already computed in a hash table I index the table by the unique identifiers of a and b Complexity with and without dynamic programming? 27 / 185 Binary BDD operations Operations ^, _: recursive descent on both subtrees, with memoizing: I store values of f(a; b) already computed in a hash table I index the table by the unique identifiers of a and b Complexity with and without dynamic programming? I without dynamic programming: unfolds DAG into tree ) exponential I with dynamic programming O(jaj:jbj) where jxj the size of DAG x 27 / 185 Fixed point solving B0 = initial states B1 = B0 + reachable in 1 step from B0 B2 = B1 + reachable in 1 step from B1 B3 = B2 + reachable in 1 step from B2 . converges in finite time to R = reachable states 28 / 185 Iteration sequence 0 0 Bk+1(x1;:::; xn) = 9 ^ 0 0 x1 ::: xn Bk(x1;:::; xn) τ(x1;:::; xn; x1;:::; xn) Needs I variable renaming (easy) I constructing the BDD for τ I 9-elimination for n variables I conjunction 29 / 185 Tools e.g. NuSMV, NuXMV in BDD mode 30 / 185 Schedule Introduction Propositional logic Generalities Binary decision diagrams Satisfiability checking Applications First-order logic Applications Beyond DPLL(T) 31 / 185 Quantifier elimination Interpolants SAT problem Given a propositional formula say whether it is satisfiable or not (give a model if so)./ Satisfiable = “it has a model” = “it has a satisfying assignment”. I “SAT” with arbitrary formula I “CNF-SAT” with formula in CNF I “3CNF-SAT” with formula in CNF, 3 literals per clause 32 / 185 Exercise Show that one can convert SAT into 3CNF-SAT with time and output linear in size of input. 33 / 185 Tseitin encoding Gregory S. Tseytin, On the complexity of derivation in propositional calculus, http://www.decision-procedures.org/ handouts/Tseitin70.pdf Can one transform a formula into another in CNF with linear size and preserve solutions? (a _ b) ^ (c _ d) −! (a ^ c) _ (a ^ d) _ (b ^ c) _ (b ^ d) 34 / 185 Tseitin encoding Add extra variables ( ) (a ^ b¯ ^ ¯c) _ (b ^ c ^ d¯) ^ (b¯ _ ¯c) : Assign propositional variables to sub-formulas: e ≡ a ^ b¯ ^ ¯c f ≡ b ^ c ^ d¯ g ≡ e _ f h ≡ b¯ _ ¯c ϕ ≡ g ^ h ; 35 / 185 Tseitin encoding e ≡ a ^ b¯ ^ ¯c f ≡ b ^ c ^ d¯ g ≡ e _ f h ≡ b¯ _ ¯c ϕ ≡ g ^ h ; turned into clauses ¯e _ a ¯e _ b¯ ¯e _ ¯c ¯a _ b _ c _ e ¯f _ b ¯f _ c ¯f _ d b¯ _ ¯c _ d _ f ¯e _ g ¯f _ g ¯g _ e _ f b _ h c _ h h¯ _ b¯ _ ¯c ϕ¯ _ g ϕ¯ _ h ¯g _ h¯ _ ϕ ϕ 36 / 185 DIMACS format c start with comments p cnf 5 3 1 -5 4 0 -1 5 3 4 0 -3 -4 0 5 = number of variables 3 = number of clauses each clause: -1 is variable 1 negated, 5 is variable 5, 0 is end of clause 37 / 185 DPLL Each clause acts as propagator e.g. assuming a and b¯, clause ¯a _ b _ c yields c Boolean constraint propagation aka unit propagation: propagate as much as possible once the value of a variable is known, use it elsewhere 38 / 185 DPLL: Branching If unit propagation insufficient to I either find a satisfying assignment I either find an unsatisfiable clause (all literals forced to false) Then: I pick a variable I do a search subtree for both polarities of the variable 39 / 185 Example ¯e _ a ¯e _ b¯ ¯e _ ¯c ¯a _ b _ c _ e ¯f _ b ¯f _ c ¯f _ d b¯ _ ¯c _ d _ f ¯e _ g ¯f _ g ¯g _ e _ f b _ h c _ h h¯ _ b¯ _ ¯c ϕ¯ _ g ϕ¯ _ h ¯g _ h¯ _ ϕ ϕ From unit clause ϕ ϕ¯ _ g ! g ϕ¯ _ h ! h ¯g _ h¯ _ ϕ removed Now g and h are t, ¯e _ g removed ¯f _ g removed b _ h removed c _ h removed ¯g _ e _ f ! e _ f h¯ _ b¯ _ ¯c ! b¯ _ ¯c 40 / 185 CDCL: clause learning A DPLL branch gets closed by contradiction: a literal gets forced to both t and f.