Problem Solving with Boolean Satisfiability

Problem Solving with Boolean SATisfiability J. Larrosa September 26, 2017 Abstract This notes complement the course slides. While class slides are more oriented to examples and motivation, here we give more formal definitions. They are working notes, not completely checked for errors and typos. 1 Overview and Motivation Propositional Logic is an extremely simple, but surprisingly powerful language for modeling (and reasoning about) complex systems. The intuitive idea is that it can be used to model systems that can be expressed as propositions (i.e, statements that can only be either true or false). A propositional formula describes the possible states of some part of a world that is being modeled and it specifies how some of those states are unfea- sible or unwanted in the system. The model might be of your house, your car, your body, your community, an ecosystem, a stock-market, etc. The language allows us to specify from all the possible worlds that could hypothetically exist (i.e, all the possible ways that the parts or states can be configured) those states that can really exist (i.e, the car engine cannot run with a dead battery, the car cannot be used with flat tires, your body in order to be healthy cannot have any disorder symptoms such as temperature, headache, caugh,...). The language also allows to proof (or disproof) properties of the system. Propositional Logic is a declarative representations. In this approach, we construct with the language a model of the system about which we would like to reason about. This model encodes our knowledge of how the system works in a computer readable form. The key property of a declarative representation is the separation of knowledge and reasoning. The representation has its own clear semantics, separated from the algorithms that one can apply to it. Thus, we can develop a general suite of algorithms that apply to any model within a broad class. The main advantage of using Propositional Logic is that it provides a com- pact representation of an exponentially large boolean function. Additionally, many algorithms have been design to work with this representation, bypassing the huge boolean function. 1 In this notes we first describe the basics of the propositional logic language and show how it can be used to model problems. We show how the boolean satisfiability query (SAT) is general enough as to include many other queries (e.g. entailment or equivalence). Then we introduce CNF, a subset of the language that is as expressive as propositional logic (i.e, every formula in propositional logic can be transformed into an equivalent one in CNF). The interest of CNF is that it facilitates the development of efficient algorithms. .... Finally, we present Max-SAT and #SAT, two popular extensions of SAT. 2 Basic Definitions A boolean variable takes values from the boolean domain B = f0; 1g, which are usually referred to as false and true, respectively. In this notes we will make the usual convention of denoting boolean variables with lower case letters (possibly subscripted) such as p; q; x1; xi; :::. We assume that the reader is fa- miliar with the usual boolean operators _ (logical or, also called disjunction), ^ (logical and, also called conjunction), : (not, also called negation), ! (logical implication), $ (logical equivalence). A boolean formula F is a syntactically well constructed expression with boolean values, variables and operators. There is a precedence order for eval- uation purposes which is (:; ^; _; !; $) where priority goes from left to right. When needed, parenthesis can be used to overrule the default precedence rules. For example, if P = fp; q; rg a valid formula is F = p _ (:q ^ p _ r) ^ 1 Note: Since F $ G is equivalent to (F ! G) ^ (G ! F ), and F ! G is equivalent to :F _ G, they are not really necessary. We can think of these two operators as syntactical sugar and, when needed, assume that formulas only contain disjunction, conjunction and negation An interpretation I of a set of variables X is a mapping from each variable to a boolean value (i.e, either 0 or 1). Note that the number of different interpretations is 2jXj. We say that an interpretation I satisfies a formula F (noted I j= F ) if replacing in F variables by their associated values evaluates to 1. For instance if I maps all the variables to 0, it does not satisfy formula p _ (:q ^ p _ r) ^ 1 because 0 _ (:0 ^ 0 _ 0) ^ 0 evaluates to 0. If interpretation I satisfies a formula F , we say that I is a model of F . A formula F is satisfiable if it has at least one model. It is a contradiction (or unsatisfiable) if it does not have any model. Finally, it is a tautology if every interpretation is a model. Formula F entails formula G (noted F j= G) if every model of F is also a model of G. Two formulas are equivalent (noted F ≡ G) if they entail each other. Sometimes it is useful to think of a boolean formula F as a binary tree where leaves are variables and internal nodes are operators (see Figure 3, left). The tree associated to a given formula is unique. Furthermore, two different formulas have different associated trees. The size of a formula F , noted jF j, is the size of its tree as the number of leaves and internal nodes (e.g. the formula in Figure 2 Figure 1: Some logical equivalences 3, left has size 12) It is often useful to think of a boolean formula F over X = fx1; x2; ··· ; xng as a boolean function F (X): Bn −! B. The truth table of a formula F (X) (see Figure 2, bottom) is a table containing F (X) for each possible instantiation of X. Property. • Every truth table over a set of variables X can be expressed as a propositional formula F (X) • Two different formulas may have the same truth table. 2.1 A first modeling example Consider that we want to model the possible scenarios of a surveillance system consisting in r robots and p surveillance spots along t different time slots (note that here r; p; t are not boolean variables, but problem parameters). We want to consider how to schedule the robots (i.e, where to place each robot at each time slot) knowing that we never want two robots in the same spot. One possibility is to use variables xijk (with 1 ≤ i ≤ r; 1 ≤ j ≤ p; 1 ≤ k ≤ t) associated to the statement robot i is in spot j at time k. As you can see, we prefer to write a model in terms of the problem parameters r; p; t, rather than a particular model for some fixed parameters. This is good for modeling purposes, because shows the generality of the model and helps understand how the model would look 3 for different scenarios. However, if we want to use this model in a particular setting, we will need to write the formula for the corresponding values. Let's start defining with some useful sub-formulas, • On day k, robot i is at some spot, which can be written as, Gik = xi1k _ xi2k _···_ xipk Note that the size of Gik is O(p). Now it is easy to define formula G = = G11 ^ G12 ^ · · · ^ G1p G21 ^ G22 ^ · · · ^ G2p ··· Gr1 ^ Gr2 ^ · · · ^ Grp which specifies that every day, every robot has to be at some spot (there is one model for each feasible assignment). The size of G is O(rtp). • On day k, spot j is occupied by at most one robot, which can be written as, Hjk = :(x1jk ^ x2jk) ^ :(x1jk ^ x3jk) ^ · · · ^ :(x1jk ^ xrjk) 2 The size of Hjk is O(r ). Now, formula H = = H11 ^ H12 ^ · · · ^ H1t H21 ^ H22 ^ · · · ^ H2t ··· Hp1 ^ Hp2 ^ · · · ^ Hpt denotes that every day, every spot holds at most one robot (there is one model for each feasible assignment). Its size is O(ptr2). The complete formula is F = G ^ H which contains one model for each possible scheduling of the robots according to the system requirements. The model can be used to identify different properties about the system. For instance, it is easy to see that in any instantiation of the formula with p < r (i.e, there are more robots than surveillance spots) the formula is unsatisfiable formula, meaning that it is not possible to place r robots in less than r places without overlapping. This example shows that propositional formulas can sometimes be pretty large, which may be a serious drawback. For instance, the size of the previous formula is O(ptr2), which may be a large number even for moderate values of p; t; r. Note that this is a very simple example. In a more realistic one, additional el- ements would be added to take into account additional conditions about robots, spots and time slots. For instance, if different robots have different capabilities and different spots have different priorities, we could augment the formula to 4 force the best robots to be placed in the most critical spots. One way to do that is to enrich the model associating to each spot and to each robot a priority and then add to the formula a third sub-formula encoding that: for every time slot k, for each pair of spots (j; j0) such that j has more priority than j0, and for each pair of robots (i; i0) such that i has more priority than i0, the more powerful robot cannot be place in the spot with less priority (that is, :(xij0k ^ xi0jk)).

Load more