Master Parisien de Recherche en Informatique 2011–2012 Proof Assistants

Project – Quantifier Elimination in Presburger Arithmetic

1 Instructions

• A significant part of the project consists in designing datatypes and writing functions. This can be done earlier and leave the proofs if you are not yet able to make proofs in Coq. But consider that often proofs catch bugs in your definitions! When no proof is required, it is wise to test the definitions (using Eval compute) on simple but not too trivial inputs.

• Don’t forget to write a short report (5 pages max.). It should provide all information needed to understand the sources, like the meaning of the main definitions or the design choices (when this is relevant).

• It is better to give a partial solution than no solution at all, either by proving properties less general than what is required, or by finishing proofs relying on justified assumptions or axioms.

2 Overview of the project

The goal of this project is to write the main components of a decision procedure for Presburger arithmetic, that is Peano arithmetic without . The problem is to decide whether a fisrt-order formula is valid or not. The main idea is, given an arbitrary arithmetic formula, to find an equivalent formula which contains no quantifier. The latter formula can be decided by mere computation. The quantifier elimination process is the following: for a formula ∃x. P (x), we first eliminate quantifiers in P (recursively). Assuming we are able to eliminate one quantifier (see below), we are done. Other cases of formula are simpler to deal with. This is the subject of section 5. For the elimination of one qauntifier, if is easier to reason on formulas of a particular shape, called disjunctive normal form (DNF). Every formula can be turned into an equivalent DNF. This will be proven in section 4. It remains to see how an existentially quantified DNF can be transformed into an equivalent formula. To illustrate this, consider the formula P = x ≥ 2y ∧ 2x ≤ 3y ∧ 3 | x + y ∧ 2 | x + 4y. Expression n | t means that the n divides the value of expression t. Scalar multiplicative terms like 2x are allowed in Presburger arithmetic, it is just a notation for x + x. Given a value for y, the formula ∃x. P (x) cannot be decided as is, because the range of the existential quantifier is infinite. We notice that the divisibility conditions are periodic: if some x0 satifies 3 | x + y ∧ 2 | x + 4y then so does x0 + 6k for all k (6 is the least common multiple of 2 and 3). Thus, if we look for the least number that satisifes P , we only need to test P on a period starting from the lower bound 2y. It can be shown that

∃x, P (x) ⇐⇒ P (2y) ∨ P (2y + 1) ∨ P (2y + 2) ∨ P (2y + 3) ∨ P (2y + 4) ∨ P (2y + 5)

1 As a last remark, we point out that there might be several lower bounds and the one that is the tightest might depend on the value of y. To be sure not to miss the least solution, we have to test P on one period next to each of the bounds. This construction is delt with in section 6.

3 Presburger Arithmetic

3.1 Expressions Presburger arithmetic is a subset of Peano arithmetic where all expressions are linear. It can be encoded by the following inductive type: Definition var := nat . Inductive term : Type := | Nat : nat -> term | Var : var -> term | Add : term -> term -> term | Mul : nat -> term -> term . The intended meaning of each constructor is the following:

• Nat is used for constant expressions

• Var stands for variable expressions. Variables are encoded by a natural number, but any decidable set could be used instead. The type var has been introduced to distinguish numbers that represent variables from those denoting integer values (as in Nat).

• Add and Mul represent respectively and scalar multiplication. a. Write the function tinterp of type (var → nat) → term → nat such that (tinterp e t) returns the value of term t in the variable assignment e. b. Write a substitution function tsubst of type term → term → var → term such that (tsubst t u x is t where every occurrence of x has been replaced by u.

3.2 Normalizing expressions Since our expressions are linear, we want to show that given a variable x, any expression t is equivalent to αx + t0 where x does not appear in t0. If α is 0, then we say that expression t does not depend on x. This normal form can be computed either by structural recursion on t, or by the formula t(x) = t(0) + (t(1) − t(0))x but the latter is slightly more complex. a. Write a normalization function computing the normal form of an expression t with respect to a variable x by returning a couple of a natural number α and an expression t0 not depending on x. b. Prove the correctness of the above function, by showing that t and αx + t0 match on every variable assignment: ∀e. tinterp e t = α × e(x) + tinterp e t0

2 3.3 Formulas The formulas we will consider are the minimal first-order formulas (formed only on negation, disjunction, existential quantifier). The atomic predicates are: inequality (≤) and divisibility (n|t). Other logical connectives and common arithmetical predicates can be derived easily. For instance A ∧ B is a notation for ¬(¬A ∨ ¬B) and t1 = t2 is a notation for t1 ≤ t1 ∧ t2 ≤ t1. a. Define formula the inductive type of formulas. Binders (∃) can be represented in different ways, but we recommend to stick to the informal notation, where ∃x. P (x) is represented by the existential quantification constructor applied to a variable x (the name of the bound variable) and a formula, where every occurrence of (Var x) refers to the bound variable. b. Write the substitution function (called fsubst) operating on formulas. Beware of variable captures: replacing x by 1 in ∃x. x = x should return the formula unchanged! (Try it.) It is however assumed that in (fsubst t u x), the free variables of u do not overlap with the bound variables of t. c. Write the interpretation function (called holds) for formulas: given a variable assignment and a formula, return an object of type Prop expressing that the given formula is true when the free variables have been substituted as specified by the assignment. d. Explain why it is not possible to directly define the interpretation of holds e (∃x. P ) as exists n, holds e (fsubst P (Nat n) x) ? However, prove that this property holds for the defini- tion accepted by Coq.

3.4 Quantifier-free formulas a. Write a boolean-valued function that determines whether a formula is quantifier-free or not. b. Write a boolean-valued function that decides whether a quantifier-free formula is valid or not in a given variable assignment. c. Prove the soundness of this decision function, that is that the boolean computed above reflects the proposition returned by holds.

4 Clauses in Distributive Normal Form

A clause is a triple of sets1 of atomic formulas:

• the first set contains formulas of the form t1 ≤ t2, • the second set contains formulas of the form n|t,

• the third set contains formulas of the form ¬n|t.

We can show that any quantifier-free formula is equivalent to a set of clauses called a “dis-

junctive normal form” (DNF), with the intepretation that [[A11; ... ; A1i1 ]; ... ;[Ak1; ... ; Akik ]] W V means i j Aij a. Define the type of clauses. b. Write clauses equivalent to the following formulas: >, ⊥ and t1 ≤ t2

1To encode sets, you can either use the List library of Coq of the seq library of SSReflect

3 c. Write the interpretation function of DNFs towards quantifier-free formulas described above.

0 d. Given two DNFs C1 and C2, write a function computing a DNF C (written C1 ∧ C2) such that 0 whenever C1 is equivalent to A and C2 equivalent to B, then C is equivalent to A ∧ B, and prove it. e. Write a function (tkqf to dnf) that transform a quantifier-free formula into a DNF, and prove its correctness. To make it simpler, it is better to add an extra argument that accumulates negations: qf to dnf is of type formula → bool → dnf and (qf to dnf A b) represents A if b is true, and ¬A if b is false.

5 Quantifier Elimination

In this section, we will assume that we have a way to eliminate one existential quantifier. This process is called a projector. More precisely a projector is a function φx from clauses to quantifier-free formulas such that φx(C) is equivalent to ∃x. C. a. Write the assumptions corresponding to the existence of a projector. b. Generalize to DNFs: given a DNF D, return a quantifier-free formula equivalent to ∃x. D. c. Write a recursive function that translates arbitrary formulas of Presburger arithmetic into an equivalent quantifier-free formula. d. Write a decision procedure for Presburger arithmetic: given a variable assignment and a for- mula, decide whether this formula holds in the given assignment, by composition of quantifier elimination and of quantifier-free formulas.

6 Projector

In this section we propose to code a projector. The correctness of a projector is beyond the scope of the current project and we only ask to code the functions and prove a reduced number of properties. We assume we want to eliminate variable x from a clause

C = {(ti ≤ ui)i;(nj | vj)j;(¬mk | wk)k}

Solutions of C are the intersection of an interval (defined by (ti ≤ ui)i) and the divisibility atoms, which is periodic: there is a number πx(C) such that if x0 satisfies the divisibility atoms, then so does x0 + kπx(C) for all k. This period should be a multiple of all n such that there is an atom (n | vj) or (¬n | wk) such that vj or wk depends on x.

6.1 Computing the period a. Write a simple function computing πx(C). The optimal choice is the least common multiple, but there are choices easier to implement.

4 6.2 Computing the set of lower bound inequalities

Using the normalization of expressions, we know that (ti ≤ ui) is equivalent to (ax+b ≤ cx+d) for some a, b, c and d not depending on x. Depending on a and c, it is possible to simplify it further, obtaining

• either (a − c)x + b ≤ d, which gives an upper bound to x,

• or b ≤ (c − a)x + d, which gives a lower bound to x. a. Write a function that given a clause, returns the set of lower bounds inequalities (as a list of triples (αi, βi, γi) standing for αix + βi ≥ γi) and a clause correspondig to all the remaining atomic formulas.

6.3 Changing the variable For a lower bound α x + β ≥ γ , we would like to say that the lower bound if γi−βi but this is i i i αi not a Presburger formula. To work around this, the idea is to multiply each atomic formulas by a natural number so that x appears with the same scalar λ. We also want λx to be added to the same expression a in all lower bound inequalities. This does not change the solutions, and it is then possible to replace all occurrences of λx + a by a variable y, with the additional constraint that y should be equal to a modulo λ. a. Collect all factors αi of x and added expression βi in lower bound formulas. Take α = Πiαi and β = Σ α β . We want to perform the variable change y = αx + β. i αi i b. Apply to variable change to clause C, producing a new clause C0 obtained as follow:

α α • Lower bound inequalities αix + βi ≥ γi are equivalent to αx + β ≥ γ + (β − βi). αi |alphai 0 α α So C should contain y ≥ γ + (β − βi). αi |alphai • Other inequalities ax+b ≤ cx+d are transformed into αax+αb+aβ+cβ = αcx+αd+aβ+cβ which means that C0 contains ay + αb + cβ = cy + αd + aβ • Divisibility atoms n | ax + b become αax + aβ = aβ mod αn. This equation can be turned into a divisibility atom because Z/αnZ is a group. It is possible to write a subtraction modulo αn function on expressions, and C0 should contain αn | (y − aβ). • Negated divisibility atoms are processed similarly. • C0 should also contain an extra divisibility atom reflecting that y is αx + β. Namely, α | (y − β), using subtraction modulo α.

To avoid the risk of variable clash, the same name should be used to refer to x and y. c. Write a function that computes the set lbx(C) of all lower bounds, that is the right handsides of inequations y ≥ νi above.

6.4 Building the projector Since the divisibility atoms are periodic, if the inequations have a solution, there is (possibly another) one that stands at a distance smaller that the period from one of the formal upper bounds. The idea is then to generate the following big disjunction _ _ fsubst C0 (a + k) y

a∈lbx(C) k∈[0;πx(C)[

5 that enumerates all upper bounds and test C on one period above each lower bound. Important note: if the set of lower bounds is empty, then it can be replaced by {0} because we work with the natural numbers. a. Write the function that generates the formula above. We will assume that this is a projector indeed.

6