<<

A Decision Procedure for an Extensional Theory of Arrays

Aaron Stump, Clark W. Barrett, and David L. Dill Jeremy Levitt Computer Systems Laboratory 0-In Design Automation, Inc.

Stanford University, Stanford, CA 94305, USA San Jose, CA 95110, USA g E-mail: fstump,dill,barrett @cs.stanford.edu Email: [email protected]

Abstract viduals that may be stored in arrays. The sort V is the sort for primitive values stored in arrays. The of value sorts

A decision procedure for a theory of arrays is of inter- is defined to be the least set X satisfying

est for applications in formal verification, program analy-

V X sis, and automated -proving. This paper presents a

decision procedure for an extensional theory of arrays and

X X array proves it correct.

Every value sort except V is an array sort. The value sorts

V I together with I are all the sorts of the language. and 1. Introduction need not be distinct.

A decision procedure for a theory of arrays is of interest Definition 1 (dimensionality of a value sort) The dimen-

) for applications in formal verification and program analy- sion dim( of a value sort is defined by

sis. Such a procedure is also of value for theorem-provers.

(V ) = 0 dim The PVS theorem-prover [11] has an undocumented deci-

sion procedure for a theory of arrays [12], and HOL has

( ( ) + 1 = dim array ) dim some automatic support for a theory of arrays via a library for finite partial functions [3]. Terms The language has countably infinitely many Two kinds of array theories have been studied previously. variables and constants, with countably infinitely many of

Extensional theories require that if two arrays store the same each distinct sort. The constants are uninterpreted, in the i value at index i, for each index , then the arrays must be sense they will not occur in any or axiom scheme. the same. Non-extensional theories do not make this re- The symbols of the language are

quirement. This paper is the first to present a procedure for

I

read of type (array ), for every value sort checking satisfiability of arbitrary quantifier-free formulas

in an extensional theory of arrays and prove its correctness.

I

write of type (array array ), for every 2. Theories of arrays value sort

Decision procedures for various theories of arrays have Subscripts on read and write will generally be omitted. In-

a i) a

been studied previously. Most of these theories can be di- formally, read( will denote the value stored in array at

(a i v )

vided into extensional and non-extensional varieties. In this index i, and write will denote an array which stores i section, several families of array theories are axiomatized the same value as a for every index except possibly , where

in classical first-order multi-sorted with . The it stores value v . theory Arr decided in this paper is then presented and com- Terms are built up in the usual way from constants and pared to previously decided theories. variables using the function symbols. Terms whose sort is

an array sort will be called array terms. Terms whose sort

(a)

2.1. The language is I will be called index terms. The dimension dim of

(a) = n

an array term a is the dimension of its sort. If dim ,

a n n 1 a Sorts The language has a basic sort I for indices into array is said to be -dimensional. If , is also said arrays. It also has value sorts, which are the sorts of indi- to be multi-dimensional. Formulas The atomic formulas of the language are the and the read-over-write scheme. This is because single- equations between terms of the same sort. Formulas are sorted first-order theories with function symbols and equal- built up from atomic formulas using propositional connec- ity may be translated into this array theory in such a way tives and quantifiers in the usual way. A formula is closed if that a first-order formula is valid iff its translation is. The

it has no free variables. A literal is an atomic formula or the translation maps constant symbols to index constants, n-

negation of an atomic formula. A theory is a set of closed ary function symbols to n-dimensional array constants,

f (i i ) n

formulas. and terms like to nested read expressions

( (f i ) i ) i ) f i i

read( read read , where

n n

f i i n 2.2. Theories are the translations of . The undecidability re- sults for classical first order logic with just function symbols Some theories restrict which array sorts are allowed. If a and equality (see, e.g., [5]) can then be applied to show that

theory allows array sorts of dimension at most n, it is said to even quite restricted quantified fragments of the extensional

have just n-dimensional arrays. If a theory allows all array theory of arrays are undecidable. sorts, it is said to have multi-dimensional arrays. A decision procedure for Arr may be useful even for The following scheme, which is schematic in a value applications which require a fully quantified logic. Many

sort , is called the read-over-write axiom scheme. Infor- theorem provers, such as the widely used PVS [11], pro-

i j

mally, it says that for all arrays a, indices and , and val- vide strategies to reduce goals to subgoals in decidable frag- j

ues v of suitable type, reading the value stored at index of ments of their logic.

a i v ) v (a j ) write( is if the two indices are equal and read if they are different. 2.4. Comparison with related work Axiom scheme 1 (read-over-write)

In this section, related work is summarized by describing

a : i : I j : I v : V array which theories are decided. These theories often use axiom-

atizations different from but equivalent to that of Arr. All

( (a i v ) j ) = v ) i = j ( read write

the theories decided are quantifier-free. Kaplan is the only

( (a i v ) j ) = (a j )) i = j

( read write read I one to distinguish the sorts V and . Many of the previous theories allow arithmetic operators or uninterpreted func- The following scheme, which is schematic in a value sort

tions over sort I to be used in addition to the symbols read

, is called the extensionality axiom scheme. Informally, and write. The restriction here to just the essential theory of it expresses a principle of extensionality for arrays: if two

arrays is justified by the fact that, as will be shown in Sec- i arrays store the same value at index i, for each index , they tion 6 below, the satisfiability procedure for Arr is suitable are equal. for incorporation into a framework for cooperating decision Axiom scheme 2 (extensionality) procedures [2]. In such a framework, separate decision pro-

cedures for arithmetic and uninterpreted functions may be

a : b :

array array combined with the decision procedure for Arr to decide the

combined theory.

(a i) = (b i)) a = b i : I ( read read The first two works present but no decision pro- The extensional theories are those axiomatized by the cedure for their theories. With the exception of Levitt’ read-over-write and extensionality axiom schemes. The work, the others give decision procedures for theories that non-extensional theories are those axiomatized by just the are strictly weaker than Arr, either because they restrict the read-over-write axiom scheme. Note that since a theory is a form of formulas in the theory (e.g., to just equations), dis- set of closed formulas, quantifier-free array theories have no allow equations between arrays, or are non-extensional. variables; all 0-ary symbols are (uninterpreted) constants. McCarthy In [8], McCarthy introduces the function symbols read and write and gives an informal semantics for 2.3. The theory Arr an extensional theory of arrays based on them. Collins and Syme Collins and Syme present in HOL The theory Arr decided in this paper is the quantifier- a theory of finite higher-order partial functions similar to a

free fragment of the extensional theory with multi- theory with multi-dimensional arrays [3]. I dimensional arrays where sort V is defined to be sort . So Kaplan In [6], Kaplan gives a decision procedure for a indices are the values stored in 1-dimensional arrays. non-extensional equational theory with just 1-dimensional The restriction to the quantifier-free fragment is justi- arrays. He considers equations between index terms only, fied by the fact that the fully quantified theory is undecid- which is reasonable since his theory contains no non-trivial

able, even in the absence of the function symbols write equations between arrays. He then shows how to extend his procedure to decide an extensional equational theory, where 3.1. Informal overview the equations may be between array as well as index terms.

He imposes the restriction that distinct variables of sort I The procedure works in two phases. In the first phase, must receive distinct interpretations. the original goal is transformed into a set of subgoals such Suzuki and Jefferson In [15], Suzuki and Jeffer- that (i) no subgoal contains write and (ii) the original goal son present a decision procedure for a theory with just 1- is satisfiable iff one of the subgoals is. Eliminating write dimensional arrays, where equations between arrays are not expressions is straightforward except when they occur as allowed. The theory has axioms for extensionality and the the left or right hand side of an equation. How to eliminate existence of constant arrays (arrays that store the same value such occurrences of write expressions is the crucial insight at all indices), but these appear to be included for technical of this algorithm. reasons only; the theory decided is equivalent to the one without those axioms under the restrictions they impose. Definition 2 (= )

They extend their procedure to decide a theory with a new

a = b i : I i I (a i) = (b i)

I read read

a b) predicate PERM, where PERM( holds iff the def

multiset of the values stored in a is contained in the mul-

a = b I =

Formulas of the form I with are called partial tiset of the values stored in b. Sentences of the theory are

equations.

(a b) P restricted to the form P PERM , where is any (quantifier-free) sentence not containing PERM. Arr does The crucial observation is that not have the PERM predicate, but inspection of the way

Suzuki and Jefferson extend their algorithm to treat PERM

(a i v ) = b (a = b (b i) = v )

ig write f read shows that it could just as easily be used to extend the algo- rithm for Arr, as long as their restriction disallowing equa- write expressions occurring as sides of equations may thus tions between array terms were retained. be eliminated by introducing partial equations. Downey and Sethi In [4], Downey and Sethi present The second phase of the procedure is based on the ob- a decision procedure for an extensional equational theory servation that in the absence of write, arrays behave like with just 1-dimensional arrays. Equations between array uninterpreted functions and read behaves like function ap- terms are allowed. They prove that determining the invalid- plication. So in the absence of write, a congruence closure ity of an equation in their theory of arrays is NP-complete. algorithm (cf. [1]) could be used to decide the theory. The Nelson and Oppen In [10], Nelson and Oppen describe algorithm must be modified to work with partial equations an extensional theory of arrays. Their theory allows multi- as well as equations, but this can be done. For simplicity, the dimensional arrays. They do not present their satisfiabil- very simple congruence closure algorithm described in [14] ity procedure for the extensional theory, but in [9], Nelson is used, but it should be possible to modify a more complex gives a detailed presentation of a satisfiability procedure for algorithm. a non-extensional theory. Levitt In Chapter 5 of his PhD thesis [7], Levitt presents 3.2. Formal presentation a decision procedure for an extensional theory of arrays based on solving equations and canonizing terms, in the Figure 1 presents our procedure as a proof system. The style of Shostak [13]. A detailed proof of correctness is proof system determines a non-deterministic procedure, not given, and has proved elusive to the authors. In con- where rules are applied bottom-up to analyze a goal into trast, a detailed proof of correctness is given below for the one or more subgoals. The system may be thought of as procedure for Arr. a rewrite system, where, for each rule, the goal below the line is rewritten to the subgoals above the line. The sys- 3. The satisfiability procedure for Arr tem resembles a Gentzen-Sch¨utte system where only left rules of the corresponding sequent system are used (i.e., a

Arr is decided by a refutation procedure. The procedure sequent system where sequents are restricted to be of the

decides satisfiability of conjunctions of literals, which are form ). The derivable objects of this system are sets equations and disequations between terms. Deciding satis- of literals. It is intended that a set of literals be derivable iff fiability of arbitrary boolean combinations of atomic formu- their conjunction is unsatisfiable. A deduction of a goal is a las can be reduced to this problem by well-known means. tree obtained by applying the proof rules bottom-up to that A conjunction of literals whose satisfiability is to be tested goal. A goal to which no rule can be applied is said to be will be called a goal. Comma will be used to denote con- normal. junction. Two goals are said to be equisatisfiable when one is satisfiable iff the other is.

Phase 1:

(a k ) = (b k )

read read

a b

(ext) k is not free in the conclusion; and are arrays

a = b

v ] i = j [ (a j )] i = j [ read

(r-over-w)

( (a i v ) j )]

[read write

a = b i I a = b (b i) = v i I

iI I read

(w-elim)

(a i v ) = b

write I

b = a

I a

(w-elim-helper) b is a write expression, and is not

a = b I

Phase 2:

a = b (a i) = (b i) i I a = b i I I I read read

(partial-eq)

a = b

I

b I = (a i)

where a ; ; read occurs in

0 0

c c b = a = b a =

I I I I

I = =

(trans) and I

0

c a = b a =

I I

[y ] x = y

x y x y [ ]

(subst) , x not in

[x] x = y

y = x

I

y

(symm) x

x = y I

Both phases:

i = j i I i I i = j

(-split) ( -expand)

i (j I ) i (j I )

(-empty) (ax)

i x = x

Figure 1. The decision procedure as a proof system The system has two phases. Some rules may be applied cedure is complete iff when it reports a goal satisfiable, the in just one phase, while others may be applied in either goal is indeed satisfiable. A procedure is correct iff it ter- phase. The rules of phase 1 are applied to a goal until no minates on all inputs, and it is sound and complete. In this rule applies, and then the rules of phase 2 are applied. The section, a detailed proof of completeness for the satisfiabil- procedure stops and reports that the original conjunction is ity procedure for Arr is given. The proof of termination is satisfiable if it encounters a normal subgoal. Otherwise, it routine and omitted for lack of space. The following theo- reports that the original goal is unsatisfiable. As mentioned rem implies soundness. before, phase 2 is a modified congruence closure algorithm. The core congruence closure algorithm consists of just the Theorem 1 (equisatisfiability) The conclusion of each rules (symm) and (subst) [14]. rule of the system is satisfiable iff one of its premises is sat-

The set-theoretic operators have their usual meanings; isfiable.

I fig I I

note that i denotes , where does not contain [ ]

i. denotes a context, which is an expression contain- Proof: The proof is routine. Consider just the rule (trans).

0

c a = b a = I

ing one or more occurrences of a single free variable. The If I and are true in some model, then it is

0

c = b = I I

expression obtained by substituting the term t for the con- easy to see by the definition of that is also

c a

t]

text’s free variable is written [ . In the rule (subst), since true in some model. If agrees with at every index except

I a b

the side condition requires that [ ] contain no occurrences those in and agrees with at every index except those

I i I I c a x

of the term x, applying (subst) replaces all occurrences of in , then clearly implies that agrees with at

i a b i c b

x] y

in [ with the term . denotes syntactic identity. The and also that agrees with at . Hence, agrees with i

symbol denotes an ordering on terms by size, which is at . For the other direction, if the premise has a model, so

y x y

defined on terms in the usual way. Let x iff and are does the conclusion, since the conclusion is a of the

2 y

such that the size of x is less than or equal to the size of . premise.

The variants and are derived from in the usual way. Recall that a normal goal is one to which no rule applies. 3.3. Avoiding non-termination in phase 2 By the equisatisfiability theorem, to prove completeness of the algorithm it suffices to show that any normal goal is satisfiable. This may be done by constructing a model for a In phase 2, applications of (partial-eq) and (trans) must normal goal. The following lemma is easily established. be restricted to avoid certain sources of non-termination. There is nothing preventing (partial-eq) and (trans) from be- Lemma 1 (effect of phase 1) A goal that is normal with ing applied repeatedly with the same partial equations, be- respect to phase 1 of the algorithm contains no write ex- cause for both rules, the partial equations are retained in the pressions and no disequations between array expressions. goal. For (partial-eq), this form of non-termination may be prevented by adding a side condition to the rule that pre-

4.1. A convenient form for normal goals

a i)

vents it from being applied if, informally, read( and

b i) i read( are already known to be equal or if is already In preparation for constructing a model, several trans- known to be equal to an of I . Formally, the proce-

formations, which are not actually performed by the algo- t dure can test whether or not t and are already known to

be equal by applying all the rules of phase 2 except (partial- rithm, are applied to a normal goal to give an equisatisfiable

normal goal , which is in a more convenient form. If the

= t

eq) and (trans) to the current goal with t added, and

= x seeing whether or not that goal is reported unsatisfiable. If normal goal contains equations of the form x , clearly

they may be removed and the result will be equisatisfiable. neither (-split) nor ( -expand) applies to the current goal, then this is equivalent just to comparing normal forms as de- Next, modify the goal by doing the following. Let G be the

termined by the core congruence closure algorithm. So in goal as it currently stands. If there is a term of the form

a i) G

an implementation, this non-termination may easily be pre- read( in that is not the left hand side of any equation

c G

vented. A similar approach can be used to prevent (trans) in G, choose a constant symbol not occurring in , and

(a i) c

from being applied repeatedly to the same formulas. The re- modify G by replacing read everywhere in it with

a i) = c

quired machinery, however, has been omitted from the proof and adding the equation read( to it. If there is no

a i) G system for simplicity. such term read( in , stop. It is easy to show that the resulting goal is normal and equisatisfiable with the original normal goal. This resulting goal consists of formulas of one

4. Correctness of the Procedure

y z of the following four forms, where x, , and are constant symbols:

A satisfiability procedure is sound iff when it reports a

x y ) = z goal unsatisfiable, the goal is indeed unsatisfiable. A pro- I. read(

a ) R a R x = y (a R

n I I I

II. The chain is denoted

n1 2 1

x = y I

n

III. I , where every element of is a constant symbol is the length of the chain.

S

x = y

I IV.

The union along the chain is defined to be j .

j n

= y

Since this resulting goal is normal, no formula x of

x y a x

The chain is said to be from to iff and

the form (IV) has its left hand side appearing anywhere else

a y

n . in the goal, since otherwise (subst) would apply. Let be this resulting goal, except without the equations of the form

(IV). will be said to be in convenient normal form. Any a a' b

model M of may be extended to a model of with those

equations of the form (IV) by giving the same

y M y

for the constant x as for the constant , if interprets , a a' b y and a single arbitrary interpretation for both x and other- a wise. a' 4.2. Construction of a model c

b' In this section, a kind of term model for the goal in b convenient normal form is constructed. Several definitions,

in terms of , are required. The fact that the core congru- ence closure algorithm (rules (subst) and (symm)) is correct

Figure 2. Standard forms for -chains

is used (see [14] for the proof).

Definition 3 ( and ) Let and be the ternary

a b

relations defined, respectively, by Lemma 2 (standard form for chains) Suppose I ,

=

with I . Then one of the following is true:

a b (a = b) I

I iff

a b b a

i. there is a -chain from to or from to , where

a b (b = a) I I iff

the union along the chain is I

I I

Note that for any , I and need not be symmetric,

a c

ii. for some c, there is a -chain from to and another

(a = b) (b = a) I

since I does not imply . c

from b to , where the unionof the unionsalongthe two

I Definition 4 ( ) Let be the least ternary sat- chains is . isfying

Figure 2 shows the possibilities.

a a a

1. , for every array constant appearing in

a C a

n I I

Proof Let be a -chain from

n1 1

S

(a b) (b a) a b

I I

2. I

I a b I = C

to , with i . Assume is of mini-

in

i 1 i

mal length of all such chains. For every with

Definition 5 ( ) Let be the least ternary relation con-

n 1

I I

, let i be either or , and suppose we

i i

taining and satisfying

a a

i n n

have . It is easy to prove that if

this latter chain is not of one of the forms described in

0 0

b b) a ( c a c c

I I I I

1 i n 1

(i) and (ii), there must be an i with

I i I

such that i is and is . So we have

i1 i

Definition 6 () Let be the binary relation defined by

a a a a a =

i i I i I i I

i . So both and

i i1 i1

a a = I

i I i

i are in . It must be the case that both and

i

a b I a b

iff I I

i are non-empty, since otherwise (subst) would apply to re-

place the left hand side of one of those equations by the right

The context will help distinguish and . Note that

is an . hand side of the other. No rules can apply, since is nor-

I I

i mal. Since both i and are non-empty, (trans) would

Definition 7 (chains) A chain of applications of a ternary be applicable, unless the conditions described in Section 3.3

R

symbol R like or , called an -chain, is defined to for preventing non-termination were keeping it from being

a ) a ) (a R a (a R a =

I i I i I I

be a conjunction of the form applied. This implies that either or

2 1 i1 i

a ) (a R a n 2 a = a a

n n I i I I

, with . i is in , since and must be their

n1 i1 i

own normal forms as determined by the core congruence Since is normal, no rules can apply. So we must have

a a I = a = a

i I I

closure algorithm. Hence, we have i . , since otherwise (subst) would apply with

i1 i

a a a a (a i)

n i I i I I I

So the chain , and read . Furthermore, since (partial-eq) cannot

n1 i1 i 1 C whose union is I , has smaller length than . This contra- apply, it must be the case that the conditions of Section 3.3

dicts the assumption that C is of minimal length of such for preventing non-termination are what is prohibiting its

a 2 a = (a i)

I

chains. application with and read . In particular,

1

(a i)

it must be the case that read is already known to ]]

Now an interpretation, given as a function [[ from the

(a i)

be equal to read . The other possibility, namely that

constant and function symbols of to their interpretations, I

i is known to be equal to an element of , is excluded

]] a

is defined. [[ is defined to every constant symbol I

because i is not in by hypothesis, and correctness of

a [[ ]] of basic type I to itself. will map array constants to the core congruence closure algorithm would require i to

functions. To satisfy extensionality, functions that give the i appear in I in a normal goal if were known to be equal

same value for every input are required to be equal. First

I (a i) (a i)

to an element of . For read and read to have

let C be a new symbol not occurring in , for every - the same normal form with respect to the core congruence

C [[ ]]

a i) = x equivalence . Define read to be the operation of (

closure algorithm, we must have read ;

function application, except that when it is given C , it may this follows from the definition of convenient normal form.

a [[a]]

just return C . Intuitively, for an array constant , will Now the induction hypothesis may be applied to conclude

a i) = x 2 be a function mapping all but a finite number of inputs to a (

that read n .

a

default value C . Formally, suppose is in -equivalence

a b

C [[a]] Proof (of Lemma 3) Suppose I and suppose

class . Define to be the function that returns C for

=

every input, except those assigned values by the following: I . Then by Lemma 2, there is either a -chain

b b a c

from a to or from to , or there is a constant

a c

Definition 8 (interpretation of array constants) such that there is a -chain from to and another

b a c

for every constant symbol of the same type as , from b to . By Lemma 4, in the first case either

(b i) = x (a i) = x

I a b

for every set such that I , read or read , and in the sec-

(c i) = x (c i) = x

i I

for every index constant not appearing in , ond, read read . Since is nor-

x y z (x i) = y (x i) = z

b i) = x x

if read( for some , then mal, for all , , and , read read

y z

a]] [[i]] [[x]]

the value of [[ for input is defined to be . implies , since otherwise (subst) would apply.

x x I = So in either case, . If , then it must be

Notice that the body of Definition 8 may specify the

b (a i) (b i)

the case that a , since read and read are

a]] i [[]] value for [[ on input more than once. So for to be well-

both in ; otherwise, (subst) would apply. But again,

[[a]] i [[x ]]

defined, if the value of on input is specified to be

a i) = x (a i) = y x y 2

read( read implies that .

0

[[x ]] [[x ]] = [[x ]] a b a c

I I

and , we need . So if and

I I [[]] with i not in and not in , then for to be well-defined,

Lemma 5 (correctness of the constructed model) The

(b i) = x (c i) = x

it must be the case that if read read

model constructed in the previous section satisfies every

0

c [[x ]] = [[x ]] a b a

I I , then . Since the conditions , ,

formula of the goal in convenient normal form.

0

c i I i I b i

not in , and not in together imply I I and

I [[]] not in I , the following lemma suffices to prove that Proof Consider the types (I), (II), and (III) of formulas is indeed well-defined. from the list in section 4.1; recall that goals in convenient

normal form consist of formulas of just these types.

[[]] a b i I

I

x y ) = z x

Lemma 3 (well-definedness of ) If , not in , Case I: read( Since is an array constant,

(a i) = x (b i) = x x x

x and read read , then . x

, and so the construction of Definition 8 will assign

x]] [[y ]] [[z ]] the value that function [[ takes on to be .

The proof of this lemma relies on the following sub-

(x y )]] = [[z ]] Hence [[read .

lemma.

= y

Case II: x Since all disequations in are between y index expressions, x and must be index constants. Hence,

Lemma 4 (certain reads equal along chains) Suppose

x]] = x [[y ]] = y x y

[[ and , by construction. If , then the

a a a i a

n I n I

, and are such that for

n1 1

S goal would not be normal, because (ax) would apply. So the

I I I i

j n

some , where is not in . Sup-

j n

= y

interpretation satisfies x .

x (a i) = x

pose there is a constant such that read .

x = y

Case III: I It must be shown that for every

(a i) = x

Then read n .

I ]] [[x]] [[y ]]

index constant not in [[ , and give the same value.

x]] [[y ]] [[ and have the same default value since they are in the

Proof The proof is by inductionon n. The base case is triv-

(a i) = x i ial. For the induction case, suppose read . same -. For those index constants not

(y i) = z in I that appear in a formula of the form read , The procedure for Arr always does this for index terms but

they store the same values, by Definition 8. 2 not always for array terms. If the rules of Figure 3 are added

t to phase 2, however, it can be shown that if t and are ar-

From the fact that a model has been constructed for a ray terms in a normal goal that are entailed to be equal, then

normal goal, the main result now follows.

t t

.

Theorem 2 (completeness) The satisfiability procedure

0 0

c c a = a = b b =

I I I

for Arr is complete. (trans2) I

0

c a = b b =

I I

= I =

5. Complexity analysis where I and

a = b a = b

I I

Observe that each application of (w-elim) or (partial-eq) (patch) i

a = b I

leads to one new subgoal for each element of the indexing i

I I

(a i) = (b i)

set in the rule. The size of is easily seen to be bounded where is read read by the size N of the original goal . So any deduction from

may be viewed as a tree with branching factor no more Figure 3. Rules to propagate entailed equa- N than N . It is not hard to show, in fact, that is an upper tions

bound on the number of branching nodes in the tree, so there

N N lg N

(N ) = O (2 ) are at most O branches. Each branch can be shown to be of polynomial length, so the algorithm 6.2. Propagating properly entailed disjunctions runs in worst-case exponential time.

Theorem 3 (NP-completeness) The problem of testing a Definition 9 (proper entailment of disjunctions) A dis- conjunction of literals for satisfiability in Arr is NP- junction that is entailed when neither of its disjuncts is complete. entailed is said to be properly entailed. Incorporating the procedure into the framework of [2] also

Proof Downey and Sethi showed that a subproblem of requires it to have the following . Let and be the problem decided here is NP-hard [4]. To show that the

equations whose sides appear in goal . If the procedure re-

problem is in NP, observe that the size of the model con-

ports satisfiable, then cannot properly entail . The structed in the previous section for a goal in convenient original procedure for Arr does not have this property; an

normal form is polynomial in the size of . The conver-

a = b a = b (b i) =

ig fj g example is the normal goal f read

sion of a normal goal to convenient normal form incurs at

(b j ) = v i = j a = b v read , which entails but nei-

most a polynomial expansion of the goal. So the size of the

= j a = b ther i nor . It can be proved, however, that the model constructed is polynomial in the size of the normal modified procedure of section 6.1 does have this property. goal. Hence a model can be nondeterministically guessed in polynomial time. Checking whether or not a conjunction 6.3. Allowing constant arrays of literals is satisfied by a model can be done deterministi- cally in polynomial time. So satisfiability of a conjunction Constant arrays are arrays that store a single value for of literals can be checked nondeterministically in polyno- all indices. The language is extended with function sym-

mial time. 2

bols const for each value sort , and the following axiom schema is added:

6. Extensions

x : i : I ( (x) i) = x read const In this section, several extensions to the refutation pro- The procedure of section 6.1 is modified to obtain a pro- cedure for Arr are considered. Due to lack of space, cor- cedure for this extended theory by adding the rules of Fig- rectness proofs are omitted. ure 4. (const-elim1) is added to both phases, and (const- symm) and (const-elim2) are added to phase 2. To ensure 6.1. Propagating all entailed equations that the conclusion of (const-elim2) entails its premise, the simplifying assumption is made that the interpretation of the

Full incorporation of the satisfiability procedure into the type I of indices is infinite. With this modified procedure, framework for cooperating procedures of [2] requires that goals that are normal with respect to phase 2 may fail to be

the procedure can discover all equations between terms oc- normal with respect to phase 1. For example, the applica-

(a i v )) = (b) curring in a satisfiable goal that are entailed by that goal. tions of const in the goal const(write const

are removed using (const-elim2) in phase 2, but this adds [4] P. Downey and R. Sethi. Assignment Commands with Array

a i v ) = b the equation write( to the goal, which could be References. Journal of the ACM, 25(4):652–666, Oct. 1978. analyzed with the (w-elim) rule of phase 1. So it is neces- [5] E. B¨orger, E. Gr¨adel, and Y. Gurevich. The Classical Deci- sary to repeat the phases. sion Problem. Springer, 1997. [6] D. Kaplan. Some Completeness Results in the Mathematical

Theory of Computation. Journal of the ACM, 15(1):124–34,

x]

(const-elim1) [ Jan. 1968.

( (x) i)] [read const [7] J. Levitt. Formal Verification Techniques for Digital Sys- tems. PhD thesis, Stanford University, 1999.

[8] J. McCarthy. Towards a Mathematical Science of Computa-

a = (x)

I const

(const-symm) tion. In IFIP Congress 62, 1962.

(x) = a

const I [9] G. Nelson. Techniques for Program Verification. Technical

(y ) where a is not of the form const Report CSL-81-10, Xerox PARC, June 1981. [10] G. Nelson and D. Oppen. Simplification by cooperating de-

cision procedures. ACM Transactions on Programming Lan-

x = y

(const-elim2) guages and Systems, 1(2):245–57, 1979.

(x) = (y )

const I const [11] S. Owre, J. Rushby, and N. Shankar. PVS: A Prototype Veri- fication System. In D. Kapur, editor, 11th International Con- Figure 4. Rules to treat constant arrays ference on Automated Deduction, volume 607 of Lecture Notes in Artificial Intelligence, pages 748–752. Springer- Verlag, 1992. 7. Conclusion [12] H. Ruess. Private communication. 2000. [13] R. Shostak. Deciding combinations of theories. Journal of the Association for Computing Machinery, 31(1):1–12, A refutation procedure for an extensional theory of 1984. multi-dimensional arrays has been presented and proved [14] A. Stump, D. Dill, J. Giesl, and C. Barrett. On correct. The theory Arr decided essentially subsumes all a Very Simple Abstract Higher-Order Congruence Clo- previously decided array theories. The procedure is suitable sure Algorithm. In preparation, 2000. Available from for incorporation into a framework for cooperating decision http://verify.stanford.edu/˜ stump/. procedures. [15] N. Suzuki and D. Jefferson. Verification of Presburger Array Programs. In Proceedings of a Confer- ence on Theoretical Computer Science, 1977. University of 8. Acknowledgements Waterloo, Waterloo, Ontario, Canada.

We thank the anonymous reviewers for their very help- ful criticism. The first author was supported during part of this work by a National Science Foundation Graduate Fel- lowship. Support was also provided in part by NSF contract CCR-9806889-002 and ARPA/AirForce contract F33615- 00-C-1693. This paper does not necessarily reflect the po- sition or the policy of the U.S. Government; no official en- dorsement should be inferred.

References

[1] L. Bachmair and A. Tiwari. Abstract Congruence Clo- sure and Specializations. In D. McAllester, editor, 17th International Conference on Automated Deduction, volume 1831 of Lecture Notes in Artificial Intelligence, pages 64– 78. Springer-Verlag, 2000. [2] C. Barrett, D. Dill, and A. Stump. A Framework for Co- operating Decision Procedures. In D. McAllester, editor, 17th International Conference on Computer Aided Deduc- tion, volume 1831 of Lecture Notes in Artificial Intelligence, pages 79–97. Springer-Verlag, 2000. [3] G. Collins and D. Syme. A Theory of Finite Maps. In Con- ference on Higher Order Logic Theorem Proving and its Ap- plications, 1995.