Behavioral Verification of Distributed Concurrent Systems with BOBJ
Joseph Goguen Kai Lin Dept. Computer Science & Engineering San Diego Supercomputer Center University of California at San Diego [email protected] [email protected]
Abstract grams [4]), whereas design level verification is easier and more likely to uncover subtle bugs, because it does not re- Following condensed introductions to classical and behav- quire dealing with the arbitrary complexities of program- ioral algebraic specification, this paper discusses the veri- ming language semantics. fication of behavioral properties using BOBJ, especially its These points are illustrated by our proof of the alternat- implementation of conditional circular coinductive rewrit- ing bit protocol in Section 4. There are actually many dif- ing with case analysis. This formal method is then applied ferent ways to specify the alternating bit protocol, some of to proving correctness of the alternating bit protocol, in one which are rather trivial to verify, but our specification with of its less trivial versions. We have tried to minimize mathe- fair lossy channels is not one of them. The proof shows that matics in the exposition, in part by giving concrete illustra- this specification is a behavioral refinement of another be- tions using the BOBJ system. havioral specification having perfect channels, and that the latter is behaviorally equivalent to perfect transmission (we thank Prof. Dorel Lucanu for this interpretation). 1. Introduction Many important contemporary computer systems are distributed and concurrent, and are designed within the ob- Faced with increasingly complex software and hardware ject paradigm. It is a difficult challenge for formal meth- systems, including distributed concurrent systems, where ods to handle all the features involved within a uniform the interactions among components can be very subtle, de- framework. Hidden algebra, which was introduced in [12] velopers are turning more and more to formal methods. and elaborated in [21, 22, 16, 32, 34], is a systematic ap- Such methods use specifications written in mathematical proach to such problems. Hidden algebra allows models logic, and sound proof rules that support refinements, yield- that only satisfy their specifications behaviorally, in that ing rigorous mathematical proofs of significant properties. they appear to exhibit the required behavior under all rel- The goal is not only to increase quality and decrease cost, evant experiments; this is important because many clever but also to allow assertions about reliability that can be implementations used in practice only satisfy their specifi- checked in a precise way. Formal methods can be applied cations in this sense. Hidden algebra extends many sorted throughout a development cycle, or at selected steps, in algebra by distinguishing between “visible” sorts used to which case refinement proofs can insure the correctness of model data, and “hidden” sorts used to model states. This key decisions at those steps. framework provides natural ways to handle the most trou- Formal methods involve specification and verification. bling features of large systems, including concurrency, dis- Formal specifications describe a system and its desired tribution, nondeterminism, as well as the usual features of properties in a formal language, using notation derived from the object paradigm, including classes, subclasses (inheri- the underlying logic. Formal verification uses formal logic tance), and local states with attributes and methods, in addi- to prove that a specification satisfies certain desirable prop- tion to abstract data types, generic modules, and more gen- erties. A proof can be viewed as a test of a specification, erally, the powerful module system of parameterized pro- which can help understand requirements, improve speci- gramming [11]. Hidden algebra generalizes the approaches fications, and detect design errors. A specification with- of process algebra and transition system to include non- out proofs may contain inconsistencies or inappropriate as- monadic parameterized methods and attributes; this extra sumptions. In our opinion, code level verification is a dif- power can sometimes dramatically simplify verification. ficult task that is often not worth the trouble (since code Behavioral equations were introduced by Reichel [31] in level errors are a small percentage of the total errors in pro- 1981, and have since been used by many researchers. Be- havioral logic is a diverse research area, including not just We offer many thanks to Grigore Ros¸u for his work hidden algebra, but also the coherent hidden algebra of Di- on circular coinductive rewriting, and to Kokichi Futatsugi aconescu [8, 7], and the observational logic of Bidoit and for his ongoing support and encouragement. We also thank Hennicker [25, 3]. These approaches fall into two broad cat- Monica Marcus for finding a bug, and Dorel Lucanu for his egories, depending on whether or not a fixed data algebra is helpful remarks. assumed for all models. A new generalization of hidden al- gebra treats these variants in a uniform way [16, 34]. Coal- 2. Algebraic Specification and BOBJ gebra is another related area, in that is also supports behav- ioral specification, and uses coinduction; e.g., see the sur- This section begins with a review of classical algebraic vey paper [26]. specification, including both loose and initial specification, Of course, the proof rules in these logics are sound, and then moves on to behavioral algebraic specification, but they are also incomplete [6], so there cannot be any particularly its foundation in hidden algebra, and its imple- algorithm for proving all true statements. Context induc- mentation in BOBJ. tion [24, 2] and general coinduction [18, 19] are two pop- ular proof techniques for verifying behavioral properties, 2.1. Loose and Data Specification but both need intensive human intervention. Circular coin- duction, introduced in [34], is a powerful method, effec- This subsection introduces the basic concepts, notation and tively implemented by the circular coinductive rewriting al- terminology that we need from classical algebraic specifi-
gorithm, which has automatically proved many behavioral cation; readers may wish to consult this material on an as-
properties [16, 17]. Behavioral equivalence generalizes the needed basis. Given a set , an -sorted set ¡ is a fam-
¤¦¥
notion of bisimilarity used in process algebra, where there ily of sets ¡£¢ , one for each . The elements of are
§¨¡ ¤ ¥
is a very large literature, including proof methods that seem called sorts and the notation ¢ © is used. A signa-
§ ¤"! ¥
to be special cases of coinduction. Howerer, this lies outside ture is an -sorted set ¢© . The el-
the scope of this paper, so we just mention Milner’s very in- ements of ¢ are called operation (or function) symbols
¤"! ¤ $%¥&£'(
#
fluential process algebra CCS [29], and [30], where the no- of arity , sort , and type ; in particular, ¢ $
tion of bisimilarity seems to have originated. is a constant symbol ( )+* denotes the empty string). If has
¤"! $-, ¤
BOBJ [16, 17, 34, 32] is an executable algebraic spec- the type , we write /. , and constants are writ-
0 , ¤ 0£¥12'(
. ification language developed in the Meaning and Compu- ten when ¢ . tation Lab at the University of California, San Diego, for Signatures are given in BOBJ by giving sorts after the supporting behavioral specification and verification, based keywords sort or sorts, and operations after the key- on recent developments in hidden algebra. In addition to words op or ops. The form of an operation follows the op rewriting for order sorted equational logic1, BOBJ also im- keyword, then a colon followed by a list of the sorts for ar- plements order sorted behavioral rewriting and conditional guments to that operation, followed by an arrow, followed circular coinductive rewriting with case analysis (the latter by the value sort of the operation. Underbar characters may abbreviated C4RW). This paper illustrates this method by serve as place holders within the form, to indicate where verifying the Alternating Bit Protocol (abbreviated ABP), the arguments should go; the number of underbars and ar- in one of its less trivial versions; BOBJ seems to be the first gument sorts should be the same, as in the in operation be- system to support automatic coinduction proofs of anything low. If there are no underbars but the argument sort list is like this complexity. Such proofs have only recently become non-empty, as with the insert operation below, the op- possible, due to the implementation of C4RW in BOBJ, and eration is assumed to have syntax that requires opening and a new event mark stream approach to fairness, as discussed closing parentheses, with commas between arguments, as in in Section 4.2 below. CafeOBJ [7] and Spike [2] also sup- insert(2, S). The following is a simple signature for a port behavioral specification and verification, but C4RW is theory of sets in BOBJ notation: only implemented in BOBJ. sorts Elt Set . Section 2 explains some basics of classical and hidden op empty : -> Set . op _in_ : Elt Set -> Bool . algebraic specification, Section 3 discuss the C4RW algo- op insert : Elt Set -> Set . rithm, while Section 4 discusses conclusions and some fu- Note that overloading is possible (and is helpful for read- ture work. We try to keep theory to a minimum, and we ability) in this framework, since the same form can have also describe only features of BOBJ that are necessary for more than one type. For example, the form _in_ could
our examples; much more information about BOBJ can be ! also be an operation on lists, with type #35476"8 35476¨89 +:<;=;?> , found in the thesis of Kai Lin [27]. where Bool is the sort of Booleans, from the builtin mod- ule BOOL, which is imported by default into every other 1 I.e., many sorted with subsorts [20, 23]. module.
2
¦
¡
2' ( ' ( ¤ ¥
¢£¢¥¤
is a ground signature iff ¢ whenever the same sort ; we may write such an equation ab-
¦
¤¨§ ¤ © ¢ ) * SRT G G©
¤ ¤ ¤
¤ and unless . Union is defined com- stractly in the form and concretely in the form
© ©
SR"P G G© §WP
¢ ¢ U 8V © © XU XV
¤ ¤ ¤
ponentwise, by ¢ . A common case when and the sorts of
P G GX©
XU5 8V
is union with a ground signature , where we use the nota- can be inferred from their uses in and in . Simi-
tion £ for . larly, a -conditional equation consists of a ground signa-
¡ £
A -algebra consists of an S-sorted set also denoted ture of variable symbols plus a set of pairs of -terms
Y Y
¡ ¡ * © G G©
/
-+4 4
, plus an interpretation of in , which is a family of ar- * and , each pair of the same sort and ,
Y Y Y Y
+¢ ¢£ ¢ , ¢ ¢ ¢ ) ¡2¢ ¡2¢£ ¡2¢ * ZRT G G© © ©
. .
¤ ¤ ¤
rows for written in the form if .
-
¤ ¤ ¤¨! ¥
each type , which interpret the oper- Hereafter we use the word “equation” for both the condi-
¡ [
ation symbols in as actual operations on . For constant tional and unconditional cases. A specification or theory
' ( , ' ( ¡2¢
. 8\ \
¢
symbols, the interpretation is given by ¢ . is a pair , consisting of a signature and a set of
¢ #$
Usually we write just $ for , but if we need to -equations.
Y
©
¡ $"! ] SR" G G ¤
make the dependence on explicit, we may write . Suppose -equation is ¤ if
Y Y Y
¡2¢ ¡ ¤ © © ¡ ¡
¤
is called the carrier of of sort . Given -algebras and is a -algebra, we say satisfies
¡ # $ , ¡ # ¡ B^] _ ,` ¡
. © .
and , a -homomorphism is an - this equation, written ¤ , iff for any map ,
Y Y
$ , ¡ # $ #$ % % _< * _< © _<)G _<G©
. ¢ ! " /
¤ ¤ -+4 4 ¤
sorted arrow such that if * for , then . Given
$'&($ ")% $ % $/¥ 7 [ ¡ [ ¡ ]
¢ ¢ ¢ ¢ ¢ X\ © © B
¤ ¤
for each and all a specification ¤ , iff for ev-
%+*¥ ¡ $ 0 ] ¥ 00&
/ ¢£, ¢ ! \ ¤.-
for , and such that ¤ for ery .
0£¥¦2'(
\
each constant symbol ¢ . Given a set of -equations , we define the provabil-
1 ¡ a
A -congruence relation on a -algebra is a - ity relation B for -equations is defined by the following
$ , ¤ ¤ ¤
.
indexed equivalence relation such that if rules:
* * * *
2 ¥ ¡£¢ 2 1
3 / , 3
-54 4
a"B SR" G G \
and with for , then 1. ¤
a"B SR" G G© aFB SR" G© G
$ 26 27 1 $ \ \
3 83 ¤
. Given a -congruence rela- 2. If ¤ , then
a ZRT G G© a ZRT G© G© ©
\ B \ B
¤ ¤
¡ ¡:9;1
tion 1 on , the quotient -algebra is a -algebra 3. If and , then
a"B SR" G G© ©
\
¤
:9<1 ¡:9<1 ¢ ¡£¢09=1 ¢ ¤
¡ such that is for any sort and
Y Y Y Y
© © ©
ZR"b G G ¥
\
¤ ¤ ¤
if
*
$ ) 2> * ) 2? ?* ) $ 26 27 * 2 ¥ ¡2¢
, 4. If and
¤ for any and
Y Y
_ ,
. \ B
BFJKNL ¤
*
$ , ¤@ ¤ ¤ ¥¦
and for / . 4
-:4 and .
a ZRT _9)G _9)G©
/ \ B
¤ -c4 4
ACB , then .
Given an -sorted signature , the -sorted set of -
a ZR"b G* G© G*¥+A
\ B /
¤ BCJKML ¢ -d4 4 ,
5. If * and for
£' ( AFB ¢
¢ED
terms is the smallest -sorted set that such that
a SR" $ G G $ , ¤ ¤ ¤
\ B ¨ 1.
and , then ¤
$/¥% = G* ¥HA $ G G ¥
¢ ¢ ¢ B ¢£,
and given and then
$ )G© G©
.
A $ ¥ A
B ¢
B . Notice that is a -algebra by interpreting
' ( $ $ ¥ 7
¢ ¢ ¢
¢ as just , and as the operation sending ] B
\ ©
Theorem 3 Completeness: If is a -equation, ¤
G G $ )G A G
¨ B
to the list . Thus, is called the -term a"Be] I ] iff \ .
algebra. Note that because of overloading, terms do not al-
X\
ways have a unique parse. The following is the key property The loose semantics of a specification is the class of this algebra: of all -algebras that satisfy the equations in \ . Example 1 A Simple Loose Theory In BOBJ, loose spec- Theorem 1 Given a signature with no overloaded
ifications are given in modules delimited by the keywords constants and a -algebra A, there is a unique -
th and end, with sorts and operations as before, with vari-
ACB ¡ I homomorphism . . ables following the keywords var or vars, and equations The proofs of this and other theorems in this section are following the keyword eq. The following is a loose theory
omitted; they can be found in [28, 9], among other places. for monoids, which are sets with a binary operation (here in-
Given a signature and a ground signature disjoint
dicated by juxtaposition) that is associative and has an iden-
£) A from , we can form the -algebra BFJKML and then tity (which is here denoted e):
view it as a -algebra by forgetting the names of the new
th MONOID is sort Elt .
A constants in X; let us denote this -algebra by BFJKNL . It has op e : -> Elt . the following universal freeness property: op _ _ : Elt Elt -> Elt .
vars E E’ E’’ : Elt .
¡ 22,N ¡ Theorem 2 Given a -algebra and . , there is eq e E = E .
eq E e = E .
22,OA ¡ 2 .
a unique -homomorphism BFJKML extending , in
eq E (E’ E’’) = (E E’) E’’ .
2 )P 2 )P P ¥Q ¤ ¥
¢ ¢ ¢
the sense that ¤ for each and ; end
2 I
sometimes we will write just 2 instead of . Instead of writing out the associative and identity laws ex-
A -equation consists of a ground signature of vari- plicitly, we can also give them as what are called attributes
£ able symbols (disjoint from ) plus two -terms of in BOBJ, in the following manner:
3
©
ZRT G G© © a
\ \
th MONOID is sort Elt . such that if ¤ is an equation in , then
op e : -> Elt .
SR 5G )G© ¤ ¥ ¢
¤ ¢£L ¤ where J for any sort
op _ _ : Elt Elt -> Elt [assoc id: e].
,MA A
.
BCJKML B'¢)J
end and KML is the -homomorphism induced
2, A AE© by ; we may write . . Note that BOBJ does not The assoc attribute actually does more than the associa- check semantic correctness of views, but only their syntax; tive equation: it enables parsing and pattern matching mod- therefore users must check the semantics. ulo associativity; similarly, the attribute id: enables pat-
tern matching modulo identity (see Section 2.2). I Example 3 A Simple View
X\ Given a specification , a natural congruence rela- view V from MONOID to PEANO is
sort Elt to Nat .
¢¡ a G£ ¤¡.G©
tion can be defined directly from B by iff ¦
op (_ _ ) to (_+_) .
a ZR G G©
\ B ¤ , and we have the following important re- end
sult:
The BOBJ syntax for views is straightforward, except that
X\ Theorem 4 Given a specification ¤ , for
when items are omitted, BOBJ attempts to figure out those
¡ ¡ © any -algebra with ¤ , there exists a unique
missing items; the resulting views are called default views;
¡
ACB;9¥ ¡ I -homomorphism from to .
see [23] for details. I
A 9¦ ¤¡ This property of B is called initiality. A useful char- A parameterized specification or parameterized theory
acterization of initiality is the following:
)A A A;
is a pair of specifications such that is included
¡
\
A A¥
Theorem 5 Given a set of of -equations, a -algebra in A ; we call the parameter or interface theory and
A B ¡
.
A A
is initial iff it has no junk (the -homomorphism the body. In Example 4 below, is ELT and is SET. In-
)A A [
¨
is surjective) and no confusion (it satisfies only the equa- stantiation of with an actual parameter requires a
I
\
A [ . tions that can be deduced from ). view to describe the binding of actual to formal pa-
rameters; often a default view can be used. Following ideas
8\ The initial semantics of a specification is the class of its initial algebras. It can be shown that all the initial al- developed for the Clear specification language [5, 13], the instantiation is given by a colimit construction. Although gebras of a specification are -isomorphic. By Theorem 4,
not needed in this paper, by exploiting the power of colim-
¡
A"B 9§ 8\ is an initial algebra of . Because any ele-
its, the “parameterized programming” module system used
¡ ¦ ment in AFB;9 can be generated by operations, induction is valid for proving properties of initial algebras. Generally, in BOBJ (and other algebraic specification languages) goes more than one induction scheme is valid for a given specifi- well beyond that of standard programming languages, and cation. in fact, the parameterized modules of Clear and earlier ver- sions of OBJ strongly influenced the module systems of Example 2 The Peano Numbers Below is a simple ini- Ada, ML, and C++; see [11, 23] for details. tial theory of natural numbers in the style of Peano, with addition. Initial theories in BOBJ are delimited by the key- Example 4 A Parameterized Initial Theory of Sets The words dth (the “d” is for “data”) and end. initial theory SET allows us to form sets of elements from any collection that has an equality relation defined on it sat- dth PEANO is sort Nat . op 0 : -> Nat . isfying the law of identity, given in its interface theory ELT. op s_ : Nat -> Nat . Parameterization of a module M by an interface I is indi- op _+_ : Nat Nat -> Nat . cated with the notation M[X :: I], where X is the formal vars M N : Nat . eq M + 0 = M . parameter of the parameterized module. eq M + s N = s(M + N). end th ELT is sort Elt . op eq : Elt Elt -> Bool . The first two operations are constructors, and one can do in- var E : Elt . eq eq(E, E) = true . duction over them to prove properties of addition, in the end usual way. (These numbers differ from those provided in the builtin module NAT, which use Java numbers and pro- dth SET[X :: ELT] is sort Set . op empty : -> Set .
vide many operations beyond addition.) I op _in_ : Elt Set -> Bool .
op insert : Elt Set -> Set .
©
©
Given signatures with sorts , then a signature vars E1 E2 : Elt . var S : Set .
¨ ¨
©
, ©
. © . ©
morphism is a pair where , and eq E1 in empty = false .
2 eq E1 in insert(E2, S) =
©
,
© ¢ ¢ .
J L ¢£L is an -indexed function J .
eq(E1, E2) or E1 in S .
A
8\
A view, or theory morphism, from a theory ¤ to eq insert(E1, S) = S if E1 in S .
A © N© © ,
X\ . a theory ¤ is a signature morphism end
4
G G© ) ) G * * G
. ¦
We can tell BOBJ to instantiate SET with the builtin mod- ¦ ; we may write for the normal form of un-
ule INT of integers, and call the result INTSET using a de- der ¥ . A TRS is canonical iff it is confluent and terminat-
fault view, with the following: ing. It can be shown that in a canonical TRS, every -term dth INTSET is has a unique normal form, called its canonical form. A sur- pr SET[INT] . vey of basic term rewriting for the one sorted case is given end in [1]. BOBJ’s term rewriting capability provides an opera- Note that this uses a default view from ELT to NAT, and the tional semantics for modules, by viewing equations as pr (for “protecting”) indicates a module importation. I rewrite rules, i.e., by applying equations in the forward di- Two additional features from parameterized program- rection. Term rewriting for initial and loose theories is in- ming that will be used in our main example are renaming voked with the command red, followed by a term (and a and sums of modules. The first allows selected sorts and period). For example, operations to be renamed within a module; this can be very select INTSET . helpful when reusing modules in new contexts. The sum red 3 in insert(1,insert(2,insert(3,empty))).
just combines two or more modules, taking proper account
§
? constructs the set - and then tests whether 3 is in it, of any shared submodules that may have arisen through im- in the context of the module INTSET, which is made the portation. The syntax of these features is illustrated in the module currently in focus by using the select command. following: Here is the output produced by the above (slightly reformat- dth PEANO+INT is ted to fit within the two column format of this paper): pr PEANO *(sort Nat to Peano, op 0 to zero) + NAT . reduce in INTSET : 3 in insert(1, insert(2, end insert(3, empty))) result Bool: true Here the sort and constant of PEANO are renamed to avoid rewrite time: 165ms parse time: 4ms conflict with those of NAT, and are then combined. Now the If some operations have attributes for associativity, com- BOBJ parser will be able to determine whether the sort of mutativity, or identity, then rewriting is done module those any given term is Peano or Nat, even though the opera- equations; we do not go into the details here, because this tion _+_ is still overloaded. feature is not needed in our ABP example, but the details can be found in [23, 1] and many other places. 2.2. Term Rewriting The builtin BOBJ module TRUTH, which is included in
BOOL and is therefore by default imported into every other
b Given a signature and ground signatures of vari- module, provides a polymorphic binary operation denoted
== _
able symbols (disjoint from ), a substitution is a -sorted which compares the normal forms of its two arguments.
5
§ _¢ , ¢ AFB ¢¨ b
set . . By Theorem 2, every such For example,
_ _ , A B)
extends uniquely to a -homomorphism . red insert(3,insert(3,empty))
== insert(3,empty) .
A"B b G ¥ AFB ¢¨ _<)G _=¢¨)G
. For any term , let ¤ . Given
¥ AFB ¢¨) G ¥ AFB ¢¨ b a term and a term , we say
returns true, since the two canonical forms are identi-
_ _<¡ matches G if there exists a substitution such that is cal; otherwise it returns false. If the TRS is canonical,
syntactically the same as G . then true is returned iff the two terms are provably equal,
Given a signature and a ground signature of vari- but if the TRS is non-terminating, reduction may go into
able symbols (disjoint from ), a -rewrite rule is a pair an infinite loop, and if the TRS is not confluent, reduc-
¢ ¢
¤£ £
of terms, written . , such that and have the same tion could return false when the terms are nonetheless
¢ sort and all variables in £ also appear in . A -rewrite sys- provably equal. The negation of ==, denoted =/=, is also a
tem, or term rewriting system, abbreviated TRS, ¥ is a set of BOBJ builtin polymorphic operation.
G G© ¥ -rewrite rules. A term rewrites to a term using , writ-
Example 5 Substitution It may be interesting to see an
G G© G G©
§¦ . ten . or just , iff there exists a rewrite rule
example of substitution in BOBJ, using the open and
¢ ¥ _ G ¨£
. in and a substitution such that has a subterm
close feature, which allows material to be temporarily
<©¢ G© G _<©¢
_ and can be obtained from by replacing with
added to the module currently in focus. Note that a period
_< _9©¢ £ ; the term is called the redex of the rewrite. Let
is required after open, but is forbidden after close.
¥
. . ¦
¦ be the reflexive and transitive closure of . is con-
G G G G G©
.
. open . ¦
fluent iff ¦ and , then there exists a term
ops x y z : -> Nat .
G© G© G G ¥
. 2. ¦
such that ¦ and . is terminating iff there
eq x = 1 .
G G8
. . ¦
is no infinite rewriting ¦ . A normal form eq y = 2 .
¥ G© G© of G under is a term such that cannot be written and eq z = 3 .
5
\ red 3 in insert(x,insert(y,insert(z,empty))) . den signature, and © is a hidden subsignature of , and is
close a (finite) set of -equations. The operations in © are called Here the ops keyword allows a number of operations with behavioral. the same type to be introduced together; here they serve as The definition hidden algebra given above allows a loose variables for the term to be reduced, while the three equa- interpretation for the data algebra, following the general ap-
tions define a substitution. I proach of [16, 34]. However, this is not appropriate for our
ABP example, since if true and false become identi-
¢ ¨£ A conditional rewrite rule is a rewrite rule . with
fied in the Boolean subalgebra, the protocol cannot work
G G© G G©
¤ ¤
¡ a condition such that all vari- correctly. This may be remedied by requiring every hidden ables in the condition also appear in ¢ , and it can be de-
algebra over a given signature to have a fixed data algebra,
¢ GX G© G G©
. ££¢ ¤
¤ ¤ noted as . A conditional as in the original version of hidden algebra, or alternatively, term rewriting system ¥ is a set of conditional or uncondi- by allowing so called data constraints, in the sense of [13], tional rewrite rules. A meaningful conditional TRS should as additional sentences. Note that general results proved for
contain at least one unconditional rewrite rule. The rewrite the loose data approach will also apply to fixed data alge-
¥ G §¦
relation . induced by is defined as follows: a term bras, so there is no loss of generality in proceeding in this
© ¥ rewrites to a term G using iff there exists a rewrite rule
way.
¢ GX G© G G© ¥ _
. £-4¦¥
¤ ¤
in and a substitution such
) I *
© ©
Given a hidden signature , a -context denoted
_<©¢ GX© G
that G has a subterm , and can be obtained from by re-
¤ A § I
for sort is a © -term in having exactly one
_<©¢ _9 ) ) G*#* * ) ) G© * *
/ £
¦ ¦
¤ -=4 4
placing with , and * for .
¤
special variable I of the sort , where is an infinite set of .
Since this is a recursive definition, we take ¦ to be the
I ) I * special variables different from . If is a © -context
least binary relation satisfying the definition.
¤ G ¥/ )G7*
of sort and ¢ , let denote the result of substi-
G I )0I * ¤ tuting for . A © -context for hidden sort is called 2.3. Hidden Logic
© -experiment if its sort is visible.
¡ If © is a subsignature of a hidden signature and is
Behavioral specifications characterize how systems behave
1 ¡
-algebra and is an equivalence on , then an opera-
in response to relevant experiments, rather than how they
$ ¢8 ¢ ¢ 1 $ ! 26 2? 1
tion in is congruent for iff
are implemented. It distinguishes visible from hidden sorts,
27© $ 27© 2>* 1 27©
/ !
-¨4 4
* whenever for . A hidden
with equality being strict on visible sorts and behavioral
¡ ¡ © -congruence on is an equivalence relation on that is on hidden sorts, in the sense of indistinguishability under
congruent for each operation in © and is the identity on vis-
experiments. More concretely, a hidden sort is treated as
©
ible sorts. The -congruence B , called behavioral equiv- a black box, whose status can only be observed and up-
alence, on ¡ is defined as follows: two data elements are dated by the operations defined on it. Behavioral specifica- equivalent iff they are equal, and two states are equivalent tion uses observations to specify the behavior of operations
iff they cannot be distinguished by © -experiments,i.e., iff without looking inside the black box. Behavioral specifica- any experiment produces the same value when applied to tion provides a more natural and flexible abstraction layer them. The following is a basic result of hidden algebra:
for system design. Behavioral satisfaction is more general
than strict satisfaction, so that behavioral specifications im- Theorem 6 Given a hidden subsignature © of and a -
¡ ¡ I pose fewer constraints on the semantics of modules. As a © algebra , B is the largest hidden -congruence on .
result, not all the inference rules of the ordinary equational
¡
reasoning are valid for behavioral modules, and special care An operation $ is -behaviorally congruent for (or
must be taken in proving behavioral properties; however, a simply congruent) iff it is congruent for B . A hidden
¡ ] ©
-algebra -behaviorally satisfies a -equation ¤
small modification of equational deduction works for be-
Y Y Y Y
SR" G G© © © ¡ ]
©
¤ ¤ ¤
if , written B ,
havioral specifications.
Y Y
*
_ , ¡ _9 _< ©
. *
iff for any mapping , if B for
A hidden signature is a signature with its sorts par-
©
_<)G _<)G
/ \
-c4 4
, then B . If is a set of -equations, ¨
titioned into visible sorts § and hidden sorts . Opera-
¡ ¡ ] ] ¥ ¡
© \ © \ B
then B iff for any . We say behav-
tions in with one hidden argument and a visible result
© 8\
are called attributes, and those with one hidden argument iorally satisfies a behavioral specification ¤ iff
¡ ¡ ] ¡
© \ © \%© © \
B B B
and a hidden result are called methods. Constants of hid- B ; we write . Define iff
¡ ] ¡ ]
© © B implies B for every algebra . Define iff
den sort are called hidden constants. A hidden -algebra is
]
\%©
B .
just a -algebra. The elements of visible sort in a hidden
© X\ Given a behavioral specification ¤ , the
-algebra represent data, and those of hidden sorts repre-
sent states; the subalgebra of visible sorts and operations of provability relation B for -equations is defined by the
visible type is called the data algebra. A behavioral spec- following rules:
ZRT G G
© X\ \ ¤ ification or theory is a triple where is a hid- 1. Reflexivity: B .
6
ZRT GX G
\ \
¤ B
2. Symmetry: If B , then called the constructors, generators, or basis (when induc-
SR" G GX
¤ . tion is involved). A general definition of cobasis is intro-
duced in [33], and a simplified version can be given as fol-
ZRT GX G
\ \
¤ B
3. Transitivity: If B and
©
lows: a cobasis is a subset of operations in that gener-
SR" G G¡ ZRT GX G¡
\
¤ ¤
, then B .
ates enough experiments, in the sense that no other exper-
Y Y Y Y
SRFb G G© © ©
¤ ¤ ¤
4. Substitution: If if iment can distinguish any two states that cannot be distin-
Y
*
_ , b AFB ZRT _<
\ . \ ¤
in and and B guished by these experiments.
Y
_< © ZRT _<G _<)G©
\ /
¤ -c4 4 * for , then B . Behavioral rewriting [7] is to behavioral deduction 5. Congruence: what standard rewriting is to standard equational deduc-
tion, a simple but useful proof method. Circular coinductive
¤
ZRT G G© G G© ¥ A"B£¢7K
\ ¤
(a) If B where and rewriting proves behavioral equalities by combining be-
*¦¥ *¨§
¥ § GX G G G ¥AFB£¢7K
, and , then havioral rewriting with circular coinduction [16]; it also
*¦¥ *©§
ZRT $ )GX G G G G
\ ¤
B strengthens the duality with induction by allowing coinduc-
* ¥ *©§
$ GX G G© G G
. tive hypotheses to be used in proofs.
SRT G* G©
\ /
¤ - 4 4 *
(b) If B for Based on the notion of cobasis, a more powerful proof
$ \
and is congruent operation in , then B method called circular coinduction is introduced in [34]. A
ZRT $ )G G $ G© G©
¤
. enriched behavioral deduction system can be got by adding
the following rule: Suppose is a cobasis of a behavioral
© ©
SR" G G SR" G G
\
¤ ¤
Define iff B . These
© X\ specification ¤ and is a well founded partial rules specialize those of ordinary equational deduction by
order on © -contexts which is preserved by the operations in
considering all sorts visible. Note that (5b) only applies to
G8 G AFB) ¥
© . For any terms and in , if for any and
congruent operations. If all operations are congruent, then
ZRT SR ?)GX
¤
for appropriate variables , B
ordinary equational deduction is sound for behavioral satis-
0) _<)GX * SR" ZR ?G 0) _<)G *
¤
and B and
faction. The following expresses soundness with respect to Y
0 SR" ZR ?GX
¤
, or else B and
Y
both equational and behavioral satisfaction, generalizing a Y
ZRT SR ?)G
© ¤
B for some -term , then
result in [6,7] that equational deduction is sound when all
ZRT G G
¤
B .
operations are congruent.
SR" G G© ZRT G G©
©
¤ ¤
Theorem 7 If , then B 2.4. Behavioral Specification
ZRT G G© I
\ © ¤ and also ¤ . This section describes behavioral modules in BOBJ. Their General coinduction [18, 19, 32] can be used to prove that
denotational semantics is the class of all algebras (i.e., im-
ZRT G G© a -equation ¤ is behaviorally satisfied by a be- plementations) that behaviorally satisfy specifications, and havioral specification by the following steps:
their operational semantics is given by behavioral rewriting.
¥ Define a binary relation ¥ on terms ( is called the Behavioral modules in BOBJ are defined between the key-
candidate relation). words bth and end. Sorts in behavioral modules are con-
Show that ¥ is a hidden -congruence. sidered hidden unless declared with the keyword dsort,
¥ G© Prove that G . for visible sorts in behavioral modules. Similarly, opera- The soundness of general coinduction follows directly from tions in behavioral modules are considered congruent un- Theorem 6. The major problem with this coinduction is that less given the attribute ncong. it requires human intervention to define an appropriate can- didate relation. Example 6 Behavioral Theory of Sets The following is a Context induction [24] can also be used to prove behav- behavioral specification for sets: ioral properties, using well-founded induction on the con- bth BSET[X :: ELT] is sort Set . text structure to show that it is valid for all experiments. But op empty : -> Set . in many real examples, context induction is not trivial and op _in_ : Elt Set -> Bool . op insert : Elt Set -> Set . requires extensive human input, for example, in the form of vars E1 E2 : Elt . var S : Set . inductive lemmas that can be difficult to discover and diffi- eq E1 in empty = false . cult to prove [10]. eq E1 in insert(E2, S) = eq(E1, E2) or E1 in S . It often happens that some experiments are unnecessary end in a context induction, because the equations imply that some experiments are equivalent to others. A similar but The first equation describes the observational result on dual situation occurs in abstract data type theory when all empty via _in_, and the next two equations give the ob- the elements can be generated from a subset of operations, servation result on insert via _in_.
7 Comparing this behavioral theory with the initial theory and S are behaviorally equivalent. However, with ordi- for sets given in Example 4, the most important difference nary rewriting, push(pop(push(S))) is reduced to
is that this theory does not have the equation insert(E1, push(S). I S) = S if E1 in S . We will later show that this equation Behavioral rewriting is invoked with the command red, can be proved as a behavioral property of this specifica-
which handles non-congruent operations properly. A term
tion (see Example 15). Although the other equations look
)_< ¢ * )_< * )0I * behaviorally rewrites to £ , where is
the same, they are methodologically different. Data theo-
¢ £ a context and . is a rewrite rule, iff one of the following ries are usually designed with respect to some constructors, is satisfied: but behavioral theories are designed with respect to obser-
vations. For example, empty and insert are “construc- 1. The redex does not have a non-trivial context. I 2. All operations from the top of down to are con- tors” of the data theory SET, i.e., all ground sets can be cre- gruent. ated with them; and then all other operations can be defined
3. The context of the redex has a subcontext such that
based on the terms generated by these constructors. In fact, I all the operations from the top of to are congruent the operation _in_ is defined in this style. I
and has a visible sort. 2.4.1. Methodology To design a behavioral theory, some For example, push(pop(push(S))) cannot be reduced
operations are selected as a cobasis to generate the behav-
¡£¢ I ¥¤ to push(S), because the context 6 doesn’t satisfy ioral equivalence relation, and other operations are defined the conditions above. with respect to these basic observers. For example, in the behavioral theory BSET, the operation _in_ is selected as Example 8 Behavioral Theory of Streams The behav- a unique observer in a cobasis, which means that, two sets ioral specifications STREAM and ZIP below are much used are behaviorally equivalent iff they always return the same in this section. The first declares infinite streams parameter- visible results under the observation of _in_, i.e., iff they ized by the “trivial” interface theory TRIV, which only re- have the same elements. quires that some sort be designated. th TRIV is sort Elt . end
2.5. Behavioral Rewriting bth STREAM[X :: TRIV] is sort Stream . op head_ : Stream -> Elt . BOBJ provides an operational semantics for behavioral op tail_ : Stream -> Stream . op _&_ : Elt Stream -> Stream . modules, again by applying equations as rewrite rules. But var E : Elt . var S : Stream . because of non-congruent operations, ordinary rewriting is eq head(E & S) = E. not in general sound, as illustrated by the following exam- eq tail(E & S) = S . ple of a behavioral theory with a non-congruent operation. end The operation _&_ inserts an element into the head of a Example 7 Nondeterministic Stacks This behavioral the- stream, and head and tail respectively return the first el- ory illustrates one way that nondeterminism can arise in hid- ement, and the stream after removing its first element. den algebra specifications, on a variant of stacks: The next specification adds an operation which “zips” bth NDSTACK is sort Stack . two streams together by taking elements from them alter- protecting NAT . op push _ : Stack -> Stack [ncong] . nately: op top _ : Stack -> Nat . bth ZIP[X :: TRIV] is pr STREAM[X] . op pop _ : Stack -> Stack . op zip : Stream Stream -> Stream . var S : Stack . vars S S’ : Stream . eq pop push S = S . eq head zip(S,S’) = head S . end eq tail zip(S,S’) = zip(S’, tail S) . end The operation push places a nondeterministically chosen natural number on the top of a stack. Notice that even for The picture below shows the application of zip to two in- behaviorally equivalent stacks S1 and S2, push(S1) and put streams:
push(S2) may insert different natural numbers onto S1
2 2 2
and S2; therefore push(S1) and push(S2) may be dis- RRRR RRRR top push R)
tinguishable by the attribute , so that should be ) ¦¨§ © /
2 2 2
3 30 5 / 3 declared non-congruent. The only equation in this specifica- lll5 01237654 lll
tion says that a stack is not behaviorally changed by push- ll
3 3 ing a new element and then popping it. 3 Notice that push(pop(push(S))) == push(S) The command red does behavioral rewriting when in is not behaviorally satisfied, although pop(push(S)) the context of a behavioral theory. For example,
8 open ZIP[NAT] . as APP[X::BSET[LIST]]. However, the corresponding ops ones twos : -> Stream . property
vars S S’ : Stream .
vars N M : Nat .
ZR £, ¤£ ZR¥, ¥
8 35476"8 ; 56 ; 6
eq head ones = 1 . ¤
¥
6 eq tail ones = ones . ; eq head twos = 2 .
eq tail twos = twos . is not behaviorally satisfied by LIST, because the exper-
red head tail tail zip(ones, twos).
¨©£5 £ I
8 4 > close iment ¤ can usually distinguish cons(N, cons(N,S)) and cons(N,S). This means the view I BSET-TO-LIST should not be used to instantiate APP above, and shows the inadequacy of the definition for be- 2.6. Behavioral Views
havioral theory morphism suggested above. I
* * * *
© X\
¤ -
Behavioral parameterized theories can use any kind of the- Given behavioral theories ¤ for , *
ory as their interfaces, but the interfaces of non-behavioral let the set of visible sorts and the set of hidden sorts in
* * ¨
theories must not be behavioral theories, i.e., behavioral the- be § and , respectively. Then a behavioral view from
£,
ories are only allowed as interfaces for other (parameter- to is a signature morphism . such
5 ¤ ¥ § ¤ ¥ §"
ized) behavioral theories. that: (1) for any sort ; and (2) for
SRT G G© SR" G G©
© ¤
For behavioral theories, we might require a view to any equation ¤ , if , then
SR 5G )G© ¢
©
¤
¤ ¢£L<¤
send visible sorts to visible sorts, and to satisfy the fol- where J for any sort
©
A ZRT G G ¤ ¥/ ,=A
\: .
¤ BT8JKNL
B J
lowing condition: For any equation in , and KML is the homomorphism
SR )G 5G©
¤ is behaviorally satisfied by the target induced by . module. Unfortunately this definition does not work in gen- Notice that this definition of behavior views requires ver- eral, as shown by the following: ifying all behavioral properties of the source module, which is impossible in practice. It is sufficient to define a signature
Example 9 Views of Behavioral Theories A behavioral
morphism from to such that (1) all translated equa-
theory for lists of natural numbers is below:
tions of are behaviorally satisfied by ; and (2) the im-
bth LIST is sort List .
age of a cobasis of under is a cobasis of . This is
pr NAT .
op nil : -> List . because it then follows that any behavioral property of op cons : Nat List -> List . is also behaviorally satisfied by . In practice, the condi- op _in_ : Nat List -> Bool . tion (2) above can be satisfied by making some operations op head_ : List -> Nat . op tail_ : List -> List . non-congruent. vars N M : Nat . vars L L1 L2 : List . eq N in nil = false . Example 10 Views between Behavioral Theories (Cont.) eq N in cons(M, L) = N == M or N in L . A different behavioral theory for lists of natural numbers is eq head cons(N,L) = N . the following: eq tail cons(N,L) = L . end bth LISTNC is sort List . pr NAT . Now we define a view from BSET[NAT] to LIST, with op nil : -> List . BSET from Example 6, as follows: op cons : Nat List -> List . op _in_ : Nat List -> Bool . view BSET-TO-LIST from BSET[NAT] to LIST is op head_ : List -> Nat [ncong] . sort Set to List . op tail_ : List -> List [ncong] . op empty to nil . vars N M : Nat . vars L L1 L2 : List . op (_in_) to (_in_) . eq N in nil = false . op insert to cons . eq N in cons(M, L) = N == M or N in L . end eq head(cons(N,L)) = N . eq tail(cons(N,L)) = L . Then the translations of equations in the module end BSET[NAT] by the view above are all behaviorally sat- The only difference between LISTNC and LIST is that LIST isfied by , and it is also straightforward to prove the head and tail are declared non-congruent. Now we de- BSET
following behavioral property in : fine the following view:
SR¡ 2,¢ ¤£ SR¦¥ ,§¥©¨ ¨ ¨ ¥
8 8 4 6 8 4 56 8 ¤ view BSET-TO-LISTNC from BSET[NAT] to LISTNC is
sort Set to List .
¨ ¥
6 8 4 op empty to nil . op (_in_) to (_in_) . This property and similar properties might have been used op insert to cons . in designing other parameterized behavioral theories, such end
9 Because the cobasis {_in_} of BSET[NAT] is mapped to The connection operation <_,_> is a congruent operation the cobasis {_in_} of LISTNC and all the equations trans- with two hidden arguments, but the untupling operations lated from the equations in BSET[NAT] are behaviorally are non-congruent, because two pairs may be behaviorally satisfied by LISTNC, we see that BSET-TO-LISTNC is equivalent, while their corresponding components are not
indeed a view. I behaviorally equivalent. BOBJ also has builtin polymorphic tupling for visible sorts, but we do not discuss this here be- The definition of behavioral views allows the source to cause it is not needed for our ABP example. be a behavioral, initial, or loose module; hidden sorts may map to visible sorts, but it does not allow views from initial or loose modules to behavioral modules, since these must 3. Verification of Behavioral Properties map visible sorts to hidden sorts; for example, the instan- tiation STREAM[LIST[NAT]] is not correct, because the Behavioral rewriting can prove simple behavioral proper- interface TRIV of STREAM has a unique visible sort TRIV ties, but more powerful methods are needed to verify more whereas the principal sort List of LIST[NAT] is hidden, difficult behavioral properties. Unlike general coinduction and can not be used to replace a visible sort. [19] and context induction [2], conditional circular coinduc- tive rewriting with case analysis provides a powerful way to prove behavioral properties, with intensive human interven- 2.7. Concurrent Connection tion. This section describes and illustrates the use of circu- The concurrent connection operation is important for con- lar coinductive rewriting in BOBJ. structing distributed concurrent systems from compo- nents. BOBJ provides a binary associative infix operator 3.1. Cobasis Discovery and Declaration || that takes two (or more) modules as arguments, cre- ating a new hidden sort that is the tupling of the prin- Cobases are important for behavioral specification; they are cipal sorts of the component modules. More concretely, not only used in designing behavioral specifications and in a concurrent connection yields a behavioral specifica- defining views, but also in the verification of behavioral tion whose states are tuples of the states of its compo- properties. BOBJ has an algorithm that computes default nents, adding a new sort called Tuple, a tupling operation cobases for behavioral specifications based on the congru- < , ,..., > : S1 S2 ... Sn -> Tuple and pro- ence criteria of [34]. This algorithm first takes all operations jection operations i* : Tuple -> Si where i ranges in © as a cobasis, and then removes operations that it finds from 1 to the number of modules connected and Si is the to be redundant. The current version uses behavioral rewrit- ing to check behavioral equivalence, which is stricter than principal sort of the -th component, plus the following “tu- pling equation” the congruence criterion in [34], which requires behavioral equivalence. The computed default cobasis can be displayed eq < 1*(T), 2*(T),..., n*(T) > = T by the command “show cobasis
a term from concurrency theory, methods in component , The algorithm starts with the cobasis containing the op-
§ for ¤ , can be proved to hold behaviorally. erations empty, insert and _in_. Because of the
The code below is equivalent to what BOBJ provides for equations E in empty = false and E1 in in-
/ the case of ¤ modules. The sort Tuple is always hid- sert(E2,S)) = E1 == E2 or E1 in S, the op- den, even when the module is instantiated with component erations empty and insert are removed. I where all sorts are visible. Example 12 Default Cobases of STREAM and ZIP Let bth 2||[1 2 :: TRIV] is sort Tuple . op <_,_ > : Elt.1 Elt.2 -> Tuple . STREAM and ZIP be the behavioral theories defined in Ex- op 1*_ : Tuple -> Elt.1 [ncong]. ample 8. BOBJ’s cobasis algorithm discovers that & and op 2*_ : Tuple -> Elt.2 [ncong]. zip are not needed for STREAM and ZIP. Thus the com- var E1 : Elt.1 . var E2 : Elt.2 . var T : Tuple . puted default cobasis for sort Stream is: eq 1*
end I
10 Given a behavioral theory, its default cobasis might not
C4RW, is presented in Figure 1; its input is a behavioral
This declares a cobasis for the behavioral theory with the
© X\ © ¤ specification with a cobasis D , and a
given name, where the body of the declaration gives a list of
G G© pair of -terms . In Figure 1, denotes the goal equa- congruent operations in the behavioral theory. For example, tions, which must have visible sorted conditions; elements
if LIST is the behavioral theory defined in Example 9, then
¡ ¢¤£ )G
¤¦¥ § of the set are called circularities; denotes the
we can declare a cobasis of LIST with:
¡ normal form of G under behavioral rewriting with and ,
cobasis SIMPLE-COBASIS-OF-LIST from LIST is where ¡ can only be applied at the top level; and for a sig-
op head : List -> Nat .
©¨
op tail : List -> List . nature, a -case definition is a pair where is a - end term called the pattern, and ¨ is a list of sets of -equations,
where each such set is called a case. The variables in get
This is correct because two lists are behaviorally equivalent
bound to terms when the case definition is used. Also, \ , iff the results of observing them by head and tail are be- where is the condition of an equation, indicates the trans- haviorally equivalent. Note that this cobasis is smaller than lation from a Boolean expression to a set of equations. Fi- the default cobasis of LIST, since it doesn’t contain the op- nally, note that the equality sign plays three different roles:
eration _in_. It is straightforward to see that if two lists ¡ for equations in and , it is a symbol that separates the left cannot be distinguished by any experiment with head and and right sides; for let (i.e., assignment) statements, it sep- tail, then they also can not be distinguished by any ex- arates the variable from the term assigned to that variable; periment with head, tail and nil. and otherwise, e.g., in if statements, it denotes the syntac- Another way to declare a cobasis for a module is to use tic identity relation on terms. the command Example 13 A Simple Case Analysis We first define a set cobasis of
3.2. The C4RW Algorithm
of - for any natural number N. The following is a Circular coinductive rewriting proves behavioral equalities case analysis named ODD-EVEN for this module: by integrating behavioral rewriting [7] with circular coin- cases ODD-EVEN for APP is var N : Nat . duction [16, 17]. This section presents this algorithm, and context sum(N) . gives examples showing that it is quite powerful in practice. case eq odd(N) = true . It takes as input a pair of terms, and returns true whenever it case eq even(N) = true . can prove the terms behaviorally equivalent, and otherwise end returns false or else fails to terminate, much as with prov- A case analysis must be associated with a previously de- ing term equality by rewriting. Here we describe the BOBJ fined module, and is valid only for that module. In the ex- implementation; its correctness is shown in [34]. ample above, ODD-EVEN is associated with APP and it can Case analyses are first class citizens, that can be named, be used only for APP. For any case analysis, a unique con- reused, and combined with other case analyses. The follow- text must be defined. A context is a term preceded by the ing is (part of) the syntax, though it is probably easier to ab- keyword context, which specifies the target of the analy- sorb through examples: sis. ODD-EVEN has the context sum(N), which tells BOBJ
11 to use case analysis on any subterms matched by this pat-
tern. A typical case definition should contain several cases, Input: (1) a behavioral theory ¦¨§ © ¦
and each may contain one or more equations that can be (2) a cobasis of
(3) a set of conditional -equations
©
used in that case. ODD-EVEN defines two cases: one is for (4) a -case definition odd natural numbers, and the other is for even natural num- Output: true if a proof of ¦¨ is found, bers. Showing correctness of case analysis declarations is otherwise false (or non-terminating)
the user’s responsibility. I Procedure:
1. let §!
The following intuitive explanation may help. If case 2. for each ©#"%$&(')§*',+¡-/.10 in
©#"%$&(')§*' -/.0
analysis is used with a cred command, whenever the left 3. move + from to $ 4. let 2 be a substitution on assigning new constants to and right sides of a goal are expanded by a cobasis opera-
the variables in 0
¦ §¦4356© 2(©708 tion and they reduce to different normal forms under behav- 5. let +
ioral rewriting with circularities, BOBJ checks if the con- 6. for each 91:;
< §>=/?A@CB ©79HG 2(© 'IJ /K!LM
ED F 7. let ¢ and
text of some case analysis matches a subterm in either nor-
B
< §>=,?A@ © 98G 2(©E' J NK!LM
+ +
¢ED F
mal form. If it does and the matching substitution is _ ,
8. if + matching context, with each case enriched by the substi- 10. then continue B ©R"%$;%©#"£KS%=,?A@ ©79HG 'T NK!LM¡§ F 11. else add D tution instances under _ of the equations in that case. This B =/?A@ © 98G ',+U JK!LMQ-/.0 F D to is repeated until all subgoals reduce to true, which means 12. else continue the proof task is proved; otherwise, a new subgoal is cre- <%+ 0 ¦+ 2 Procedure CASE-ANAL( < , , , , , ) ated, whose left and right sides are the two different nor- < < V 1. if matches a subterm of or + with substitution V V mal forms. If no further case analysis applies to some sub- 2. then let + be with a new constant goal, then the original goal fails, and a new circularity is substituted for each variable 3. for each case W in added for it. = @J?%B © 2(©708_§a@/bTc dJe IXZY¢\[R]_^\D ` 4. if ¢ The following command invokes circular coinductive 5. then continue +§¦_+f3gVH+U©EWh£3i6©72(© 08U rewriting with a specific case analysis declaration on a spe- 6. else let ¦+ B ©7V ©M + D F ¢ cific equational goal: 7. let ¢ and j §l=,?A@B ©MV ©7< U + + + ¢ ¢ED F cred with j j 0 ¦ 3oV © Wh + + CASE-ANAL( , + , , , ) In executing this command, BOBJ automatically uses case is true analysis whenever the context of the indicated case defini- 9. then continue tion is found in proof tasks. 10. else return false 11. else return true Example 14 Behavioral Equivalence over Streams We define two operations map and iter over streams in the Figure 1. The C4RW Algorithm following modules: th FUN is sort Elt . op f_ : Elt -> Elt . c-reduce in MAP : map (iter E) == iter (f E) end using cobasis for MAP: op head _ : Stream -> Elt [prec 15] bth MAP[X :: FUN] is pr ZIP[X] . op tail _ : Stream -> Stream [prec 15] op map_ : Stream -> Stream . ------op iter_ : Elt -> Stream . reduced to: map (iter E) == iter (f E) var E : Elt . var S : Stream . ------eq head map S = f head S . add rule (C1) : map (iter E) = iter (f E) eq tail map S = map tail S . ------eq head iter E = E . target is: map (iter E) == iter (f E) eq tail iter E = iter f E . expand by: op head _ : Stream -> Elt [prec 15] end reduced to: true nf: f E ¨¡ ¨£¢¡¨£¤ 6 For any stream ¤ , map(s) returns " ------ ¨¥ ¨£¢ ¨£¤ ] ¥ ¥ ¥ . For any element of Elt, iter(e) target is: map (iter E) == iter (f E) " ¢ expand by: op tail _ : Stream -> Stream [prec 15] ¨ ¨ ¨ ¥ gives ¥ . We show that map iter E = deduced using (C1) : true iter f E with the following code below, in which trac- nf: iter (f (f E)) ing is first enabled: ------result: true set cred trace on . c-rewrite time: 81ms parse time: 3ms cred map iter E == iter f E . Here is the BOBJ output: Similarly, we can do the following 12 open . current non-deterministic systems; it is perhaps the simplest vars S1 S2 : Stream . non-trivial example of such a system. The ABP is a com- cred map zip(S1, S2) == zip(map(S1), map(S2)) . munication protocol, i.e., a distributed algorithm for reli- close ably transferring data from a source to a target using un- reliable channels. There are actually many different ways for which BOBJ returns true. I to formalize the ABP, based on different assumptions about Example 15 A Proof Using C4RW Example 6 claimed process structure, time, reliability of channels, and so on. that insert(E,S) = S if E in S is a behavioral This paper proves correctness for an ABP model of inter- property of BSET. The following is a proof using C4RW: mediate complexity, assuming synchronous discrete time, cases CASES for ELT is fair channels, and the ability to recognize transmission er- vars X Y : Elt . context eq(X, Y) . rors. The proof relies heavily on the specification and ver- case eq X = Y . ification methods of hidden algebra, and their implementa- case eq eq(X, Y) = false . tion in the BOBJ system, especially its C4RW algorithm. As end far as we know, this paper presents the first complete alge- 2 set cred trace on . braic proof for a system of this kind . cred with CASES insert(E1,S) == S if E1 in S . ack line ¨¡ 5)G G Two cases are tried when a subterm of the format data bit data bit input stream data line output stream is found in a proof goal, and one is enriched with the equa- source target GX G ¨¢ 5)G G ¤ tion ¤ , and the other is enriched with £ ¨ > 6 ¥ . Here is the BOBJ output (slightly reformatted to fit): Figure 2. The Alternating Bit Protocol c-reduce in BSET : insert(E1, S) == S if E1 in S use: CASES 3 using cobasis for BSET: The structure of the ABP is illustrated in Figure 2. It op _ in _ : Elt Set -> Bool [prec 41] has: an input stream of data to be transmitted; a source and a ------target process, each having a data buffer and a one bit state; reduced to: insert(E1, S) == S 2?G2 (G 83 ------a data channel, for pairs called packets; an ac- add rule (C1) : insert(E1, S) = S if E1 in S knowledgement channel, for packets consisting of a single ------bit; and an output data stream. target is: insert(E1, S) == S if E1 in S expand by: op _ in _ : Elt Set -> Bool [prec 41] Here’s how the ABP works: the source process starts by # ¨ reduced to: eq(˜sysconst˜Elt-0, e1) or repeatedly sending packets into the data channel, (˜sysconst˜Elt-0 in s) == ˜sysconst˜Elt-0 # where is the first element of the input stream, and is 0 in s ------or 1. The target process starts by waiting until it receives the # # case analysis by CASES packet , and then it repeatedly sends over the ac------knowledgement channel. When the source process receives case 1 : # # ¤£ assume: ˜sysconst˜Elt-0 = e1 , it begins repeatedly sending the packet - , reduce: eq(˜sysconst˜Elt-0, e1) or where is the second element of the input stream, which (˜sysconst˜Elt-0 in s) == ˜sysconst˜Elt-0 is what the receiver process is now waiting for. When the in s # ¥£ nf: true target receives - , it begins sending packets con- ------# ¦£ taining - to the source process. And so on ... Note that case 2 : we must assume that the two bits are distinct, or this pro- assume: eq(˜sysconst˜Elt-0, e1) = false reduce: eq(˜sysconst˜Elt-0, e1) or cess will fail, and that for convenience, our formalization of (˜sysconst˜Elt-0 in s) == ˜sysconst˜Elt-0 the ABP uses Booleans instead of bits; we also need that the in s natural numbers are distinct. So we take these to be a fixed nf: true ------data algebra for this problem. analyzed 2 cases, all cases succeeded It must be assumed that the channels are fair in the ------sense that if the sender persists, eventually a packet will get result: true c-rewrite time: 53ms parse time: 3ms through; without this assumption, the algorithm is not cor- I rect, because data transmission might fail forever. This as- 2 Here we mean “algebraic” in the sense of algebraic specification the- 4. The Alternating Bit Protocol ory, rather than, for example, process algebra. 3 This diagram and the subsequential informal discussion are intended The Alternating Bit Protocol is a well-established bench- to motivate the formal specification and proof; they should not be con- mark for proof technologies that address distributed con- fused with the formalization itself, which is given later. 13 sumption also implies that the system is non-deterministic, eq data-ok-after Cs = data-ok-after ftl Cs since we do not know how long it will take before a packet if not data-ok Cs . eq data-ok-after Cs = Cs if data-ok Cs . gets through. Perhaps the single most challenging problem eq ack-ok-after Cs = ack-ok-after ftl Cs associated with algebraic correctness proofs for algorithms if not ack-ok Cs . like ABP has been to formalize fairness. The formalization eq ack-ok-after Cs = Cs if ack-ok Cs . end used here is novel, simple, and powerful; moreover, it makes good use of the capabilities of BOBJ, and is easily extended bth ABP is dsort 2PrState . to other situations. pr 2CHAN-STATE + STREAM . op <_,_|_,_> : Bool Nat Bool Nat -> 2PrState . The formalization of correctness is a crucial part of the op [_,_,_] : 2PrState DataStream 2ChState -> proof process. For the ABP, this is straightforward: the out- DataStream . put stream should equal the input stream, except that the vars B B1 B2 : Bool . vars M N : Nat . var Is : DataStream . var Cs : 2ChState . initial content of the target buffer and all erroneous trans- eq [, Is, Cs] = missions should be disregraded. [, Is, ftl Cs] Since streams are infinite “lazy” structures, coinductive if not ack-ok Cs . eq [ 14 behavior of the builtin equality ==, which always returns The final module, ABP, specifies the operation of the true or false (or possibly fails to terminate). This in- protocol, using all the previous modules. The operation completeness of eq can be important when expressions like <_,_|_,_> constructs states of sort 2PrState for the not eq(X,ok) occur in the condition of an equation and two processes. The arguments of this constructor represent X is (for example) m, since then the equation cannot be ap- the states of a data buffer and a one bit register for each pro- plied, whereas it could be applied if =/= (the negation of cess, where the bits are represented by Boolean values. The ==) were used instead; of course, the equation can be ap- keyword dsort indicates that it has initial semantics for plied when X is err. If == and =/= were used instead of the sort 2PrState. The operation [_,_,_] takes as its eq and not eq, the proof in this paper would fail. (By arguments a two process state, an input data stream, and the way, the attribute “[comm]” of eq makes it a symmet- a two channel state, returning the next output data stream ric relation.) state. Its four equations express the operational semantics The second module also contains our novel formaliza- of the ABP in a straightforward way, although presuming tion of fairness, based on the infinite streams of natural the tricks discussed above. When the two bits in the pro- numbers of the first module. Here the number 0 represents cess state are the same, the source process is ready for an ¡ £¢ an immediately successful transmission, while / rep- acknowledgement to arrive, and to then start sending an- resents / consecutive failures followed by a success. Thus, other data item; and when they are different, the target pro- wherever you are in such a stream, there is always a suc- cess is ready to receive a new data packet and then to start cess some finite distance in the future, which is the meaning sending acknowledgements. The first equation says that the of fairness. These streams define when a transmission suc- source process waits when it is ready but the acknowledge- ceeds, but not what is transmitted; let’s call them event mark ment channel fails. The second equation says the target pro- streams4. Notice that this module introduces head and tail cess waits when it is ready but the data channel fails. The operations that are quite different from those of STREAM; third equation says the source process flips its bit when it is fhd returns a mark telling whether or not transmission suc- ready and receives an acknowledgement. The fourth equa- ceeds, and ftl returns the next event mark stream. tion says the target process flips its bit when it is ready and The third module, 2CHAN-STATE, provides event mark receives a data packet. streams for the two channels. The sort 2ChState is con- The correctness criterion for the ABP is very simply ex- structed as the concurrent connection of two fair streams, pressed by the equation using BOBJ’s builtin || operation; it has behavioral se- tl [St, Is, Cs] = Is, mantics. The operations fhd1 and fhd2 respectively ex- where St is a system state where the source is ready, Is tract the heads of the data and acknowledgement streams, is an input data stream, and Cs is a state for the two chan- while ftl extracts the pair of tails of the two streams. nels; it just says that the entire input data stream is success- This much is straightforward, but some other operations fully transmitted (with the initial output ignored); note that are more tricky. The operations data-ok and ack-ok this expresses both safety and liveness for the ABP. How- respectively check whether the data and the acknowledge- ever, we do not prove this form, but instead we prove ment stream have succeeded; note that they both use the specially defined eq operation. The operations data-ok- [ 15 than choice within models6, or we could say that the “real [, Is, ack-ok-after ftl c] . model” of a specification is the class of all hidden algebras close that satisfy it. But no matter what view we take, this is a ***> proof of Lemma C very convenient framework for specification and verifica- open . *** base case tion, which covers every possible behavior of the system. eq fhd1 c = err . eq data-ok-after ftl c = ftl c . red [ The proof begins by proving the following five lemmas for open . *** induction step the ABP specification: eq fhd1 c = err . eq fhd1 ftl c = err . (A) fhd1 data-ok-after Cs = ok red [ (D) [, Is, Cs] = set cobasis of STREAM . [ 16 Lemmas A, B and C are proved by induction, whereas Lem- Lemma C, and it is handled in the same way, for the same mas D and E are proved by case analysis. The proof of reasons. Lemma A is a straightforward induction on the natural num- Lemma D is proved by case analysis on whether ber in the head of the event mark stream for the data chan- fhd1 ftl Cs is ok, or is err, and the first equation as- nel. It says that the function data-ok-after always de- sumed for each case follows from one of these assump- livers an event mark stream indicating an immediate suc- tions. Similar reasoning to that used for the base cases cessful data transmission. The equation of Lemmas B and C, based on consequences of the con- fhd2 ack-ok-after Cs = ok dition of the lemma, justifies the second equation for each case. The second case of Lemma D also uses a spe- can also be proved as a lemma, but fortunately it is cial case of Lemma C, but under the assumptions of this not needed, because adding it would cause some non- case, its leftside is not reduced; so as before, we use its re- terminating reductions in the proof. duced form instead. (The reduction of the leftside is given Each of the next four lemmas is a behavioral conse- as a comment here, since it cannot actually be run in- quence of a corresponding axiom among the first four in the side a case declaration.) ABP specification. However, the assumptions that appear in Similarly, Lemma E is proved by case analysis on the cases of their proofs are a bit tricky. For Lemma B, first whether fhd2 ftl Cs is ok, or is err, giving rise notice that its condition, not ack-ok Cs, can only be to the first equations in its cases, while the second equa- satisfied if eq(fhd2 Cs, ok) is false, which accord- tions follows from assuming its condition. Finally, the sec- ing to the initial semantics of the module MARK, can only ond case of Lemma E uses a special case of Lemma D; happen if fhd2 Cs is err. Therefore when setting up the again, its left side is not reduced, and this is handled as be- base case for Lemma B, implication elimination allows us fore. fhd2 c = err to assert before doing reduction to check We use coinduction to prove Lemmas D and E, as indi- the equation. The same reasoning allows us to assert the cated by the command cred, and therefore we need a coba- fhd1 c = err equation for the base case of Lemma C. sis. BOBJ can compute a default cobasis, but it is not the Lemmas B and C are proved by induction over the num- right one for this problem, because correctness of the ABP ber of transmission failures. Since the condition of each depends only on the input and output streams. For this rea- equation says there must be at least one failure, the sim- son, the correct cobasis is that of the module STREAM. Fi- plest case, which must be the base case, is that there is just nally, we prove ABP correctness using these lemmas: one failure, and this implies that the tail of the appropri- ate channel indicates an immediate success. This and the bth ABP+ is dsort 2PrState . pr 2CHAN-STATE + STREAM . definitions of the operations ack-ok-after and data- op <_,_|_,_> : Bool Nat Bool Nat -> 2PrState . ok-after justify the second equation asserted for the base op [_,_,_] : 2PrState DataStream 2ChState -> cases of Lemmas B and C. DataStream . vars B B1 B2 : Bool . vars M N : Nat . For the induction steps of Lemmas B and C, we know var Is : DataStream . var Cs : 2ChState . there must be at least one more failure. Therefore tail of the eq fhd1 data-ok-after Cs = ok . *** A appropriate channel must also indicate failure, which jus- eq [, Is, Cs] = [, Is, ack-ok-after ftl Cs] tifies the second assumed equation, the first being just as if not ack-ok Cs . *** B in the base cases. The third equation assumed in each case eq [ [, Is, ftl ftl c] , cases CASE-OF-CHANNEL for ABP+ is vars B1 B2 : Bool . vars N1 N2 : Nat . and this is what actually appears in the proof score. The re- var Is : DataStream . var Cs : 2ChState . duction before the induction hypothesis justifies this substi- context [ 17 end some of which are subject to errors and some of which are not. A builtin module providing all these options could be a open . vars B1 B2 : Bool . vars N M : Nat . useful addition to BOBJ. var Is : DataStream . var Cs : 2ChState . We are now applying the techniques of this paper to cred with CASE-OF-CHANNEL other non-deterministic, distributed, concurrent algorithms; [ 18 ¡ [2] Narjes Berregeb, Adel Bouhoula, and Micha l Rusinowitch. [16] Joseph Goguen, Kai Lin, and Grigore Ros¸u. Circular coin- Observational proofs with critical contexts. In Fundamen- ductive rewriting. In Automated Software Engineering ’00, tal Approaches to Software Engineering, volume 1382 of pages 123–131. IEEE, 2000. Proceedings of a workshop held Lecture Notes in Computer Science, pages 38–53. Springer, in Grenoble, France. 1998. [17] Joseph Goguen, Kai Lin, and Grigore Ros¸u. Behav- [3] Michel Bidoit and Rolf Hennicker. Constructor-based obser- ioral and coinductive rewriting. In Proceedings, Rewrit- vational logic. Technical Report LSV–03–9, Laboratoire Sp- ing Logic Workshop, 2000. Elsevier, 2001. Electronic cification et Verification, CNRS de Cachan, March 2003. Notes on Theoretical Computer Science, Volume 36, at [4] Barry Boehm. Software Engineering Economics. Prentice- www.elsevier.nl/locate/entcs/volume36.html. Hall, 1981. [18] Joseph Goguen and Grant Malcolm. Hidden coinduction: [5] Rod Burstall and Joseph Goguen. An informal introduc- Behavioral correctness proofs for objects. Mathematical tion to specifications using Clear. In Robert Boyer and Structures in Computer Science, 9(3):287–319, June 1999. J Moore, editors, The Correctness Problem in Computer Sci- [19] Joseph Goguen and Grant Malcolm. A hidden agenda. The- ence, pages 185–213. Academic, 1981. Reprinted in Soft- oretical Computer Science, 245(1):55–101, August 2000. ware Specification Techniques, Narain Gehani and Andrew Also UCSD Dept. Computer Science & Eng. Technical Re- McGettrick, editors, Addison-Wesley, 1985, pages 363–390. port CS97–538, May 1997. [6] Samuel Buss and Grigore Ros¸u. Incompleteness of behav- [20] Joseph Goguen and Jose´ Meseguer. Order-sorted algebra I: ioral logics. In Horst Reichel, editor, Proceedings, Coalge- Equational deduction for multiple inheritance, overloading, braic Methods in Computer Science (CMCS’00), volume 33 exceptions and partial operations. Theoretical Computer Sci- of Electronic Notes in Theoretical Computer Science, pages ence, 105(2):217–273, 1992. Drafts exist from as early as 61–79. Elsevier Science, March 2000. 1985. [7] Razv˘ an Diaconescu and Kokichi Futatsugi. CafeOBJ Re- [21] Joseph Goguen and Grigore Ros¸u. Hiding more of hid- port: The Language, Proof Techniques, and Methodologies den algebra. In Jeannette Wing, Jim Woodcock, and Jim for Object-Oriented Algebraic Specification. World Scien- Davies, editors, FM’99 – Formal Methods, pages 1704– tific, 1998. AMAST Series in Computing, Volume 6. 1719. Springer, 1999. Lecture Notes in Computer Sciences, [8] Razv˘ an Diaconescu and Kokichi Futatsugi. Behavioural co- Volume 1709, Proceedings of World Congress on Formal herence in object-oriented algebraic specification. Journal of Methods, Toulouse, France. Universal Computer Science, 6(1):74–96, 2000. [22] Joseph Goguen and Grigore Ros¸u. A protocol for distributed [9] Hartmut Ehrig and Bernd Mahr. Fundamentals of Algebraic cooperative work. In Gheorghe Stefanescu, editor, Proceed- Specification 1: Equations and Initial Semantics. Springer, ings, FCT‘99, Workshop on Distributed Systems (Ias¸i, Ro- 1985. EATCS Monographs on Theoretical Computer Sci- mania), volume 28, pages 1–22. Elsevier, 1999. Electronic ence, Vol. 6. Lecture Notes in Theoretical Computer Science. [10] Marie-Claude Gaudel and Igor Privara. Context induction: [23] Joseph Goguen, Timothy Winkler, Jose´ Meseguer, Kokichi an exercise. Technical Report 687, LRI, Universite´ de Paris- Futatsugi, and Jean-Pierre Jouannaud. Introducing OBJ. In Sud, 1991. Joseph Goguen and Grant Malcolm, editors, Software Engi- [11] Joseph Goguen. Principles of parameterized program- neering with OBJ: Algebraic Specification in Action, pages ming. In Ted Biggerstaff and Alan Perlis, editors, Software 3–167. Kluwer, 2000. Reusability, Volume I: Concepts and Models, pages 159–225. Addison Wesley, 1989. [24] Rolf Hennicker. Context induction: a proof principle for be- havioural abstractions. In A. Miola, editor, Proceedings, In- [12] Joseph Goguen. Types as theories. In George Michael ternational Symposium on the Design and Implementation Reed, Andrew William Roscoe, and Ralph F. Wachter, ed- of Symbolic Computation Systems, volume 429 of Lecture itors, Topology and Category Theory in Computer Science, Notes in Computer Science, pages 101–110. Springer, 1990. pages 357–390. Oxford, 1991. Proceedings of a Conference held at Oxford, June 1989. [25] Rolf Hennicker and Michel Bidoit. Observational [13] Joseph Goguen and Rod Burstall. Institutions: Abstract logic. In Algebraic Methodology and Software Technology model theory for specification and programming. Journal (AMAST’98), volume 1548 of Lecture Notes in Computer of the Association for Computing Machinery, 39(1):95–146, Science, pages 263–277. Springer, 1999. January 1992. [26] Bart Jacobs and Jan Rutten. A tutorial on (co)algebras and [14] Joseph Goguen and Razv˘ an Diaconescu. Towards an alge- (co)induction. Bulletin of the European Association for The- braic semantics for the object paradigm. In Hartmut Ehrig oretical Computer Science, 62:222–259, June 1997. and Fernando Orejas, editors, Proceedings, Tenth Workshop [27] Kai Lin. Machine Support for Behavioral Algebraic Specifi- on Abstract Data Types, pages 1–29. Springer, 1994. Lec- cation and Verification. PhD thesis, University of California ture Notes in Computer Science, Volume 785. at San Diego, 2003. [15] Joseph Goguen and Kai Lin. Web-based support for coop- [28] Jose´ Meseguer and Joseph Goguen. Initiality, induction and erative software engineering. Annals of Software Engineer- computability. In Maurice Nivat and John Reynolds, edi- ing, 12:25–32, 2001. Special issue for papers from a confer- tors, Algebraic Methods in Semantics, pages 459–541. Cam- ence in Taipei, December 2000. bridge, 1985. 19 [29] Robin Milner. A Calculus of Communicating Systems. Springer, 1980. Lecture Notes in Computer Science, Vol- ume 92. [30] David M.R. Park. Concurrency and Automata on Infinite Se- quences. Springer, 1980. Lecture Notes in Computer Sci- ence, Volume 104. [31] Horst Reichel. Behavioural equivalence – a unifying con- cept for initial and final specifications. In Proceedings, Third Hungarian Computer Science Conference. Akademiai Ki- ado, 1981. Budapest. [32] Grigore Ros¸u. Hidden Logic. PhD thesis, University of Cal- ifornia at San Diego, 2000. [33] Grigore Ros¸u and Joseph Goguen. Hidden congruent deduc- tion. In Ricardo Caferra and Gernot Salzer, editors, Proceed- ings, 1998 Workshop on First Order Theorem Proving, pages 213–223. Technische Universitat¨ Wien, 1998. (Schloss Wil- helminenberg, Vienna, November 23-25, 1998). [34] Grigore Ros¸u and Joseph Goguen. Circular coinduc- tion. In Proceedings, Int. Joint Conf. Automated Deduction. Springer, 2000. Sienna, June 2001. 20 == ok . ***> proof of Lemma E close cases CASE-D for SETUP is vars B B1 B2 : Bool . ***> proof of Lemma B vars M N P Q : Nat . open . *** base case vars Is Ds : DataStream . eq fhd2 c = err . var Cs : 2ChState . eq ack-ok-after ftl c = ftl c . context [