Quick viewing(Text Mode)

Refinement Calculus

Refinement Calculus

Θemi

Refinement Calculus:

A Systematic Introduction

Ralph-Johan Back

Joakim von Wright Draft, December 31, 1997 This is page ii Printer: Opaque this This is page iii Printer: Opaque this

Preface

The refinement calculus is a framework for reasoning about correctness and re- finement of programs. The primary application of the calculus is the derivation of traditional imperative programs, where programs are constructed by stepwise refinement. Each refinement step is required to preserve the correctness of the pre- vious version of the program. We study the basic rules for this kind of program derivation in detail, looking at specification statements and their implementation, how to construct recursive and iterative program statements, and how to use proce- dures in program derivations. The refinement calculus is an extension of Dijkstra’s weakest precondition calculus, where program statements are modeled as predicate transformers.

We extend the traditional interpretation of predicate transformers as executable program statements to that of contracts that regulate the behavior of competing agents and to games that are played between two players. The traditional inter- pretation of predicate transformers as programs falls out of this as a special case, where one of the two players has only a very rudimentary role to play. Our extension yields a number of new potential applications for the refinement calculus theory, like reasoning about correctness of interactive program statements and analyzing two-person games and their winning strategies.

Program refinement involves reasoning in many domains. We have to reason about properties of functions, predicates and sets, relations, and program statements de- scribed as predicate transformers. These different entities are in fact mathematically all very similar, and they form the refinement calculus hierarchy. In this book we examine the ways in which these different domains are related to each other and how one can move between them. Lattice theory forms a unifying conceptual basis iv

that captures similarities between the domains, while higher-order serves as a common logic that we can use to reason about entities in these domains.

We present a new foundation for the refinement calculus based on lattice theory and higher-order logic, together with a simple theory of program variables. These topics are covered in the first part of the book. Higher-order logic is described in a new way that is strongly influenced by lattice theory. This part also introduces notions needed for imperative programs, program variables, expressions, assignment statements, blocks with local variables, and procedures. The second part of the book describes the predicate transformer approach to programming logic and program , as well as the notion of program refinement. Wealso study the of contracts, games, and program statements, and show how it is related to the predicate transformer interpretation of these. The third part of the book shows how to handle recursion and iteration in the refinement calculus and also describes how to use the calculus to reason about games. It also presents case studies of program refinement. The last part of the book studies more specific issues related to program refinement, such as implementing specification statements, making refinements in context, and transforming iterative structures in a correctness-preserving way.

The refinement calculus has been a topic of investigation for nearly twenty years now, and a great number of papers have been published in this area. However, a sys- tematic introduction to the mathematical and logical basis for program refinement has been missing. The information required to study this field is scattered through many journal and conference papers, and has never been presented as a coherent whole. This book addresses this problem by giving a systematic introduction to the field. The book partly summarizes work that has been published earlier, by ourselves and by other authors, but most of the material is new and has not been published before.

The emphasis in this book is on the mathematical and logical basis of program re- finement. Although we do give examples and case studies of program construction, this is not primarily a hands-on book on how to construct programs in practice. For reasons of space we have been forced to omit a great deal of material that we had wanted include in the book. In particular, we decided not to include the important topics of data refinement and parallel and reactive system refinement here (though a simple approach to data refinement is described in Chapter 17). We plan to treat these topics in a comprehensive way in a separate book.

The book is intended for graduate students and for advanced undergraduate students interested in the mathematics and logic needed for program derivations, as well as for programmers and researchers interested in a deeper understanding of these issues. The book provides new insight into the methods for program derivations, as well as new understanding of program semantics and of the logic for reasoning about programs. Θemv

The book should be easy to use in teaching. We have tried to explain things quite carefully, and proofs are often given in reasonable detail. We have also included quite a number of examples and exercises. A previous understanding of logic and programming methodology is helpful but not necessary in order to understand the material in the book. The first, second, and third parts of the book can be given as successive courses, with suitable material added from the fourth part.

Acknowledgments

This book was inspired and influenced by a number of other researchers in the field. In particular, Edsger Dijkstra’s work on predicate transformers and his systematic approach to constructing programs has been an important source of inspiration, as has the work of on the logic of programming. Jaco de Bakker’s work on programming semantics has formed another strong influence, in particular the stringent way in which he has approached the topic. We very much appreciate Carroll Morgan’s work on the refinement calculus, which has provided many ideas and a friendly competitive environment that has been much appreciated. Finally, Michael Gordon’s work has shown the usefulness of higher-order logic as a framework for reasoning about both hardware and software and about formal theories in general. Our close coworkers Reino Kurki-Suonio and Kaisa Sere deserve a special thanks for many years of fruitful and inspiring collaboration in the area of programming methodology and the logic for systematic program construction. We are indebted to many other colleagues for numerous enlightening discussions on the topics treated in this book. Our thanks go to Sten Agerholm, Roland Backhouse, Eike Best, Michael Butler, Rutger Dijkstra, Wim Feijen, Nissim Francez, Marcel van de Goot, , Jim Grundy, John Harrison, Ian Hayes, Eric Hehner, Wim Hesselink, Peter Hofstee, Cliff Jones, Bengt Jonsson, He Jifeng, Joost Kok, Rustan Leino, Alain Martin, Tom Melham, David Nauman, Greg Nelson, John Tucker, Amir Pnueli, Xu Qiwen, Emil Sekerinski, Jan van de Snepscheut, Mark Staples, Reino Vainio, and Wolfgang Weck. We also want to thank our students, who have provided an inspiring environ- ment for discussing these topics and have read and given detailed comments on many previous versions of the book: Thomas Beijar, Martin B¨uchi, Eric Hedman, Philipp Heuberger, Linas Laibinis, Thomas Langbacka,˚ Leonid Mikhajlov, Anna Mikhajlova, Lasse Nielsen, Rimvydas Ruksˇenas,˙ Mauno R¨onkk¨o, and Marina Wald´en. The Department of at Abo Akademi University and Turku Centre for Computer Science (TUCS) have provided a most stimulating working environment. The sabbaticals at the California Institute of Technology, Cambridge vi

University, and Utrecht University have also been very fruitful and are hereby gratefully acknowledged. The generous support of the Academy of Finland and the Ministry of Education of Finland for the research reported in this book is also much appreciated. Last but not least, we want to thank our (sometimes more, sometimes less) patient wives and children, who have been long waiting for this book to be finished: Barbro Back, Pia, Rasmus, and Ida Back, Birgitta Snickars von Wright, Heidi, Minea, Elin, and Julius von Wright. This is page vii Printer: Opaque this

Contents

1 Introduction 1 1.1 Contracts ...... 1 1.2 Using Contracts ...... 6 1.3 Computers as Agents ...... 9 1.4 Algebra of Contracts ...... 10 1.5 Programming Constructs ...... 12 1.6 Specification Constructs ...... 15 1.7 Correctness ...... 18 1.8 Refinement of Programs ...... 20 1.9 Background ...... 21 1.10 Overview of the Book ...... 24

I Foundations 27

2 Posets, Lattices, and Categories 29 2.1 Partially Ordered Sets ...... 29 2.2 Lattices ...... 36 2.3 Product and Function Spaces ...... 40 2.4 Lattice Homomorphisms ...... 43 2.5 Categories ...... 49 2.6 Summary and Discussion ...... 54 2.7 Exercises ...... 54

3 Higher-Order Logic 57 3.1 Types and Terms ...... 57 3.2 Semantics ...... 62 3.3 Deductive Systems ...... 63 3.4 Theories ...... 65 3.5 Summary and Discussion ...... 66 3.6 Exercises ...... 67

4 Functions 69 4.1 Properties of Functions ...... 69 viii Contents

4.2 Derivations ...... 73 4.3 Definitions ...... 77 4.4 Product Type ...... 80 4.5 Summary and Discussion ...... 82 4.6 Exercises ...... 83

5 States and State Transformers 85 5.1 State Transformers ...... 85 5.2 State Attributes and Program Variables ...... 86 5.3 Reasoning with Program Variables ...... 91 5.4 Straight-Line Programs ...... 97 5.5 Procedures ...... 98 5.6 Blocks and Value Parameters ...... 101 5.7 Summary and Discussion ...... 107 5.8 Exercises ...... 108

6 Values 109 6.1 Inference Rules for Boolean Lattices ...... 109 6.2 Truth Values ...... 111 6.3 Derivations with Logical Connectives ...... 112 6.4 Quantification ...... 115 6.5 Assumptions ...... 118 6.6 Derivations with Local Assumptions ...... 120 6.7 Summary and Discussion ...... 123 6.8 Exercises ...... 124

7 Predicates and Sets 127 7.1 Predicates and Sets ...... 127 7.2 Images and Indexed Sets ...... 130 7.3 Inference Rules for Complete Lattices ...... 131 7.4 Bounded Quantification ...... 133 7.5 Selection and Individuals ...... 135 7.6 Summary and Discussion ...... 136 7.7 Exercises ...... 137

8 Boolean Expressions and Conditionals 139 8.1 Boolean Expressions ...... 139 8.2 Reasoning About Boolean Expressions ...... 141 8.3 Conditional Expressions ...... 143 8.4 Proving Properties About Conditional State Transformers .. 146 8.5 Summary and Discussion ...... 148 8.6 Exercises ...... 149

9 Relations 151 Contents ix

9.1 Relation Spaces ...... 151 9.2 State Relation Category ...... 153 9.3 Coercion Operators ...... 154 9.4 Relational Assignment ...... 155 9.5 Relations as Programs ...... 159 9.6 Correctness and Refinement ...... 161 9.7 Summary ...... 164 9.8 Exercises ...... 164

10 Types and Data Structures 167 10.1 Postulating a New Type ...... 167 10.2 Constructing a New Type ...... 170 10.3 Record Types ...... 172 10.4 Array Types ...... 175 10.5 Dynamic Data Structures ...... 179 10.6 Summary and Discussion ...... 181 10.7 Exercises ...... 181

II Statements 185

11 Predicate Transformers 187 11.1 Satisfying Contracts ...... 187 11.2 Predicate Transformers ...... 189 11.3 Basic Predicate Transformers ...... 192 11.4 Relational Updates ...... 195 11.5 Duality ...... 198 11.6 Preconditions and Guards ...... 199 11.7 Summary and Discussion ...... 200 11.8 Exercises ...... 202

12 The Refinement Calculus Hierarchy 203 12.1 State Categories ...... 203 12.2 Homomorphisms ...... 206 12.3 Summary and Discussion ...... 211 12.4 Exercises ...... 212

13 Statements 213 13.1 Subtyping, Sublattices and Subcategories ...... 213 13.2 Monotonic Predicate Transformers ...... 217 13.3 Statements and Monotonic Predicate Transformers ...... 219 13.4 Derived Statements ...... 223 13.5 Procedures ...... 227 13.6 Summary and Discussion ...... 228 13.7 Exercises ...... 230 x Contents

14 Statements as Games 233 14.1 Game Interpretation ...... 233 14.2 Winning Strategies ...... 239 14.3 Correctness and Refinement ...... 244 14.4 Summary and Discussion ...... 246 14.5 Exercises ...... 246

15 Choice Semantics 249 15.1 Forward and Backward Semantics ...... 249 15.2 Comparing Semantics for Contract Statements ...... 252 15.3 Refinement in the Choice Semantics ...... 254 15.4 Summary and Discussion ...... 255 15.5 Exercises ...... 256

16 Subclasses of Statements 259 16.1 Homomorphic Predicate Transformers ...... 259 16.2 Subcategories of Statements ...... 264 16.3 Summary and Discussion ...... 266 16.4 Exercises ...... 267

17 Correctness and Refinement of Statements 269 17.1 Correctness ...... 269 17.2 Stepwise Refinement ...... 273 17.3 Refinement Laws ...... 280 17.4 Refinement in Context ...... 287 17.5 Refinement with Procedures ...... 291 17.6 Example: Data Refinement ...... 292 17.7 Summary and Conclusions ...... 295 17.8 Exercises ...... 297

III Recursion and Iteration 299

18 Well-founded Sets and Ordinals 301 18.1 Well-Founded Sets and Well-Orders ...... 301 18.2 Constructing the Natural Numbers ...... 303 18.3 Primitive Recursion ...... 305 18.4 Ordinals ...... 307 18.5 Ordinal Recursion ...... 310 18.6 How Far Can We Go? ...... 312 18.7 Summary and Discussion ...... 314 18.8 Exercises ...... 314

19 Fixed Points 317 19.1 Fixed Points ...... 317 19.2 Fixed points as Limits ...... 320 Contents xi

19.3 Properties of Fixed Points ...... 322 19.4 Summary and Discussion ...... 326 19.5 Exercises ...... 327

20 Recursion 329 20.1 Fixed Points and Predicate Transformers ...... 329 20.2 Recursion Introduction and Elimination ...... 334 20.3 Recursive Procedures ...... 336 20.4 Example: Computing the Square Root ...... 342 20.5 Summary and Discussion ...... 343 20.6 Exercises ...... 344

21 Iteration and Loops 347 21.1 Iteration ...... 347 21.2 Properties of Iterations ...... 350 21.3 Correctness of Iterations ...... 352 21.4 Loops ...... 353 21.5 Loop Correctness ...... 356 21.6 Loop Introduction and Elimination ...... 359 21.7 Summary and Discussion ...... 361 21.8 Exercises ...... 362

22 Continuity and Executable Statements 365 22.1 Limits and Continuity ...... 365 22.2 Continuity of Statements ...... 367 22.3 Continuity of Derived Statements ...... 371 22.4 Executable Statements ...... 373 22.5 Interactive Guarded Commands ...... 377 22.6 Summary and Discussion ...... 380 22.7 Exercises ...... 381

23 Working with Arrays 383 23.1 Resetting an Array ...... 383 23.2 Linear Search ...... 386 23.3 Selection Sort ...... 388 23.4 Counting Sort ...... 390 23.5 Summary and Discussion ...... 402 23.6 Exercises ...... 402

24 The N-Queens Problem 403 24.1 Analyzing the Problem ...... 403 24.2 First Step: The Terminating Case ...... 406 24.3 Second Step: Extending a Partial Solution ...... 407 24.4 Third Step: Completing for Recursion Introduction ...... 409 xii Contents

24.5 Final Result ...... 411 24.6 Summary and Discussion ...... 411 24.7 Exercises ...... 412

25 Loops and Two-Person Games 413 25.1 Modeling Two-Person Games ...... 413 25.2 Winning Strategies ...... 415 25.3 Extracting Winning Strategies ...... 417 25.4 Strategy Improvement ...... 421 25.5 Summary and Discussion ...... 422 25.6 Exercises ...... 422

IV Statement Subclasses 425

26 Statement Classes and Normal Forms 427 26.1 Universally Conjunctive Predicate Transformers ...... 427 26.2 Conjunctive Predicate Transformers ...... 429 26.3 Disjunctive Predicate Transformers ...... 431 26.4 Functional Predicate Transformers ...... 432 26.5 Continuous Predicate Transformers ...... 435 26.6 Homomorphic Choice Semantics ...... 437 26.7 Summary and Discussion ...... 443 26.8 Exercises ...... 444

27 Specification Statements 447 27.1 Specifications ...... 448 27.2 Refining Specifications by Specifications ...... 449 27.3 Combining Specifications ...... 453 27.4 Refining Conjunctive Specifications ...... 457 27.5 General Refinement of Specifications ...... 458 27.6 Summary and Discussion ...... 460 27.7 Exercises ...... 461

28 Refinement in Context 463 28.1 Taking the Context into Account ...... 463 28.2 Rules for Propagating Context Assertions ...... 467 28.3 Propagation in Derived Statements ...... 471 28.4 Context Assumptions ...... 475 28.5 Summary and Discussion ...... 477 28.6 Exercises ...... 478

29 Iteration of Conjunctive Statements 479 29.1 Properties of Fixed Points ...... 479 29.2 Relating the Iteration Statements ...... 482 29.3 Iteration of Meets ...... 484 Contents xiii

29.4 Loop Decomposition ...... 486 29.5 Other Loop Transformations ...... 487 29.6 Example: Finding the Period ...... 489 29.7 Summary and Discussion ...... 494 29.8 Exercises ...... 494

30 Appendix 497

References 501 This is page xiv Printer: Opaque this This is page 1 Printer: Opaque this 1

Introduction

The refinement calculus is a logical framework for reasoning about programs. It is concerned with two main questions: is a program correct with respect to a given specification, and how can we improve, or refine, a program while preserving its correctness. Both programs and specifications can be seen as special cases of a more general notion, that of a contract between independent agents. Refinement is defined as a relation between contracts and turns out to be a lattice ordering. Correctness is a special case of refinement where a specification is refined by a program. The refinement calculus is formalized within higher-order logic. This allows us to prove the correctness of programs and to calculate program refinements in a rigorous, mathematically precise manner. Lattice theory and higher-order logic together form the mathematical basis for the calculus. We start with an informal introduction to some of the central concepts in the calcu- lus: the notion of a contract as a generalization of programs and specifications, how program correctness is generalized to contracts, and how refinement of contracts is defined. We also show how these general concepts are related to more traditional notions used in programming and program correctness. We will elaborate on these issues in the rest of the book, along the way introducing and studying many other notions that we need for program correctness and refinement.

1.1 Contracts

Let us start from the most general notion used, that of a contract. Consider a collection of agents, where each agent has the capability to change the world 2 1. Introduction

in various ways through its actions and can choose between different courses of action. The behavior of agents and their cooperation is regulated by contracts. Before going into this in more detail, a short remark on the notation that we use is in order. We denote function application with an infix dot, writing f.x, rather than the more traditional f (x). We write A → B for the set of all functions f : A → B. Functions can be of higher order, so that the argument or value of a function can be a function itself. For instance, if f is a function in A → (B → C) and a an element of A, then f.a is a function in B → C.Ifb is an element of B, then ( f.a).b is an element of C. Function application associates to the left, so ( f.a).b can be written as just f.a.b. We will use standard set-theoretic and logical notation in our informal discussion in this and the next chapter. In particular, we write T, F, ¬b, b ∧c, b ∨c, b ⇒ c, b ≡ c for, respectively, truth, falsity, , conjunction, disjunction, implication, and equivalence in logical formulas. We write universal quantification as (∀x • b) and existential quantification as (∃x • b). These notions will all be given precise definitions when higher-order logic is introduced, but for now, we rely on an informal understanding.

States and State Changes We assume that the world is described as a state σ . The state space  is the set of all possible states (or worlds). An agent changes the state by applying a function f to the present state, yielding a new state f.σ . An agent can have different functions for changing the world. We will write  f for the action that changes the state from σ to f.σ , and refer to this as an update action.

A special case is the identity function id :  → , which does not change the world at all, id.σ = σ. We write skip =id for the action that applies the identity function to the present state.

We think of the state as having a number of attributes x1,...,xn, each of which can be observed and changed independently of the others. Such attributes are usually called program variables. The value of an attribute x, val.x, is a function that depends on the state: val.x.σ gives the value of attribute x in state σ . It models an observation of some property of the state. For each attribute xi , there is also an update function set.xi that we use to change the state so that attribute xi gets a specific value: set.x.a.σ is the new state that we get by setting the value of attribute x to a without changing the values of any of the other attributes.

State changes are conveniently expressed as assignments. For instance, xi := xi + x j describes a function from state σ to the state set.xi .a.σ , where a = val.xi .σ +val.x j .σ is the value of expression xi +x j in state σ . The action that carries out this assignment is then xi := xi + x j . We drop the angular brackets around 1.1. Contracts 3 assignments in the examples below, as it will always be clear that an assignment action rather than just the assignment function is intended.

Contracts A contract regulates the behavior of an agent. The contract can stipulate that the agent must carry out actions in a specific order. This is written as a sequential action S1 ; S2 ; ...; Sm , where S1,...,Sm are the individual actions that the agent has to carry out. For any initial state σ = σ0, the agent has to produce a succession of states σ1,...,σm , where σi is the result of performing action Si in state σi−1. A contract may also require the agent to choose one of the alternative actions S1,...,Sm . The choice is written as the alternative action S1 ··· Sm . The choice between these alternatives is free; i.e., any alternative may be carried out by the agent, but the agent must choose one of the given alternatives. These two constructs, together with the basic actions described above, give us a very simple language for contracts:

S ::=f |S1 ; S2 | S1 S2 . Each statement in this language describes a contract for an agent. We have restricted ourselves to binary composition only. Arbitrary composition can be expressed in terms of binary composition, as sequential and alternative composition are both assumed to be associative: S1 ; (S2 ; S3) = (S1 ; S2) ; S3, and S1 (S2 S3) = (S1 S2) S3. We assume that semicolon binds more strongly than choice. We can observe certain other properties that this language of contracts is expected to satisfy. For instance, we expect that

skip ; S = S and S = S ; skip , because continuing by not changing the world should not affect the meaning of a contract. The order in which the alternatives are presented should also not be important; only the set of alternatives matters, so that

S S = S and S1 S2 = S2 S1 .

The following is an example of a contract statement: x := 1;(x := x + y skip) ; y := x − 1 . This stipulates that the agent first has to set the attribute x to 1, then it must choose between either changing the state by assigning the value of x + y to the attribute x or not changing the state, and finally it has to set attribute y to the value of x − 1. This assumes that the state space has the two attributes x and y (it may have more attributes, but the values of other attributes are not changed). The agent is said to satisfy (or follow) the contract if it behaves in accordance with it. 4 1. Introduction

Cooperation Between Agents Consider next two agents, a and b, that work on the same state independently of each other. Each agent has a will of its own and makes decisions for itself. If these two agents want to cooperate, they need a contract that stipulates their respective obligations. A typical situation is that one of the agents, say a, acts as a client, and the other, b,asaserver. Assume that agent a follows contract S,

S = x := 0;(T ; x := x + 1 skip). where T is the contract for agent b,

T = (y := 1 y := 2) ; x := x + y . The occurrence of T in the contract statement S signals that a asks agent b to carry out its contract T . We can combine the two statements S and T into a single contract statement that regulates the behavior of both agents. Because this combined contract is carried out by two different agents, we need to explicitly indicate for each choice whether it is made by a or by b. The combined contract is described by

S = x := 0;((y := 1 b y := 2) ; x := x + y ; x := x + 1 a skip). Thus, the combined contract is just the result of substituting the contract of T for the call on T in the contract of S and explicitly indicating for each choice which agent is responsible for it. For basic actions like skip and assignments, it does not matter which agent carries out the state change, because the effect will be the same.

Assertions An assertion is a requirement that the agent must satisfy in a given state. We express an assertion as {g}, where g is a condition on the state. For instance, {x + y = 0} expresses that the sum of (the values of attributes) x and y in the state must be zero. An example of a contract with an assertion is the following:

S = x := 1;(x := x + y ; {x > 1} x := x + z) ; y := x − 1 . Here we require that x > 1 must hold after the assignment in the first alternative has been completed. If this assertion does in fact hold there when the agent carries out the contract, then the state is unchanged, and the agent simply carries on with the rest of the contract. If, however, the assertion does not hold, then the agent has breached the contract. Analyzing this example, we can see that the agent will breach the contract if y ≤ 0 initially and the agent chooses the first alternative. The agent does not, however, need to breach the contract, because he can choose the second alternative, which does not have any assertions, and then successfully complete the contract. 1.1. Contracts 5

The assertion {true} is always satisfied, so adding this assertion anywhere in a contract has no effect; i.e., {true}=skip. The assertion {false} is an impossible assertion. It is never satisfied and will always result in the agent breaching the contract. When looking at the combined contract of two agents, a and b, it is again very important to explicitly indicate which party breaches the contract. Thus, if we have

S = x := 0;(T ; x := x + 1 {false}) ; {y = x} , T = (y := 1 y := 2;{false}), then the combined contract is

S = x := 0; ((y := 1 b y := 2;{false}b) ; x := x + 1 a {false}a); {y = x}a . Here the index indicates which agent has the obligation to satisfy the assertion.

Assumptions Besides assertions, there is another notion that is important when making a contract between two parties. This is the assumptions that each of the parties makes, as their condition for engaging in the contract. We will express an assumption as [g], where g is a condition on the state. Consider as an example the contract [x ≥ 0] ; (x := x + y x := x + z) ; y := x − 1 . Here the assumption of the agent carrying out this contract is that x ≥ 0 holds initially. If this is not the case, then the agent is released of any obligation to carry out his part of the contract. The assumption [true] is always satisfied, so we have that [true] = skip. The assumption [false] is never satisfied and will always release the agent from his obligation to continue with the contract. It is thus an impossible assumption. As with assertions, we permit assumptions to occur at any place in the contract statement. For instance, consider the statement x := 1;(x := x + y ;[x > 1] x := x + z) ; y := x − 1 . Here the agent assumes that x > 1 after carrying out the assignment of the first choice. If this is not the case, the agent is released from its contract. Again, we notice that the agent does not have to choose the first alternative; he can choose the second alternative and carry out the contract to the end. Finally consider combining the contracts for two agents. As with assertions and choice, we also have to indicate explicitly for each assumption whether it is made by agent a or agent b. 6 1. Introduction

1.2 Using Contracts

We use agents to achieve something, i.e., to establish some new, more desirable state. We describe the desired states by giving the condition that they have to satisfy (the postcondition). The possibility of an agent achieving such a desired state depends on the means it has available, i.e., on the functions that it can use to change the state. If the agent cannot achieve the required state for itself, then it needs to involve other agents.

Taking Sides We will assume that the contract is always made between an agent whose goals we want to further (our agent) and another agent whose goals are not important to us. We denote our agent’s choices by , and the choice of the other agent by ◦.Ina similar way, we distinguish between {g} and {g}◦ and between [g] and [g]◦. This avoids the introduction of arbitrary names for agents when analyzing a contract. We extend the language for describing contracts to include combined contracts with two different agents interacting. The syntax is then as follows, with assumptions and assertions added as new contract statements: ◦ ◦ ◦ S ::=f |{g}|{g} | [g] | [g] | S1 ; S2 | S1 S2 | S1 S2 . Combining the two contracts

S = x := 0;(T ; x := x + 1;{x = y} skip), T = [x ≥ 0] ; (y := 1 y := 2), where contract S is for our agent and contract T is for the other agent, results in the single contract ◦ ◦ S = x := 0;([x ≥ 0] ; (y := 1 y := 2) ; x := x + 1;{x = y} skip). This is the contract statement from our agent’s point of view. Given a contract for a single agent and a desired postcondition, we can ask whether the contract can establish the postcondition. This will again depend on the initial state, the state in which we start to carry out the contract. For instance, consider the contract S = x := x + 1 x := x + 2 . The agent can establish postcondition x = 2ifx = 1orx = 0 initially. When x = 1 initially, the agent should choose the first alternative; but when x = 0, the agent should choose the second alternative. The agent can also establish postcondition x = 1 with this contract if the initial state satisfies x =−1orx = 0. From initial state x = 0, the agent can thus establish either x = 1orx = 2, by suitable choice of alternatives. 1.2. Using Contracts 7

The postcondition need not determine a unique final state. For instance, the agent may want to achieve either x = 1orx = 2. The agent can establish this condition in an initial state where x =−1orx = 0orx = 1.

Satisfying a Contract We need to be more precise about what it means for an agent to achieve some condition by satisfying a contract. Obviously, breaching the contract does not satisfy the contract. On the other hand, an agent cannot be required to follow a contract if the assumptions that it makes are violated. Violating the assumptions releases the agent from the contract. As the fault is not with the agent, it is considered to have satisfied the contract in this case. We can illustrate these points by considering first the statement S that consists of a single assertion action, say S ={x ≥ 0}. An agent can establish a condition q with this action only if x ≥ 0 holds (so that the contract is not broken) and if in addition, q already holds (as the state is not changed if x ≥ 0 holds). Consider next S = [x ≥ 0]. If x ≥ 0 holds initially, then the effect of the action is just a skip. In this case q will be established provided that q already was true of the state before the action. If x ≥ 0 does not hold, then the assumptions that the agent makes are violated, so the agent is considered to have satisfied its contract to establish q. We will say that the agent can satisfy contract S to establish postcondition q in initial state σ if the agent either can achieve a final state that satisfies q without breaching the contract or is released from the contract by an assumption that is violated. Let us denote this by σ {| S |} q.

Two Agents Consider now the situation where the agent is not alone in carrying out its contract but cooperates with another agent. Whether the contract can be used to establish a certain postcondition will then also depend on what the other agent does. In analyzing whether our agent can achieve a certain condition with S, we will assume that the other agent does not breach S with its actions. We will say that σ {| S |} q holds if

(i) assuming that the assumptions that our agent makes are satisfied, and (ii) assuming that the other agent does not breach the contract, then from state σ, our agent can always establish a final state that satisfies condition q without itself breaching the contract. When there is another agent involved, our agent has to be able to satisfy the contract no matter how the other agent makes its choices. This means that the agent has to take into account all possible alternatives 8 1. Introduction

that the other agent has and be able to establish the desired condition no matter which alternatives are chosen by the other agent. The above definition means that our agent satisfies the contract (for any condition q) if the other agent breaches its contract. This relieves our agent from any further duties to follow the contract. It is considered to have satisfied the contract, as the fault was not with it. ◦ In particular, this means that a total breach of contract by the other agent, {false} , has the same effect as if our agent had made the impossible assumption [false]. ◦ Thus {false} = [false] when we consider only what it means to satisfy a contract. In both cases, our agent is considered to have satisfied the contract, no matter what condition was to be established.

Many Agents We have assumed above that there are at most two agents, our agent and the other agent. Let us now complete the picture by considering the situation where there are more agents. Consider first the situation where we have three agents, a, b, and c. Assume that a is our agent. Then it does not matter from our point of view whether the two other agents b and c are really separate agents, each with its own free will, or whether there is a single will that determines the choices made by both b and c. Therefore, in analyzing what can be achieved with a contract, we will denote a ◦ by , and b and c both by . Similarly, if our agent a cooperates with agent b to achieve some common goals, but agent c has different goals, then we would use for a and b, while we would ◦ use for c. In general, situations with more than two agents can always be reduced to the case where we only need to consider two agents: our agent and its allies, considered as a single agent, and all the other agents together, also considered as a single agent.

Contracts as Games Our main concern with a contract is to determine whether we (our agent) can use the contract to achieve our stated goals. In this sense, our agent has to make the right choices in its cooperation with other agents, who are pursuing different goals that need not be in agreement with our goals. The other agents need not be hostile to our goals; they just have different priorities and are free to make their own choices. However, because we cannot influence the other agents in any way, we have to be prepared for the worst and consider the other agent as hostile if we want to achieve certainty of reaching a specific condition. As far as analyzing what can be achieved with a contract, it is therefore justified to consider the agents involved as the two opponents in a game. The actions that the 1.3. Computers as Agents 9

agents can take are the moves in the game. The rules of the game are expressed by the contract S: it states what moves the two opponents can take and when. Given an initial state σ, the goal for our agent is to establish a given postcondition q. The other agent tries to prevent our agent from establishing this postcondition. We will make this a little bit more dramatic and call our agent the angel and the other agent the demon. In the game, we talk about an angelic choice when the choice is made by our agent, and about demonic choice when the choice is made by the demon. A player in a game is said to have a winning strategy in a certain initial state if the player can win (by doing the right moves) no matter what the opponent does. Satisfaction of a contract now corresponds to the existence of a winning strategy: σ {| S |} q holds if and only if the angel has a winning strategy to reach the goal q when playing with the rules S, when the initial state of the game is σ . If our agent is forced to breach an assertion, then it loses the game directly. If the other agent is forced to breach an assertion, it loses the game, and our agent then wins. In this way, the angel can win the game either by reaching a final state that satisfies the stated condition, by forcing the demon to breach an assertion, or by choosing an alternative where one of its own assumptions is false. In all other cases, the angel loses the game.

1.3 Computers as Agents

We have not yet said what agents are. They could be humans, they could be comput- ers, or they could be processes in a computer system. They could be programmers building software systems. A central feature of these agents is that they can make free choices. This means that the behavior of agents is nondeterministic, in the sense that it cannot be predicted with certainty. Explaining free will for humans is a well-known problem in philosophy. For com- puters, the fact that we need to postulate a free will might be even more confusing, but actually, it is not. Basically, nondeterminism in computer systems comes from information hiding. Certain information that determines the behavior of a com- puting agent is not revealed to other agents interacting with this agent. Therefore, behavior that is actually deterministic will appear nondeterministic to an outside agent because it cannot see the reasons for the choices made. It will appear as if the agent is making arbitrary choices, when in fact these choices may be com- pletely determined by some hidden information. Typically there are some internal, hidden, state components that are not visible to outside viewers. A choice might also depend on some natural random phenomena, or it may depend on time-critical features that are unknown to the other agents. Information hiding is the central principle in designing a large computer or software system as a collection of cooperating agents. The purpose is to restrict information 10 1. Introduction

disclosed about an agent’s behavior to a minimum, in order to keep as many options open as possible for later changes and extensions of the behavior. A common situation is that the user of a computer system is seen as our agent (the angel), the other agent (the demon) being the computer system itself. Then a contract corresponds to the protocol by which the user interacts with the computer. The protocol lays down the alternatives that the user has available in his interaction with the computing agent (menu choices, executable commands, etc). In a text editor, these alternatives could include opening a new file for editing, typing some text into the editor, and saving the file. There may also be certain restrictions that have to be satisfied when using the operations of the system, such as that the user can only open a file that is registered in the file system. These restrictions are assumptions from the point of view of the system. The response to violating restrictions can vary, from a polite warning or error message to a system crash or the system being locked in an infinite loop. The restrictions are thus assertions for the user, who must satisfy them if he wants to achieve his goals. Another common situation is that both agents are components of a computer sys- tem. For example, one component/agent could be a client, and the other could be a server. The client issues requests to the server, who carries out the requests accord- ing to the specification that has been given for the possible requests. The server procedures that are invoked by the client’s requests have conditions associated with them that restrict the input parameters and the global system state. These condi- tions are assertions for the client, who has to satisfy them when calling a server procedure. The same conditions are assumptions for the server, who may assume that they hold when starting to comply with a request. The server has to achieve some final state (possibly returning some output values) when the assumptions are satisfied. This constitutes the contract between the client and the server. There is an interesting form of duality between the client and the server. When we are programming the client, then any internal choices that the server can make increase our uncertainty about what can happen when a request is invoked. To achieve some desired effect, we have to guard ourselves against any possible choice that the server can make. Thus, the client is the angel and the server is the demon. On the other hand, when we are programming the server, then any choices that the client makes, e.g., in choosing values for the input parameters, increases our uncertainty about the initial state in which the request is made, and all possibilities need to be considered when complying with the request. Now the server is the angel and the client is the demon.

1.4 Algebra of Contracts

A contract statement regulates the means for achieving something. Since we are only interested in our own agent, we see the contract as a tool for our agent to 1.4. Algebra of Contracts 11 achieve some goals. If we compare two contract statements for our agent, say S and S, then we can say that the latter is at least as good as the former if any condition that we can establish with the first contract can also be established with the second contract. We then say that S is refined by S and denote this by S  S. More formally, we can define S  S to hold if  σ {| S |} q ⇒ σ {| S |} q, for any σ and q.

Evidently S  S always holds, so refinement is reflexive. It is also easy to see that refinement is transitive: S  S and S  S implies S  S. We will also postulate that two contracts are equal if each refines the other. This means that the two contracts are equally good for achieving any conditions that our agent might need to establish. This gives us antisymmetry for contracts: S  S and S  S implies S = S. In terms of satisfying contracts, we then have that S = S if and only if  σ {| S |} q ≡ σ {| S |} q, for any σ and q. These properties imply that contracts form a partial ordering.

It is evident that {false}S for any contract S, because we cannot satisfy the contract {false} in any initial state, for any final condition. Hence, any contract is an improvement over this worst of all contracts. Also, it should be clear that S  [false]: the assumptions of contract [false] are never satisfied, so this contract is satisfied in any initial state for any final condition. Hence, this is the best of all contracts. This means that the partial order of contracts is in fact bounded: it has a least and a greatest element. Consider next the contract S = x := x + 1 x := x + 2. Then σ {| S |} q holds if our agent, by choosing either x := x + 1orx := x + 2, can establish q in initial state σ. Thus, we have that σ {| x := x + 1 x := x + 2 |} q iff σ {| x := x + 1 |} q or σ {| x := x + 2 |} q . For instance, we have that x = 0 {| x := x + 1 x := x + 2 |} x = 1 holds, because x = 0 {| x := x + 1 |} x = 1 . (Here, x = 0 stands for the predicate that holds in exactly those states in which the value of x is 0, and similarly for x = 1.) Consider then the contract x := x + 1 ◦ x := x + 2. In this case, the other agent is making the choice. Hence, our agent can be certain to satisfy the contract to achieve a certain final condition q (in initial state σ ) if and only if he can satisfy 12 1. Introduction

both x := x + 1 and x := x + 2 to achieve the required final conditions. In other words, we have that ◦ σ {| x := x + 1 x := x + 2 |} q iff σ {| x := x + 1 |} q and σ {| x := x + 2 |} q . This reflects the fact that our agent cannot influence the choice of the other agent, and hence must be able to satisfy whichever contract the other agent chooses. The contracts will in fact form a lattice with the refinement ordering, where is the join in the lattice and ◦ is the meet. Following standard lattice notation, we ◦ will, therefore, in what follows write  for . The impossible assertion {false} is the bottom of the lattice of contracts, and the impossible assumption [false] is the top of the lattice. We will generalize the notion of a contract to permit choices over an arbitrary set of contracts, {Si | i ∈ I } and {Si | i ∈ I }. The set of contracts may be empty, or it may be infinite. With this extension, the contracts will in fact form a complete lattice. The sequential composition operation is also important here. It is easy to see that contracts form a monoid with respect to the composition operation, with skip as the identity element. Contracts as we have described them above thus have a very simple algebraic structure; i.e., they form a complete lattice with respect to , and they form a monoid with respect to sequential composition. A further generalization of contracts will permit the initial and final state spaces to be different. Thus, the contract may be initiated in a state σ in , but we permit operations in the contract that change the state space, so that the final state may be in another state space . The simple monoid structure of contracts is then not sufficient, and we need to consider the more general notion of a category of contracts. The different state spaces will form the objects of the category, while the morphisms of the category are the contracts. The skip action will be the identity morphism, and composition of morphisms will be just the ordinary sequential composition of actions. We will explain these algebraic concepts in more detail in the next chapter. Here we have only indicated the relevance of the lattice and category-theoretic concepts in order to motivate the reader to study and appreciate these notions.

1.5 Programming Constructs

We have above given a rather restricted set of statements for describing contracts. As we argued above, contracts can be used to describe the way in which pro- gram components interact with each other. Hence, we should be able to describe 1.5. Programming Constructs 13 traditional programming constructs, such as conditional statements, iteration and recursion, as contracts. This is what they essentially are: statements that oblige the computer system to carry out specific sequences of actions.

Conditional Contracts Consider first a conditional statement such as if x ≥ 0 then x := x + 1 else x := x + 2 fi . We can consider this a conditional contract,defined in terms of previous constructs, as follows: if x ≥ 0 then x := x + 1 else x := x + 2 fi = {x ≥ 0} ; x := x + 1 {x < 0} ; x := x + 2 . Thus, the agent can choose between two alternatives. The agent will, however, always choose only one of these, the one for which the guard x ≥ 0 is true, because choosing the alternative where the guard is false would breach the contract. Hence the agent does not have a real choice if he wants to satisfy the contract. We could also define the conditional in terms of choices made by the other agent as follows: if x ≥ 0 then x := x + 1 else x := x + 2 fi = [x ≥ 0] ; x := x + 1  [x < 0] ; x := x + 2 . These two definitions are equivalent. The choice made by the other agent is not controllable by our agent, so to achieve some desired condition, our agent has to be prepared for both alternatives, to carry out x := x + 1 assuming x ≥ 0, and to carry out x := x + 2 assuming x < 0. If the other agent is to carry out the contract without violating our agent’s assumptions, it has to choose the first alternative when x ≥ 0 and the second alternative when x ≥ 0 is false.

The guarded conditional statement introduced by Dijkstra [53], of the form if g1 → S1 [] ··· [] gm → Sm fi, is a generalization of the traditional conditional statement. In this case, the guards g1,...,gm need not be mutually exclusive (as they are in the traditional conditional statement). The choice between executing Si or S j is done nondeterministically when the two guards gi and g j both hold. Dijkstra’s interpre- tation of the nondeterminism is that it should not be controllable. In our framework, this means that the choice is demonic. This gives the following definition:

if g1 → S1 [] ··· [] gm → Sm fi =

{g1 ∨ ...∨ gm } ; ([g1];S1  ... [gm ];Sm ). The first assert statement tests that at least one of the guards is true in the state. If not, then the result is an abortion of the execution, which amounts to our agent breaching the contract. If at least one of the guards holds in the state, then the 14 1. Introduction

other agent will choose which alternative to continue with. To avoid breaching the contract, the other agent chooses an alternative for which the guard is true. If we take the first definition of the traditional conditional statement as the starting point, we get another generalized conditional statement, which has a very different interpretation. Here the nondeterminism is controllable by our agent; the choice is angelic:

if g1 :: S1 [] ··· [] gm :: Sm fi ={g1} ; S1 ... {gm } ; Sm . This statement will again cause a breach of the contract if none of the assertions holds in the state. If at least one of the assertions holds, then our agent gets to choose the alternative to continue with. This construct can be interpreted as a context-sensitive menu choice: the alternatives from which our agent chooses are items on a menu, and the assertions determine which of these items are enabled in a particular state.

Recursive Contracts We can make the language of contracts even more interesting from a programming point of view by permitting recursive statements of contracts. This can be defined by extending the syntax as follows:

• S ::=···|X | (µX S1).

• Here X is a variable that ranges over contract statements, while (µX S1) is the contract statement S1, where each occurrence of X in S1 is interpreted as a recursive invocation of the contract S1. It is more customary to define a recursive contract by an equation of the form

X = S1 ,

where S1 usually contains some occurrences of X. This defines the recursive • contract (µX S1). An example of a recursive contract is

X = (x := x + 1;X skip). Here the agent can choose between increasing x by one and repeating the process, or finishing. In this way, the agent can set x to any value that is equal to or greater than the initial value of x. We will later show that we can introduce recursive contracts in terms of infinite choices, so we do not actually need to postulate recursive contracts. Like the conditional contract, they can be introduced as convenient abbreviations defined in terms of more primitive contracts. 1.6. Specification Constructs 15

A recursive contract introduces the possibility of nontermination, so we have to decide what nontermination means in our framework. We will interpret nontermi- nation as a breach of contract by our agent, making this something to be avoided. This corresponds to intuition: delaying a task indefinitely is not considered to be an acceptable way of satisfying a contract.

Iteration Iteration is defined in terms of recursion. For example, the standard while loop is defined as follows:

while g do S od = (µX • if g then S ; X else skip fi). An example is the following little contract, which applies the function f to an initial value x = a until the value becomes larger than some fixed constant b:

x := a ; while x ≤ b do x := f.x od . The same contract can be defined as x := a ; X where X is defined by the recursive definition

X = if x ≤ b then x := f.x ; X else skip fi .

1.6 Specification Constructs

A program is usually built as a collection of interacting components. When pro- gramming a specific component, we consider the other components to be controlled by other agents. We may need some of these components in order to carry out the task set for our component, but we cannot influence the choices made by the other agents. The situation is analogous to a contractor using subcontractors. A specification of a program component is a contract that gives some constraints on how the component is to behave but leaves freedom for the other agent (the subcontractor or the implementer) to decide how the actual behavior of the com- ponent is to be realized. A specification therefore often involves uncontrollable choice between a number of alternatives. For instance, if we want to specify that the variable x is to be set to some value between 0 and 9, then we can express this as the contract x := 0  x := 1 ···x := 9 . Then our agent does not know precisely which alternative will be chosen by the other agent, so whatever it wants to achieve, it should achieve it no matter which alternative is chosen. 16 1. Introduction

This contract can be improved from the point of view of our agent by getting rid of the uncertainty. We have, e.g., that

x := 0  x := 1 ···x := 9  x := 3 . This means that x := 3 is a correct implementation of the specification. Any other statement x := n, n = 0,...,9, is also a refinement and thus a correct implementation of the specification. Moreover, any choice that cuts down on the number of alternatives will also be a refinement of the specification. An example is

x := 0  x := 1 ···x := 9  x := 3  x := 7 . This explains the central importance of the notion of refinement in analyzing contracts: it captures and generalizes the notion of implementing a specification. In some situations we need to consider an infinite number of alternatives. For instance, to set x to any nonnegative value, we can use the following specification: {x := n | n ≥ 0} . Here one of the statements x := n is chosen from the infinite set of possible state- ments. As an executable statement, this construct would be difficult to implement in its full generality (in the sense that any possibility could be chosen in practice). However, when we look at it as a contract, it is perfectly acceptable. It just states some constraint on what the other agent can choose, where the constraint happens to be such that it permits an infinite number of possible choices for the other agent.

Relational Assignments We have assumed that an agent can change the state by applying a state-changing function f to the present state. More generally, we can assume that the agent can change the state by applying a relation R to the present state. In a given state σ , the agent is free to choose as next state any state σ  that satisfies σ R σ . Let us write {R} for the action that changes the state in this way. We consider {R} to be a contract that shows the freedom that the agent has in choosing the next state. However, the agent must choose one of the states permitted by R. If there is no state σ  such that σ R σ  holds, then the agent is forced to breach the contract. Thus the agent can satisfy the contract {R} to establish condition q in initial state σ if and only if there is some state σ  such that σ R σ  and σ  ∈ q. The contract {R} describes the freedom our agent has in choosing the next state. The contract {R}◦ is denoted by [R]. Here the choice of the next state is made by the other agent, and our agent has no influence on this choice. If there is no state σ  such that σ R σ , then the other agent is forced to breach the contract. This means that our agent can satisfy the contract [R] to establish condition q in initial state σ if σ  ∈ q for every σ  such that σ R σ . In particular, our agent will 1.6. Specification Constructs 17 satisfy the contract if there is no σ  such that σ R σ . Keeping the game-theoretic interpretation of contracts in mind, we refer to {R} as an angelic update and to [R] as a demonic update. We can generalize the assignment x := e to a relational assignment (x := a | b). An example is the relational assignment (x := n | n > x). This relates the initial   state σ to a final state σ when σ = set.x.n.σ and n > val.x.σ hold. In other words, the relational assignment nondeterministically chooses some value n that satisfies the condition n > val.x.σ and assigns it to x. In general, (x := a | b)   relates states σ and σ to each other when σ = set.x.a.σ and b holds in.σ .We can now describe angelic updates like {x := n | n > x} and demonic updates like [x := n | n > x], where the relation between initial and final states is described by (x := n | n > x). We can use demonic updates as specifications. We have, e.g., that

[x := n | n > x]  x := x + 2 , so x := x +2 is a possible implementation of this specification, because the number of possible final states that the other agent can choose in a given initial state is restricted to just one. Another possible implementation is x := x + 1  x := x + 2, which is less deterministic than the first, but still much more deterministic than the original specification. The angelic update can also be used as a specification. In this case, the contract {x := n | n > x} expresses that we are free to choose a new value for x that is greater than the present value. A refinement in this case is a contract that gives our agent even more freedom to choose. For instance,

{x := n | n > x}{x := n | n ≥ x} . Here the refinement adds the possibility of not increasing the value of x, so there is at lest one more alternative to choose from. In general, a refinement decreases the number of choices for the other agent and increases the number of choices for our agent.

Pre–postcondition Specifications A common way to specify a program statement is to give a precondition p and a postcondition q on the state that the implementation must satisfy, assuming that only certain program variables are changed. Consider as an example the specifi- cation with the precondition x ≥ 0 and postcondition x > x, where x stands for the new value of the program variable. Assume further that only the program variable x may be changed. This pre-postcondition specification is expressed by 18 1. Introduction

the contract   {x ≥ 0} ;[x := x | x > x] . Thus, the contract is breached unless x ≥ 0 holds in the state initially. If this condition is true, then x is assigned some value x for which x > x holds in the initial state.

Contracts as Generalized Specifications Consider the following example of a contract that uses both kinds of nondetermin- ism:

{x, e := x0, e0 | x0 ≥ 0 ∧ e0 > 0} ; = |− ≤ − 2 ≤ . [x : x1 e x x1 e] The purpose of this contract is to compute the square root of a number. Our agent first chooses new values x0 and e0 for x and e such that the condition x0 ≥ 0∧e0 > 0 holds. Then the other agent chooses some new value x1 for x such that the condition − ≤ − 2 ≤ e x x1 e holds. This contract requires that our agent first choose a value x0 for which the square root should be computed and a precision e0 with which the square root should be computed. Then the other agent is given the task of actually producing a new value x1 as the required square root. The computed value is not uniquely determined by the specification, but is known only up to the given precision. The above contract describes a simple interaction between a user of a square- root package and the system that computes the square root. Contracts generalize the traditional notion of a pre–postcondition specification to more intricate user interactions with a system (or client interactions with a server). In general, the interaction may be much more involved and may require fur- ther choices by the user of the system and the system itself during the course of computation. For instance, the following describes a simple event loop interaction:

• (µX if g1 :: S1 [] ··· [] gm :: Sm fi ; X skip). Here the user of the system repeatedly executes one of the actions that is enabled until he decides to stop the repetition. The user will be forced to stop the repetition if none of the actions is enabled but he may otherwise freely repeat the loop as many times as he desires.

1.7 Correctness

Let S be a contract statement, and let p and q be two predicates (the precondition and the postcondition). Then we say that S is correct with respect to p and q, 1.7. Correctness 19 denoted by p {| S |} q,ifσ {| S |} q holds for every σ that satisfies p. Program statements are special kinds of contract statements, so this also defines correctness for program statements. Thus p {| S |} q expresses that for any initial state in p, our agent can choose an execution of S that either establishes q or leads to some of its assumptions being violated. Let us compare this definition with the traditional notion of total correctness of programs. This requires that program S be guaranteed to terminate in a state that satisfies postcondition q whenever the initial state satisfies precondition p. Com- pared to a program, our statements are more powerful; they can describe behavior that is not usually associated with programs. Programs are assumed to have only demonic nondeterminism (or sometimes only angelic nondeterminism) but not both kinds of nondeterminism. Breaching the contract can be seen as a form of nontermination (abortion), but there is no counterpart to breaching an assumption for ordinary programs. Total correctness takes demonic nondeterminism into ac- count by requiring that the program be guaranteed to terminate in some desired final state no matter what demonic choices are made. Our definition extends this by requiring that there be a way for our agent (the angel) to make its choices so that it can establish the required postcondition no matter what demonic choices are made. If there are no angelic choices or assumptions, then our definition coincides with the traditional notion of total correctness for nondeterministic programs.

Correctness Problems Our general goal is to construct program statements (or more general contract statements) that are correct with respect to some given specification. This is the programming problem. Of course, programs are also required to satisfy other cri- teria, such as efficiency and portability, but we concentrate here on correctness. There are, however, different ways of approaching this problem.

(i) We may assume that precondition p and postcondition q are given and that statement S has already been constructed by some means. We are asked to prove that p {| S |} q holds. This is the correctness problem, the traditional center of interest in programming . (ii) The precondition p and postcondition q are given. Our task is to derive a suitable program statement S such that p {| S |} q holds. This is the program derivation problem, which is one of the central topics of this book. (iii) The program statement S is given. We want to find a suitable precondition p and postcondition q such that p {| S |} q holds. This is the program analysis problem; we ask what the program does. This is sometimes referred to as reverse engineering. If the postcondition q is also given, then we need to find a suitable precondition p such that p {| S |} q holds. This is precondition anal- ysis: we want to determine the conditions under which the program behaves as required. Alternatively, the precondition p could be given, and we should 20 1. Introduction

find a suitable postcondition q such that p {| S |} q holds. This is postcondition analysis; we want to describe the effect of a program statement for certain initial states. (iv) We want to find an interesting precondition p, postcondition q, and program statements S such that p {| S |} q holds. This could be called explorative programming. We construct the program and its specification hand in hand, making the program and its specification more precise as part of the program construction process. A variant of explorative programming arises when the postcondition q is given, and the task is to find an interesting statement S and a precondition p such that p {| S |} q holds. This leads to a form of backward program derivation, where we try to derive from the postcondition a program that will achieve this postcondition. Another variant is that the precondition p is given, and we try to derive an interesting statement S and a postcondition q such that p {| S |} q holds. This kind of forward program derivation is probably the least interesting of the problems to be considered, because the precondition usually does not contain much information and does not give us much direction in searching for interesting programs.

The word suitable above is important, because for many problems there are triv- ial solutions that are immediately available. For the program derivation problem, [false] will satisfy any pre- and postcondition. For precondition analysis we can always choose the trivial precondition false, and for postcondition analysis we can often at least get true as a feasible postcondition (when the program statement is guaranteed to terminate). The refinement calculus is a framework for studying these different kinds of ques- tions. In what follows we will look at some of these paradigms in somewhat more detail.

1.8 Refinement of Programs

A central application of the refinement calculus is the derivation of a program that satisfies a given specification. The method for program derivation that we will be particularly interested in is known as stepwise refinement. The basic idea is as follows. We start with a high-level specification of what a program is required to do. This specification is then implemented by a program statement that achieves what the specification says, but usually contains parts that are not implemented but only specified. These subspecifications are then implemented by other program statements, possibly containing other unimplemented parts in turn. This process continues until all specifications have been implemented by the program text, and we then have an executable program. Thus, this classic stepwise refinement method proceeds in a top-down fashion. 1.9. Background 21

Besides this top-down method for program development, another idea that was merged with the stepwise refinement method was that of program transformation. Here a program is changed by applying a transformation that changes it in a way that preserves the correctness of the original program. Transformations of source text, such as loop folding and unfolding, data refinement, and program optimization, are examples of this approach. The refinement relation between contract statements is defined so that it preserves correctness: If S  S and p {| S |} q holds, then p {| S |} q will also hold. This is the case for any choice of precondition p and postcondition q. The formalization of the stepwise refinement method in the refinement calculus is based on this observation. We start with an initial statement S0 that satisfies some correctness criteria that we have been given; i.e., we assume that p {| S0 |} q. Then we derive a sequence of successive refinements

S0  S1 ···Sn .

By transitivity, we have that S0  Sn, and because refinement preserves correctness, we have that p {| Sn |} q also holds. Thus, we have constructed a new statement Sn from the original statement S0, where the new statement also satisfies our initial requirement.

An individual refinement step Si  Si+1 can be established by, e.g., implementing some specification statement in Si to get Si+1, by applying some transformation rule to derive Si+1 from Si , or by some other means. Sometimes, we need a whole subderivation to establish that the refinement step Si  Si+1 is valid. The methods that we use for calculating or validating a refinement step form a collection of so-called refinement laws. These tell us how to find suitable refinement steps and under what conditions these steps are valid.

The purpose of the derivation is to find a program Sn that in some ways is better than the original program S0. This program could be more space or time efficient, it could be implementable on a specific computer system or in a specific (whereas the original one may not have been), or it could have other desirable properties that the original program was missing. The determination of these criteria and checking that the proposed refinements are indeed improvements over the original statements is considered to be outside the scope of this book. We focus on methods for proving that a refinement preserves correctness and on finding general rules that are known to preserve correctness.

1.9 Background

We give below a brief overview of the background of the refinement calculus, concentrating on earlier developments in this area. More recent work is described and referenced in the subsequent chapters. 22 1. Introduction

The origins of the refinement calculus are in the stepwise refinement method for program construction that was put forward by Edsger W. Dijkstra [51] and [141]; the transformational approach to programming, put forward by Susan Gerhart [61] and by Rod Burstall and John Darlington [43]; and the early work on program correctness and data refinement by C.A.R. Hoare [84, 87]. The purpose of the refinement calculus is to provide a solid logical foundation for these meth- ods, based on the weakest precondition approach to total correctness of programs proposed by Dijkstra [53]. The refinement calculus is a further development of the weakest precondition the- ory. It was invented by Ralph-Johan Back in his Ph.D. thesis in 1978 [6] and shortly thereafter described in book form [7]. This work introduced the central notions of the calculus: the refinement relation between program statements, with the emphasis on preserving total correctness rather than partial correctness of pro- grams; modeling program statements as predicate transformers; using specification statements as primitive program statements; the emphasis on monotonic program constructs to guarantee that component replacement preserves correctness; using assertion statements for refinement in context; and data refinement. The basic re- finement rules for program statements were identified and proved to be sound, as were the basic rules for data refinement of program statements. A few journal papers describing the approach followed in the next few years. Spec- ifications treated as executable statements were described by Back in a 1981 article [10]. The general notion of program refinement was also described that same year [9], and the of statements with unbounded nondeterminism (which was the result of adding arbitrary specification statements to a programming language) was described in yet another article in the same year [8]. However, the topic did not attract much interest from other researchers in the field at that time. Interest in refinement calculus was revived around 1987–88. An overview and update of the refinement calculus approach was published by Back [11, 12], with a better treatment of iteration and recursion, among other things. Back and Kaisa Sere applied the refinement calculus to stepwise construction of parallel and reactive programs and to the refinement of atomicity in parallel programs [15, 22, 24]. This work was based on the action system framework that had been developed earlier by Back and Reino Kurki-Suonio [20]. This work uncovered a number of interesting foundational issues in the refinement calculus and also convinced the authors of the general usability of the approach in practice. Carroll Morgan [102, 103, 104] and Joseph Morris [108] also started to publish papers on this topic around 1987–88. Both were essentially using the original refinement calculus framework, but they extended it with different specification statements and with miraculous statements (our assumptions), which were not allowed in the original calculus. Furthermore, Morris used a fixed-point approach to handle recursion. Greg Nelson [113] also studied miraculous statements, although he had a slightly different approach to program refinement (requiring both total and 1.9. Background 23 partial correctness to be preserved). Wim Hesselink initiated the study of algebraic properties of program statements, based on the predicate transformer interpretation, but he concentrated on equivalence rather than refinement [77].

Morgan later published an influential book on program refinement [106]. This book incorporated his extensions to the refinement calculus and contained a number of case studies that described how to derive programs from specifications in the refinement calculus framework. This established the applicability of the calculus to practical programming problems. The book reformulates the original refinement calculus in a notation that is influenced by the Z specification style [133].

Another driving force for the further development of the calculus was the need to reengineer its mathematical basis. The lattice-theoretic study of refinement was initiated in a paper by Back and Joakim von Wright in 1989 [25, 27] on the duality in specification , where the fundamental program statements were identified and analyzed. In particular, this work introduced the central themes of angelic and demonic nondeterminism in the calculus. The use of lattice theory has led to a considerable simplification of the theory, as compared to earlier attempts, and made it much more streamlined. At the same time, the theory has also become much more general, and new applications for these generalizations have suggested themselves along the road.

The original formalization of the refinement calculus was done in infinitary logic [10]. The need for a logic stronger than first-order logic was evident already then, but the step to a full higher-order logic was taken only later by Back and von Wright [28, 30]. The higher-order logic formalization has since then been extended and reworked considerably, as documented in this book. A complete implementation of the refinement calculus has also been made by von Wright [144] using the HOL theorem-proving system [62]. The HOL system has influenced a number of design decisions in our formalization of the calculus. Many of the theorems presented in this book have been checked mechanically for correctness and now form the basis for a mechanized program-refinement environment that is being developed in a project led by von Wright [45, 97].

There are many other authors besides those mentioned above that have contributed to the development of the refinement calculus. Some of the work has aimed at extending the domain of applicability of the calculus to new areas, such as parallel programs [23], reactive programs [14, 31], and object-oriented programs [111, 129, 138]. Others have been looking at larger-scale applications of refinement methods [21, 117]. Theory development has also continued and has brought many new insights into this area [46, 58, 112]. A rather active research area is to build better computer-supported tools for program refinement [47, 67]. There has also been considerable work on transporting some of the techniques from the refinement calculus framework to other formalisms for program development, such as the CSP process algebra [44]. 24 1. Introduction

1.10 Overview of the Book

The previous sections have given an informal introduction to some of the main concepts that we will study in this book. The first part of the book covers the foundations of the refinement calculus. As indicated above, lattices form the main algebraic structures needed in the refinement calculus, so the book opens with an overview of some basic notions of lattice theory. More lattice theory is introduced later on as needed. Higher-order logic, which we use as the logical framework for reasoning about refinement, is presented next. We give an overview of higher- order logic, with a lattice-oriented axiomatization of this logic and introduce the derivational proof style that we are using throughout the book. We show how to reason about functions, about truth values, sets, predicates, and relations. We also study the properties of states with attributes and show how to formalize the notion of program variables in higher-order logic.

Very simple contracts with no choices (deterministic programs) can be modeled as functions from states to states. More complicated contracts may be modeled as relations on states, provided that all choices are made by only one agent. General contracts, where choices can be made by two or more agents, are modeled as predicate transformers, i.e., functions that map postconditions to preconditions. This is the central notion to be studied in the second part of the book. We will describe the basic algebraic properties of predicate transformers. The notions of correctness and refinement of predicate transformers will also be defined precisely. The lattice-theoretic interpretation of contracts is a direct extension of set theory, so programs, specifications, and contracts in general share many properties with sets. This provides a familiar framework in which to reason about correctness and refinement of programs. We will show that contract statements in fact can be identified with the monotonic predicate transformers, and that more standard program statements correspond to subclasses of such predicate transformers. The notion of correctness is analyzed more carefully in this context, and we show how our formalization of correctness is related to standard ways of reasoning about total correctness of programs.

We will also look more carefully at alternative interpretations of contracts. As we explained above, a contract can be seen as a game between an angel (our agent) and a demon (the other agent), where correctness corresponds to the existence of a win- ning strategy for the angel. We define an operational game semantics for contracts that formalizes this viewpoint. We also define the choice semantics, an alternative interpretation of contract statements that can be seen as a denotational semantics for contracts. The predicate transformer semantics and the choice semantics are equivalent, and both are abstractions of the game semantics.

The third part of the book introduces recursion and iteration of statements. Recur- sion is explained in terms of fixed points of monotonic functions on lattices, and iteration is explained in terms of recursion, in the way we have described above. 1.10. Overview of the Book 25

Some of the material is based on ordinal arithmetic, so we give a short introduction to the concepts and results needed from this field. Iteration is studied in depth, because of its central status in reasoning about imperative programs. We illustrate the use of the refinement calculus framework by three case studies: one on deriving a nontrivial sorting algorithm, another on deriving a recursive search algorithm, and the third on determining the winning strategy for a classical game. The fourth part of the book considers subclasses of contracts that are characterized by some homomorphism properties. These subclasses correspond to more stan- dard models for sequential programs. For instance, Dijkstra’s guarded commands fall into the subclass of conjunctive and continuous statements. We show how to describe traditional semantic models, like denotational semantics for deterministic and nondeterministic programs, in the refinement calculus framework. We also give normal forms for contracts in these different subclasses and show that the normal forms can be understood as specifications for the class of contracts in question. We will in particular look at one very general subclass of contract statements, the conjunctive statements. This class is interesting because it corresponds to the stan- dard model for program statements and specification statements. We will consider a collection of important refinement techniques in more detail for this class of statements. We analyze the notion of specification statements and study methods for refining specification statements into more concrete program statements. We show how the context of a program component can be taken into account and used in a refinement step. We also consider program transformation rules that change the control structure of programs, in particular rules for changing iteration statements. This is page 26 Printer: Opaque this This is page 27 Printer: Opaque this

Part I

Foundations This is page 28 Printer: Opaque this This is page 29 Printer: Opaque this 2

Posets, Lattices, and Categories

This chapter introduces the central mathematical structures that are needed to formalize the refinement calculus: partially ordered sets (posets), lattices, and cat- egories. We identify the basic properties of posets and lattices, and use them for a classification of lattices. We also show how to construct new lattices out of old ones as cartesian products and function spaces. We study structure-preserving mappings (homomorphisms) on lattices. Finally, we show how to form a certain kind of cat- egory out of these lattices. The simple notions identified in this chapter underlie the formal reasoning about properties of programs, specifications, and contracts in general.

2.1 Partially Ordered Sets

Let A be a nonempty set and  a binary relation on A. We say that  is respectively reflexive, transitive, symmetric, and antisymmetric if

a  a , ( reflexive) a  b ∧ b  c ⇒ a  c , ( transitive) a  b ⇒ b  a , ( symmetric) a  b ∧ b  a ⇒ a = b , ( antisymmetric) for any choice of a, b, c in A (note that we assume that  binds more strongly than ∧, which in turn binds more strongly than ⇒; for a complete collection of precedence rules we refer to Appendix A). Here a note on notation is in order. We could write a  b  c rather than a  b ∧ b  c in the left-hand side of the transitivity rule. However, we will 30 2. Posets, Lattices, and Categories

restrict the use of such continued relations to a minimum; they will appear only to indicate ranges of numbers of the form a ≤ x ≤ b and occasionally with continued equalities of the form a = b = c. The relation  is called a preorder if it is reflexive and transitive. The pair (A, ) is then called a preordered set,orapreset. If the preorder  is also symmetric, it is said to be an equivalence relation. If the preorder  is antisymmetric, it is said to be a partial order, and (A, ) is then called a partially ordered set,oraposet. Generally, a < b stands for a  b ∧¬(b  a). For posets, this means that a < b stands for a  b ∧ a = b. We also write a  b for b  a and a = b for b < a. The dual of the poset (A, ) is the poset (A, ). A partial order is linear (or total)if

a  b ∨ b  a ( linear) holds for every a, b ∈ A. A linear partial order is called a linear order, and (A, ) is then a linearly ordered set. In a discrete order, the only element that is comparable with an element a is a itself. This order is thus defined by a  b if and only if a = b for any a, b ∈ A.A discrete order is clearly both a partial order and an equivalence relation.

Let B ⊆ A, B =∅. Evidently, the restriction B of  to B is reflexive (transitive, symmetric, antisymmetric, linear) on B if  is reflexive (transitive, symmetric, antisymmetric, linear) on A. Hence, (B, B ) is a poset whenever (A, ) is a poset. We say that (B, B ) is a subposet of (A, ). Assume that (A, ) is a partially ordered set and that B is a subset of A. The element a ∈ B is the least element of B if a  x for every element x in B. Similarly, b ∈ B is the greatest element of B if x  b for every element x in B. The least and greatest elements are unique if they exist. The least element of the whole poset A is called the bottom of the poset and is denoted by ⊥. The greatest element of the whole poset A is called the top of the poset and is denoted by .A poset is said to be bounded, if it has a least and a greatest element. For such posets, we have

⊥a , (⊥ smallest) a  . ( greatest)

A finite poset (A, ) is often pictured as a Hasse diagram. Each element in A is a node in the diagram. If there is an upward edge from a node a to a node b, then a < b. Relations a  b that can be inferred by reflexivity or transitivity are not indicated. Thus there is an edge from a to b in the Hasse diagram if and only if a < b and there is no c = a, b such that a < c ∧ c < b. We also use Hasse diagrams to describe parts of infinite posets. Figure 2.1 gives examples of Hasse diagrams. 2.1. Partially Ordered Sets 31

{a,b,c}

T {a,b} 3 {a,b} {a,c} {b,c} 2 {a}{b} {a} {c} 1 {b} F Ø 0

Ø

(a) (b) (c) (d)

FIGURE 2.1. Poset examples

Truth-Value Poset The set of truth values Bool ={F, T}, ordered by implication, is an important poset. We have that F ⇒ F and T ⇒ T, so implication is reflexive. If b1 ⇒ b2 and b2 ⇒ b3, then b1 ⇒ b3, so implication is transitive. Finally, if b1 ⇒ b2 and b2 ⇒ b1, then b1 = b2, so implication is also antisymmetric. Thus (Bool, ⇒) is a poset. In fact, implication is a linear ordering, because either b1 ⇒ b2 or b2 ⇒ b1, for any choice of truth values for b1 and b2. The truth-value poset is shown as a Hasse diagram in Figure 2.1. The truth-value poset is evidently also bounded, with F as the least and T as the greatest element.

Powerset Poset The set of subsets P(A) of a given set A form another important poset. The pair (P(A), ⊆) is a poset for any set A. This poset is described in Figure 2.1 for A ={a, b} and for A ={a, b, c}. Note that for A ={a}, we get a poset with the same structure as the truth-value poset in Figure 2.1, if we make the correspondence ∅=F and {a}=T. Inclusion is linear only when A =∅or A ={a}; for A ={a, b}, neither {a}⊆{b} nor {b}⊆{a} holds. The powerset poset is also bounded, with ∅ as the least element and A as the greatest element.

Natural-Numbers Poset The usual arithmetic ordering ≤ on natural numbers Nat ={0, 1, 2,...} is also a partial order (Figure 2.1). It is reflexive (n ≤ n), transitive (n ≤ m ∧ m ≤ k ⇒ n ≤ k), and antisymmetric (n ≤ m ∧ m ≤ n ⇒ n = m). Hence, (Nat, ≤) is a poset. In fact, it is a linearly ordered set, as m ≤ n or n ≤ m holds for any two natural numbers n and m. Another example is the linearly ordered set of (positive and negative) integers (Int, ≤), with the usual ordering. The natural numbers have 32 2. Posets, Lattices, and Categories

a least element, 0, but do not have a greatest element. The integers have neither a least nor a greatest element.

Congruence and Monotonicity

Let (A, A) and (B, B ) be preorders. The function f : A → B is then said to be order-preserving if   a A a ⇒ f.a B f.a ( f order-preserving) holds for all a and a in A. The function f is said to be order-embedding if the converse also holds, i.e., if   a A a ≡ f.a B f.a . ( f order-embedding)

Assume that f : A → B is order-preserving. If the preorders A and B are equivalence relations, then f is called a congruence. If they are partial orders, then f is said to be monotonic. A binary function is said to be congruent (monotonic) if it is congruent (monotonic) in each of its arguments separately. For f : A × B → C, this is the case if and only if the following holds:     a A a ∧ b B b ⇒ f.(a, b) C f.(a , b ). ( f monotonic) A function f is said to be antimonotonic if   a A a ⇒ f.a B f.a . ( f antimonotonic) Antimonotonicity is really a monotonicity property: f as a function from poset (A, A) to poset (B, B ) is antimonotonic if and only if f as a function from poset (A, A) to the dual poset (B, B ) is monotonic. Squaring is an example of a monotonic function on the natural numbers: n ≤ m implies that n2 ≤ m2. Squaring is antimonotonic on the negative numbers: for natural numbers n and m, we have that −n ≤−m implies that n ≥ m, which in turn implies that (−n)2 ≥ (−m)2. Addition and multiplication of natural numbers are examples of functions that are monotonic in both arguments: if n ≤ n and m ≤ m, then n + m ≤ n + m and n · m ≤ n · m.

Meet and Join Let B be a subset of poset A. An element a ∈ A is a lower bound of B if a  b for every b ∈ B. Dually, an element a ∈ A is an upper bound of B if b  a for every b ∈ B. The least element of a set is thus always a lower bound of the set, but the converse need not hold, as a lower bound need not be an element of the set B. 2.1. Partially Ordered Sets 33

The element a is the greatest lower bound (or infimum,ormeet)ofB ⊆ A if a is the greatest element of the set of lower bounds of B. We write B for the greatest lower bound of a set B. When B ={b1, b2}, we write b1  b2 for B. A set of elements B need not have a greatest lower bound. However, if there is a greatest lower bound, then it is unique. The greatest lower bound of an arbitrary set B is characterized by the following two properties: b ∈ B ⇒B  b , (general  elimination) (∀b ∈ B • a  b) ⇒ a B , (general  introduction) The first property states that B is a lower bound, and the second that it is the greatest of all lower bounds. The names for these properties reflect their intended use. The first property is used when we need to go from the meet to an element greater than the meet. The second is used when we want to go from some element to the meet of a set.

When the set B is given as an indexed set, B ={bi | i ∈ I }, then we write • ( i ∈ I bi ) for the greatest lower bound B (the set I , which is used to index the elements, can be any set). For two elements, the properties of the meet are as follows:

b1  b2  b1 and b1  b2  b2 , ( elimination) a  b1 ∧ a  b2 ⇒ a  b1  b2 . ( introduction)

The dual definition is as follows. The element a is the least upper bound (or supremum,orjoin)ofB if a is the least element of the set of upper bounds of B. We denote the least upper bound of B by B. We write b1 b2 for {b1, b2}. The least upper bound of a set B is unique if it exists. The supremum of an arbitrary set B is characterized by the following two properties: b ∈ B ⇒ b  B , (general introduction) (∀b ∈ B • b  a) ⇒ B  a . (general elimination) The first property states that B is an upper bound, and the second that it is the least of all upper bounds. The names for these properties again reflect their intended use. The first property shows how to go from an element to the join, which is a greater element. The second property shows how to go from the join to an element greater than the join.

Again, when the set B is given as an indexed set, B ={bi | i ∈ I }, we write • ( i ∈ I bi ) for the greatest lower bound B. For binary joins (i.e., joins of two elements), we have that

b1  b1 b2 and b2  b1 b2 , ( introduction) b1  a ∧ b2  a ⇒ b1 b2  a . ( elimination) 34 2. Posets, Lattices, and Categories

T

c

a

b

d

FIGURE 2.2. Poset with meets and joins

Figure 2.2 illustrates meets and joins in a poset. We have that c = a b and d = a  b in the poset. The top  and bottom ⊥ of the poset are also indicated in the figure. We have the following basic correspondence between the partial ordering, the least upper bound, and the greatest lower bound.

Lemma 2.1 The following properties hold for any a, b in a poset: a  b = a ≡ a  b , (correspondence) a b = b ≡ a  b , whenever the meet and join in question exist.

Proof We prove the required property for infimum. The proof for supremum is analogous. We prove the equivalence by showing implication in both directions. We first prove that a  b = a ⇒ a  b: a = a  b ⇒{reflexivity of } a  a  b ⇒{elimination and transitivity of } a  b

We then prove that a  b ⇒ a = a  b: a  b ⇒{reflexivity of } 2.1. Partially Ordered Sets 35

a  a ∧ a  b ⇒{introduction} a  a  b ⇒{a  b  a holds by  elimination} a  a  b ∧ a  b  a ⇒{antisymmetry of } a = a  b

2

We have here written the (informal) proofs as linear derivations, where each im- plication is written and justified on a line of its own. In Chapter 4 we will introduce a formal way of writing proofs that includes linear derivations such as these.

Because ⊥a and a hold in any poset that has a top and a bottom, we have as special cases of this lemma that

a ⊥=⊥ a ⊥=a , (⊥ properties) a =aa = , ( properties)

for arbitrary a in a poset with top and bottom.

The conjunction a ∧ b of truth values a and b satisfies the conditions for finite meet: (i) a ∧ b ⇒ a and a ∧ b ⇒ b,soa ∧ b is a lower bound of {a, b}; and (ii) if c ⇒ a and c ⇒ b, then c ⇒ a ∧ b,soa ∧ b is the greatest lower bound of {a, b}. Similarly, disjunction satisfies the properties of finite join. Thus, conjunction is the meet and disjunction the join in Bool.

In the powerset poset, we have again that A  B = A ∩ B for subsets A and B of D: (i) A ∩ B ⊆ A and A ∩ B ⊆ B,soA ∩ B is a lower bound of {A, B}; (ii) if C ⊆ A and C ⊆ B, then C ⊆ A ∩ B,soA ∩ B is the greatest lower bound of {A, B}. Similarly, we have that A B = A ∪ B. We also have that ⊥=∅and =D in (P(D), ⊆). For an arbitrary family K of sets in P(D), we have that K =∪K and K =∩K .

For the natural numbers (with ordering ≤), it is easy to see that m  n = m min n, while m n = m max n, where min gives the smaller and max the larger of the two elements. If K is an arbitrary nonempty set of natural numbers, then K always exists, but K exists only if K is finite. If we extend the natural numbers with the first infinite ordinal ω (ordinals are treated in detail in Chapter 18) to get Nat ∪{ω}, then K also exists for arbitrary sets of natural numbers (and is always ω when K is infinite). 36 2. Posets, Lattices, and Categories

2.2 Lattices

A poset (A, ) (or, for simplicity, just A) is called a lattice if the meet a  b and join a b exist in A for all pairs a, b of elements of A. A direct consequence of this is that a poset (A, ) is a lattice if and only if B and B exist for all finite nonempty subsets B of A. The poset (Bool, ⇒) is a lattice, as are (Nat, ≤), (Int, ≤), and (P(D), ⊆) for any D. The posets in Figure 2.1 are thus lattices, as is the poset in Figure 2.2. The poset (A, ) is called a meet semilattice if a  b exists in A for all pairs a and b of elements of A.Ajoin semilattice is defined analogously. If the partial order (A, ) is a lattice, then meet and join are defined for any pair of elements in A, so lattice meet and join can be seen as binary operations on A. They thus form an algebra on A. These operations can be shown to have the following properties: a  a = aa a = a , (idempotence) a  b = b  aa b = b a , (commutativity) a  (b  c) = (a  b)  ca (b c) = (a b) c , (associativity) a  (a b) = aa (a  b) = a . (absorption)

Conversely, assume that we have an algebra A where the operations  and satisfy the above properties. Define a  b to hold if and only if a = a  b (or a b = b); i.e., assume that the correspondence between meet (or join) and ordering holds. Then we can show that (A, ) is a lattice and that the operations  and correspond to meet and join in this lattice. Hence, lattices can be introduced either as a special kind of algebra or as a special kind of poset. We have above chosen the latter approach. Meet and join, seen as operations, are easily shown to be monotonic with respect to the lattice ordering: a  a ∧ b  b ⇒ a  b  a  b , ( monotonic) a  a ∧ b  b ⇒ a b  a b . ( monotonic)

We next define a number of additional properties that lattices may have and give a classification of lattices based on these properties. These additional properties turn out to be very important and will occur in different disguises over and over again.

Complete Lattices A lattice A is complete if B and B exist in A for all subsets B ⊆ A. In particular, greatest lower bounds and least upper bounds also must exist for empty B as well as for any infinite B. 2.2. Lattices 37

Every complete lattice has a top and a bottom, so a complete lattice is always bounded. The bottom can be characterized either as the greatest lower bound of the whole lattice or as the least upper bound of the empty set. A dual characterization holds for the top. We thus have that ⊥=A and ⊥= ∅ , (bottom characterization) = A and =∅ . (top characterization)

The truth-value lattice is a complete lattice, because any finite lattice is complete. Consider a set B ={bi | i ∈ I } of truth values. The set B must be either ∅, {F}, {T},or{F, T}.ForB =∅, we already know that B = T and B = F, by the above characterization. For B =∅, we have that

• ( i ∈ I bi ) = T iff bi = T for all i ∈ I , • ( i ∈ I bi ) = T iff bi = T for some i ∈ I . The meet and join of indexed sets of truth values thus model universal and existential quantification, as we will show later. The intersection and union of an arbitrary set of subsets of A is also a subset of A. Hence, the subset lattice is also a complete lattice. The natural numbers form a complete meet semilattice, i.e., a poset where every nonempty subset has a meet. However, they do not form a complete lattice, because the join of an infinite set of natural numbers does not exist. The extended natural numbers Nat ∪{ω} do form a complete lattice.

Distributive Lattices A lattice A is distributive if for any elements a, b, and c of A the following two properties hold, referred to as meet distributivity and join distributivity, respec- tively: a (b  c) = (a b)  (a c), ( distributivity) a  (b c) = (a  b) (a  c). ( distributivity)

The distributivity properties do not necessarily generalize to infinite meets and joins in a lattice. We call the corresponding properties for infinite meets and joins the infinite meet distributivity condition and the infinite join distributivity condition. These conditions are

• • a ( i ∈ I bi ) = ( i ∈ I a bi ), (infinite  distributivity) • • a  ( i ∈ I bi ) = ( i ∈ I a  bi ), (infinite distributivity) where a and bi are elements of the lattice A, and I is any index set. The property a∧(b∨c) = (a∧b)∨(a∧c) holds for truth values, as does a∨(b∧c) = (a ∨ b) ∧ (a ∨ c). Thus, conjunction and disjunction satisfy the distributivity laws, 38 2. Posets, Lattices, and Categories

so the truth-value lattice is in fact a distributive lattice. The infinite distributivity conditions are trivially satisfied in any finite distributive lattice. The subset lattice also satisfies the distributivity laws: A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) and A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C). In fact, the subset lattice satisfies the infinite distributivity laws:

• • A ∪ (∩ i ∈ I Bi ) = (∩ i ∈ I A ∪ Bi ), • • A ∩ (∪ i ∈ I Bi ) = (∪ i ∈ I A ∩ Bi ).

The natural numbers also satisfy the distributivity laws:

m min (n max k) = (m min n) max (m min k), m max (n min k) = (m max n) min (m max k). In addition, the infinite meet distributivity condition holds for natural numbers, but not the corresponding join condition, as max is not defined for infinite sets of natural numbers.

Boolean Lattices A complete distributive lattice A is called a complete boolean lattice if every element a in A has a unique complement ¬ a in A satisfying the conditions

a ¬a =⊥, (¬ contradiction) a ¬a =. (¬ exhaustiveness)

Every complete boolean lattice satisfies both infinite distributivity conditions. For a proof of this fact, we refer to standard texts on lattice theory. The following de Morgan rules also hold in any complete boolean lattice:

• • ¬ ( i ∈ I ai ) = ( i ∈ I ¬ ai ), (de Morgan) • • ¬ ( i ∈ I ai ) = ( i ∈ I ¬ ai ),

where ai are elements of the lattice and I is any index set. We also note that complement is an involution:

¬ (¬ a) = a . (¬ involution) A direct consequence of this is that complement is a bijective function from a complete boolean lattice onto itself:

a = b ≡¬a =¬b . (¬ bijective)

Lattice complement is also antimonotonic with respect to the lattice ordering:

a  b ≡¬a ¬b . (¬ antimonotonic) 2.2. Lattices 39

The truth values form a boolean lattice, where ¬ is negation of truth values: We have that a ∧¬a = F and a ∨¬a = T. The following theorem summarizes the properties of the truth-value lattice that we have noted above. Theorem 2.2 The truth values with the implication ordering form a complete, boolean, and linearly ordered lattice. In fact, it can be shown that any partially ordered set with at least two elements that satisfies the conditions of Theorem 2.2 is isomorphic to the truth-value lattice. Thus the truth-value lattice is completely characterized by the fact that it is a complete, boolean, and linearly ordered lattice. The subset lattice is also a boolean lattice. It is bounded and distributed, as we have shown above. The complement of a subset A of D is the set ¬ A = D − A. It is easy to see that A ∪¬A = D and A ∩¬A =∅, so the requirements on the complement are met. The natural numbers do not form a boolean lattice. Even if we add the limit ordinal ω to Nat, so that we get a complete distributive lattice, we still do not have a boolean lattice.

Atomic Lattices Let A be a lattice with a least element. The element a ∈ A (where a =⊥) is called an atom if the following condition holds for each b ∈ A: b  a ⇒ b =⊥ ∨ b = a . Thus, an atom is an element that is one step above the bottom element in a Hasse diagram. A lattice A is said to be atomic if for each element x in A, we have that x = {a ∈ A | a is an atom ∧ a  x} .

The truth-value lattice is atomic. In fact, any finite complete boolean lattice is atomic. The representation theorem states that any atomic complete boolean lattice is isomorphic to the subset lattice of its atoms. The subsets of a set also form an atomic lattice, where the atoms are the singleton sets. The definition of atomicity then states that any set in the subset lattice is the join (union) of its singleton subsets:

B = (∪ b ∈ B • {b}).

The following theorem summarizes the properties of the subset lattice that we have noted and proved above.

Theorem 2.3 The subsets of a set D with the inclusion ordering form a complete, boolean, and atomic lattice. 40 2. Posets, Lattices, and Categories

Dual Lattice and Duality The dual of a lattice A = (A, ) is the lattice A◦ = (A, ). The dual lattice is complete (distributive, boolean) if and only if the lattice is complete (distributive, boolean). Meet in the dual lattice is join in the original lattice, and similarly, join in the dual lattice is meet in the original lattice. Complement is the same in both lattices. Lattices satisfy the following duality principle. Let φ be a universal statement about lattices (A, ). The dual statement φ◦ is then constructed by interchanging  and ,  and , ⊥ and  in φ, while leaving ¬ unchanged. The dual statement is true of all lattices if and only if the original statement is true of all lattices. The de Morgan laws above provide an example of how this duality principle works. We have shown earlier that the complement operation is antimonotonic rather than monotonic. This means that the complement operation is monotonic onto the dual of the lattice. If we consider complement as a function ¬ : A → A◦, then it is in fact monotonic.

2.3 Product and Function Spaces

We now show how to construct new posets and lattices out of given posets or lattices as product and function spaces.

Product Spaces

Let us first look at the cartesian product. Let (A, A) and (B, B ) be two posets. We define a partial ordering on the cartesian product A × B componentwise,by     (a, b) A×B (a , b ) ≡ a A a ∧ b B b . ( of pairs) The ordering thus holds between two pairs if and only if it holds for each component separately. The cartesian product of two posets is a poset itself, as the following lemma shows. In addition, the property of being a lattice is also inherited in a cartesian product, as are completeness, boundedness, (infinite) distributivity, and being a boolean lattice.

Lemma 2.4 Let (A, ) and (B, ) be posets. Then (A × B, ) is also a poset when the ordering  on A × Bisdefined componentwise. This poset is a lattice (complete, bounded, distributive, boolean lattice) if A and B are lattices (complete, bounded, distributive, boolean lattices). The “if” in this lemma can be strengthened to “if and only if” when A and B are nonempty. 2.3. Product and Function Spaces 41

(T,T,T) T (T,T) (T,F,T) (T,T,F) (F,T,T) (T,F) (F,T) (T,F,F) (F,F,T) (F,T,F) F (F,F)

(F,F,F)

(a) (b) (c)

FIGURE 2.3. Products of the truth-value lattice

Assume that A and B are (bounded, boolean) lattices. Bottom, top, complement, meet, and join in the product lattice are then componentwise extensions of the corresponding operations of the component lattices:

⊥A×B = (⊥A, ⊥B ), (⊥ of pairs) A×B = (A, B ), ( of pairs) ¬ (a, b) = (¬ a, ¬ b), (¬ of pairs) (a, b)  (a, b) = (a  a, b  b), ( of pairs) (a, b) (a, b) = (a a, b b). ( of pairs) These definitions are extended in the obvious way to the cartesian product of an arbitrary number of posets. The lattice (b) in Figure 2.1 has the same structure as the product lattice Bool × Bool, while lattice (c) in the same figure has the structure of the product lattice Bool × Bool × Bool. These product lattices are shown in Figure 2.3.

Function Spaces Let A be a set and B a poset. Recall that A → B stands for the set of all functions f : A → B. Two functions f and g in A → B are equal if they give the same values for each argument in A (this is known as the extensionality principle): f = g ≡ (∀x ∈ A • f.x = g.x). We use this same idea to define a partial order on A → B as the extension of the partial order on B:

• f A→B g ≡ (∀x ∈ A f.x B g.x) ( of functions) for any functions f, g ∈ A → B. This is the extensionality principle for functions generalized to partial orderings. We say that the poset A → B is a pointwise exten- sion of the poset B. The notion of pointwise extension is illustrated in Figure 2.4. 42 2. Posets, Lattices, and Categories

B Σ

FIGURE 2.4. Pointwise extension ordering

The function indicated by the dashed arrows is here greater than the function indicated by solid lines. We have the same result for the pointwise extension to function spaces as for cartesian products.

Lemma 2.5 Let B be a poset and A any set. Then the pointwise extension of B to A → Bis also a poset. Furthermore, this poset is a lattice (complete, bounded, distributive, boolean lattice) if B is a lattice (complete, bounded, distributive, boolean lattice). Note that atomicity is a property that is not inherited by this extension (Exercise 2.6). The lattice operations in A → B are defined in terms of the corresponding operations on B, when B is a (bounded, boolean) lattice. For any x ∈ A,we define

⊥A→B .x =⊥B , (⊥ of functions) A→B .x =B , ( of functions) (¬ f ).x =¬( f.x), (¬ of functions) ( f  g).x = f.x  g.x , ( of functions) ( f g).x = f.x g.x . ( of functions) When B is a complete lattice, we define

• • ( i ∈ I fi ).x = ( i ∈ I fi .x), (general  of functions) • • ( i ∈ I fi ).x = ( i ∈ I fi .x). (general of functions)

Subsets form a prime example of a pointwise extension construction. A subset A ⊆ D can be described by a function pA : D → Bool, which we define by pA.x = T if and only if x ∈ A. Functions of this kind are known as predicates. The predicates inherit lattice properties from the truth-value lattice. The pointwise extension of the truth-value lattice to D → Bool is in fact a lattice that has exactly the same structure as the subset lattice P(D). 2.4. Lattice Homomorphisms 43

Relations provide another example of pointwise extension. Set-theoretically, a relation is seen as a subset r ⊆ A × B, but it can also be modeled as a function Rr : A → P(B) by defining Rr .a ={b ∈ B | (a, b) ∈ r}. The relations R : A → P(B) form a pointwise extension of the subset lattice, and this inherits the lattice properties of this simpler lattice. We will study both predicates and relations of this form in more detail later on.

2.4 Lattice Homomorphisms

Given a function from one lattice to another, we are interested in whether the function preserves some or all of the lattice structure of its domain. For instance, whether the bottom in the domain is mapped to the bottom in the range, whether meets are preserved, and so on. Functions that preserve this kind of lattice properties are generally known as homomorphisms.

Let A and B be lattices and h : A → B. The following are special homomorphism properties that h may satisfy:

h.⊥A =⊥B , (bottom homomorphism) h.A =B , (top homomorphism)   h.(a A a ) = h.a B h.a , (meet homomorphism)   h.(a A a ) = h.a B h.a , (join homomorphism) h.(¬ Aa) =¬B (h.a). (complement homomorphism)

Thus, a bottom homomorphism maps the bottom of A to the bottom of B, while a top homomorphism maps the top of A to the top of B. Complement, meet, and join homomorphisms preserve complements, meets, and joins, respectively. A function on lattices can be homomorphic for only some of the lattice operations, or it can be homomorphic for all lattice operations. We say that the function h is a lattice homomorphism if it preserves meets and joins, and that it is a boolean lattice homomorphism if it also preserves negation.

Figure 2.5 gives an example of a function from one lattice to another. The mapping indicated is a bottom and top homomorphism, and it is also a join homomorphism, but it is not a meet homomorphism, as is easily seen (a  b =⊥,but f.a  f.b = c and f.⊥ = c).

A (boolean) lattice homomorphism h : A → B is said to be an embedding of A in B if h is injective (one-to-one). It is said to be an epimorphism (and B is an image of A)ifh is surjective (onto). If h is both injective and surjective, i.e., a bijection, then we call it a (boolean) lattice isomorphism and say that A and B are isomorphic. 44 2. Posets, Lattices, and Categories

a b c

FIGURE 2.5. Homomorphic function

Binary Lattice Operations as Homomorphisms The lattice operations themselves provide simple examples of homomorphisms. The property a  (b  c) = (a  b)  (a  c) holds in any lattice. This property states that the function f.x = a  x is meet homomorphic; we have that f.(bc) = a  (b  c) = (a  b)  (a  c) = f.b  f.c. Thus, the meet operation itself is meet homomorphic in its right argument. By commutativity, meet is also a meet homomorphism in its left argument. Meet is also bottom homomorphic (a⊥ = ⊥) but usually not top homomorphic (a =) in a complete lattice. Similarly, join is join homomorphic and top homomorphic (a =) but not usually bottom homomorphic (a ⊥=⊥) in a complete lattice. Distributivity is also a homomorphism property. For instance, a (b  c) = (a b)(a c) states that the function f.x = a x is meet homomorphic, i.e., that join is meet homomorphic in its right argument (and hence in its left argument also, by commutativity). Similarly, a  (b c) = (a  b) (a  c) states that meet is join homomorphic in its right argument. Thus, lattice join is meet homomorphic and lattice meet is join homomorphic in a distributive lattice. Combining this with the previous result, we note that meet and join are both meet and join homomorphisms in a distributive lattice. If we consider complement as a function ¬ : A → A◦, then it is a boolean lattice isomorphism: ¬⊥=, ¬ (a b) = (¬ a)  (¬ b), and so on.

Universal Meets and Joins We extend the notion of a homomorphism to arbitrary sets of elements in a lattice, as follows. We say that h : A → B is a universal meet homomorphism if the condition

• • h.( i ∈ I ai ) = ( i ∈ I h.ai ) (universal  homomorphism) 2.4. Lattice Homomorphisms 45

holds for any set of elements {ai | i ∈ I } in A. We say that h is a positive meet homomorphism if this condition holds for any nonempty set of elements {ai | i ∈ I }. Dually, we say that h is a universal join homomorphism if the condition

• • h.( i ∈ I ai ) = ( i ∈ I h.ai ) (universal homomorphism)

holds for any set {ai | i ∈ I } and that it is a positive join homomorphism if the condition holds for any nonempty set {ai | i ∈ I }. It is easy to see that lattice meet is a positive meet homomorphism: For I =∅,we • • have a  ( i ∈ I ai ) = ( i ∈ I a  ai ). However, lattice meet is usually not • a universal meet homomorphism: We have a  ( i ∈∅ ai ) = a =a, but on • the other hand, ( i ∈∅ a  ai ) =. Lattice meet is a positive join homomorphism if the lattice is infinitely join dis- tributive. In fact, in this case it is even a universal join homomorphism, because • • a  ( i ∈∅ ai ) = a ⊥=⊥=( i ∈∅ a  ai ). The dual results hold for lattice joins.

Homomorphism Inclusions The homomorphisms on a lattice are not independent of each other, as shown by the following lemma.

Lemma 2.6 Let h : A → B be a function from lattice A to lattice B. Then

(a) h is a universal meet (universal join) homomorphism if and only if h is a top (bottom) homomorphism and a positive meet (positive join) homomorphism;

(b) if h is a positive meet (positive join) homomorphism, then h is a meet (join) homomorphism; and

(c) if h is a meet or join homomorphism, then h is monotonic.

Proof We show the proof of (iii). The other properties follow directly from the definitions. Assume that h is a meet homomorphism. Then a  b ≡{correspondence} a = a  b ⇒{functionality} h.a = h.(a  b) ≡{h is -homomorphism} 46 2. Posets, Lattices, and Categories

top hom.monotonic bottom hom.

meet hom. join hom.

pos. meet hom. pos. join hom.

univ. meet hom. univ. join hom.

FIGURE 2.6. Homomorphism properties

h.a = h.a  h.b ≡{correspondence} h.a  h.b The proof for join homomorphism is similar. 2

The implications between the different homomorphisms are shown in Figure 2.6. In this hierarchy, monotonicity and top and bottom homomorphisms are the weak- est properties, while universal meet and join homomorphisms are the strongest properties.

Tabular Representation Because there are many lattices and operations on lattices that we consider in this book and there are also many homomorphism properties of interest, it becomes cumbersome to list and memorize all the properties. On the other hand, the ho- momorphism properties are extremely useful when reasoning about lattices, so it is worth identifying them all. We therefore introduce a concise tabular represen- tation for describing monotonicity and homomorphism properties of functions on lattices. The homomorphism properties of the binary lattice operations were discussed above and are summarized in Table 2.1. The operations that we test for monotonic- ity and homomorphism are listed on the left. The monotonicity property is listed first on top of the table, in the column denoted . The homomorphism properties are then listed, with ⊥ standing for bottom homomorphism,  for top homomor- phism,  for positive meet homomorphism, for positive join homomorphism, and ¬ for complement homomorphism. We do not treat finite meet (or join) homomor- phism separately, because in most cases the same results hold for finite meet (join) 2.4. Lattice Homomorphisms 47 homomorphism and positive meet (join) homomorphism. If needed, we explicitly indicate when a result holds only for finite meets or joins. Also, we do not have a separate entry for universal meet homomorphism, because that is equivalent to top and positive meet homomorphism both holding (and similarly for universal join homomorphism). We write the label is in the top left corner to indicate that each table entry states whether the following assertion is true:

The function f (on the left) is homomorphic with respect to the operation g (on the top)

(for the relation , the assertion is that f is monotonic with respect to ). The table entry can be “yes” (meaning that this is always the case) or “no” (meaning that this need not be the case), or it can be a qualified “yes”, with an indication of the condition under which the assertion holds. We write yes◦ for the case when the function is a homomorphism (or is monotonic) onto the dual lattice.

Pointwise Extension as Homomorphism The pointwise extension theorem shows that if A is a partial order, then B → A is also a partial order, with the pointwise extension of A as the ordering, and that in fact the properties of being a lattice, complete lattice, distributive lattice, or boolean lattice are all preserved by pointwise extension. Consider the function extend defined by extend. f = f.x, where x is a fixed element in A. This function maps elements in the lattice (B → A, B→A) to elements in the lattice (A, A ). Thus, we can ask whether it is a monotonic function, and to what extent it is a homomorphism. The results above answer all these questions affirmatively: pointwise extension is monotonic, and it is homomorphic for all lattice operations. Pointwise extension is thus a boolean lattice homomorphism. Table 2.2 shows the relevant entries. We have, e.g., that

extend.(g  h) = extend.g  extend.h in the case of meet homomorphism (see Exercise 2.3).

is  ⊥ ¬ f.x = x  a yes yes no yes yes1 no f.x = x a yes no yes yes2 yes no f.x =¬x yes◦ yes◦ yes◦ yes◦ yes◦ yes 1when the lattice is infinitely join distributive 2when the lattice is infinitely meet distributive

TABLE 2.1. Homomorphic properties of lattice operations 48 2. Posets, Lattices, and Categories

is  ⊥ ¬ extend yes yes yes yes yes yes fst yes yes yes yes yes yes snd yes yes yes yes yes yes

TABLE 2.2. Homomorphic properties of pointwise extension

For the extension of lattices to a product of lattices, the corresponding result states that projection onto the first or the second component preserves monotonicity and is a lattice homomorphism. We have, e.g., that fst(⊥A×B ) =⊥A, that fst.(a  b) = fst.a  fst.b, and so on, as also shown in Table 2.2.

Lattice Operations as Homomorphism Consider finally the homomorphism properties of meet and join on lattices. Assume that A is a complete lattice. Then meet is a function from sets of elements in A to elements in A, i.e.,  : P A → A, and similarly for join. As we have shown above, P A is also a complete lattice ordered by inclusion. The characterization of top and bottom shows that join is both bottom and top homomorphic:

∅=⊥ and A = . Join is also monotonic,

B ⊆ B ⇒ B  B , and it is join homomorphic,

(B ∪ B) = ( B) ( B). However, join is neither meet homomorphic nor negation homomorphic. In a similar way, it is easy to see that meet is bottom and top homomorphic to the dual lattice A◦:

∅= and A =⊥ . Meet is antimonotonic,

B ⊆ B ⇒B B , and it is join homomorphic onto the dual lattice

(B ∪ B) = (B)  (B). Table 2.3 summarizes these properties. 2.5. Categories 49

is  ⊥ ¬  yes◦ yes◦ yes◦ no yes◦ no yes yes yes no yes no

TABLE 2.3. Homomorphic properties of meet and join

2.5 Categories

Consider a collection of lattices and the possible homomorphisms between them. Take for instance all meet homomorphisms. We can see that the collection of meet homomorphisms is closed with respect to composition: if h : A → B is a meet homomorphism and k : B → C is a meet homomorphism, then k ◦ h : A → C is also a meet homomorphism, as is easily checked:

(k ◦ h).(a  b) ={definition of functional composition} k.(h.(a  b)) ={h is a meet homomorphism} k.(h.a  h.b) ={k is a meet homomorphism} k.(h.a)  k.(h.b) ={definition of functional composition} (k ◦ h).a  (k ◦ h).b

The identity function id : A → A is a homomorphism for any operations or constants we choose to preserve. In particular, it is a meet homomorphism. A collection of lattices together with all meet homomorphisms between these lattices is an example of a concrete category; i.e., a category in which the objects are sets and the morphisms are structure-preserving functions from objects to objects.

In general, any collection of sets with a certain structure together with all the structure-preserving functions (homomorphisms) between these sets forms a con- crete category. The collection of all posets together with all monotonic functions between the posets is another example of a concrete category. We can even consider the collection of all sets with a certain structure. This need not be a set itself, so the notion of a concrete category goes outside ordinary set theory. However, this notion provides a very useful abstraction, because it permits us to study all possible homomorphisms in a general way. 50 2. Posets, Lattices, and Categories

Abstract Categories We can generalize the notion of a concrete category to an (abstract) category.A category C consists of a collection of objects Obj(C) and a collection of morphisms Mor(C). Each morphism has a source and a target, both of which are objects. We f write A −→ B to indicate that f is a morphism with source A and target B. f Furthermore, there is a composition operator that takes a morphism A −→ B and g f ;g a morphism B −→ C to the morphism A −→ B. Composition f ; g is defined only if the target of f is the source of g. Composition is required to be uniquely defined and associative. Finally, for every object A there is a special morphism 1A, the identity morphism on A, which is an identity element for composition. If there is no danger of confusion, we usually drop the index and simply write 1 for 1A. Summarizing, the following always hold in a category: f ; (g ; h) = ( f ; g) ; h , (; associative) 1; f = f and f ;1= f . (1 unit)

f We write C(A, B) for the collection of all morphisms A −→ B in C. In category theory, we do not need to assume that the morphisms form a set. If they do, then the category is said to be small. Figure 2.7 illustrates these notions. The objects are A, B, C, and D; the morphisms f g h j k are A −→ B, B −→ C, B −→ D, C −→ D, D −→ B, the identity morphisms for each object, and all morphisms that can be constructed from the explicitly indicated morphisms by composition. Any concrete category will be an abstract category, where the sets with structure are the objects and the structure-preserving functions are the morphisms of the

1

B k f h

1 A g D 1

j C 1

FIGURE 2.7. Objects and morphisms 2.5. Categories 51 category. In particular, the concrete category of sets with functions forms a cate- gory. The sets A, B,...are the objects and the functions f : A → B,...are the f morphisms A −→ B. Composition in this category is forward functional com- position, defined by ( f • g).x = g( f (x)), and the identity functions id : A → A are the unit morphisms 1A in the category. Functional composition is associative, so the category properties hold. We could also choose the functions f : B → A (the reverse direction) as the mor- f phisms A −→ B. In this case, composition is backward functional composition, defined by ( f ◦g).x = f (g(x)), while the identity function id : A → A still serves as the unit morphism 1A. The collection of all relations also forms a category. Each object is a set, and as R morphisms A −→ B we choose relations R ⊆ A × B. The identity relation serves as the unit element, and composition is ordinary relational composition R • Q, where (x, z) ∈ R • Q iff (x, y) ∈ R and (y, z) ∈ Q, for some y. This is an example of a category that is not a concrete category. A preorder can be seen as a special kind of category. The objects of the category are f the elements of the preorder. There is a morphism a −→ b between two elements a and b if and only if a  b.Reflexivity guarantees that there is an identity morphism, and transitivity guarantees that the composition of two morphisms is a morphism in the poset category.

Functors and Homomorphisms The analogue to a lattice homomorphism in a category is a functor. Given two categories C and D, a functor F maps objects of C to objects of D and morphisms of C to morphisms of D in such a way that sources, targets, composition, and f identities are preserved. This means that F maps every morphism A −→ B to F. f a morphism F. f with source F.A and target F.B; i.e., F.A −→ F.B and the following holds: F( f ; g) = (F. f ) ; (F.g), (functor properties) F.1A = 1(F.A) . A trivial example of a functor is the identity functor, which maps a category to itself. The functor F reduces to a homomorphism in the special case when it does not change the objects; i.e., F.A = A for each object A.

Order-Enriched Categories The function space construction gives us new lattices where partial order and the lattice operations are defined by pointwise extension. Since the elements are func- 52 2. Posets, Lattices, and Categories

tions, we can also compose them, and there is an identity element for functional composition. Relations have a similar structure. The function space constructions thus have both a lattice structure and a category structure. Most of the basic struc- tures that are needed for the refinement calculus turn out to be categories of this kind. Consider a category C where for arbitrary objects A and B the morphisms C(A, B) are ordered by a relation A,B . We say that C is an order-enriched category if composition is monotonic in both arguments:     f A,B f ∧ g B,C g ⇒ f ; g A,C f ; g (any category can in fact be viewed as such an order-enriched category if we choose the discrete partial order as the ordering). If composition is monotonic only in its left argument, f  f  ⇒ f ; g  f  ; g , then we call the category a left order-enriched category.Aright order-enriched category is defined analogously. A category is obviously order-enriched if and only if it is both left and right order-enriched. If the ordering relation is a preorder, poset, lattice, etc., we call the category a preorder-enriched category,aposet-enriched category,alattice-enriched category, and so on. The interaction between the order and category structures gives rise to numerous distributivity possibilities. For example, we may want to know whether composition distributes over lattice operations from the left or from the right. For instance, does c;(ab) = c;ac;b (left distributivity over meet) hold, or does (ab);c = a;cb;c (right distributivity over meet) hold? If the morphisms for a given source and target • • form a complete lattice, we may ask whether ( i ∈ I ai ) ; b = ( i ∈ I ai ; b). Similarly, we may ask whether meet and join distribute over composition. For instance, is (a ; b)  c = (a  c) ; (b  c)? In this case we need not distinguish left from right distributivity, since meet and join are commutative.

Example: Pointwise Extension

Assume that C is a category where the objects are posets (A, A), (B, B ),...and the morphisms are functions f : A → B,..., ordered by the pointwise extension of the ordering relation. Composition is backward functional composition. We can see that this category is left poset-enriched, as follows. Assume that f  g. Then ( f ; h).x ={definition of composition} f.(h.x) {assumption} g.(h.x) 2.5. Categories 53

={definition of composition} (g ; h).x so f  g ⇒ f ; h  g ; h follows by pointwise extension (a dual argument shows that if composition is forward function composition, then the category is right poset-enriched).

If we choose the monotonic functions f : A → B as the morphisms in this category, rather than all functions, then we get a (proper) poset-enriched category. Assume that f  g and h is monotonic. Then

(h ; f ).x ={definition of composition} h.( f.x) {assumptions; pointwise extension} h.(g.x) ={definition of composition} (h ; g).x

Next we consider distributivity of composition over lattice operators in either one of these categories. The following derivations show that composition distributes from the right over meet:

(( f  g) ; h).x ={definition of composition} ( f  g).(h.x) ={pointwise extended meet} f.(h.x)  g.(h.x) ={definition of composition} ( f ; h).x  (g ; h).x ={pointwise extended meet} ( f ; h  g ; h).x so ( f  g) ; h = f ; h  g ; h.

Right distributivity also holds for join, bottom, top, and complement. However, left distributivity does not in general hold for any of the lattice operations. Forward function composition again dually distributes from the left over lattice operations. 54 2. Posets, Lattices, and Categories

2.6 Summary and Discussion

We have given an overview of the basic notions of partial orders, of monotonicity on posets, and of meets and joins in posets. We have defined what it means for a poset to be a lattice and have identified a number of subclasses of lattices. We have also shown how to build new posets and lattices from old ones using product and function spaces, and shown to what extent the lattice properties are inherited by these constructions. We have looked at structure-preserving mappings (homomorphisms) between lattices and shown how collections of lattices form categories. The category theory that we need in this book will be quite elementary. We mainly use it as an organizational concept for the different kinds of algebraic structures that we need for the refinement calculus. It also allows us to treat different kinds of compositions in a uniform manner, in the same way as lattice theory allows us to treat ordering in a uniform manner across many different kinds of structures. The material on lattices presented here is rather standard and can be found in many classical references such as Birkhoff [40] and George Gr¨atzer [63]. In particular, these references contain proofs of the theorems of infinite distributivity for com- plete boolean lattices and the representation theorem for atomic lattices. A modern textbook is Davey and Priestley [50]. Good introductions to category theory and its use in computer science have been written by Benjamin Pierce [116] and by Barr and Wells [38]. We have chosen to start our exposition of the refinement calculus with an overview of basic lattice-theoretic notions. The reason is that these simple ideas form the basis for the formalization of higher-order logic that we will consider next. To a large extent, reasoning about truth values, sets, relations, and program statements can be seen as different instantiations of the same basic lattice-theoretic concepts that we have described above. This has also motivated us to present the material in somewhat more detail than what might be strictly needed, because it is important to have a good intuitive grasp of the concepts involved.

2.7 Exercises

2.1 Let the set A ={1, 2, 3, 6, 8, 12} be ordered such that m  n means that m is divisible by n. Find two numbers in A that do not have a least upper bound in A.Is it possible to find two numbers in A that do not have a greatest lower bound in A?

2.2 Show that the composition of two monotonic functions is a monotonic function.

2.3 Assume that B is a lattice and A is an arbitrary set. Show that pointwise extension to the function space A → B is a lattice homomorphism. 2.7. Exercises 55

2.4 Let L be the set of all cofinite sets of natural numbers (i.e., all sets that have a finite complement), ordered by set inclusion. Show that L is a lattice but not a complete lattice. Does it have a top or a bottom element? 2.5 Prove that the structure (Bool × Nat, ) is a lattice, where  is the following ordering: (b, n)  (b, n) ≡ b ⇒ b ∧ n ≤ n . What are the least upper bounds and greatest lower bounds in this lattice? Is the lattice bounded, distributed, complete, boolean? Show that f : Bool × Nat → Nat, where f.(F, n) = 2n and f.(T, n) = 2n + 1 is monotonic but not an order embedding of (Bool × Nat, ) into (Nat, ≤). 2.6 Let A be any infinite set. Then (A → Bool, ⊆) is an atomic complete boolean lattice. Show that (A → Bool) → (A → Bool) ordered by pointwise-extended set inclusion is not atomic. 2.7 Assume that f and g are bijective monotonic functions with f  g. Show that g−1  f −1. 2.8 Show that the implications in Figure 2.6 are all valid. 2.9 The function f : P(Bool) → P(Nat) is defined so that f.∅=∅, f.{F}={0} , f.{T}={x | x = 0} , f.Bool = Nat . Show that f is a homomorphic embedding. 2.10 Let A be an arbitrary set and ⊥ and  two elements not in A. Show that the set A ∪{⊥, }, ordered by a  b ≡ a = b ∨ a =⊥∨b = , is a complete lattice. 2.11 Show that the meet operation in a complete lattice has the following property:

x =A ≡ (∀y • y  x ≡ (∀a ∈ A • y  a)) . Since the definition characterizes meet using two properties (that it is a lower bound and that is greater than all lower bounds), this characterization can make proofs about meets shorter. What is the dual characterization of join? This is page 56 Printer: Opaque this This is page 57 Printer: Opaque this 3

Higher-Order Logic

We now introduce the logical basis for program refinement, higher-order logic. This is an extension of the simply typed lambda calculus with logical connectives and quantifiers, permitting logical reasoning about functions in a very general way. In particular, it allows quantification over higher-order entities such as predicates and relations, a feature that we will find very useful in our formalization. The power of higher-order logic is similar to that of set theory. It is sufficient for expressing most ordinary mathematical theories and has a simple set-theoretic semantics.

We describe the basic framework of higher-order logic in this chapter: the syntactic entities of the logic, the semantic meaning of these entities, the notions of deduction and proof, and the notion of a theory. In the next chapter we then give the specific axioms and inference rules that are needed to reason about functions. These are the usual rules of the simply typed lambda calculus. The axioms and inference rules needed for reasoning about logical propositions in general are postponed to Chapter 6.

3.1 Types and Terms

In informal mathematics, functions have to be introduced with an explicit name before they can be used in mathematical expressions. Lambda calculus provides a mechanism for building expressions that denote functions (so-called lambda expressions), so that functions can be used without explicit definitions and naming. 58 3. Higher-Order Logic

Functions are constructed by λ-abstraction. The function that maps each natural number n to its successor n + 1 is written as the lambda abstraction (λn : Nat • n + 1). Here Nat is a type that denotes the set of natural numbers. The part λn : Nat expresses that the argument of the function is denoted by n and ranges over the set of natural numbers Nat. The value of the function for argument n is given by the expression n + 1. The parentheses delineate the scope of the bound variable n. Lambda abstraction permits us to be precise about the variables and constants in an expression. For instance, we can make a distinction between the functions (λn : Nat • n − m) and (λm : Nat • n − m). In the first case, n is the argument of the function and m is a constant that is assumed to have some fixed value (determined by the context). In the second lambda expression, it is the other way around. We can also nest lambda abstractions, so that, e.g., (λm : Nat • (λn : Nat • n − m)) is the function that maps each natural number m to the function (λn : Nat • n − m). Applying a function to an argument is called application. Application is an oper- ation that takes two terms, t denoting a function and s denoting an argument, and gives as result the value of function t at argument s. The application of t to s is written t.s. For example, we have that

(λn : Nat • n − m).(a + 3) = (a + 3) − m , (λm : Nat • (λn : Nat • n − m)).(a + 3) = (λn : Nat • n − (a + 3)) . The exact rules for computing with abstractions and applications will be made precise later. But before that, we need to be more precise about the syntax and semantics of lambda calculus types and expressions.

Types The types of higher-order logic are expressions that denote sets. Let AT be a type structure, i.e., a set of type operators each having a name and an arity (an arity is a natural number). Types over AT are defined by the following context-free grammar:

 ::= C | 1 → 2 | Op(1,...,n). (types)

Here C is a nullary type operator (also called an atomic type)inAT, while 1,...,n are types (we generally use , , and  as metavariables for types). The type operator → associates to the right, so that, e.g., 1 → 2 → 3 abbreviates 1 → (2 → 3). In the last alternative, Op is an n-ary type operator in AT that constructs a new type out of the types 1,...,n. 3.1. Types and Terms 59

Each type denotes some nonempty set. The atomic types denote designated sets in the universe of all sets (this will be made more precise below). Higher-order logic postulates two atomic types: Bool, which denotes the two-element set of truth values, and Ind, which denotes an infinite set of individuals.

The function type 1 → 2 stands for the set of all (total) functions from the set denoted by 1 to the set denoted by 2. Thus, e.g., Ind → Bool denotes the set of functions from individuals to truth values. Similarly, each type operator denotes a set-forming operation, and Op(1,...,n) stands for the result of applying that operation to the sets denoted by 1,...,n. In fact, we can view → as a binary type operator, mapping sets 1 and 2 to the set 1 → 2. However, we prefer to give it special status, since it is so fundamental. It is possible to add new type operators (including atomic types) to the logic. This is illustrated in Chapter 4 and described in more detail in Chapter 10. Part of the power of higher-order logic comes from the fact that we can iterate type construction to ever higher levels of functionality. For instance, (Ind → Bool) → Bool denotes the set of functions that take functions from individuals to truth values as arguments and return truth values as results. This is a set of higher-order functions.

Ter ms The terms of higher-order logic are expressions that denote elements of types. Each term t in higher-order logic is associated with a unique type. We assume that two disjoint infinite sets of names are given, constant names and variable names.Aconstant is a pair (c,), written c, where c is a constant name and  is a type. Similarly, a variable is a pair (v, ), written v, where v is a variable name and  is a type. Just as the definition of types is relative to a particular type structure AT, the definition of terms is relative to a given set SG of constants, called a signature, where each constant has a type over AT. The terms of higher-order logic are defined by the following context-free grammar:

t ::= v , (variable) | c , (constant) | t1.t2 , (application) • | (λv t1) (abstraction) (we use s and t as metavariables for terms; c, f , and g as metavariables for constants; and u, v, and w as metavariables for variables). We associate a type with each term using the following rules:

(i) Each variable v is a term of type . 60 3. Higher-Order Logic

(ii) Each constant c in SG is a term of type .

(iii) If t is a term of type 1 → 2 and t1 a term of type 1, then the application t.t1 is a term of type 2. v  (λv • ) (iv) If 1 is a variable and t2 a term of type 2, then the abstraction 1 t2 is a term of type 1 → 2.

A term t is well-typed if it has a type  according to the above rules. The type of each well-typed term is uniquely determined by the syntax. We write t :  to express that t has type . A well-typed term t :  denotes some element in the set denoted by , in the way described intuitively above. From now on, we always take ‘term’ to mean ‘well-typed term’. Type indications are usually omitted from constants and variables when this infor- mation can be deduced from the structure of the term or from the context in which the term is used. When needed, it is often sufficient to indicate the types of some • variables in lambda abstractions. In this case, we prefer to write (λv :  t1) for • (λv t1), particularly if the type expression  is more complex. Two different variables can have the same name, if they have different types. Two different constants can also have the same name, provided that they have different types. Both kinds of overloading are quite convenient, provided that one adds extra type information or explanations whenever necessary to avoid ambiguity.

Precedence and Association Application associates to the left, so that t.t.t stands for (t.t).t. In some cases we use a constant f :  →  →  as an infix operator, writing xfyrather than f.x.y. Typical examples are addition and subtraction; we prefer to write x + y rather than +.x.y.Infix operators have lower precedence than ordinary function application, and they always associate to the right. The precedences of different infix operators are listed in Appendix A. Parentheses can always be used to override (or emphasize) the precedences. For some constants we use a prefix format, writing fxfor f.x, with the same precedence. In particular, we write negation and complement as prefix constants, and similarly for meets and joins over sets. Prefix operators have lower precedence than function application, but higher precedence than infix operators. A small number of constants are written as postfix operators. Most postfix operators are written as superscripts (e.g., S◦ is the duality operator ◦ applied to S). Postfix operators have higher precedence than ordinary function application. Abstraction is written as a mixfix, with delimiters (the parentheses) indicating the scope of the abstraction. Abstraction has lower precedence than application. To   simplify nested applications, we can write (λv • (λv • t)) as (λv v • t). 3.1. Types and Terms 61

Substitution and Replacement We make a distinction between free and bound occurrences of variables in a term. An occurrence of the variable v in term t is bound if it is within a subterm of t of  the form (λv • t ); otherwise it is free. A variable is free in t if it occurs free in t, and it is bound in t if it occurs bound in t. A term in which all occurrences of variables are bound is called closed. Consider as an example the term (λn • n − m).(a + 1). The variable n is here bound, while m and a are free. A variable can occur both free and bound in the same term. For example, the variable n occurs both free and bound in the term n + (λn • n + 1).m. The term t[v := t] stands for the result of substituting the term t for all free occurrences of variable v in term t. If no free variable of t becomes bound in t[v := t], then we say that t is free for v in t. Thus, (λn • n − m).(a + 1)[m := (a − 1)] is the same term as (λn • n − (a − 1)).(a + 1). The term (a − 1) is free for m in (λn • n − m).(a + 1), because a does not become bound in the substitution. The term (n − 1) would not be free for m in this term. We write t[v1,...,vm := t1,...,tm ] for the simultaneous substitution of terms t1,...,tm for variables v1,...,vm in t. Substitution is sometimes used to indicate a term with specific occurrences of some subterm singled out; t[v := s] is a term where the variable v is used in term t to indicate the positions of the interesting occurrences of subterm s.

Standard Types and Constants It is necessary to impose some additional structure on the framework introduced so far. For now, we assume only a minimal structure that is needed to reason about functions. In Chapter 6, we introduce further structure for reasoning about truth and falsity. We assume that every type structure contains the standard type Bool. Further- more, we assume that every constant signature contains a standard constant called equality, = :  →  → Bool (equality) for each type . The intended (standard) interpretation is that Bool denotes a distinguished two- element set of truth values and that = denotes equality on the set denoted by . 62 3. Higher-Order Logic

3.2 Semantics

We give a brief description of the meaning (semantics) of types and terms in higher- order logic. The semantics is set-theoretic; a type  denotes a set in a universe of sets, and a term t :  denotes an element in the set denoted by . We assume a universe (family) U of nonempty sets, which is closed under the following operations: nonempty subset, cartesian product, powerset, and function space. This means that, e.g., if X is in U, and Y is a nonempty subset of X, then also Y is in U. Furthermore, we assume that U contains a distinguished infinite set I (its elements are called individuals) and a distinguished two-element set 2 ={ff, tt} (its elements are called truth values). Finally, we assume that there is a choice function called choice on U that chooses some (arbitrary) element from every nonempty set, i.e., choice.X ∈ X for all X ∈ U, X =∅. Not all these assumptions are needed in this chapter, but we find it convenient to make all assumptions about the universe explicit from the start.

Semantics of Types A model for a type structure AT is given by associating every n-ary type operator op in AT with some function f : U n → U (the meaning of the operation).

We require that the type Bool be given a standard meaning as the set 2 and that the function type constructor → be given a standard meaning as the function in U 2 → U that maps the pair (X, Y ) to the set of all (set-theoretic) total functions from X to Y . Given a model for AT, every type  denotes a set M. (the meaning of )inU. In a standard model, M.Bool = 2 and M.( → ) ={f | f : M. → M.}.

For example, if × denotes cartesian product, then (Bool × Bool) → Bool denotes the set of all functions f : 2 × 2 → 2, i.e., functions that map pairs of truth values to truth values.

Semantics of Terms Now assume that type structure AT is given and that SG is a signature over AT.A model for SG determines (1) a model for AT and (2) for every constant c :  in SG an element M.c ∈ M. (the meaning of c). In a standard model we require that = :  →  → Bool be given its standard meaning as equality on the set denoted by . Consider a term t.Anenvironment E for the term is a set of variable-value pairs {(x1, a1),...,(xn, an)} that assigns a value ai in M. to every free variable xi :  that occurs in t. The meaning of a term with free variables is determined relative to such an environment. 3.3. Deductive Systems 63

The meaning ME .t of a term t in environment E is defined inductively, as follows. The model gives the meanings of constants,

ME .c = M.c , and the environment gives the meanings of variables,

ME .x = a, when (x, a) ∈ E. For application, we define

  ME .t.t = (ME .t).(ME .t ). Thus, if t denotes a function f and t denotes an element x, then t.t denotes the • element f.x. Finally, the denotation ME .(λv :  t) of lambda abstraction is the function f ,defined by

f.a = ME .t ,  that maps any a ∈ M. to ME .t, where E is the same environment as E except that v is mapped to a. In this way, the meanings assigned to the constants determine the meanings of all terms that can be built out of these constants in a given environment. For a closed term t, the the meaning is independent of the environment (since there are no free variables), and we can define M.t = M∅.t.

3.3 Deductive Systems

The syntax of higher-order logic is next extended with a deductive system in which we can prove properties of elements denoted by terms. We assume below that a standard type structure AT and a standard signature SG are given.

A formula is a term of type Bool. Given a specific assignment of values to the free variables in a formula (an environment), a formula thus always denotes a . A sequent is a pair (, t), written

  t , (sequent) where  is a finite set of formulas over SG (the assumptions of the sequent) and t is a single formula over SG (the consequent of the sequent). A sequent   t is satisfied in a model of the types in AT and constants in SG if any environment (with assignment of values to the free variables in  and t) that makes every formula in  true also makes t true. A sequent is valid if it is satisfied in every standard model. 64 3. Higher-Order Logic

An inference is a tuple of sequents (1  t1,...,m  tm ,  t), m ≥ 0. An inference is usually written as

  t ···   t 1 1 m m . (inference)   t

The sequents 1  t1,...,m  tm are called the hypotheses and   t is called the conclusion of the inference. An inference of higher-order logic is said to be valid if any standard model that satisfies all the hypotheses also satisfies the conclusion.

A deductive system DS is a set of inferences. We usually define the inferences of a deductive system DS with a small collection of inference rules. These have the same form as inferences but may contain metavariables that stand for arbitrary terms and sets of formulas. As before, we use s and t as metavariables for terms; c, f , and g for constants; and u, v, and w for variables. In addition, we use  for finite sets of formulas. Any instantiation of these metavariables that satisfies an associated side condition is then a permitted inference of the deductive system. An inference rule is written as

  t ···   t 1 1 m m {R} (inference rule)   t • side condition.

The name of the inference rule is R. Here 1,...,m ,and t1,...,tm , t may con- tain metavariables. The side condition states the requirements that an instantiation of the metavariables with actual terms has to satisfy. An inference with no hypotheses (m = 0) is called an axiom. An inference rule with no hypotheses is called an axiom scheme of DS. The line that separates the (empty) set of hypotheses from the conclusion is usually omitted in axioms (but usually not in axiom schemes).

Deductions We can construct new inferences from old ones by composition. If

1  t1 ··· i  ti ··· n  tn   t is an instance of inference rule R (and thus satisfies the side condition if there is one) and

i,1  ti,1 ··· i,n  ti,n i  ti 3.4. Theories 65

is an instance of inference rule R with a conclusion that is the same as the ith hypothesis of the first inference, then their composition is the inference   ···   ···   ···   1 t1 i,1 ti,1 i,n ti,n n tn ,   t

which we get from the first inference by replacing the hypothesis i  ti with the hypotheses i,1  ti,1,...,i,n  ti,n. A deduction in DS records the intermediate steps and the inference rules of DS that we have used to construct an inference. The deduction for the above inference is

i,1  ti,1 ···i,n  ti,n {R’} 1  t1 ··· i  ti ··· n  tn {R}   t For each inference, we indicate the inference rule that determines it. The hypotheses of this deduction are the sequents that are not conclusions of inferences, and the conclusion of the whole derivation is   t. By composing inferences, we can build deductions of arbitrary complexity. A sequent   t is a theorem if there exists a proof of it, i.e., a deduction without any hypotheses and with the sequent as its conclusion. Deductions can also be built by composing inference rules rather than inferences. In this case, we are building deduction schemes. One then has to be careful that the side conditions of all rules that are used in the deduction are satisfied. Each such deduction scheme gives us an inference rule where the conclusion is the final con- clusion of the whole deduction scheme, the hypotheses are the topmost hypotheses of the deduction scheme, and the side condition consists of the accumulated side conditions of the rules used in the deduction. An inference rule arrived at in this way is called a derived inference rule. A derived inference rule without any hypotheses is called a theorem scheme.A theorem scheme may thus have side conditions, and it may contain metavariables that stand for arbitrary terms or sets of formulas (a theorem cannot contain any metavariables).

3.4 Theories

A theory is determined by a type structure AT, a set of constants SG over AT (the signature), and a set of inference rules DS (the primitive rules of inference). The first determines the types that can be formed in the theory, the first and second together determine the terms and formulas that can be expressed in the theory, and all three together determine the proofs (and thus also the theorems and derived inference rules) in the theory. 66 3. Higher-Order Logic

A theory is consistent if there exists some sequent (over the signature of the theory) that cannot be derived. Once we have introduced the notion of falsity F,wecan formulate this in a more familiar way; a theory is consistent if F cannot be proved as a theorem within the theory. A theory is sound (with respect to a given model) if every sequent that can be proved in it is valid in the model. It is complete (with respect to a model) if any valid theorem can be proved using the deductive system of the theory. A theory is extended by adding new types, new constants, and new inference rules. A theory extension thus makes the language more expressive and may permit new theorems to be proved. Anything provable in the original theory is, however, also provable in the extended theory. An extension of a consistent theory need not be consistent. That is why it is desirable to extend the theory in a conservative way. An extension is said to be conservative if it does not extend the set of theorems over the old signature. This means that a sequent of the original theory can be proved in the extension if and only if it can be proved in the original theory. The advantage of a conservative extension is that it has a standard model whenever the original theory has one. This means that a conservative extension can never introduce inconsistency.

3.5 Summary and Discussion

This chapter has introduced the basic ingredients of higher-order logic: the syntax and semantics of types and terms, the notion of a deductive system, and the notion of a proof in higher-order logic. The basic notions of soundness and completeness of a deductive system have also been defined, as well as the notion of conservative theory extensions. For more details and two slightly different formulations of higher-order logic, we refer to the books by Peter Andrews [3] and by Michael Gordon and Tom Melham [62]. Higher-order logic is based on Alonzo Church’s work on a simple theory of types [48], originally invented as a tool for studying the foundations of mathematics. The central notion of types was first proposed by Bertrand Russell and Alfred North Whitehead [140] in an attempt to avoid the paradoxes in naive set theory discovered by Russell. There are also stronger and more expressive type theories, such as the Martin-L¨of type theory [99] and the calculus of constructions [49]. These permit subtypes and dependent types to be expressed quite elegantly. However, typing may then not be decidable, so correct typing needs to be proved for an expression at the same time that the properties expressed by the formulas are proved. Thus, the price to be paid for the added expressiveness is that proofs become more complex. The semantics of the more expressive type theories are also more complex, and we 3.6. Exercises 67

lose the simple set-theoretic interpretation that higher-order logic has. Overall, our feeling is that classical higher-order logic closely matches the intuition of a working mathematician, and that the loss of expressive power is not a serious drawback in practice. The lack of subtypes is perhaps the biggest disadvantage of higher-order logic, but we will show that there are some rather simple ways of getting around this restriction.

3.6 Exercises

3.1 Assume that  and  are types and that c :  and g :  → Bool are constants. Deduce the types of all subterms of the following term: (λx : Bool • λy • f.x.c = g.y). 3.2 Show that the typing rules of higher-order logic rule out self-application of the form t.t; i.e., show that a term can never be applied to itself. 3.3 We want to create a theory over a single type T with the following constants: P : T → T → Bool , A : T → T → Bool , and with the following two inference rules:  . .   . .  , A t t P t t .  A.t.t  A.t.t

(a) Write a deduction of  A.x.z from  P.x.y and P.y.z.

(b) Show that if there are deductions of  A.t.t and  A.t.t using only the two inference rules above, then there is also a deduction of  A.t.t. This is page 68 Printer: Opaque this This is page 69 Printer: Opaque this 4

Functions

We start our overview of the basic axioms and inference of higher-order logic by studying the rules for reasoning about functions. These are essentially the rules of simply typed lambda calculus. A method of extending the logic by defining new constants is described. In association with this we also describe two ways of introducing new notation without extending the logic: abbreviations (syntactic sugaring) and local definitions. Furthermore, we consider an important practical issue: how to present proofs in a simple and readable way. The natural deduction style is the traditional way of presenting proofs and is the one that we have described in Chapter 3. However, we prefer an alternative derivational proof style, because we find it more readable and it scales up better to large proofs. The natural deduction style and the derivational proof style are just two different ways of writing proofs; the underlying notion of a proof is the same in both styles. Finally, we postulate a new type constructor that denotes a cartesian product and investigate its basic properties.

4.1 Properties of Functions

There are basically three classes of rules that we use to reason about functions in higher-order logic: (1) rules that determine the basic properties of equality, (2) substitution and congruence rules for terms, and (3) conversion rules for abstraction and application. We look at all these rules in more detail below. 70 4. Functions

Equality Equality is the most basic constant of higher-order logic. The following three inference rules establish that equality is an equivalence relation.

, (= reflexive)  t = t

  t = t   t = t , (= transitive)  ∪   t = t

  t = t . (= symmetric)   t = t There are no side conditions associated with these inference rules. The metavari- ables t, t, and t stand for arbitrary terms of arbitrary type. In particular, this means that the rules are also valid for terms of new types that are added to the logic later.

The transitivity rule is easily seen to hold for an arbitrary number of intermediate terms:   =   = ···   = 1 t1 t2 2 t2 t3 n−1 tn−1 tn . 1 ∪···∪n−1  t1 = tn The following is a deduction of this (derived) inference rule for n = 4:

2  t2 = t3 3  t3 = t4 {= trans.} 1  t1 = t2 2 ∪ 3  t2 = t4 {= trans.} 1 ∪ 2 ∪ 3  t1 = t4

Substitution and Congruence The principle of substitution is the general inference rule for equality. It states that replacing subterms with equal terms preserves truth:

  =    v = t1 t1 t[ : t1]  ∪   v =  (substitution) t[ : t1] •  v t1 and t1 are free for in t. From this inference rule we can derive a rule that says that replacing equals by equals preserves equality:

  =  t1 t1   v = = v =  (substitute equals for equals) t[ : t1] t[ : t1] •  v t1 and t1 are free for in t. 4.1. Properties of Functions 71

The derivation of this rule uses reflexivity: {= refl.}   =   v = = v = { } t1 t1 t[ : t1] t[ : “ t1 ”] Subst.   v = = v =  t[ : t1] t[ : t1] We have here indicated the subterm that is replaced in the substitution within quotation marks (“ t1 ”). The quotation marks have no other significance; they serve only to emphasize the part of the term that is being changed. A function associates a unique value with each argument: if the arguments are equal, then the values of the function on these arguments must also be equal. This is expressed by the following inference rule:

  t = t . (operand congruence)   t.t = t.t

In higher-order logic, there is also another possibility: we can have two different terms that describe the same function. If these two terms are applied to the same argument, the values should again be equal:

  t = t . (operator congruence)   t.t = t.t Operator and operand congruence are both easily derived from the rule for replacing equals for equals. Abstraction also preserves equality, as stated by the following rule of inference:

  t = t  (abstraction)   (λv • t) = (λv • t ) • v is not free in . The side condition states that this rule may be used only with sequents where the variable v does not occur free in any formula in . That the side condition is necessary is seen from the sequent x = 0  x + 1 = 1. This is obviously true, but it does not follow that the function (λx • x + 1) is the same as the function (λx • 1), because the equality of the function values is established only in the special case when x = 0. Symmetry and transitivity of equality can be proved from the other rules that we have given. Symmetry is proved using reflexivity and substitution: {= reflexive}    t = t   “ t ” = t {Substitution}   t = t In a similar way, transitivity is proved using only substitution (Exercise 4.3). 72 4. Functions

Conversion Rules The lambda calculus conversion rules provide the basic mechanism for manipulat- ing terms that denote functions. Two terms are considered equivalent if they differ only with respect to the naming of bound variables. In other words, the names of the bound variables are not significant. This property is expressed by the rule of α conversion:

(α conversion)  (λv • t) = (λw • t[v := w]) • w does not occur free in t. An example of this rule is that (λx • x + y) = (λz • z + y). However, (λx • x + y) = (λy • y + y) does not follow by this rule (and indeed, is not true), because y occurs free in (x + y). Remember that t[v := w] stands for the substitution of variable w for all free occurrences of v in t. The side condition guarantees that no free variable w in t becomes bound when we replace v with w. The β conversion rule gives the basic property of function application:

  (β conversion)  (λv • t).t = t[v := t ] • t is free for v in t. An example of this rule is that (λx • x + y).(y + 1) = (y + 1) + y. Ignoring the side condition for β conversion is a common error in manipulating lambda terms. Consider as an example the term (λx • (λy • x)).y. If we carry out the β conversion without considering the side condition, we get (λx • (λy • x)).y = (λy • y), i.e., the identity function, which is obviously wrong (we should get the function that returns y for any argument). The free variable y has become bound in the resulting expression. Thus, we cannot apply the β-conversion rule directly to simplify this term. The correct method is first to use α conversion to replace the bound variable y with another variable z that does not occur elsewhere in the formula, and then use β conversion and transitivity to establish the required result. In practice, α conversion is a basic step that is usually applied directly to a subterm, without further explanation. The η conversion rule expresses that any term t :  →  can be written as an abstraction of an application:

(η conversion)  (λv • t.v) = t • v is not free in t. An example of this rule is that (λx • (λz • z + y).x) = (λz • z + y). We can understand the η conversion rule as essentially stating that abstraction is the inverse 4.2. Derivations 73

of application: application followed by abstraction is the identity transformation. The β conversion rule describes the converse situation, i.e., abstraction followed by application. Finally, we have the principle of extensionality: two functions are equal if they give the same value for each argument in the domain:

  t.v = t.v (extensionality)   t = t • v not free in , t or t. The requirement that v not be free in  means that v is not restricted by the assumptions in ; i.e.,  is satisfied by any value for v. The extensionality principle and the rule of operand congruence together state that two functions are equal if and only if they give the same value for each argument in the domain. We can prove extensionality from the more basic rules. Assume that v is not free in , t,ort. Figure 4.1 shows the deduction for the extensionality inference rule. The left subderivation uses η conversion and symmetry, the middle subderivation uses abstraction, and the right subderivation again uses η conversion. The conclusion follows by transitivity. The assumptions guarantee that the side conditions for abstraction and η conversion are satisfied in this deduction.

4.2 Derivations

The format for deductions that we have given above shows the structure of the argument quite clearly but is not suitable for writing larger deductions (either deductions with many inference steps or deductions where the terms are large), as should be evident from the proof of extensionality given here. We therefore introduce an alternative format for writing deductions that scales up better and that we choose as the main format for writing proofs throughout the rest of the book. This format is based on a few simple notational conventions: writing inference rules and deductions sideways with indentation to show hypothesis and subproofs, using assumptions with scope to avoid repeating the same assumption over and over again in a deduction, and identifying the special role that transitivity and substitution play in deductions. We refer to deductions in the format we propose

 (λv • t.v) = t   t.v = t .v  t = (λv • t.v)   (λv • t.v) = (λv • t .v)  (λv • t .v) = t    t = t 

FIGURE 4.1. Deduction of extensionality inference rule 74 4. Functions

here as derivations or calculational proofs. We emphasize that a derivation is just an alternative way of writing deductions; the basic notion of a deduction is not changed. Consider a deduction of the form D D ··· D 1 2 m {rule name} .   t = t We can write this deduction sideways, as follows:   t ={rule name}

• D1 • D2 . .

• Dm  • t

where we assume that the subdeductions D1, D2, ..., Dm are also written sideways in the same way. Thus, rather than presenting the deduction as an upward-growing tree, we present it as a nested outline structure, with subdeductions shown as indented deductions. This format contains exactly the same information as the usual format, but the information is arranged in a different way. Each part of the conclusion (the as- sumption, the first term, the equality symbol, and the second term) is written on a line of its own. The equality symbol is decorated with the name of the rule. The subdeductions are written indented one step to the right (each one marked with a bullet •). We add a small dot in front of the second term when necessary to show the nesting more clearly. This dot has no semantic meaning. In most cases the hypotheses have the same assumptions as the conclusion in an inference. In such cases, the assumptions in the hypotheses are omitted, to make the proof more concise. If a hypothesis adds some formulas to the assumptions of the conclusion, then only the added assumptions are shown for the hypothesis, within square brackets (examples of this will be shown in later chapters). In both cases, the turnstile symbol  is omitted. The turnstile symbol is used only when we want to list the exact set of assumptions that are made for a hypothesis (thus overriding the present scoping assumption). On the abstract level of an inference rule, where a single symbol  is used for the assumptions, this may seem like a small gain. However, in real proofs the assump- tions may be very large formulas, and the gain is considerable. Most repetitions of assumptions are implicit with this convention, and the proof is compressed con- siderably. The convention is quite intuitive in the sideways format, where we can 4.2. Derivations 75 think of the assumption as having a scope that covers all nested subderivations (unless the implied assumptions are explicitly replaced by a new set of assump- tions, marked by the turnstile symbol). An assumption in square brackets adds a new assumption to the assumptions already in force, so it is a local assumption.

The real advantage of the notational conventions above come when we also permit inference steps to be strung together by transitivity. Consider the following proof:

DD1 {R1} DD2 {R2} DDm {Rm }   t0 = t1   t1 = t2 ···   tm−1 = tm {T}   t0 = tm where DD1, DD2,...,DDm are the subdeductions for each inference step (so ,..., each DDi is a list of subdeductions Di,1 Di,ni ). Note that the assumptions  are the same in each hypothesis. We write this proof in the sideways format as follows:



 t0

={R1}

DD1

• t1

={R2}

DD2

• t2 . .

• tm−1

={Rm }

DDm

• tm

Transitivity is used implicitly in this derivation. We interpret the whole derivation as a proof of the statement   t0 = tm . The inference step that explicitly states   t0 = tm (by transitivity) is omitted. This format is more economical than the original proof format, because the intermediate terms ti are written out only once, while there are two occurrences of these terms in the original format. This again becomes important when the terms are large. Besides transitivity, we also use reflexivity and symmetry implicitly in derivations, as illustrated in the example below. 76 4. Functions

Example: Extensionality The proof of extensionality is as follows when written as a derivation. The hypoth- esis is   t.v = t.v (recall that we assumed that v is not free in , t,ort). We have that   t ={η conversion; side condition satisfied by assumption} (λv • t.v) ={abstraction, v not free in } • t.v ={hypothesis} t.v  • (λv • t .v) ={η conversion; side condition satisfied by assumption}  t

We also use symmetry of equality implicitly: η conversion states that (λv • t.v) = t, but our derivation uses equality in the opposite direction in the first step. Note also that the assumptions  are available in the subderivation, so the hypothesis is really   t.v = t.v. The nested derivation in the example is typical of this proof technique. In proving  that (λv • t.v) = (λv • t .v),wefocus on the subcomponent t.v of the first term and show that t.v = t.v. This establishes the outer-level equality by abstraction. In this example, the nested derivation is the hypothesis of the whole proof, but in general, we could have a longer proof here. If a nested derivation is obvious (e.g., using reflexivity), we may omit it and simply state its conclusion as a comment. The nested subderivation here is actually too short and obvious to be indicated explicitly. In such cases, it is sufficient to indicate the subterm being replaced and explain the subderivation in the justification. The derivation is then as follows:

  t ={η conversion; side condition satisfied by assumption} (λv • “ t.v ”) ={abstraction, v not free in , hypothesis}  (λv • t .v) ={η conversion; side condition satisfied by assumption}  t 4.3. Definitions 77

The derivational proof format shows the proofs of hypotheses in-line, indented one step to the right. This works as long as the indented proof is short. Deeply nested or long subproofs make the derivation difficult to follow. It is then better to make the proof of the hypothesis into a separate lemma that is proved either before or after the main proof. Alternatively, one can use a word processor with an outliner to construct the derivations. The outliner permits nested derivations to be hidden and shown as required and thus allows a better overall view of the whole derivation. An even better alternative is to have a proof editor with outlining capabilities, so that the correctness of the inference steps can also be checked mechanically.

4.3 Definitions

The most secure way of extending a theory is to define new constants explicitly. This is a conservative extension, so the consistency of the extended theory is guaranteed. We describe this method of extending a theory below, together with another simple mechanism, the use of local definitions.

Constant Definition A new constant can be added to a theory by giving an (explicit) constant definition of the form

∧ c = t , (constant definition) where t is a closed term. The definition adds the new constant c to the signature, and the new axiom  c = t to the inference rules. Explicit constant definition is a conservative extension, since the added constant c has the same denotation as the term t. This means that any proof that uses the constant c can be rewritten as an equally valid proof that uses t instead of c. An example of an explicitly defined constant is the function composition operator ◦ (written infix). It is typed as ◦ : ( → ) → ( → ) → ( → ) , and its definition is

∧ ◦ = (λfgx• f.(g.x)) . (backward function composition)

The following variation on β conversion can be used to unfold an explicit constant definition: 78 4. Functions

•    t = (λv1 ···vn t )  (use of definition)   t.t1. ···.tn = t [v1,...,vn := t1,...,tn] • none of the terms ti contain any of the vi free. We derive this inference rule for the special case n = 1. Let the hypothesis be that •    t = (λv t ) and assume that v is not free in t1. We then have the following derivation of the inference rule: 

 t.t1 ={substitute equals for equals, hypothesis}

•  (λv t ).t1 ={β conversion}  t [v := t1]

An explicit definition of a function constant is often more readable if we write the ∧ ∧ arguments on the left-hand side, writing c.v = t rather than c = (λv • t). These two ways of writing definitions are equivalent, in the sense that either one can be derived from the other. As an example, we define the identity function and forward composition (written infix) of functions and give an alternative definition for backward composition of functions: ∧ id.x = x , (identity) ∧ ( f • g).x = g.( f.x), (forward composition) ∧ (g ◦ f ).x = g.( f.x). (backward composition) Here typing requires that f and g be functions, where the result type of f is the same as the domain type of g.

Abbreviations An alternative way of introducing new constructs into the logic is by using syntactic sugaring,orabbreviations. The syntax of higher-order logic is very terse; this makes the logic conceptually simple, but terms can be difficult to read. We can increase readability by introducing conventions that allow us to write terms in a new way. This does not extend the signature; it merely adds a layer of “pretty- printing” on top of the logic. This layer is very shallow; it is always possible to strip it away and work with the underlying pure terms if one wants to do that. However, by also stating inference rules in the sugared syntax, we can actually work on the abbreviation level all the time, forgetting completely about the pure syntax. We use 4.3. Definitions 79 abbreviations quite frequently in the subsequent chapters, in order to get a readable syntax for program constructs. We have already seen some example of syntactic sugar, e.g., the infix notation used for equality and function composition.

Local Definitions A very useful example of syntactic sugaring is local definitions.Wedefine the following abbreviation:

 ∧  (let v = t in t) = (λv • t).t . This gives us a way of writing terms with a kind of local context. A (sub)term of the   form (let v = t in t) can always be rewritten as (λv • t).t (and then β-converted). However, by using the let syntax we indicate that the term is going to be used in a specific way; we do not intend to rewrite it using β conversion. An example of a local definition is (let f = (λx • x + 1) in f ◦ f ). We have by the definition that (let f = (λx • x + 1) in f ◦ f ).0 ={local definition} (λf • f ◦ f ).(λx • x + 1).0 ={β conversion} ((λx • x + 1) ◦ (λx • x + 1)).0 ={functional composition} (λx • x + 1).(“ (λx • x + 1).0”) ={β conversion and simplification 0 + 1 = 1} (λx • x + 1).1 ={β conversion and simplification 1 + 1 = 2} 2

A typical useful property that can be expressed nicely in this syntax is the following:

 (vacuous let)  (let v = t in t) = t • v not free in t.

Local definitions also distribute into applications:

   ,  (let v = t in t1.t2) = (let v = t in t1).(let v = t in t2) 80 4. Functions

and they distribute into lambda abstractions under certain conditions:

   (let v = t in (λy • t)) = (λy • let v = t in t) • y not free in t.

The let construct is useful for introducing temporary (local) names for subterms inside a term. This makes it possible to structure terms in a way that makes them easier to read.

4.4 Product Type

Lambda abstraction permits us to describe functions over one variable. In practice, we very often need functions over two or more variables. There are essentially two different ways of describing such functions. Either we define functions over tuples directly, where tuples are elements of a product type, or we use currying.Inthe first case, a function for summing two natural numbers is written as

f.(x, y) = x + y . Here (x, y) is an element of the cartesian product type Nat × Nat, and f itself is of the type Nat × Nat → Nat. In the second approach (currying), we write the function as

f.x.y = x + y . In this case, f is of the type Nat → Nat → Nat. This approach uses higher-order functions in an essential way: f.x is a function of type Nat → Nat. The advantage of currying is that we can also talk about functions for which only part of the arguments are given; e.g., f.x stands for the function (λy • x + y) : Nat → Nat. This is useful in many situations, as we will notice, so we will in general prefer currying for functions with many arguments. However, the other approach also has its merits, as will also become evident later on. We extend the type structure of higher-order logic with product types, which stand for cartesian products of types. Product types are added to higher-order logic in the form of a binary type operator × (written infix). The operator × associates to the right, so 1 × 2 × 3 abbreviates 1 × (2 × 3) (so a triple is really a nested pair). We also assume that the product constructor binds more strongly than the function constructor, so  →  ×  stands for  → ( × ). The product type 1 × 2 stands for the cartesian product of the sets denoted by 1 and 2. Thus Nat × Nat → Nat × Nat denotes functions on pairs of natural numbers. In order to make use of products, we extend the logic with three new constants. Elements of the product  ×  are constructed using the pairing function (pair: 4.4. Product Type 81

 →  →  × ). In addition, we introduce projections (fst :  ×  →  and   snd :  ×  → ). We use an infix comma for pairing, writing t, t for pair.t.t . We postulate the following properties for the product operations:

 (fst.x, snd.x) = x , (pairing)  fst.(x, y) = x , (projection 1)  snd.(x, y) = y , (projection 2)  (x, y) = (x, y) ≡ x = x ∧ y = y . (congruence)

The first three axioms state that pairing and projections are inverses of each other. The last one states that pairing is a congruence and that equality on pairs is uniquely determined by equality on its components. We assume that pairing associates to the right, so that (t, t, t) stands for (t,(t, t)).

We introduce a special paired abstraction for products:

∧ (λu : ,v :  • t) = (paired abstraction) (λw :  ×  • t[u,v := fst.w, snd.w]), where w is an arbitrary variable that does not occur free in t. Since the syntax of higher-order logic does not allow paired abstraction, this is syntactic sugar; we can write terms with paired abstraction, but such a term always stands for a corresponding term in the basic syntax. The definition may not seem to be unique, but the α-conversion rule shows that it does not matter how we choose w.

The paired abstraction (λu,v • t) introduces names for the components of the product, which can then be used inside the term, thus considerably simplifying the term. An example of paired abstraction is the function

  (λn : Nat, n : Nat • n + n ), which maps pairs (n, n) of numbers to the number n + n. It is convenient to write   this expression as (λn, n : Nat • n + n ); i.e., the type indication is given only once for the preceding list of bound variables. Without paired abstraction we would have to write this term as (λx : Nat × Nat • fst.x + snd.x).

The conversion rules (α, β, and η conversion) can be generalized to paired abstrac- tions. Thus, we have the following: 82 4. Functions

      (λu,v • t) = (λu ,v • t[u,v := u ,v]) (α conversion) • u and v do not occur in t,

•   (λu,v t).(t1, t2) = t[u,v := t1, t2](β conversion) • t1 is free for u and t2 for v in t,

  (λu,v • t.(u,v))= t ( η conversion) • u and v not free in t. Note that these rules work directly on the syntactically sugared terms. Exactly as for infix operators, this shows that we do not need to rewrite sugared terms into the basic syntax in order to do manipulations.

The rules above are easily justified using the basic conversion rules. For example, paired β conversion is justified as follows: 

•  (λu,v t).(t1, t2) ={definition of paired abstraction syntax}

• (λw t[u,v := fst.w, snd.w]).(t1, t2) ={β-conversion}

t[u,v := fst.w, snd.w][w := (t1, t2)] ={properties of substitution; w not free in t}

t[u,v := fst.(t1, t2), snd.(t1, t2)] ={properties of projections}

t[u,v := t1, t2]

4.5 Summary and Discussion

We have presented the fundamental rules that are needed for proving properties about functions. The rules for equality formalize the fact that equality is an equiv- alence relation. The substitution and abstraction rules formalize the principle of substituting equals for equals. Finally, the α, β, and η conversion rules formalize the basic notions of abstraction and application, and show how they are interde- pendent. These rules for functions are not specific to higher-order logic; they are 4.6. Exercises 83

the general rules of the simply typed lambda calculus, also due to Church [48]. An overview of various lambda calculi is given by Henk Barendregt [37]. We have introduced derivations as an alternative format for writing proofs. This for- mat is based on a few simple observations that permit a one-to-one correspondence to be established between proofs in the style of natural deduction and derivational proofs. This derivational style is used for proofs throughout the rest of the book. Derivations as described here are inspired by the calculational proof format intro- duced by van Gasteren, Dijkstra, and Wim Feijen and advocated by many computer scientists [54, 66, 60]. Although similar in spirit to their proof style, our approach differs in some important aspects from theirs. The basic underlying framework that we use is a sequent formulation of natural deduction for higher-order logic, while their proof style is based on Hilbert-style proofs. They combine the proof steps with a pointwise extended ordering relation, whereas we permit only simple order- ing relations between proof steps. We also allow nested derivations in the proof. Nested calculational proofs have been proposed by, e.g., Jan van de Snepscheut, who suggested embedding calculational proofs within the comment brackets [131]. The approach taken here is more direct, in that the derivations are directly given in a nested (outline) format. The use of nested derivations is a big advantage when the terms become larger. This is the case with program derivations, where the terms are program statements. The proof style developed here also lends itself quite nicely to machine-supported presentation, browsing, and construction of proofs. Work in this direction has been done by Jim Grundy and Thomas Langbacka˚ [70].

4.6 Exercises

4.1 Write the proof of the rule “Substitute equals for equals” as a linear derivation. 4.2 Derive the rules of operator and operand congruence from the rule for replacing equals for equals. 4.3 Derive the transitivity rule for equality using only the substitution rule. 4.4 Derive the rule of η-conversion using other rules given in this chapter. 4.5 Show that the function composition operation ◦ is associative.

4.6 Derive the inference rules given for the let construct.

4.7 We define the constant twice by ∧ twice = (λfx• f.( f.x)) .

(a) What is the type of twice? 84 4. Functions

(b) Simplify (twice ◦ twice).(λx • x + 2).3 (note that this is not a case of self- application). 4.8 Define three constants (known as combinators) as follows: I.x = x , S. f.g.x = f.x.(g.x), K.x.y = x .

(a) What are the types of I , S and K ?

(b) Prove that I = S.K.K . 4.9 Assume that T is a type and that e : T , inv : T → T , and + : T → T → T (infix) are constants satisfying the following rules:

,  s + (t + u) = (s + t) + u

,  s + e = s

,  s + inv.s = e   + = s t e .   t = inv.s Prove  inv.(s + t) = inv.t + inv.s by a derivation. This is page 85 Printer: Opaque this 5

States and State Transformers

Imperative programs work by making successive changes to a state. In the intro- duction we assumed that an agent has available a collection of functions that it can apply to a state in order to change the state in some desired way. By composing such state functions we can describe more complicated state changes. This gives us a first, very simple model of programs, where a program is seen as a total function from a set of initial states to a set of final states. The state in imperative programs is really an abstraction of the computer memory, where values can be stored in locations and subsequently read from these locations. A program variable is essentially an abstract location. We will show in this chapter how to formalize the notion of program variables in higher-order logic and give some basic properties of program variables. At the same time, this chapter serves as an illustration of how to use the axioms and inference rules for functions that we developed in Chapter 4 to reason about states and state changes.

5.1 State Transformers

A function from states to states describes a state change. In principle, any type in higher-order logic can be used as a state space. The types that we have introduced thus far are, however, rather poor at modeling the often very complicated aspects of states that we meet in practice. We therefore prefer to use states with program variables as described in the next section to model more complicated states and computations. A state change may be within a single state space or from one state space to another. Let  and  be two state spaces. A function f :  →  is then called a state 86 5. States and State Transformers

transformer from  to  (a function f :  →  where  is a considered as a state space but  is not is called a state function). A state transformer category is a category where the objects are state spaces and the morphisms are functions between these state spaces. The identity function is the identity morphism, and forward functional composition is the composition of morphisms:

∧ 1 = id , (1 of state transformers) ∧ f ; g = f • g . (; of state transformers)

Composition and identity satisfy the category requirements:

( f • g) • h = f • (g • h) (; associative) id • f = f and f • id = f (id unit) A morphism thus maps states to states. In this case, the two arrow notations f : f  →  (function) and  −→  (morphism) stand for the same thing. A state transformer category is determined by a collection of types (state spaces) and a collection of state transformers that includes the identity state transformers on the state spaces. In addition, we require that the composition of two state transformers in the category be again a state transformer in the category.

In particular, let Tran X be the state transformer category where the collection X of types forms the objects (state spaces) in the category and the morphisms are defined by

∧ Tran X (, ) =  → , (state transformers) where  and  are state spaces in X. Thus, the morphisms from  to  are all functions f :  → . In what follows, we will simply write Tran for Tran X when the specific collection X of types is not important.

5.2 State Attributes and Program Variables

A program state will usually need to record a number of different aspects or at- tributes of the computation domain, which are then manipulated in the program. An attribute of type  over state space  is a pair (g, h) of functions, where g :  →  is an access function and h :  →  →  is an update function. Given a state σ, g.σ gives the value of the attribute in this state (the value is of type ), while h.γ.σ is a new state that we get by setting the value of the attribute to the value γ . The access and update functions of an attribute are accessed by the two projection functions fst and snd. However, to make it easier to remember which is which, we will define two new functions, val = fst and set = snd, as alternative names for the projection functions, so that val.(g, h) = g and set.(g, h) = h. 5.2. State Attributes and Program Variables 87

A sequence of updates to a state σ , with x1 first set to a1, then x2 set to a2, ..., xm set to am , is described in terms of forward composition as ((set.x1.a1) •···• (set.xm .am )).σ . In the category of state transformers, this is written as (set.x1.a1 ; ···; set.xm .am ).σ. We prefer the latter notation for composing state transformers. Consider an attribute x. Setting this attribute to a specific value and then accessing the attribute should yield the same value; i.e.,

val.x.(set.x.a.σ ) = a . (a)

We also expect that the attributes can be set independently of each other. In partic- ular, this means that setting the value of an attribute x does not change the value of attribute y when x = y :

val.y.(set.x.a.σ ) = val.y.σ . (b)

Next, an attribute can record only one value, so if we set the value of the attribute twice in succession, then only the last value is stored. This can be captured by the requirement

set.x.a ; set.x.b = set.x.b . (c)

Also, the order in which two different attributes are set should not matter, so we require that

set.x.a ; set.y.b = set.y.b ; set.x.a . (d)

Finally, we expect that setting an attribute to the value it already has does not change the state:

set.x.(val.x.σ ).σ = σ. (e)

Distinct attributes x1,...,xn are called program variables if they satisfy these independence requirements for any choice of x and y as two different attributes xi and x j in this list. The independence requirements essentially state the properties of an abstract memory with independent locations for storing values, where each location is associated with a specific type.

We often use a single name for a list of distinct state attributes, x = x1,...,xm . Property (d) shows that it makes sense to write set.x.a.σ if a is a correspond- ing list of values. Similarly, val.x.σ can be used to stand for the list of values val.x1.σ,...,val.xm .σ . We can use these axioms to determine the value of a program variable in a given state. For instance, assume that we first set x1 to 3, then x2 to 5, and then finally set 88 5. States and State Transformers

x1 to 0. What is the value of program variable x2 in the resulting state? We have

val.x2.((set.x1.3;“set.x2.5;set.x1.0”).σ ) ={property (d), associativity of functional composition}

val.x2.(“ (set.x1.3;set.x1.0;set.x2.5).σ ”) ={definition of forward composition}

val.x2.(set.x2.5.((set.x1.3;set.x1.0).σ )) ={property (a)} 5

Since we have postulated axioms for state attributes, we need to show that they do not introduce inconsistency into the logic. To show that the axioms for a collection x1 : 1,...,xm : m of attributes is consistent, it is sufficient to consider the model where  = 1 ×···×m is the state space and xi is the ith projection function on . Furthermore, it is possible to show that the axioms are complete with respect to this model, in the sense that any state change in  can be described as a sequence of state changes using update functions set.xi (in fact, in a normal form where no attribute is updated more than once), and any state function f :  →  can be described as a function over the access functions.

Expressions A central use of program variables is in assignment statements like x := x + y. Intuitively, this says that (the value of) program variable x is changed to the sum of (the values of) program variables x and y. In terms of attributes, this means that the state σ is changed by setting the attribute x to a new value a, where a = val.x.σ + val.y.σ . The value a depends on the state σ , so the expression x + y is a function from states to values. An expression e on state space  is determined by the following grammar: ˙ e ::= g |˙y | f .e1. ···.em . Here g is an access function, g :  → , y is a variable (not a program variable), and f is some (function) constant. We denote the pointwise extension of y to a state function on  by y˙, where y˙ = (λσ :  • y). The pointwise extension of f is denoted by f˙ and is defined for m ≥ 0by ˙ f . f1. ···. fm .σ = f.( f1.σ ). ···.( fm .σ ) .

A boolean expression is an expression of type  → Bool,anatural number expression is an expression of type  → Nat, and so on. An expression is thus simply a term of higher-order logic that is built in a special way. 5.2. State Attributes and Program Variables 89

The pointwise-extended variable y˙ permits us to have free variables that are not attributes in expressions. This will be needed later on, when we consider quantification of boolean expressions. An expression is thus constructed out of access functions, pointwise-extended vari- ables, and pointwise-extended operations. We have earlier used pointwise-extended operations in connection with pointwise extension of lattices. Our definition of f˙ generalizes this to operations with any number of arguments. For instance, we can define the pointwise extension of arithmetic operations by

0˙.σ = 0 1˙.σ = 1 . . ˙ (− f1).σ =−( f1.σ ) ˙ ( f1 + f2).σ = ( f1.σ ) + ( f2.σ ) . .

An example of an expression written in infix notation is ˙ ˙ ˙ ˙ (g1 − 1) + (g2 +˙z).

This expression contains two access functions, g1 and g2; the pointwise-extended variable z˙; the pointwise-extended constant 1;˙ and two pointwise-extended binary operations, +˙ and −˙ .

Simpler Notation for Expressions Explicit indication of pointwise extension becomes quite cumbersome in practice. Remember that a constant is determined by its name and its type. Hence, we are free to use the same name for both the function constant c and its pointwise extension c˙, because these two constants have different types. Thus, we can define 0.σ = 0, 1.σ = 1, (− f1).σ =−( f1.σ ), ( f1 + f2).σ = ( f1.σ ) + ( f2.σ ), and so on, just as we did when extending lattice operations. We drop the explicit indication of pointwise extension when the type information is sufficient to determine that the operations must be pointwise extended. Thus, we can write the above expression in a more standard way as

(g1 − 1) + (g2 + z).

The access functions that we use in an expression on  are the access functions of the attributes that we assume for the state space . Thus, if we assume that  has attributes x1,...,xm , then the access functions for  are val.x1,...,val.xm . The expression would then be written as

(val.x1 − 1) + (val.x2 + z), 90 5. States and State Transformers

assuming that g1 and g2 are the access functions of attributes x1 and x2. In practice, it is cumbersome to indicate the access of each attribute explicitly. Most programming languages therefore have the convention that the attribute name also stands for the access function of the attribute. We follow this convention by postulating that for any attribute x we also have an access function with the same name x (both will usually be assumed to be variables) such that

x = val.x . (f) Note that on the left-hand side we have x :  →  and on the right-hand side x : ( → ) × ( →  → ), so the two variables have the same name but different types; hence they are two different variables. We will choose this as our last assumption about program variables. With assumption (f), we can finally write the expression in a standard form, as it would be written in most imperative programming languages:

(x1 − 1) + (x2 + z). We will assume that expressions are written in this way, using access functions x directly, rather than going via the attributes (val.x). As expressions are just ordinary terms of higher-order logic, we can prove prop- erties of expressions in the usual way. As an example, consider proving that (x + y) − y = x. We have that (x + y) − y ={η-conversion} (λσ • ((x + y) − y).σ ) ={pointwise extension} (λσ • (x.σ + y.σ ) − y.σ ) ={arithmetic} (λσ • x.σ ) ={η-conversion} x Note that the justification “arithmetic” is used for an equality of the form (a + b) − b = a, on the level of natural numbers, while the equality that we prove has the same form but on the level of expressions. In Chapter 8 we will show how this kind of proof is generally done in a single step.

Assignment The update operation is a simple form of assignment, where an explicit value is assigned to an attribute (used as a program variable). More generally, we want to 5.3. Reasoning with Program Variables 91

permit an attribute to be assigned the value of an expression. Let e be an expression and let x be an attribute. We define the assignment (x := e) :  →  to be the following function: ∧ (x := e).σ = set.x.(e.σ ).σ . (assignment)

An example is the assignment x := x + y − 3, which maps an initial state σ to  a final state σ = set.x.a.σ , where a = (x + y − 3).σ = (x.σ + y.σ − 3) =  (val.x.σ + val.y.σ − 3). For attributes z = x,wehaveval.z.σ = val.z.σ if the independence axioms are assumed to hold. The assignment permits us to express independence axiom (e) more nicely as (x := x) = id . Axioms (a)–(d) allow us to prove a result like (x := a) ; (x := x) = (x := a), showing that an assignment of the value of an attribute to the same attribute is the identity function if the component has been assigned a value at least once. However, we cannot deduce (x := x) = id from the other axioms, i.e., that the same assignment is an identity also when it is the first assignment to the component (Exercise 5.2). The assignment is easily generalized to permit an assignment to a list of attributes, so called multiple assignment: ∧ (x1,...,xm := e1,...,em ).σ =

(set.x1.(e1.σ ) ; set.x2.(e2.σ) ; ... ; set.xm .(em .σ )).σ ,

where the attributes x1,...,xm must be distinct. The distinctness requirement implies, by independence property (d), that the order in a multiple assignment does not matter (Exercise 5.4):

(x1, x2 := e1, e2) = (x2, x1 := e2, e1).

The assignment statement is built out of an attribute x and an expression e. Hence, we can apply the principle of operand congruence to assignment statements to determine when two assignment statements are equal. This gives us the rule   =  e e .   (x := e) = (x := e) In fact, we can say something even stronger about the equality of two assignment statements, as will be shown in Section 8.4.

5.3 Reasoning with Program Variables

Let us next consider how to establish properties of expressions and state transform- ers that are described in terms of program variables. Let Pa.x stand for assumption 92 5. States and State Transformers

(a) for attribute x, Pb.x.y for assumption (b) for x and y, and so on. We write

var x1 : 1,...,xm : m

for the set of properties assumed to hold for program variables x1,...,xm . More precisely, this set consists of the assumptions

Pa.xi , Pc.xi , Pe.xi , xi = val.xi , i = 1,...,m ,

that hold for each attribute xi , and the assumptions

Pb.xi .x j , Pd .xi .x j , i = 1,...,m, j = 1,...,m, i = j ,

that hold for any pair xi , x j of attributes. These assumptions arise from independence assumptions (a)–(e) together with assumption (f), by choosing x and y in all possible ways as two distinct attributes xi and x j . We establish a property like t = t, where t and t are two expressions or two state transformers in program variables x1,...,xm , by proving that  var x1,...,xm  t = t . Thus, in proving the equality of the two terms, we may assume that the indepen- dence assumptions hold for attributes x1,...,xm that occur in the terms. Higher-order logic is monotonic, in the sense that adding assumptions to a theorem (sequent) produces a new theorem. This is reflected in the inference rule   t . , t  t In the context of program variables, this means that if we have proved a property  like t = t for program variables x1,...,xm , then this property will hold also if we assume that there are more program variables on the state. In other words, we have  var x1,...,xm  t = t  . var x1,...,xm , xm+1,...,xn  t = t

Another simple inference rule that we need when reasoning about program vari- ables is the following: ,...,  var x1 xm t .  ,...,   ,..., =  ,...,  var x1 xm t[x1 xm : x1 xm ]  ,...,  We assume here that the new attributes x1 xm are all distinct and are free for x1,...,xm in t. This means that we are free to replace the program variables with new program variables: the corresponding properties will still hold for the new program variables. This inference rule is also an easy consequence of the inference rules that are introduced in the next chapter. 5.3. Reasoning with Program Variables 93

Substitution Property The following result (the substitution lemma) turns out to be of central importance to reasoning about programs with attributes and expressions. Theorem 5.1 Assume that x and y are state attributes. Furthermore, assume that e and f are expressions over x, y, both of the same type as x. Then

var x, y  e.((x := f ).σ) = e[x := f ].σ . Note that on the left-hand side we have the assignment operation x := f ,defined by (x := f ).σ = set.x.( f.σ ).σ , while on the right-hand side we have a substitution, where the term (expression) f is substituted for variable x in the term (expression) e. We can use substitution on the right-hand side, because x is an access function, i.e., a variable (of function type).

Proof The proof is by structural induction. First, assume that e is just an attribute. If e is access function x, then we have

x.(set.x.( f.σ ).σ ) ={attribute properties (a), (f)} f.σ ={property of substitution} x[x := f ].σ If e is an attribute y that is different from x, then we have

y.(set.x.( f.σ ).σ ) ={attribute properties (b), (f)} y.σ ={property of substitution} y[x := f ].σ If e is a pointwise extended variable y˙, then we have y˙.((x := f ).σ) ={definition of y˙} y ={definition of y˙} y˙.σ ={˙y has no subterms} y˙[x := f ].σ

Finally, assume that e is an expression of the form g˙.e1. ···.em , m ≥ 0, where e1,...,em are expressions, and that the theorem holds for e1,...,em (the induction 94 5. States and State Transformers

hypothesis). Then we have that

(g˙.e1. ···.em ).(set.x.( f.σ ).σ )  ={definition of pointwise extension, σ = set.x.( f.σ ).σ }   g.(e1.σ ) ···.(em .σ ) ={induction assumption}

g.(e1[x := f ].σ ). ···.(em [x := f ].σ ) ={definition of pointwise extension}

(g˙.e1[x := f ]. ···.em [x := f ]).σ ={property of substitution}

(g˙.e1. ···.em )[x := f ].σ This last case covers also the case when e is a constant, by choosing m = 0. 2

Note that the assumptions in var x, y form a subset of var x, y, z1,...,zm , for any collection of additional attributes z1,...,zm . Therefore, the proof will in fact establish

var x1,...,xn  e.((x := f ).σ ) = e[x := f ].σ for any collection of program variables that includes x, y. This is the monotonicity principle that we already referred to above.

The following is a simple example of the use of this lemma:

var x, y : Nat  (x · y).((x := x + y).σ ) = ((x + y) · y).σ .

The substitution lemma is immediately generalized to multiple assignment, as fol- lows. Assume that x1,...,xm are attributes of , e is an expression, and f1,..., fm are other expressions, where fi is of the same type as x1, for i = 1,...,m. Then

var x1,...,xm  e.((x1,...,xm := f1,..., fm ).σ ) =

e[x1,...,xm := f1,..., fm ].σ . In fact, we can see the substitution lemma as stated in this form by considering x := f as a multiple assignment, with x = x1,...,xm and f = f1,..., fm . From now on, we let x := e stand for a multiple assignment, unless we explicitly state that x is a single attribute.

An immediate consequence of the substitution lemma is the following.

Corollary 5.2 Let e be an expression not involving attribute (list) x. Then

var x  e.((x := f ).σ ) = e.σ . 5.3. Reasoning with Program Variables 95

Assignments as State Transformers State transformers form a category, with the identity function as the identity mor- phism and forward sequential composition as composition. An assignment is one way of describing state transformers that in practice is very useful. Let us therefore consider what the category operations are on assignments. For identity, we have already shown that x := x is the same as id for any list of program variables x. The following theorem shows how to compose two successive assignments to the same variable x.

Theorem 5.3 Let x be a list of attributes and f a list of expressions. Then

var x  (x := e) ; (x := f ) = (x := f [x := e]).

Proof We prove the result for the case when x is a single attribute; the general case is proved in the same way. ((x := e) ; (x := f )).σ ={definition of function composition} (x := f ).((x := e).σ ) ={definition of assignment} set.x.( f.((x := e).σ )).((x := e).σ ) ={definition of assignment} set.x.( f.((x := e).σ )).(set.x.(e.σ ).σ ) ={attribute property (c)} set.x.( f.((x := e).σ )).σ ={substitution lemma} set.x.( f [x := e].σ ).σ ={definition of assignment} (x := f [x := e]).σ 2

Theorem 5.3 really only restates the definition of function composition, f ; g = (λx • g.( f.x)), but using the assignment notation with attributes and expressions. However, it is useful as a basic rule for merging two assignment statements into one or (reading it from right to left) splitting an assignment statement into two parts. Theorem 5.3 requires that the two assignments be compatible, in the sense that they update exactly the same variables. The following lemma shows how we can make two assignments compatible. 96 5. States and State Transformers

Lemma 5.4 Assume that x and y are disjoint lists of attributes. Then

var x, y  (x := e) = (y, x := y, e).

Proof We have that (y, x := y, e).σ ={definition of multiple assignment} (set.y.(y.σ ) ; set.x.(e.σ )).σ ={definition of sequential composition} set.x.(e.σ ).(set.y.(y.σ ).σ) ={attribute property (e)} set.x.(e.σ ).σ ={definition of assignment} x := e 2

Theorem 5.3 together with Lemma 5.4 can now be used to compute the sequential composition of two arbitrary assignments. Assume, for example, that x, y, and z are distinct attributes. Then (x, y := e, f ) ; (y, z := g, h) ={add superfluous variables} (x, y, z := e, f, z) ; (x, y, z := x, g, h) ={merge assignments} (x, y, z := x[x, y, z := e, f, z], g[x, y, z := e, f, z], h[x, y, z := e, f, z]) ={simplify} (x, y, z := e, g[x, y := e, f ], h[x, y := e, f ])

Immediate consequences of Theorem 5.3 are the following rules for independent assignments:

Corollary 5.5 (a) Assume that x and y are disjoint (lists of) attributes. Then

var x, y  (x := e) ; (y := f ) = (x, y := e, f [x := e]).

(b) Furthermore, if x is not free in f and y is not free in e, then

var x, y  (x := e) ; (y := f ) = (y := f ) ; (x := e). 5.4. Straight-Line Programs 97

Proof We first prove (a): ((x := e) ; (y := f )).σ ={Lemma 5.4} ((x, y := e, y) ; (x, y := x, f )).σ ={Theorem 5.3} (x, y := x[x, y := e, y], f [x, y := e, y]) ={simplify substitutions} (x, y := e, f [x := e]) Now (b) follows by applying (a) twice. 2

5.4 Straight-Line Programs

We can use the framework built thus far to model straight-line programs, i.e., programs that perform only a sequence of successive changes to a state. These can be defined by the following very simple grammar:

f ::= id | x := e | f1 ; f2 .

Here x is a state attribute, e an expression, and f , f1, and f2 straight-line programs. Note that this definition does not introduce a new syntactic category. Instead, it identifies a specific subclass of state transformers: those that can be constructed in the manner described by the production rules. Consider a simple straight-line program like

S = (x := x + y) ; (y := x − y) ; (x := x − y). (∗) The attributes here are (at least) x and y. The type of these attributes determines the state space of S.Ifx is an attribute on , then S :  → ;ifx is an attribute on , then S :  → . The straight-line program itself is of exactly the same form in both cases. We will in such cases say that S :  →  and S :  →  are similar. Similarity captures the notion that two state transformers expressed in terms of program variables are syntactically the same but may operate on different underlying state spaces. Properties of S that are expressed in terms of attributes of the state do not depend on whether the underlying state space is  or . This is the main advantage of using state attributes with expressions and the assignment notation: we can prove properties that hold for the whole family of similar statements, rather than establishing a property only for a specific statement on a specific state space. Alternatively, we can think of straight-line programs as being polymorphic. This means that we consider the underlying type  as a type variable that can be 98 5. States and State Transformers

instantiated to different concrete types. We have avoided using a polymorphic version of higher-order logic, for simplicity, but adding polymorphic terms does not present any difficulties and is, e.g., used in the underlying logic of the HOL system [62]. Only very small parts of programs are straight-line programs. However, straight- line programs form the building blocks of ordinary programs, because most im- perative programs are built up from such small pieces. It is therefore important to be able to reason about these simple program fragments. Let us give an example of how to use the assignment rules to reason about a straight- line program. The following derivation shows that the state transformer (∗) above has the effect of swapping the values of the two attributes without using an extra variable as temporary storage: var x, y : Nat  “ (x := x + y) ; (y := x − y) ”;(x := x − y) ={merge assignments} (x, y := x + y, “ (x + y) − y ”) ; (x := x − y) ={simplify expression (derivation in Section 5.2) } (x, y := x + y, x) ; (x := x − y) ={merge assignments} (x, y := “ (x + y) − x ”, x) ={simplify expression} (x, y := y, x)

We can compress any sequence of assignments to a single multiple assignment by successive applications of the split/merge theorem. In some situations, this will simplify the program, but in other cases it will make the resulting program longer and more difficult to understand. Hence, there are situations where a sequence of assignments is to be preferred and situations where a single multiple assignment is better. Therefore, we need different ways of describing the same state transformer and ways of moving between different equivalent descriptions.

5.5 Procedures

Let us now also show how to extend the simple straight-line programs with a procedure mechanism. This mechanism is quite simple and will be used as such also for the more complicated program models that we will describe later on. Let us write the state transformer for swapping the values of program variables x and y in the previous section in full, without the abbreviations for assignments and 5.5. Procedures 99 expressions. Then (x := x + y) ; (y := x − y) ; (x := x − y) becomes (λσ • set.x.(val.x.σ + val.y.σ ).σ ) ; (λσ • set.y.(val.x.σ − val.y.σ ).σ ) ; (λσ • set.x.(val.x.σ − val.y.σ ).σ ) . The attributes x and y are thus the (only) free variables of this state transformer. This state transformer will only swap the values of the program variables x and y. We could define a function of more general use by defining a constant swap that does what we want for any two attributes:

∧ swap = (λxy• x := x + y ; y := x − y ; x := x − y). Then the application of this function to some other attributes a and b, of the same type as x and y, is just swap.a.b . A simple calculation shows that

swap.a.b = (a := a + b ; b := a − b ; a := a − b). Lambda abstraction and application are thus available for program variables just as they are available for any other variables. Let us compare this with the standard terminology used in programming languages. The swap function is actually a procedure that can be used in straight-line programs. The constant name swap is the procedure name, the attributes x and y that we have used to define the constant are the formal parameters, and the term that we use to define the constant is the procedure body. An application of the constant to other variables, like swap.a.b, is known as a procedure call, where a and b are the actual parameters. The specific method for parameter passing that we get by the above conventions is usually referred to as call by reference. Thus, the basic mechanisms of higher-order logic already contain all the features that we need for defining a procedure mechanism for straight-line programs. This same mechanism is also applicable as such to define procedures in the more expressive program models that we will study later on. We will prove properties of straight-line programs only under the assumption that the attributes involved satisfy the independence assumptions, i.e., are in fact program variables. Thus, we can prove that var x, y  (x := x + y ; y := x − y ; x := x − y) = (x, y := y, x), although x := x + y ; y := x − y ; x := x − y in general is not equal to x, y := y, x (as this requires equality of the two state transformers for any attributes x and y, not only for independent attributes). 100 5. States and State Transformers

Using the definition of the swap procedure, we can deduce that

var x, y  swap.x.y = (x, y := y, x), (∗) and we can also deduce that var a, b  swap.a.b = (a, b := b, a) using the rules for changing program variables described above.

Aliasing Assuming that we have shown property (∗) for swap.x.y, what do we know about the properties of, say, swap.a.a. In other words, what happens if we give the same program variable a as actual parameter for both x and y? In programming logics this is known as aliasing: two formal parameters are identified in the procedure call. The problem with aliasing is that the independence assumptions actually imply that the program variables must be different attributes (provided that there is a state attribute that can take two distinct values). To see this, assume that var x, y holds  and that in addition, val.x = val.y. Consider a state σ , a state σ = set.x.1.σ , and   a state σ = set.x.2.σ . Then 2 ={assumption (a)}  val.x.σ ={assumption val.x = val.y}  val.y.σ ={assumption (b)}  val.y.σ ={assumptions val.x = val.y and (a)} 1 which is a contradiction. Therefore, x and y must be different attributes. In a similar way, we can show that the assumption set.x = set.y also leads to a contradiction. The property (∗) thus implies that swap.x.y = (x, y := y, x) holds for any two different attributes x and y that satisfy var x, y. This means that we know nothing about swap.x.y when x and y are the same variable: the proof has established property (∗) only for the case when x and y are different. Aliasing as such is not illegal or wrong. On the contrary, the procedure call swap.a.a is a well-formed term that denotes a specific state transformer. We can even show that var a  swap.a.a = (a := 0) holds, but this requires a new proof; it cannot be deduced from (∗). 5.6. Blocks and Value Parameters 101

This means that we always have to apply a procedure with two or more formal arguments to different program variables. This is known as the nonaliasing require- ment for passing parameters by reference. In our context, it is a direct consequence of the way we have formalized procedures with parameters.

5.6 Blocks and Value Parameters

A state transformer f :  →  models a state change, starting in an initial state σ :  and terminating in some final state f.σ : . State transformers that are expressed using the program variable notation stay within the same state space. We shall now show how to define a block, which temporarily moves to a different state space. This construct permits us to embed state transformers on a “larger” state space into a state transformer on a “smaller” state space. The definition of the block construct rests on the notion of state extension functions. The state transformers begin :  →  and end :  →  form a state extension pair if they satisfy

begin ; end = id , (g) i.e., if end is a left inverse of begin (the reason for calling end a left inverse of begin becomes clear if we write the condition as end ◦ begin = id). A state extension pair establishes a correspondence between elements of the two state spaces, so that every σ ∈  has a copy begin.σ in  and every γ ∈  has an original end.γ in . A similar idea will be used to extend the logic with new types (see Chapter 10). For given state spaces  and  there may exist many state extension pairs. In fact, one can show that any injective function can be used as the function begin (Exercise 5.7). However, the idea is that begin maps  to a copy of  inside  in a particular way that will be made clear below. Given a state extension pair and a state transformer f :  → ,wedefine a block to be a state transformer of the form

begin ; f ; end . (block) The block maps state space  into the state space  and then performs the state change f before moving back to the original state space. The extended state space  is ‘larger’ than , because begin is injective. This construct is illustrated as a commuting diagram in Figure 5.1.

Local Variables

Now assume that the state space  has program variables x1,...,xm satisfying the independence axioms. Also assume that  is another state space with pro- 102 5. States and State Transformers

f ΓΓ

begin end

ΣΣ begin; f ; end

FIGURE 5.1. Blocks as state extensions

gram variables x1,...,xm , y1,...,yn satisfying the independence axioms. Fur- thermore, assume that a state extension pair between  and  satisfies the following properties:

(xi := z) ; begin = begin ; (xi := z), (h) (xi := z) ; end = end ; (xi := z), (j) val.xi = end ; val.xi , (k) (y j := e) ; end = end . (l) Conditions (h), (j), and (k) state that there is a correspondence between the program variable x1,...,xm on the two state spaces. Condition (l) states that a change in y j does not cause a change in the underlying global state, so the program variables y1,...,yn are local to . These conditions are illustrated as commuting diagrams in Figure 5.2. They generalize in a straightforward way to multiple assignments. For reference, we collect all the state attribute properties in Table 5.1

val.x.(set.x.a.σ ) = a (a) val.y.(set.x.a.σ ) = val.y.σ (b) set.x.a ; set.x.b = set.x.b (c) set.x.a ; set.y.b = set.y.b ; set.x.a (d) set.x.(val.x.σ ).σ = σ (e) x = val.x (f) begin ; end = id , (g) (xi := z) ; begin = begin ; (xi := z), (h) (xi := z) ; end = end ; (xi := z) (j) val.xi = end ; val.xi (k) (y j := e) ; end = end (l)

TABLE 5.1. Independence requirements for attributes 5.6. Blocks and Value Parameters 103

x := z y := e ΓΓΓ val.x ΓΓ

end end end ∆ end end

ΣΣΣ val.x Σ x := z

(g) (h) (i)

FIGURE 5.2. Block function properties

Let us introduce a special syntax for a block where these assumptions are satisfied: ∧ begin var y1,...,yn := e1,...,en ; f end =

begin ; (y1,...,yn := e1 ◦ end,...,en ◦ end) ; f ; end , where e1,...,em are expressions over the global state space  (i.e., expressed in terms of program variables x1,...,xm ). The initialization y j := e j ◦ end makes the initial value of y j the value of expression e j in the global state. The variables y1,...,yn are called local variables in the block. One may wonder whether it is possible to define state attributes in such a way that conditions (g)–(l) are satisfied. To see that this is indeed possible, assume that state space  is given, with program variables x1,...,xm . By letting  be of the form  × , where  has program variables y1,...,yn, we can define the attributes over  in terms of the attributes over  and  as follows: ∧ xi = xi ◦ snd , ∧ y j = y j ◦ fst .  The conditions (g)–(l) are then satisfied when begin.σ = (γ ,σ) and end.γ =  snd.γ , for some arbitrary γ in . We permit local program variables to have the same name as global program variables. In this case, the corresponding access and update functions inside the block always refer to the local rather than the global attribute. This corresponds to a redeclaration of a variable that hides the corresponding attribute of the global state, making it inaccessible to the statement inside the block (except to the initialization). This attribute is, of course, available as usual outside the block. As an example consider the following state transformer: begin var y, z := x + y, x − y ; x := y + z ; v := x + v end ; v := x + v. Here the initial state space  has program variables v, x, and y (it may have other program variables as well). The state transformer (x := y + z) ; (v := x + v) 104 5. States and State Transformers

operates on a state space  that has program variables v, x, y, z, where v and x are global and y and z are local. The local program variables y and z are initialized to the values x + y and x − y, where x and y access the global program variables. Note that the state transformer v := x + v inside the block is similar to but not the same as the state transformer v := x + v outside the block. The state transformer inside the block works on the extended state space of the block body, while the state transformer outside the block works on the global state space. The state attributes involved have the same names, but they have different domains. The use of expressions and assignment notation allows us to ignore this difference when writing programs. In most cases, this is a good convention, because the distinctions between the two statements are not important. However, when reasoning about general properties of statements in the refinement calculus, we need to keep this distinction in mind. The distinction can always be made explicit by indicating the types of the state attributes involved, so there is no confusion on the logical level, where typing makes the two state transformer terms distinct. The expression e that is used to initialize the value of the local program variable may depend on the values of the global variables at block entry. A traditional block initializes the local program variables to some fixed value, which may or may not be known. For instance, if y is of type Nat, then we may choose e to be some arbitrary natural number, or we may choose e to be 0. In either case, once the convention is known, we do not need to indicate e explicitly in the syntax of the block and can write the block simply as begin var y ; f end.

Working with Blocks The basic properties of state extension pairs and program variables permit us to prove properties about blocks, as the following example shows. The aim is to swap two program variables using only single-variable assignments: x, y := y, x ={characterization of state extension pair, local variable z} begin ;“end ”;x, y := y, x ={block property (j)} begin ; z := x ;“end ; x, y := y, x ” ={properties (g) and (h)} begin ;“z := x ; x, y := y, x ”;end ={assignment reasoning (Exercise 5.6)} begin ; z := x ; x := y ; y := z ; end ={definition of block syntax} begin var z := x ; x := y ; y := z end 5.6. Blocks and Value Parameters 105

Here the first three steps accomplish an instance of what can be seen as a general principle of block introduction:

 f = begin var y := e ; f end , (block introduction) where f is a state transformer written using only assignment notation (with no occurrence of local variable z) and f  and f are similar (in fact, we can allow f  to contain additional assignments to the local variable). Reading the block introduction from right to left, we see that at the same time we are given a principle of block elimination. In fact, block introduction and elimination can be seen as special cases of intro- duction and elimination of local variables. The rule for this has the following form:

 begin var z := e ; f end = (local variable introduction)   begin var y, z := e, e ; f end when f and f  are similar. The proof is a similar derivation as for the special case of block introduction. In this more general case, we can add or remove one or more local variables to or from a block.

Value Parameters Recall the definition of procedures in Section 5.5. In principle, we can get along with just call by reference for procedures. In practice, other parameter-passing mechanisms are also useful for procedures, and the block construct gives us a way of handling these. The most important other mechanism is call by value.A procedure declaration usually indicates for each parameter whether it should be called by reference or called by value (or in some other way). Assume that we have a procedure declaration of the form

proc f (var x, val y) = s .

Here we use standard notation, with var x indicating that parameter x is to be passed by reference and val y indicating that y is to be passed by value. This procedure declaration is seen as syntactic sugaring for the following definition of constant f :

∧ f.x.y = s .

A call on the procedure f is assumed to be of the form f (u, e), where u is some attribute and e an expression. This call is again seen as syntactic sugaring:

∧   f (u, e) = begin var y := e ; f.u.y end . Thus, the call f (u, e) is actually a block that introduces a local variable y for the value parameter, with y distinct from u, and assigns the actual value expression e 106 5. States and State Transformers

to this variable. We need to assume that no recursive calls on f are made within the body of f , because straight-line programs are not expressive enough to model recursive procedures. Recursive procedures will be treated in detail in Chapter 20.

Locally Defined Procedures Recall that the let construct allows us to name a term locally. When we focus on the body t in let x = s in t,wegetx = s as a local assumption. The same idea can be used to define procedures locally and to have the procedure definition available as an assumption when reasoning about statements that contain procedure calls. We define

proc f (var x, val y) = s in (local procedure) ··· f (u, e) ··· =∧ let f = (λxy• s) in   ···begin var y := e ; f.u.y end ··· Here the dots (···) surrounding the procedure call indicate that the call is part of a state transformer. We choose y such that there is no name clash with u. The difference, compared with global procedures, is that f is now not a constant but a variable that has a local meaning inside the body of the let construct. The definition shows a body with only a single procedure call, but statements with multiple procedure calls work in exactly the same way. As an example consider the following state transformer:

proc Incr(var x, val y) = (x := x + y) in Incr(a, a + b) ; b := 0 The following derivation shows what statement this is

proc Incr(var x, val y) = (x := x + y) in Incr(a, a · b) ; b := 0 ={definition of procedure} let Incr = (λxy• x := x + y) in begin var y := a · b ; Incr(a, y) end ; b := 0 ={definition of let } (λIncr • begin var y := a · b ; Incr(a, y) end ; b := 0).(λxy• x := x + y) ={β-conversion} begin var y := a · b ; (λxy• x := x + y).(a, y) end ; b := 0 ={β-conversion} begin var y := a · b ; a := a + y end ; b := 0 5.7. Summary and Discussion 107

Note that this is exactly what the meaning of Incr(a, a + b) ; b := 0 would have been if Incr had been defined as a global procedure (constant). One advantage of local procedures is that we can focus on the body of a procedure definition and rewrite it, since it is part of the term that we are transforming. For example, we can use the results of earlier derivations as follows: proc Swap(var x, y) = “ (x, y := y, x) ” in Swap(a, b) ; Swap(c, d) ={replace according to earlier derivation} proc Swap(var x, y) = begin var z := x ; x := y ; y := z end in Swap(a, b) ; Swap(c, d)

5.7 Summary and Discussion

In this chapter we have shown how to model program variables as independent attributes in higher-order logic, investigated the basic properties of attributes, and shown how to define expressions and assignments in terms of attributes. A number of basic properties of expressions and assignments were identified. As a result, we have a small and convenient theory of state transformers, which we can use to show equivalence of terms in this domain. We have also shown how to model blocks with local variables and procedures in this little equational theory of program variables. The notion of a state with independent attributes is central to the imperative pro- gramming paradigm. We have tried to be precise about the notion of a program variable, an expression, and an assignment, because so much in what follows depends on a precise understanding of these issues. States in programming logics are usually fixed to a specific model. For instance, one can model states as functions that map a set of program variable names to values. A detailed treatment of this approach is given by Jaco de Bakker [36]. This approach either requires all values to be of the same type, or it needs a logic with dependent types (or an untyped logic). Another approach is to define states as tuples, as done by, e.g., Back and von Wright [30]. Yet another popular model is to treat states as records. Higher-order logic can be extended with records (as is, e.g., done in the programming language Standard ML [100]). Dijkstra and Carel Scholten [54] also consider program variables as functions of a hidden state, but their treatment of program variables is otherwise quite different from ours. We have chosen to state only the minimal assumptions that we need about states and their attributes in order to reason about programs with expressions and assignments. Models that consider states as functions, tuples or records will all satisfy the axioms that we have given for states. In this way, we avoid choosing one specific model in favor of others. 108 5. States and State Transformers

5.8 Exercises

5.1 Show that the following property follows from the independence axioms:   set.x.e.σ = set.x.e .σ ⇒ e.σ = e .σ . 5.2 Show that the following property follows from the independence axioms (a)–(d): (x := a) ; (x := x) = (x := a). Verify informally that it is still not possible to prove (x := x) = id from the same axioms. 5.3 Assume a state space Nat where the two program variables b : Bool and x : Nat are encoded as follows. For given state σ : Nat we let val.b.σ = even.σ and val.x.σ = σ div 2. Describe the operations set.b and set.x. 5.4 Prove the following property:

(x, y := e1, e2) = (y, x := e2, e1). 5.5 Verify the following using Corollary 5.2: (x ∗ y).((x, y := x + y, y + 1).σ ) = ((x + y) ∗ (y + 1)).σ , (x ∗ y).((z := x + y).σ ) = (x ∗ y).σ . 5.6 Use the properties of assignments to show the following equality: (z := x) ; (x, y := y, x) = (z := x) ; (x := y) ; (y := z). 5.7 Assume that a function f :  →  is given. Show that f has a left (right) inverse if and only if f is injective (surjective). 5.8 When computing a sum, a local variable can be used to collect intermediate sums. Show the following equality:

var x, y, z, s  y := x + y + z = begin var s := 0;s := s + x ; s := s + y ; s := s + z ; y := s end . 5.9 Assume the following procedure definition:

Max(var x, val y, z) = x := max.y.z , Min(var x, val y, z) = x := min.y.z . where the operator max returns the greater of its two arguments and min returns the smaller. Prove in full formal detail the following:    Max(u, e, e ) ; u := e + e − u = Min(u, e, e ). This is page 109 Printer: Opaque this 6

Truth Values

In previous chapters we have introduced higher-order logic and described the constants and inference rules that are used for reasoning about functions. In this chapter we present the constants and inference rules that are needed for reasoning about logical formulas. We have shown earlier that the truth values form a complete boolean lattice. Here we describe the general inference rules that are available for reasoning about boolean lattices. The basic inference rules for truth values are then special cases of these general rules. Furthermore, inference rules for quantification follow from the fact the truth values form a complete lattice. We also show how the derivation style works for reasoning about formulas with logical connectives. In particular, we show how subderivations that focus on a sub- term of the term being transformed can use context-dependent local assumptions. We also give some inference rules for assumptions and show how to manipulate assumptions in proofs.

6.1 Inference Rules for Boolean Lattices

Assume that the type  denotes the set A. Then :  →  → Bool denotes a relation on A.Ift1 :  and t2 :  are two terms, then t1  t2 is a formula that is true if the relation denoted by  holds between the element denoted by t1 and the element denoted by t2. 110 6. Truth Values

The inference rule

( reflexive)  t  t

now states that  is reflexive. This inference rule may be taken as an axiom if we want to postulate that the relation  is reflexive. Alternatively, it may also be possible to derive it from other axioms and inference rules for the relation .

Similarly, the inference rule

  t  t   t  t ( transitive)  ∪   t  t

is valid if the relation  is transitive. Writing the transitivity property as an inference rule avoids the explicit use of an implication when expressing this property. Either inference rules such as these may be postulated for the relation , or they may be derivable by using other axioms and inference rules for the relation .

The collection of inference rules that hold for boolean lattices are summarized in Table 6.1. These inference rules are really schemes, where any concrete choice for the constants , ⊥, , , , and ¬ gives a collection of inference rules stating that these constants form a boolean lattice.

  t  t    t   t    ( reflexive)  t  t  ∪   t  t (transitive)   t  t    t   t   ( antisymmetric)  ∪   t = t (⊥ least) ⊥t  t  ( greatest)   s  t   s  t    t  s   t   s     ( introduction)  ∪   s  t  t  ∪   t t  s ( elimination)

   ( elimination)  t  t  t  t  t  t

   ( introduction)  t  t t  t  t t

    ( distributivity)  t (t  t ) = (t t )  (t t )

    ( distributivity)  t  (t t ) = (t  t ) (t  t ) (¬ contradiction)  t ¬t =⊥  t ¬t = (¬ exhaustion)

TABLE 6.1. Inference rules for boolean lattices 6.2. Truth Values 111

6.2 Truth Values

Traditionally, the properties of truth values are given by picking out a few of the operations as basic, giving the axioms that determine their properties, and then defining the other logical constants explicitly in terms of these postulated constants. This approach works well if we want to do metamathematical reasoning about the higher-order logic itself, but it is unnecessarily cumbersome when we are interested only in using the logic. Lattice theory provides a nicer way of characterizing the properties of truth values. We have already noted that the truth values form a boolean and linearly ordered lattice. Any poset of this kind with at least two elements must be isomorphic to the standard model for truth values. Hence, we can take this collection of lattice properties as the axioms that characterize truth values.

The type Bool of truth values was postulated as a standard type in higher-order logic. We now further postulate the following constants:

T : Bool , (true) F : Bool , (false) ¬ : Bool → Bool , (not) ⇒ : Bool → Bool → Bool , (implies) ∧ : Bool → Bool → Bool , (and) ∨ : Bool → Bool → Bool . (or) We prefer to use the symbol ≡ (rather than =) for equality on Bool. We consider t ⇐ t as an alternative way of writing t ⇒ t. The constants ⇒, ∧, and ∨ are (exactly like = and ≡) written infix and associate to the right. These constants are all intended to have their traditional meanings. The infix logical connectives have lower precedence than other infixes (e.g., arith- metic operators and equality). Conjunction ∧ has highest precedence, followed by the operations ∨, ⇒, and ≡, in that order. The precedence is overridden when necessary by explicit parenthesis. Note that ≡ and = are actually the same op- eration on Bool, but they have different precedences. For a complete list of infix precedences, see Appendix A. Negation is a prefix operator. It has higher precedence than all infix operators except for the function application dot.

Inference Rules We postulate that the truth values form a boolean lattice with implication ⇒ as the ordering relation. Falsity F is the bottom, truth T is the top, conjunction ∧ is meet, disjunction ∨ is join, and negation ¬ is complement. This means that the above-listed inference rules for boolean lattices hold for the truth values. This gives 112 6. Truth Values

  t ⇒ t    t  ⇒ t    (⇒ reflexive )  t ⇒ t  ∪   t ⇒ t (⇒ transitive)   t ⇒ t    t  ⇒ t   (⇒ antisymmetric)  ∪   t ≡ t (F least)  F ⇒ t  t ⇒ T (T greatest)   s ⇒ t   s ⇒ t    t ⇒ s   t  ⇒ s     (∧ introduction)  ∪   s ⇒ t ∧ t  ∪   t ∨ t ⇒ s (∨ elimination)

   (∧ elimination)  t ∧ t ⇒ t  t ∧ t ⇒ t

   (∨ introduction)  t ⇒ t ∨ t  t ⇒ t ∨ t

    (∧ distributivity)  t ∨ (t ∧ t ) ≡ (t ∨ t ) ∧ (t ∨ t )

    (∨ distributivity)  t ∧ (t ∨ t ) ≡ (t ∧ t ) ∨ (t ∧ t ) (¬ contradiction)  t ∧¬t ≡ F  t ∨¬t ≡ T (¬ exhaustion)

TABLE 6.2. Inference rules for truth values

us the inference rules for truth values shown in Table 6.2. The truth-value lattice has two additional properties that are special to it. The first property is that it is linearly ordered. This is expressed by the following axiom:

. (⇒ linear)  (t ⇒ t) ∨ (t ⇒ t)

The second property is that we identify truth with theoremhood. This is expressed by the following axiom:

. (truth)  T This axiom thus states that T is a theorem.

6.3 Derivations with Logical Connectives

The notion of a proof is not changed when we add the new rules (like truth and linearity of implication above); it is only the underlying theory that is extended. The rules have been expressed as inference rules, in the format used for natural 6.3. Derivations with Logical Connectives 113 deduction. However, we prefer to use derivations rather than deductions also for reasoning about logical formulas in general. There is then one small problem. The format requires that each proof step must be of the form t  t, where  is a preorder. Only equality and implication are of this form. How should we give a derivational proof for sequents of the form   t1 ∧ t2 or   t1 ∨ t2? The following inference rule shows the way out of this dilemma. It essentially says that stating a proposition is the same as stating that it is equivalent to the constant T (which stands for truth):

. (≡ T rule)  t ≡ (t ≡ T) Similarly, we have a rule that says that stating the negation of a proposition is the same as stating that it is equivalent to the constant F (which thus stands for falsity):

. (≡ F rule) ¬t ≡ (t ≡ F) The proofs of these rules are left as exercises to the reader (Exercise 6.1). The proofs are rather long, involving antisymmetry of ≡, substitution, and properties of T and F. In fact, the following variations of these rules are also valid

, (T ⇒ rule)  t ≡ (T ⇒ t)

, (⇒ F rule) ¬t ≡ (t ⇒ F) since the implications in the opposite direction are trivially true. The solution to the problem posed above is now simple. To prove a theorem   t, where t is not of the form required for a derivation, it is sufficient to give a derivation for   t ≡ T, as shown by the following deduction:

{≡ T rule}   t ≡ T  t ≡ (t ≡ T) {Substitution}   t

The T ⇒ rule shows that it is in fact sufficient to give a derivation for   T ⇒ t. Similarly, if we want to prove  ¬t, then it is sufficient to give a derivation for   t ≡ F or for   t ⇒ F (the latter amounts to a proof by contradiction). We illustrate derivations with logical connectives below by looking at a selection of standard propositional inference rules. There are many such rules, but space does not permit us to present them all. They can in most cases be easily derived from existing rules. 114 6. Truth Values

Properties of Truth and Falsity For truth and falsity, we have the following useful rules (in fact, the corresponding rules hold in every complete lattice):

, (T∧ rule)  T ∧ t ≡ t

, (T∨ rule)  T ∨ t ≡ T

, (F∧ rule)  F ∧ t ≡ F

. (F∨ rule)  F ∨ t ≡ t Let us consider the first one of these, as an example of a simple derivational proof in logic.  t ⇒{∧introduction} • t ⇒{T greatest} T • t ⇒{reflexivity} t • T ∧ t

This establishes that  t ⇒ T ∧ t. The opposite implication (T ∧ t ⇒ t)isan instance of ∧ elimination, so T ∧ t ≡ t follows by antisymmetry.

Two-Valued Logic Another important property is that there are exactly two truth values. This is stated by the following inference rule:

. (Boolean cases)  (t ≡ T) ∨ (t ≡ F) Higher-order logic is thus a two-valued logic. We prove this rule with a derivation, rewriting the goal in the form  T ≡ (t ≡ T) ∨ (t ≡ F). The following derivation establishes the theorem:

 (t ≡ T) ∨ (“ t ≡ F ”) 6.4. Quantification 115

≡{substitution; ≡ F rule} (t ≡ T) ∨¬t ≡{substitution; ≡ T rule} (t ≡ T) ∨¬(t ≡ T) ≡{exhaustiveness} T

Modus Ponens The modus ponens rule is the following:

  t   t ⇒ t . (modus ponens)  ∪   t The proof is as follows, with hypotheses   t and   t ⇒ t:  ∪   T ⇒{T ⇒ rule; first hypothesis} t ⇒{second hypothesis}  t   The derivation establishes ∪  T ⇒ t , from which the required result follows by the T ⇒ rule.

6.4 Quantification

The inference rules postulated above formalize propositional logic, but for quantifi- cation we need additional constants and inference rules. We introduce the following operations for quantification:

∀ : ( → Bool) → Bool , (for all) ∃ : ( → Bool) → Bool . (exists) The quantifiers ∀ and ∃ thus take predicates as their arguments. Quantification provides an example of the advantage of higher-order logic over first-order logic. In first-order logic, we have to build universal quantification into the syntax of formulas. In higher-order logic, universal and existential quantifica- tion are constants. 116 6. Truth Values

Following standard practice, we use binder notation for quantifiers, writing (∀v • t) for ∀.(λv • t) and (∃v • t) for ∃.(λv • t). This notation has the same (low) precedence as λ-abstraction.

Let A be a predicate of some type ; i.e., A :  → Bool. Then the intention is that ∀.A ≡ T if and only if A.x ≡ T for all x in . Also, we want that ∃.A ≡ T if and only if A.x ≡ T for some x in . Because  is assumed to be nonempty, the latter is equivalent to A.x ≡ F not holding for every x in .

Rules for Universal Quantification What rules can we postulate for universal quantification? Consider the set B = {A.x | x ∈ }, i.e., the range of A. This set cannot be empty, because  =∅,so it must be either {F}, {T},or{F, T}. We have that ∧B = T if and only if B ={T}, which again can hold only when A.x = T for each x in . Thus, we see that ∧B ≡∀.A. This suggests that we should use the properties of general meets as the inference rules for universal quantification.

Let t be a boolean term with a possible free occurrence of variable v. Then A = (λv :  • t) is of type  → Bool, i.e., a predicate. Using binder notation, ∀.A is written as (∀v • t). We have that B ={A.x | x ∈ }={(λv • t).x | x ∈ B}= {t[v := x] | x ∈ B}, so any element of B must be of the form t[v := t] for some t. Because ∀.A is also ∧B, we can use the rules for general meets. General meet introduction again gives us the introduction rule for universal quantification:

  s ⇒ t (∀ introduction)   s ⇒ (∀v • t) • v not free in s or .

General meet elimination gives us the elimination rule for universal quantification:

 (∀ elimination)  (∀v • t) ⇒ t[v := t ] • t is free for v in t.

The ∀ introduction rule is also known as generalization, the ∀ elimination rule as specialization. Intuitively, the introduction rule says that if we can prove that s ⇒ t holds for an arbitrary v, i.e., (∀v • s ⇒ t), then we have that s ⇒ (∀v • t). The elimination rule is also straightforward: if we have proved that a property holds for all v, then it must hold for any specific element (denoted by some term t) that we choose to consider. 6.4. Quantification 117

Rule for Existential Quantification For existential quantification we can reason similarly. If B ={A.x | x ∈ }, then ∨B = T if and only if B ={F}, which is again equivalent to A.x = T for some x in . Thus, ∨B ≡∃.A. General join introduction gives us the introduction rule

 (∃ introduction)  t[v := t ] ⇒ (∃v • t) • t is free for v in t.

General join elimination gives us the elimination rule:   t ⇒ s (∃ elimination)   (∃v • t) ⇒ s • v not free in s or . Again, the hypothesis in the elimination rule states that t ⇒ s holds for every value of v in  (because v is not free in s or in ). The introduction rule simply states that if we have proved a property for some element (denoted by a term t), then there exists a v such that the property holds. The elimination rule again says that if we are able to prove that s follows from t without making any assumptions about which value is chosen for v, then the mere existence of some value v that makes t true implies s. The lattice-theoretic arguments for the introduction and elimination rules should be seen as arguments for the of these rules. Formally, we need to postulate the quantifier introduction and elimination rules above for the truth values Bool, because they cannot be derived from the other rules that have been postulated for truth values. Table 6.3 shows some other useful inference rules for reasoning about quantified formulas, which are derivable from the basic introduction and elimination rules that we have presented here. We also define the exists-unique quantifier ∃!by

   .  (∃!v • t) ≡ (∃v • t ∧ (∀v • t[v := v ] ⇒ v = v)) Thus (∃!x • t) means that there exists a unique value x that makes t true.

Bounded Quantification The scope of the bound variable in a universal or existential quantification can be restricted to satisfy some additional condition b. The bounded universal and existential quantification is defined by ∧ (∀v | b • t) = (∀v • b ⇒ t), ∧ (∃v | b • t) = (∃v • b ∧ t). 118 6. Truth Values

 (∀v • v = t ⇒ t ) ≡ t [v := t]  (∃v • v = t ∧ t ) ≡ t [v := t]) (∀, ∃ one-point) • v not free in t • v not free in t

 (∀v • t) ≡ t  (∃v • t) ≡ t (∀, ∃ vacuous) • v not free in t • v not free in t

¬(∀v • t) ≡ (∃v • ¬ t) ¬(∃v • t) ≡ (∀v • ¬ t) (de Morgan)

 (∀v • t ∧ t ) ≡ (∀v • t) ∧ (∀v • t ) (distribute ∧)

 (∃v • t ∨ t ) ≡ (∃v • t) ∨ (∃v • t ) (distribute ∨)

 (∀v • t ⇒ t ) ≡ ((∃v • t) ⇒ t )  (∀v • t ⇒ t ) ≡ (t ⇒ (∀v • t )) (distribute ∀ over ⇒) • v not free in t  • v not free in t

 (∃v • t ⇒ t ) ≡ ((∀v • t) ⇒ t )  (∃v • t ⇒ t ) ≡ (t ⇒ (∃v • t )) • v not free in t  • v not free in t (distribute ∃ over ⇒)

TABLE 6.3. Inference rules for quantified formulas

For instance, we have

(∀x | x > 0 • x/x = 1) ≡ (∀x • x > 0 ⇒ x/x = 1). Note that functions in higher-order logic are always total, so 0/0 is well-defined, but the properties of division may not allow us to conclude anything about the value of 0/0. The inference rules for bounded quantification are generalizations of the basic quantifier rules given earlier and can be derived from these.

6.5 Assumptions

A sequent   t states that t follows from . Hence, assumptions and implication are closely related. In fact, we can consider assumptions as just a convenient device for working with implications. The two basic rules for handling assumptions show how we can move the left-hand side of an implication into the assumption and vice versa:

, t  t , (discharge)   t ⇒ t

  t ⇒ t . (undischarge) , t  t 6.5. Assumptions 119

The first rule is also known as the deduction theorem (following standard practice, we write , t for  ∪{t} in inference rules). Together these two rules show that assumptions and sequents are just a variant of implication, and that we can move between these two representations quite freely. However, it is important to note the difference between an inference and an implication. For example, the rule of generalization allows us to infer  (∀x • t) from  t, but the implication t ⇒ (∀x • t) is generally not a theorem.

Derived Rules We exemplify the above rules of inference by using them in the derivation of some further rules that are useful in practice. The first inference rule says that we may always conclude a formula that is taken as an assumption:

. (assume) , t  t This rule is a direct consequence of the reflexivity of implication and undischarging: {⇒ reflexivity}   t ⇒ t {undischarge} , t  t

The following inference rule formalizes the basic method of splitting up a proof of a theorem t into a subproof where a lemma t is proved first and then later used to prove the main theorem:   t , t  t . (use of lemma)  ∪   t This inference rule is also known as the cut rule. It is variant of the modus ponens inference rule. The proof is a simple application of modus ponens and discharging. We finally give a convenient rule for case analysis: , t  t , ¬ t  t . (case analysis)  ∪   t The proof for this is as follows, with hypotheses , t  t, and , ¬ t  t.  ∪   T ≡{¬exhaustion} t ∨¬t ⇒{∨elimination} 120 6. Truth Values

• t ⇒{undischarge in hypothesis , t  t} t •¬t ⇒{undischarge in hypothesis , ¬ t  t } t  • t

Here we have a derivation with a mix of ≡ and ⇒, and the conclusion is then that  T ⇒ t . In general, we can mix relations in a derivation, and the overall conclusion then states that the initial and the final terms are related by the composition of the relations involved. Composition of relations will be treated formally in Chapter 9. Here it is sufficient to note that a composition of ⇒ and ≡ is ⇒.

As we already mentioned in the previous chapter, higher-order logic is monotonic, in the sense that adding assumptions to a sequent cannot make it false:

  t . (add assumption) , t  t

This rule can be derived using rules of substitution, ≡ T, and discharging.

Another inference rule that we referred to in the previous chapter allowed us to change the free variables in a sequent:

, s  t (change free variables) , s[x := x]  t[x := x] • x is not free in , x free for x in s and t.

This follows from the inference rules for assumption discharging and undischarg- ing, together with the rules for introduction and elimination of universal quantifi- cation.

6.6 Derivations with Local Assumptions

The substitution rule tells us that we can always start a subderivation by focusing on a subterm if the relation that we are working on is equality. There are special situations in which focusing allows us to make local assumptions that depend on the context. 6.6. Derivations with Local Assumptions 121

Focusing Rules for Connectives The following inference rule is easily derived from the rules that have been given earlier (Exercise 6.6): ,  ≡ t t1 t2 .   t ∧ t1 ≡ t ∧ t2

This rule shows that when we focus on one conjunct t1 of a boolean term t ∧ t1,we may use t as an additional assumption in the subderivation. A dual rule shows that when focusing on a disjunct we may assume the negation of the other disjunct: , ¬  ≡ t t1 t2 .   t ∨ t1 ≡ t ∨ t2

Using the fact that implication can be rewritten using negation and disjunction, we immediately get rules for focusing on the subterms of an implication: ,  ≡ , ¬  ≡ t t1 t2 , t t1 t2 .   t ⇒ t1 ≡ t ⇒ t2   t1 ⇒ t ≡ t2 ⇒ t

The following small example shows how local assumptions are used in proofs. We prove the tautology

 p ⇒ ((p ⇒ q) ⇒ q).

The derivation illustrates the rule for focusing on the consequent of an implication.

p ⇒ (“ (p ⇒ q) ⇒ q ”) ≡{replace subterm, use local assumption} • [p] (p ⇒ q) ⇒ q ≡{local assumption says p ≡ T} (T ⇒ q) ⇒ q ≡{T ⇒ rule} q ⇒ q ≡{reflexivity} T • p ⇒ T ≡{T greatest} T

Here the main derivation has no assumptions. In the subderivation, p is added as a local assumption and used when rewriting. 122 6. Truth Values

Focusing and Monotonicity Assume that f is a monotonic function. Then the following inference rule holds for f :     t t .   f.t  f.t Thus, we can focus on the subterm t in f.t and work with the partial order in the subderivation. Here the partial order can be arbitrary (including ⇒ and ⇐ on the booleans). Furthermore, it can be shown that the rules for local assumptions are valid also when working with ⇒ or ⇐ rather than ≡ (Exercise 6.7).

Since the quantifiers are monotonic with respect to implication (and reverse impli- cation), it is possible to focus inside quantifiers. However, it should be noted that if we focus on t in an expression (Qv • t) (where Q is a quantifier or a binder), then no assumptions involving the variable v are available, since the subterm t is not within the scope of that variable v. This problem can always be avoided by α-converting the expression so that v is replaced by a fresh variable (by a fresh variable we mean a variable that does not appear free in any of the terms that we are currently considering).

Note that in the case of antimonotonic functions (such as negation on the booleans) the subderivation works with the ordering in the opposite direction to the main derivation.

As an example, we prove one-half of the one-point rule for the existential quantifier:    (∃v • v = t ∧ t ) ⇒ t [v := t] • v not free in t.

The derivation is as follows, assuming that v is not free in t:  (∃v • v = t ∧ “ t ”) ≡{substitute using local assumption v = t}  (∃v • “ v = t ∧ t [v := t]”) ⇒{∧elimination in monotonic context}  (∃v • t [v := t]) ≡{drop vacuous quantifier; v not free in t[v := t]}  t [v := t]

(in the last step we use another quantifier rule, see Exercise 6.8).

Here we have skipped the details of the subderivation, where we focus both inside a quantifier and inside a conjunction. Since the conjunction is within the scope of the bound variable v, we may use the assumption v = t in the subderivation. 6.7. Summary and Discussion 123

This derivation also illustrates (in the step justified by ∧ elimination) how we do a replacement in a single step, rather than starting a subderivation that would make the proof unnecessarily long.

Focusing and Local Definitions We end this chapter by showing how the idea of focusing allows us to use local definitions as assumptions. The following inference rule shows that if we focus on   the body t of a term let v = t in t , then v = t becomes a local assumption: , v =  ∼  t t1 t1 ,   ( v = ) ∼ ( v =  ) let t in t1 let t in t1 where ∼ is an arbitrary relation (Exercise 6.10). The rules for working with the let construct can be used to introduce (and elim- inate) local procedures and procedure calls in the middle of a derivation. The following derivation shows how a procedure can be introduced by identifying a state transformer and naming it. (a, b := b, a) ; (c, d := d, c) ={introduce vacuous procedure definition} proc Swap = (λx.y • x, y := y, x) in (a, b := b, a) ; (c, d := d, c) ={use local assumption to rewrite body of let construct} proc Swap = (λxy• x, y := y, x) in Swap.a.b ; Swap.c.d ={switch to procedure syntax} proc Swap(var x, y) = (x, y := y, x) in Swap(a, b) ; Swap(c, d) In practice, we can omit some of the intermediate steps of a derivation like this and simply introduce the procedure in one step and the calls to it in another step.

6.7 Summary and Discussion

We have presented the axioms and inference rules for reasoning about truth values. These rules allow us to reason about the standard logical connectives and about quantification. We have also given rules for manipulating assumptions in proofs. We showed how to prove arbitrary formulas with a derivation, using the fact that asserting a formula is the same as asserting that the formula is equivalent to truth. This device allows us to retain the one-to-one correspondence between derivational 124 6. Truth Values

proofs and proofs in the style of natural deduction. The derivations that we give here generalize the traditional calculational style in that they permit manipulation of assumptions inside proofs and permit nested derivations. Andrews’ textbook [3] gives a comprehensive overview of type theory and higher- order logic. The version of higher-order logic that we describe in this book is closer to the version underlying the interactive theorem-proving system HOL [62]. The main difference is that we do not have polymorphic types and that we have chosen a different set of basic axioms and inference rules for the logical operations. For more detailed description and analysis of the semantics of higher-order logic, we refer to these same authors. Both Andrews and Gordon and Melham attempt to formalize higher-order logic with as few constants and inference rules as possible and define most of the logical constants explicitly in terms of the few postulated constants. Gordon and Melham choose as primitive inference rules (variants of) assume, reflexivity of equality, α and β conversion, substitution, abstraction, discharge, and modus ponens plus a rule of type instantiation. In addition, they have five axioms: boolean cases, antisymmetry of implication, η conversion, introduction, and infinity (we will meet the last two of these in the next chapter). Andrews has a slightly different collection of axioms and inference rules for higher-order logic. We have here presented an alternative, lattice-based formalization of higher-order logic. The correspondence between lattice theory and logic is well known and old, dating back to George Boole in the nineteenth century. The advantage of formalizing higher-order logic in this way is that we can build a single collection of inference rules that are applicable across a large range of different domains, from reasoning about logical terms and sets to refinement of program statements and large modular systems. This cuts down on the conceptual overhead that is needed in program construction and clearly identifies the common patterns of reasoning in the different domains of discourse. The idea of using focusing and subderivations in posets (and generally in structures other than Bool) builds on the refinement diagrams of Back [16], the hierarchical reasoning method of Robinson and John Staples [122] (window inference), and Grundy’s extension to this method [68, 69].

6.8 Exercises

6.1 Prove the ≡ T and ≡ F rules using only the general rules of higher-order logic and the basic inference rules for truth values. 6.2 Prove that meet and join in a lattice are commutative, using only antisymmetry and the introduction, elimination, and distributivity rules for meet and join. 6.8. Exercises 125

6.3 Prove the correspondence rule for meet:  t  t ≡ t  t = t using only the basic inference rules for lattices. 6.4 Prove the classical principle of double negation ¬¬t ≡ t using only rules given in this chapter (hint: use case analysis; first prove ¬T ≡ F and ¬F ≡ T). 6.5 Prove that implication can be expressed with negation and disjunction:  t ⇒ t ≡¬t ∨ t . 6.6 Prove the rule for focusing on a conjunct:

, t  t1 ≡ t2   t ∧ t1 ≡ t ∧ t2 (hint: do case analysis on t). 6.7 Prove the rule for focusing on a conjunct under implication: ,  ⇒ t t1 t2 .   t ∧ t1 ⇒ t ∧ t2 Then formulate and prove the corresponding rules for disjunction and implication. 6.8 Derive the rules for dropping vacuous quantifications:  (∀v • t) ≡ t  (∃v • t) ≡ t • v not free in t, • v not free in t. 6.9 Derive the one-point rule for universal quantification:    (∀v • v = t ⇒ t ) ≡ t [v := t] • v not free in t. 6.10 Derive the rules for focusing inside let expressions. This is page 126 Printer: Opaque this This is page 127 Printer: Opaque this 7

Predicates and Sets

In this chapter we show how to formalize predicates in higher-order logic and how to reason about their properties in a general way. Sets are identified with predicates, so the formalization of predicates also gives us a formalization of set theory in higher-order logic. The inference rules for predicates and sets are also special cases of the inference rules for boolean lattices. These structures are in fact complete lattices, so the general rules for meets and joins are also available. We complete our formalization of higher-order logic by giving the final constants and inference rules that need to be postulated: an axiom of infinity and an .

7.1 Predicates and Sets

Let  be some type in higher-order logic. A predicate on  is a function p :  → Bool. The predicates over  thus form a function space. We write P() for this space: ∧ P() =  → Bool . (predicate space)

A predicate p :  → Bool determines a subset A p ⊆ ,defined by A p = {σ ∈  | p.σ ≡ T}. Conversely, any subset A ⊆  determines a predicate pA :  → Bool satisfying pA.σ ≡ T if and only if σ ∈ A. Thus, we have two ways of describing sets, either using the traditional set-theoretic framework or using functions that range over truth values. The pointwise extension of the implication ordering on Bool gives us an ordering of predicates: 128 7. Predicates and Sets

∧ p ⊆ q = (∀σ :  • p.σ ⇒ q.σ ) . ( of predicates) Because Bool is a complete boolean lattice, the predicates over  also form a complete boolean lattice. Corollary 7.1 (P(), ⊆) is a complete boolean and atomic lattice. The operations on predicates are defined by pointwise extension of the operations on truth values: ∧ false.σ = F , (⊥ of predicates) ∧ true.σ = T , ( of predicates) ∧ (p ∩ q).σ = p.σ ∧ q.σ , ( of predicates) ∧ (p ∪ q).σ = p.σ ∨ q.σ , ( of predicates) ∧ (¬ p).σ =¬p.σ . (¬ of predicates)

The operations on predicates are inherited from the operations on truth values, and so is the naming of these operations: false is the identically false predicate and true the identically true predicate. Predicate meet is referred to as conjunction of predi- cates, predicate join is called disjunction of predicates, and predicate complement is called negation of a predicate. Implication and equivalence on predicates are also defined pointwise, by ∧ (p ⇒ q).σ = p.σ ⇒ q.σ , (⇒ of predicates) ∧ (p ≡ q).σ = p.σ ≡ q.σ . (≡ of predicates) Implication (p ⇒ q) and ordering (p ⊆ q) are related by

p ⊆ q ≡ (∀σ • (p ⇒ q).σ ) . A similar relationship holds between equality = and equivalence ≡, namely

p = q ≡ (∀σ • (p ≡ q).σ ) .

Sets as Predicates The predicate ordering coincides with subset inclusion: A ⊆ B if and only if pA ⊆ pB . Predicate false corresponds to the empty set, p∅ = false, and predicate true corresponds to the universal set, p = true. Meet (or conjunction) of predicates is intersection: σ ∈ A ∩ B if and only if (pA ∩ pB ).σ ≡ T, and similarly, join (or disjunction) of predicates is union, and complement (negation) of predicates is set complement. The traditional set-theoretic notation is introduced as an abbreviation in higher- order logic: 7.1. Predicates and Sets 129

∧ {σ :  | t} = (λσ :  • t). (set comprehension) Thus, {σ :  | A.σ } is just syntactic sugar for (λσ :  • A.σ ), which in turn is the same as A (by η-conversion). Similarly, σ ∈ A is just syntactic sugar for A.σ .We often write just  for the universal set U when there is no danger of confusion. Any set in higher-order logic must be a subset of some type . Hence, each type is a universal set for its subsets, so there is an infinite number of universal sets (and an infinite number of empty sets).

Properties of Predicates The general properties for boolean lattices hold for predicates. For instance, ⊆ is a partial ordering, so the transitivity rule holds,   t ⊆ t   t ⊆ t . (⊆ transitive)  ∪   t ⊆ t Similarly, the rule of meet introduction is   s ⊆ t   s ⊆ t . (∩ introduction)  ∪   s ⊆ t ∩ t These properties can be derived in higher-order logic from the corresponding prop- erties for truth values, using the definitions of the operations on predicates. Hence, we need not postulate these properties; they can be proved from the postulates that we have introduced earlier. The difference with truth values is that neither linearity nor theoremhood holds for the predicate lattice. We show how to derive two of the boolean lattice rules for predicates and sets: reflexivity and transitivity of set inclusion ⊆.Reflexivity is proved as follows: A ⊆ A ≡{definition of inclusion} (∀a • A.a ⇒ A.a) ≡{reflexivity of ⇒} (∀a • T) ≡{drop vacuous quantification} T This establishes that A ⊆ A. 130 7. Predicates and Sets

For transitivity, we reason as follows: A ⊆ B ∧ B ⊆ C ≡{definition of inclusion} (∀a • A.a ⇒ B.a) ∧ (∀a • B.a ⇒ C.a) ≡{quantifier rule} (∀a • (A.a ⇒ B.a) ∧ (B.a ⇒ C.a)) ⇒{transitivity of implication; monotonic context} (∀a • A.a ⇒ C.a) ≡{definition of inclusion} A ⊆ C

7.2 Images and Indexed Sets

Let f :  →  be a function. Then we define the image of set A ⊆  under function f by

∧ im. f.A ={y :  | (∃x • x ∈ A ∧ y = f.x)} . (image) The image is thus a subset of . We follow ordinary mathematical practice and write f.A for im. f.A when this cannot cause confusion. The range of a function is then the image of its domain:

∧ ran. f = f. . (range) Note that f.{x}={f.x} for x ∈ , so function application and set formation commute for singletons.

The extension of a function f :  →  to im. f ∈ P() → P() always has an inverse preim. f : P() → P(),defined by

∧ preim. f.B ={x | f.x ∈ B} (preimage) for arbitrary B ⊆ . Again, we follow ordinary mathematical practice and write − the preimage as f 1.B when this cannot cause confusion. The set preim. f.B is called the preimage (or inverse image)ofB under f .

For a fixed f , the image im. f : P() → P() is a function from one lattice to another. Hence, we may ask about the homomorphism properties of this function. These are easily established: the image is monotonic,

  A ⊆ A ⇒ im. f.A ⊆ im. f.A , and it is bottom and join homomorphic, 7.3. Inference Rules for Complete Lattices 131

is  ⊥ ¬ im. f yes yes no no yes no preim. f yes yes yes yes yes yes (λa • (i ∈ I • a.i)) yes yes 1 yes yes no no (λa • ( i ∈ I • a.i)) yes yes yes 1 no yes no ◦ ◦ ◦ ◦ (λI • (i ∈ I • a.i)) yes yes yes no yes no (λI • ( i ∈ I • a.i)) yes yes yes no yes no 1when index set I is nonempty

TABLE 7.1. Homomorphic properties of images

im. f.∅=∅,   im. f.(A ∪ A ) = im. f.A ∪ im. f.A .

For the preimage preim. f : P() → P(), we have even stronger homomorphism properties. The preimage is both bottom and top homomorphic, it is monotonic, and it is meet and join homomorphic. It is even negation homomorphic. The ho- momorphism properties of the image and preimage operation are summarized in Table 7.1. We have used the indexed set notation earlier in an informal way, in connection with meets and joins of arbitrary sets in a poset. Let us now define it formally in terms of images. The indexed set {a.i | i ∈ I } is really a set image. For a :  →  and I ⊆ ,wedefine

∧ {a.i | i ∈ I } = im.a.I . In general, we define

∧ {t | i ∈ I } = im.(λi • t).I for t a term of type . The indexed set is thus determined by an index function (λi • t) and an index range I . The index i itself is a bound variable. The traditional notation {ai | i ∈ I } is also used as an alternative for {a.i | i ∈ I }.

7.3 Inference Rules for Complete Lattices

We gave the inference rules for binary meets and joins in lattices in the preceding chapter. Let us now consider complete lattices where meets and joins are defined for arbitrary sets of elements. A meet takes a set of elements as an argument and returns the greatest lower bound of the elements in the set. It is thus a set operation, 132 7. Predicates and Sets

similar to quantification, whereas binary meet and binary join are operations on pairs of elements. If the set is given as an index set {t | i ∈ I }, then we can use the indexed notation for its meet and join. We define

∧ ( i ∈ I • t) ={t | i ∈ I } , (indexed meet) ∧ ( i ∈ I • t) = {t | i ∈ I } . (indexed join) If I is of the form {i | b}, then we use the notation ( i | b • t) as a shorthand for ( i ∈{i | b} • t), and similarly for joins. The characterizations of meet and join then justify the following inference rules for indexed meets and joins:

, i ∈ I  s  t (general  introduction)   s  ( i ∈ I • t) • i not free in s, ,orI ,

  (general  elimination) t ∈ I  ( i ∈ I • t)  t[i := t ] • t is free for i in t,

  (general introduction) t ∈ I  t[i := t ]  ( i ∈ I • t) • t is free for i in t,

, i ∈ I  t  s (general elimination)   ( i ∈ I • t)  s • i not free in s, ,orI .

Index Range Homomorphisms The indexed meet is a function of the form (im.a.I ). For a fixed a, we have that (λI •  (im.a.I )) : P() → , where  is a complete lattice. Hence, we can ask about the homomorphism properties of this function from the powerset lattice to the lattice .Wefirst notice that the indexed meet is antimonotonic and the indexed join is monotonic in the index range:

I ⊆ J ⇒ ( i ∈ I • a.i)  ( i ∈ J • a.i), (range monotonicity) I ⊆ J ⇒ ( i ∈ I • a.i)  ( i ∈ J • a.i). For the empty range, we have that

( i ∈∅• a.i) =, (empty range) ( i ∈∅• a.i) =⊥. 7.4. Bounded Quantification 133

We can split up an index range into smaller sets using the following range split property:

( i ∈ I ∪ J • a.i) = ( i ∈ I • a.i)  ( i ∈ J • a.i), (range split) ( i ∈ I ∪ J • a.i) = ( i ∈ I • a.i) ( i ∈ J • a.i). These properties follow directly from the homomorphism properties of the meet, join, and range operators. Thus, indexed joins are bottom and join homomorphic onto , while indexed meets are bottom and join homomorphism onto the dual lattice ◦. These properties are summarized in Table 7.1.

Index Function Homomorphims We can also consider the homomorphism properties of the indexed meet  (im.a.I ) when we fix I instead. The function (λa •  (im.a.I )) maps functions in  →  to elements in . Because  is a complete lattice, we may assume that the functions are pointwise ordered. Then (λa •  (im.a.I )) is a function from the complete lattice  →  to the complete lattice , so we can again ask about its the homomorphic properties. These will be generalizations of the homomorphic properties for binary meet and join that we have described earlier. For binary meet, we have that b  b and c  c implies that b  c  b  c and b c  b c; i.e, binary meet is monotonic in both arguments. For an indexed meet and join, this generalizes to

a  b ⇒ ( i ∈ I • a.i)  ( i ∈ I • b.i), (image monotonicity) a  b ⇒ ( i ∈ I • a.i)  ( i ∈ I • b.i). Furthermore, indexed meets and joins are both bottom and top homomorphic:

( i ∈ I • ⊥.i) =⊥, when I =∅ , ( i ∈ I • ⊥.i) =⊥, ( i ∈ I • .i) =, ( i ∈ I • .i) =, when I =∅ . Indexed meet is also meet homomorphic (so it is in fact universally meet homomor- phic), and indexed join is also join homomorphic (universally join homomorphic):

( i ∈ I • (a  b).i) = ( i ∈ I • a.i)  ( i ∈ I • b.i), ( i ∈ I • (a b).i) = ( i ∈ I • a.i) ( i ∈ I • b.i). These results are summarized in Table 7.1.

7.4 Bounded Quantification

Let us write (∀i ∈ I • t) for (∀i | i ∈ I • t) and (∃i ∈ I • t) for (∃i | i ∈ I • t). Then it is easily seen that (∀i ∈ I • t) = (∧ i ∈ I • t) and that (∃i ∈ I • t) = (∨ i ∈ 134 7. Predicates and Sets

I • t). Thus, bounded quantification is a special case of indexed meet and join, and has properties that follow directly from the general properties by specializing. In particular, the rules for introduction and elimination of bounded quantification given in the previous chapter are instances of the general rules for indexed meets and joins, as is easily shown. The homomorphism properties for indexed meets and joins also hold for bounded quantification. For example, universal quantification is antimonotonic, and exis- tential quantification is monotonic in the index range:

I ⊆ J ⇒ ((∀i ∈ I • t) ⇐ (∀i ∈ J • t)) , (range monotonicity) I ⊆ J ⇒ ((∃i ∈ I • t) ⇒ (∃i ∈ J • t)) . If the range is empty, the universal quantification is trivially true and existential quantification is trivially false:

(∀i ∈∅• t) = T , (empty range) (∃i ∈∅• t) = F . The technique for range splitting is also very useful:

(∀i ∈ I ∪ J • t) = (∀i ∈ I • t) ∧ (∀i ∈ J • t), (range split) (∃i ∈ I ∪ J • t) = (∃i ∈ I • t) ∨ (∃i ∈ J • t). The range-splitting and empty-range properties become particularly useful when reasoning about properties of arrays in programs. These properties are all directly derivable in higher-order logic by using the definition of bounded quantification and the quantifier rules. The index function homomorphisms also give us useful rules for reasoning about bounded quantification. The formulation and proof of these are given as an exercise (Exercise 7.4).

General Conjunction and Disjunction The predicate lattice is a complete lattice, so indexed meets and joins are also • available for predicates. We write (∩ i ∈ I pi ) for the indexed meet of a set • {pi | i ∈ I } of predicates and define (∪ i ∈ I pi ) analogously. In set theory we talk about the intersection and union of a family of sets. Indexed meet and join of predicates can be seen as pointwise extensions of bounded quantification:

• • (∩ i ∈ I pi ).σ ≡ (∀i ∈ I pi .σ ) , (general conjunction) • • (∪ i ∈ I pi ).σ ≡ (∃i ∈ I pi .σ ) . (general disjunction) The inference rules for these are the ones for general meets and joins, specialized to predicates. The general ∩ introduction rule is, e.g., 7.5. Selection and Individuals 135

, ∈  ⊆ i I p qi ∩ • (general introduction)   p ⊆ (∩ i ∈ I qi ) • i not free in p, ,orI .

7.5 Selection and Individuals

Higher-order logic needs two more postulates in order to allow a proper formaliza- tion of arithmetic. These are specific properties that we need for the type structure: we need at least one type with an infinite number of elements, and we need to be able to choose one element from any set of elements in a type. These two requirements correspond to the set-theoretic axioms of infinity and choice.

Selection We need to postulate one more constant to get the full power of higher-order logic. This is the selection operator of type

: ( → Bool) → . (select) This operator takes a set (predicate) A of elements in  as its argument and returns an element .A in A if A is not empty. If A is empty, then .A is some (unknown) element in the universal set . We use binder notation for selection and write ( v • t) for .(λv • t). The selection operation embodies the axiom of choice, since it allows us to choose an element from any set. The rules for introducing and eliminating selection reflect the idea that .A denotes an arbitrary element selected from the set A.

, ( introduction)   (∃v • t) ⇒ t[v := ( v • t)]

  t ⇒ s ( elimination)   t[v := ( v • t)] ⇒ s • v is not free in  or s. The first rule states that if there is an element that makes t true, then ( v • t) denotes such an element. The second rule is justified in the same way as the ∃ elimination rule. The selection operator allows us introduce a constant denoting an arbitrary element for any type: ∧ arb = ( x :  • F). (arbitrary element) 136 7. Predicates and Sets

Since the selection operation is always defined, arb always returns some value. However, we have no information about arb :  other than it is of type .

The Type of Individuals In addition to the type Bool, which denotes the truth values, higher-order logic also postulates a standard type Ind, which denotes a set of individuals. The only thing we know about the individuals is that there are infinitely many of them. The axiom of infinity (one of the basic axioms of higher-order logic) is formulated as follows:

 (∃ f : Ind → Ind • oneone. f ∧¬onto. f ). (infinity axiom)

This states that there exists a function f on Ind that is injective but not surjective. This implies that Ind can be mapped injectively onto a proper subset of itself, which is possible only if Ind is infinite. The notion of a function being injective (or one-to-one) and surjective (or onto)isdefined as follows:

∧    oneone. f = (∀xx • f.x = f.x ⇒ x = x ), (injectivity) ∧ onto. f = (∀y • ∃x • y = f.x). (surjectivity) A function that is both injective and surjective is said to be bijective:

∧ bijective. f = oneone. f ∧ onto. f . (bijectivity)

Figure 7.1 illustrates the way in which this requirement forces the interpretation of Ind to be infinite. For the finite set {0, 1, 2, 3}, it is easy to see that there is no way in which we can define a function on this set that is injective but not surjective. However, for an infinite set {0, 1, 2, 3,...}, the figure on the right-hand side gives an example of a function that is injective but not surjective (0 is not in the range of this function). In this case, the function illustrated is the successor function on the natural numbers. Later we show how this idea can be used to define the natural numbers as a set that is isomorphic to a subset of the individuals.

7.6 Summary and Discussion

This chapter has shown how to formalize reasoning about predicates and sets in higher-order logic. The notions of indexed meets and joins were formally defined, and general inference rules for a complete boolean lattice were given in terms of indexed meets and joins. In particular, we considered bounded quantification, which arises as a special case of indexed meets and joins. To our knowledge, the homomorphism properties of indexed meets and joins have not been investigated in detail before. 7.7. Exercises 137

. . . . 4

3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

FIGURE 7.1. Injective functions on finite and infinite sets

The remaining axioms and inference rules of higher-order logic were also pre- sented. These axioms postulate that there exists an infinite set, and that we can choose some element from any set. Individuals were introduced as a new stan- dard type, in addition to Bool. In this respect, we give the same foundation for higher-order logic as Andrews [3] and Gordon and Melham [62]. Church’s orig- inal formulation had an operator for definite description rather than the selection operator [48].

7.7 Exercises

7.1 A set B over a type  is really a function B :  → Bool. This means that we can talk about the image im.B.A of another set A under B. This image is a set of truth values. Show the following:

(a) im.B.B ={T},ifB =∅ .

(b) im.B.∅=∅ . Under what condition is im.B.A ={F}? Under what condition is im.B.A ={F, T}? 7.2 Show that images and inverse images are related as follows:

− − im. f.( f 1.Y ) ⊆ Y and X ⊆ f 1.(im. f.X). 7.3 Show that the complement ¬ in a complete boolean lattice is a bijective function. 7.4 The index-function homomorphism properties for lattices have counterparts for bounded quantification. For example, image monotonicity for meets corresponds to the following rule:

(∀i • p.i ⇒ q.i) ⇒ (∀i ∈ I • p.i) ⇒ (∀i ∈ I • q.i). 138 7. Predicates and Sets

Prove this rule and formulate other similar rules. 7.5 In first-order logic it is common to introduce a Skolem function when one has proved a theorem of the form (∀v • ∃w • t). Such a function f is assumed to satisfy the property (∀v • t[w := f.v]). In higher-order logic, the introduction of a Skolem function can be based on a derived inference rule:

  (∀v • ∃w • t.w) (skolemization)   (∃ f • ∀v • t.( f.v)) • w not free in t. Derive this inference rule using only the rules given in the text. 7.6 Prove the following theorem (the axiom of choice): (∃ f : P() →  • ∀A ∈ P() | A =∅• f.A ∈ A). 7.7 Prove the following theorems:

(a) injective. f ≡ (∃g • g ◦ f = id).

(b) surjective. f ≡ (∃g • f ◦ g = id).

(c) bijective. f ≡ (∃g • g ◦ f = id ∧ f ◦ g = id). Note that (c) is not a trivial consequence of (a) and (b). 7.8 Try to formulate an intuitively appealing definition of the constant finite, where finite.A means that the set A is finite. This is page 139 Printer: Opaque this 8

Boolean Expressions and Conditionals

In the previous chapter we showed how to reason about predicates and sets in general. In programming, predicates arise in two different disguises. First, they are used to describe control structures with branching, like conditional statements and loops. Second, they are used to express correctness conditions for programs. In both cases, the predicates are expressed in terms of program variables, so they are really boolean expressions. In this chapter, we will look more closely at boolean expressions and at how to reason about their properties.

8.1 Boolean Expressions

A boolean expression is an expression of type  → Bool; i.e., it is a predicate on . We usually prefer to use the (pointwise extended) truth-value operations for boolean expressions, writing b1 ∧b2 ∧T ⇒ b3 ∨b4 ∨F rather than the predicate/set notation b1 ∩ b2 ∩ true ⇒ b3 ∪ b4 ∪ false. We are usually interested in the value of a boolean expression b in a specific state σ , and the correspondence between the boolean expression and its value in the state is then very direct:

(b1 ∧ b2 ∧ T ⇒ b3 ∨ b4 ∨ F).σ =

b1.σ ∧ b2.σ ∧ T ⇒ b3.σ ∨ b4.σ ∨ F .

We can also have quantified boolean expressions, such as (∀i ∈ I • b) = (∩i ∈ I • b) and (∃i ∈ I • b) = (∪i ∈ I • b). Then (∀i ∈ I • b).σ ≡ (∀i ∈ I • b.σ ) and (∃i ∈ I • b).σ ≡ (∃i ∈ I • b.σ ). For example, the boolean expression (∃z : Nat • x = y · z) states that the value of attribute x is divisible by the value of attribute y. We use the same abbreviations for quantified boolean expressions as 140 8. Boolean Expressions and Conditionals

we have for quantified truth-value formulas. The boolean expression ¬ (∃yz| y ≥ 2 ∧ z ≥ 2 • x = y · z) contains a bounded quantification and is equivalent to ¬ (∃yz• y ≥ 2 ∧ z ≥ 2 ∧ x = y · z). It states that the value of attribute x is prime. Let e be an expression. We define the base term e associated with the expression e as follows:

x→ = x , y˙ = y , ˙ f .g1. ···.gm = f.g1. ···.gm . For boolean expressions, we also define

(∀i ∈ I • b) = (∀i ∈ I • b), (∃i ∈ I • b) = (∃i ∈ I • b). For each expression e of type  → , this determines a term e of type .For instance, repeated rewriting using the definitions above gives

(x→Nat − 1) + (y→Nat + z) = (xNat − 1) + (yNat + z).

The value of an expression e in a state σ can be determined using the base term e, as shown by the following lemma. Note that in the lemma, the substitution replaces a variable x by the term x→.σ (the value of the access function of attribute x in state σ).

Lemma 8.1 Let x1,...,xm be the attributes of expression e. Then

e.σ = e[x1,...,xm := x1.σ,...,xm .σ ] .

Proof Assume that the substitution x1,...,xm := x1.σ,...,xm .σ is written as x := x.σ . We prove this theorem scheme by induction over the structure of the expression e. When e is the access function of attribute x, then we have that x.σ = x[x := x.σ ]. When e is y˙, we have that y˙.σ = (λσ • y).σ = y = y[x := x.σ ]. ˙ Consider next the case when e is an application f .e1. ···.em . We then have that ˙ f .e1. ···.em .σ ={definition of pointwise extension}

f.(e1.σ ). ···.(em .σ ) ={induction hypothesis}

f.(e1[x := x.σ]). ···.(em [x := x.σ ]) ={property of substitution, f is a constant}

( f.e1. ···.em )[x := x.σ ] The case when e is a quantified boolean expression is proved in a similar way. 2 8.2. Reasoning About Boolean Expressions 141

Note that Lemma 8.1 states that a certain property holds for all expressions. Since the property of being an expression is not defined inside higher-order logic, the proof is necessary informal, in the sense that it is outside the logic. The rule in Lemma 8.1 is a theorem scheme, and e is a metavariable that can be instantiated to any expression (but not to an arbitrary term of higher-order logic).

8.2 Reasoning About Boolean Expressions

The properties that we want to prove about programs can often be reduced to proving inclusion b ⊆ c or equality b = c between two boolean expressions. Here we will show how to reduce such inclusions and equalities to properties for the corresponding base terms b and c.

Corollary 8.2 Let x1,...,xm be the attributes of the boolean expression b. Then

• b.σ ≡ (∃x1 ··· xm x1 = x1.σ ∧···∧xm = xm .σ ∧ b). This corollary to Lemma 8.1 allows us to reduce inclusion between boolean ex- pressions to implication between the corresponding base terms. Consider as an example how to establish that (x ≤ y) ⊆ (x < y + 2). Intuitively, this says that whenever condition x ≤ y holds (in a state), condition x < y + 2 also holds (in this same state). We have that x ≤ y ={set theory} {σ | (x ≤ y).σ} ={Corollary 8.2} {σ |∃xy• x = x.σ ∧ y = y.σ ∧ “ x ≤ y ”} ⊆{focus on boolean term} • x ≤ y ⇒{arithmetic} x < y + 2 • {σ |∃xy• x = x.σ ∧ y = y.σ ∧ x < y + 2} ={Corollary 8.2} x < y + 2 Notice how the corollary allows us to focus on proving an implication between boolean terms in order to establish inclusion between the boolean expressions. In the same way, we can establish equality of two boolean expressions by proving equivalence for their corresponding base terms. We refer to this way of establishing inclusion or equality of a boolean expression as reduction. 142 8. Boolean Expressions and Conditionals

In practice, this level of detail in a proof is unnecessary. We will omit the details of the reduction in subsequent proofs. The above derivation is then x ≤ y ⊆{reduction} • x ≤ y ⇒{arithmetic} x < y + 2 • x < y + 2 The fact that the step is justified as a reduction signals that the inner derivation is done on boolean formulas and not on boolean expressions. In simple cases like this one, we will usually omit the nested derivation altogether and make implicit use of reduction: x ≤ y ⊆{arithmetic} x < y + 2 Here the use of ⊆ (rather than ⇒) shows that we are relating two boolean expres- sions rather than two boolean terms.

Alternative Reduction We can also reduce b ⊆ c directly to an implication between base terms, as shown by the following lemma.

Lemma 8.3 Let b and c be boolean expressions, and let x = x1,...,xm be the attributes that occur in these. Then

• (a)  (∀x1 ···xm b ⇒ c) ⇒ b ⊆ c , • (b) var x1,...,xm  (∀x1 ···xm b ⇒ c) ≡ b ⊆ c .

Proof Let x := x.σ stand for x1,...,xm := x1.σ,...,xm .σ . We have for (a) that

•  (∀x1 ...xm b ⇒ c) ⇒{specialization; σ a fresh variable} b[x := x.σ ] ⇒ c[x := x.σ ] ≡{substitution property} b.σ ⇒ c.σ and the result then follows by generalization. For (b), we also have to prove the reverse implication: 8.3. Conditional Expressions 143

var x1,...,xm  b ⊆ c ≡{definition of inclusion} (∀σ • b.σ ⇒ c.σ) ≡{substitution property} (∀σ • b[x := x.σ] ⇒ c[x := x.σ ])  ⇒{specialize, σ := (set.x1.x1 ; ···; set.xm .xm ).σ }   b[x := x.σ ] ⇒ c[x := x.σ ]  ≡{property of state attributes: xi .σ = xi } b ⇒ c

The result then follows by generalization, as the variables x1,...,xm are not free in b and c, nor in var x1,...,xm . 2

As an example, consider again establishing (x ≤ y) ⊆ (x < y + 2). We have that

var x, y  (x ≤ y) ⊆ (x < y + 2) ≡{reduction} (∀xy• x ≤ y ⇒ x < y + 2) ≡{arithmetic} T This form of reduction is simpler to use when we want to establish as inclusion like (x ≤ y) ⊆ (x < y + 2) as a single formula, whereas the first form of reduction is more convenient when we want to derive a superset (x < y + 2) from a subset (x ≤ y) or vice versa.

8.3 Conditional Expressions

The conditional construct, usually denoted by if ... then ... else ... fi, is of cen- tral importance in mathematics as well as in programming languages. In math- ematics, it allows us to define functions by cases. In programs, it permits us to choose between alternative computation paths depending on the present state of the computation. In this section, we define a conditional construct that allows us to define functions by cases in general. Then we show how to extend this to conditional expressions and conditional state transformers. 144 8. Boolean Expressions and Conditionals

Conditionals Let us define a new constant, the conditional cond : Bool →  →  → . The intended meaning is that cond.t.t1.t2 is t1 if t ≡ T and t2 otherwise: ∧ cond.t.t1.t2 = (conditional) • ( x ((t ≡ T) ⇒ x = t1) ∧ ((t ≡ F) ⇒ x = t2)) .

We can make the conditional operator more palatable by adding some syntactic sugar: ∧ if t then t1 else t2 fi = cond.t.t1.t2 . The if ... fi construct makes conditionals easier to read and manipulate. For this to work, we need inference rules for the new construct. Three basic inference rules are the following (Exercise 8.2):   t ,   if t then t1 else t2 fi = t1  ¬ t ,   if t then t1 else t2 fi = t2

   .  if t then t else t fi = t

The conditional operator is needed to define functions by cases. For instance, we can define the maximum function on natural numbers as follows: ∧ max.x.y = if x ≤ y then y else x fi . The rules above then allow us to deduce, e.g., that max.x.(x + 2) ={definition of max} if “ x ≤ x + 2”then x + 2 else x fi ={arithmetic} if T then x + 2 else x fi ={conditional rule} x + 2

Similarly as for let terms, the condition (or its negation) becomes a local assumption when focusing on either branch of a conditional. This is justified by the following rules: ,  ∼  t t1 t1   ( ) ∼ (  ) if t then t1 else t2 fi if t then t1 else t2 fi 8.3. Conditional Expressions 145

and , ¬  ∼  t t2 t2 ,   ( ) ∼ (  ) if t then t1 else t2 fi if t then t1 else t2 fi where ∼ is an arbitrary relation.

Conditional Expressions and State Transformers The conditional operator is extended pointwise as follows. Let b be a predicate on , and let f :  →  and g :  →  be two state functions. Then we define the conditional state function if b then f else g fi :  →  by ∧ (if b then f else g fi).σ = if b.σ then f.σ else g.σ fi . When b, f , and g are expressions, then we refer to this construct as a conditional expression. When f and g are state transformers, then we call this a conditional state transformer. The following state transformer sets attribute x to the maximum of attributes x and y: if x ≤ y then x := y else id fi . An example of a conditional expression is, e.g., if x ≤ y then y else x fi , which gives the maximum of attributes x and y directly. The following result shows how to establish equality of two conditional state transformers with the same condition:

Theorem 8.4 Let f , f , g, and g be state transformers and b a predicate. Then   (if b then f else g fi) = (if b then f else g fi) ≡   b ⊆ ( f = f ) ∧¬b ⊆ (g = g ).

This follows directly from the definitions and the inference rules for conditionals. The condition b ⊆ ( f = f ) is a kind of conditional equality. It says that the functions f and f  have to have the same values in a state whenever the state satisfies b. We can identify a number of other useful equality rules for working with conditional expressions and conditional state transformers.

Theorem 8.5 The following rules hold for conditional state functions:

(a) if b then f else f fi = f ,

(b) if b then f else g fi = if ¬ b then g else f fi , 146 8. Boolean Expressions and Conditionals

(c) if b then f else g fi ; h = if b then f ; h else g ; h fi ,

(d) h ; if b then f else g fi = if h ; b then h ; f else h ; g fi . We leave the proof of these equalities as exercises (Exercise 8.4). In the last rule we use the fact that h ;b is a predicate (but not a boolean expression) if h is a state transformer and b is a predicate. By noting that (x := e) ; b = b[x := e] (Exercise 8.5), we get a special case of the last rule that uses only boolean expressions: x := e ; if b then f else g fi = if b[x := e] then x := e ; f else x := e ; g fi .

8.4 Proving Properties About Conditional State Transformers

Let us add conditional state transformers to the straight-line programs that we have described earlier. The language is thus defined by the following syntax:

f ::= id | x := e | f1 ; f2 | if b then f1 else f2 fi . Here x is a list of distinct state attributes, e a corresponding list of expressions, and b a boolean expression. An obvious property that we want to prove of a straight-line program is that it is equal to another one. As an example, consider the simple program x := x +1;x := x − 1. Let us prove that it is really the identity state transformer. We have that var x : Nat  (x := x + 1) ; (x := x − 1) ={merge assignments (Theorem 5.3)} (x := “ (x + 1) − 1”) ={functionality of assignment, arithmetic} (x := x) ={property (e)} id The second step uses the functionality of assignment, i.e., that e = e ⇒ (x := e) = (x := e).

Pointwise Functionality Any function f is a congruence with respect to equality; a = b implies that f.a = f.b. In many cases, functionality is not sufficient, and we need a stronger property. Let us say that a function f is pointwise functional if 8.4. Proving Properties About Conditional State Transformers 147

  (∀σ • e.σ = e .σ ⇒ f.e.σ = f.e .σ ) . (pointwise functionality) This condition can be written as

(e = e) ⊆ ( f.e = f.e). The equalities that occur in these expressions are pointwise extended operations (the typing would not be correct otherwise). Pointwise extended functions f˙ are always pointwise functional: if e.σ = e.σ , then   f˙.e.σ = f.(e.σ ) = f.(e .σ ) = f˙.e .σ . This means that expressions in general are pointwise functional. We generalize this to arbitrary components of expressions.

Lemma 8.6 Let e and e be two expressions and let h be another expression. Then

(e = e) ⊆ (h[x := e] = h[x := e]).

The proof is by induction on the structure of expressions. As an example, we have that

(x + y = z) ⊆ ((x + y) ∗ z = z ∗ z).

The assignment function x := e is not a pointwise extended function, but it is pointwise functional:

(e = e) ⊆ (x := e = x := e). This follows directly from the definition of assignment. We can use this property to reason about equivalence of assignments on certain states. For instance, we have that x ≤ y ⊆{reduction: (∀xy• x ≤ y ⇒ x max y = y)} x max y = y ⊆{pointwise functionality of assignment} (x := x max y) = (x := y) so the state transformers x := x max y and x := y are equal in those states where x ≤ y . Sequential composition is pointwise functional in its first argument:

( f = f ) ⊆ ( f ; g = f  ; g). 148 8. Boolean Expressions and Conditionals

However, sequential composition is not pointwise functional in its second argu- ment. Instead, we have that (g = g).( f.σ ) ⇒ ( f ; g = f ; g).σ . As we compare the functions in two different states, pointwise functionality does not hold. Conditional state functions (expressions and state transformers) are pointwise extended operations, so they are also pointwise functional in all arguments:    (b = b ∧ f = f ∧ g = g ) ⊆    if b then f else g fi = if b then f else g fi .

Example Proof Let us illustrate these proof rules by showing that the little straight-line program for computing the maximum of two program variables that we have given above is correct, i.e., that it is the same state transformer as (x := x max y): (x := x max y) ={conditional expression rule} if x ≤ y then (x := x max y) else (x := x max y) fi ={equality of conditionals, generalization} • x ≤ y ⊆{derivation above} (x := x max y) = (x := y) • x > y ⊆{reduction and assignment property} (x := x max y) = (x := x) • if x ≤ y then x := y else “ x := x ” fi ={property (e) of attributes} if x ≤ y then x := y else id fi

8.5 Summary and Discussion

We have shown how to reason about boolean expressions and conditional straight- line programs. We have shown how to reduce properties about boolean expressions to properties about ordinary boolean terms. An alternative way of handling predicates is taken by Dijkstra and Scholten in [54]. They build a theory of boolean expressions ab initio, by defining the syntax for these directly and postulating a collection of properties that boolean expressions satisfy. 8.6. Exercises 149

They take pointwise functionality (which they call punctuality) as the basic rule, in combination with the everywhere operator [b], which corresponds to universal quantification in our framework: [b] ≡ (∀σ • b.σ ). With these correspondences, the laws for predicate expressions derived by Dijkstra and Scholten can be derived in our framework too. The approach that we have chosen here is closer to traditional logic, in that we compare boolean expressions directly for inclusion or equality and use the reduction theorem to move between reasoning on the predicate level and reasoning on the truth-value level.

8.6 Exercises

8.1 Show that if t1 and t2 are boolean terms, then

if t then t1 else t2 fi ≡ (t ∧ t1) ∨ (¬ t ∧ t2). 8.2 Derive the three inference rules for the if ... fi construct given in the text. 8.3 Derive the two focusing rules for the if ... fi construct given in the text. 8.4 Prove the rules for conditional state transformers given in Theorem 8.5. 8.5 Show that if b is a boolean expression, then (x := e) ; b = b[x := e]. 8.6 Show that conditionals on two levels are related as follows:

if b then x := e1 else x := e2 fi = (x := if b then e1 else e2 fi). 8.7 Generalize Theorem 8.4 to permit the conditions to be different. This is page 150 Printer: Opaque this This is page 151 Printer: Opaque this 9

Relations

Let us next look at how relations are handled in higher-order logic. Relations, like predicates, form a complete boolean lattice and thus have essentially the same set of inference rules as predicates. We can define new operations on relations, giving additional properties. In particular, relations combine a lattice and a category structure. Relations also provide us with a more expressive way of describing state changes, where the effect can be nondeterministic in some states and undefined in other states.

9.1 Relation Spaces

We define the relation space from  to , denoted by  ↔ ,by

∧  ↔  =  → P() . (relation space) A relation between  and  is in set theory seen as a subset r ⊆  × . The set-theoretic view and the function view taken here are linked in a simple way. The set r determines a function Pr :  → P(),defined by Pr .a.b = T if and only if (a, b) ∈ r. Conversely, the function P :  → P() determines the set rP ⊆  × ,byrP ={(a, b) | P.a.b = T}.

Lattice of Relations The relation space arises as a pointwise extension of the predicate space. For two relations P, Q ∈  ↔ , the partial ordering is defined by 152 9. Relations

∧ P ⊆ Q = (∀σ :  • P.σ ⊆ Q.σ ) . ( of relations) It is easy to see that P ⊆ Q is equivalent to (∀σγ• P.σ.γ ⇒ Q.σ.γ ). By the pointwise extension theorem, the relation space inherits the property of being a complete boolean lattice from the predicate lattice. The relation space is also atomic. Theorem 9.1 ( ↔ , ⊆) is a complete, boolean, and atomic lattice. The lattice operations on relations are also the pointwise extensions of the corre- sponding operations on predicates. We define these as follows, for σ : : ∧ False.σ = false , (⊥ of relations) ∧ Tr ue.σ = true , ( of relations) ∧ (P ∩ Q).σ = P.σ ∩ Q.σ , ( of relations) ∧ (P ∪ Q).σ = P.σ ∪ Q.σ , ( of relations) ∧ (¬ P).σ =¬P.σ . (¬ of relations) The naming here is similar to that for predicates: False is the identically false rela- tion, Tr ue the identically true relation, and we talk about the conjunction, disjunction and negation of relations. The inference rules for boolean lattices are valid for relations. Similarly as for predicates, we can derive these inference rules from the corresponding rules for predicates and truth values, so we do not need to postulate them. In a set-theoretic interpretation, False is the empty relation, and Tr ue is the universal relation that relates any two elements to each other. Relation ¬ P is the complement of the relation P, while P ∩ Q is the intersection of two relations P and Q, and P ∪ Q is the union of these two relations, as would be expected. Since relations form a complete lattice, indexed meets and joins are also available • for relations. We write (∩ i ∈ I Ri ) for the indexed meet of a set {Ri | i ∈ I } • of relations, and (∪ i ∈ I Ri ) for the indexed join. The indexed meet and join of relations are pointwise extensions of the corresponding constructs for predicates:

• • (∩ i ∈ I Ri ).σ = (∩ i ∈ I Ri .σ ) , (general conjunction) • • (∪ i ∈ I Ri ).σ = (∪ i ∈ I Ri .σ ) . (general disjunction) The inference rules for these are again the rules for general meets and joins, specialized to relations.

Other Operations on Relations Relations form a rather rich domain, and there are a number of other useful operations that we can define on them. We consider here the most important ones. 9.2. State Relation Category 153

The identity relation and composition of relations are defined as follows: ∧ Id.σ.γ = σ = γ, (identity) ∧ (P • Q).σ.δ = (∃γ • P.σ.γ ∧ Q.γ.δ) . (composition) As with functional composition, the range type of P must be the same as the domain type of Q for the typing to be correct. The inverse of relation R is R−1,defined by ∧ R−1.σ.γ = R.γ.σ . (inverse relation)

9.2 State Relation Category

A state relation category has state spaces as objects and state relations as mor- P phisms. Thus,  −→  means that P is a relation of type  ↔ . The identity relations are identity morphisms, and composition is forward relational composition: ∧ 1 = Id , (1 for state relations) ∧ P ; Q = P • Q . (; for state relations)

A special kind of state relation category is RelX , which has all types in X as objects and all relations between two objects as morphisms; i.e., ∧ RelX (, ) =  ↔ .

We write Rel for RelX when the collection X of state spaces is not important. We leave it to the reader to verify that Rel is indeed a category (Exercise 9.5), i.e., that (P ; Q) ; R = P ; (Q ; R), (; associative) Id ; R = R and R ; Id = R . (Id unit) It is easy to see that composition is monotonic with respect to relation ordering: P ⊆ P ∧ Q ⊆ Q ⇒ P ; Q ⊆ P ; Q . (; monotonic)

We summarize the results about relations in the following theorem.

Theorem 9.2 RelX forms a complete atomic boolean lattice-enriched category. Furthermore, composition distributes over bottom and join, both from left and right:

P ; False = False , (False distributivity) False ; P = False , P ; (Q ∪ R) = P ; Q ∪ P ; R , (∪ distributivity) (P ∪ Q) ; R = P ; R ∪ Q ; R . 154 9. Relations

Composition does not, however, distribute over meets and not over top (Exercise 9.6).

9.3 Coercion Operators

Wecan construct relations from predicates and functions using coercion operations. Let p :  → Bool and f :  → .Wedefine the test relation |p| :  ↔  and the mapping relation | f | :  ↔ ,by

∧ |p|.σ.γ = σ = γ ∧ p.σ , (test) ∧ | f |.σ.γ = f.σ = γ. (mapping) Thus, |p| is a subset of the identity relation and holds where p is satisfied, while | f | is the relation that describes the function f . These operations are useful when we want to express conditions involving pred- icates, functions, and relations. Using the |·|operators, we can often write the conditions on the relational level. For example, the expression p.σ ∧ R.σ.γ can be written as (|p| ; R).σ.γ , as shown by the following derivation: (|p| ; R).σ.γ ≡{definition of composition}    (∃σ • |p|.σ.σ ∧ R.σ .γ ) ≡{definition of test relation}    (∃σ • σ = σ ∧ p.σ ∧ R.σ .γ ) ≡{one-point rule} p.σ ∧ R.σ.γ

The test constructor |·|: P() → Rel(, ) is easily seen to be monotonic. In addition, it preserves bottom, positive meets and universal joins. However, it does not preserve top or negation. Furthermore, we have the following properties:

|true|=Id , |p ∩ q|=|p| ; |q| .

It is also easily checked that the mapping relation operator preserves identity and sequential composition:

|id|=Id , | f ; g|=|f | ; |g| .

The domain and range of a relation are defined as follows: 9.4. Relational Assignment 155

is  ⊥ ¬ 1; (λp • |p|) yes yes no yes yes no yes yes (λf • | f |) yes ----- yes yes (λR • dom.R) yes yes yes no yes no yes no (λR • ran.R) yes yes yes no yes no yes no

TABLE 9.1. Homomorphic properties of coercion operations

∧ dom.R.σ = (∃γ • R.σ.γ ) , (domain) ∧ ran.R.γ = (∃σ • R.σ.γ ) . (range)

The operations dom.R and ran.R, which give the domain and range of a relation R, are both monotonic with respect to the relation ordering, and they are bottom, top, and join homomorphic (but not meet homomorphic). These homomorphism properties are summarized in Table 9.1.

Partial Functions A relation R :  ↔  is said to be deterministic (or functional) if every σ ∈  is related to at most one γ ∈ , i.e., if

   (∀σγγ • R.σ.γ ∧ R.σ.γ ⇒ γ = γ) . (deterministic relation) Deterministic relations model partial functions, which thus form a subset of rela- tions. The empty relation False is a partial function, as are the test |p| and map | f |. Composition of relations preserves determinism, and so does intersection. Re- lations described in terms of these constants and operations are therefore always deterministic.

9.4 Relational Assignment

The assignment notation is very convenient for describing functions that change the values of attributes of a state. We can extend the assignment notation in a rather straightforward way to relations between states with attributes. Assume that x :  →  is an attribute and b a boolean expression of type  with possible free occurrences of a fresh variable x : .Wedefine the relational assignment (x := x | b) as the following state relation:

 ∧  (x := x | b).σ ={set.x.x .σ | b.σ } . (relational assignment) 156 9. Relations

An example of a relational assignment is   (x := x | x ≥ x + y).     This assignment relates states σ and σ = set.x.x .σ whenever x satisfies x ≥ val.x.σ + val.y.σ . Note that we could as well write this assignment as (x := z | z ≥ x + y). The use of primed variables in relational assignments is merely a traditional way of indicating new values for program variables. The above definition of the relational assignment is slightly awkward to use in proofs, so we often use the following alternative characterization:

Lemma 9.3 The relational assignment has the following property:      (x := x | b).σ.σ ≡ (∃x • σ = set.x.x .σ ∧ b.σ ) .

Proof The proof is a simple derivation:   (x := x | b).σ.σ ≡{definition of relational assignment}   σ ∈{set.x.x .σ | b.σ } ≡{set membership}    (∃x • σ = set.x.x .σ ∧ b.σ } 2

The relational assignment is also easily generalized to multiple relational assign- ment, which permits two or more state components to be changed simultaneously. ( ,..., =  ,...,  | ).σ =∧ x1 xm : x1 xm b {( . .  ··· . .  ).σ | .σ } , set x1 x1 ; ; set xm xm b

where we assume that x1,...,xm are distinct attributes.

Basic Properties of Relational Assignment The relational assignment gives us a convenient way of expressing relations in terms of program variables. When working with relational assignments, we can use properties of relations, but in order to make this work smoothly, we need properties stated explicitly in terms of assignment statements. We start with a number of lattice-oriented homomorphism properties that reduce concepts from the level of relations to the level of boolean expressions inside relational assignments:

Theorem 9.4 Assume that x is a list of program variables and that b and c are boolean expres- sions. Then 9.4. Relational Assignment 157

 (a) var x  (x := x | F) = False .

   (b) var x  (x := x | b ∧ c) = (x := x | b) ∩ (x := x | c).

   (c) var x  (x := x | b ∨ c) = (x := x | b) ∪ (x := x | c).

   (d) var x  (x := x | b) ⊆ (x := x | c) ≡ (∀x • b ⊆ c).

Proof We show the proof of case (d), for the case when x is a single state attribute. The generalization to the case when x is a list of variables is straightforward.   (x := x | b) ⊆ (x := x | c) ≡{definition of ⊆, Lemma 9.3}     (∀σγ• (∃x • γ = set.x.x .σ ∧ b.σ ) ⇒ (∃x • γ = set.x.x .σ ∧ c.σ )) ≡{quantifier rules}     (∀σγx • γ = set.x.x .σ ∧ b.σ ⇒ (∃x • γ = set.x.x .σ ∧ c.σ )) ≡{one-point rule (rename captured variable)}       (∀σ x • b.σ ⇒ (∃x • set.x.x .σ = set.x.x .σ ∧ c[x := x ].σ ))     ≡{state attribute properties say set.x.x .σ = set.x.x .σ ≡ x = x }       (∀σ x • b.σ ⇒ (∃x • x = x ∧ c[x := x ].σ )) ≡{one-point rule}  (∀σ x • b.σ ⇒ c.σ) ≡{pointwise extension}  (∀x • b ⊆ c) Cases (a)–(c) are proved in a similar style. 2

The rules for meet and join are easily extended to arbitrary meets and joins:

 • •  var x  (x := x | (∀i bi )) = (∩ i ∈ I (x := x | bi )) ,  • •  var x  (x := x | (∃i bi )) = (∪ i ∈ I (x := x | bi )) . However, there are no similar results for negation or top. The rules in Theorem 9.4 require that the relational assignments involved all update exactly the same variables. When reasoning about assignments that work on differ- ent collections of variables we can use the following rule to make the assignments compatible:

Lemma 9.5 Assume that x and y are lists of program variables and that b is a boolean expression in which y does not occur free. Then     var x, y  (x := x | b) = (x, y := x , y | b ∧ y = y). 158 9. Relations

Proof The proof follows the idea of the proof of Theorem 9.4, rewriting with definitions and using state attribute properties. 2

Next consider properties of sequential composition and identity:

Theorem 9.6 Assume that x is a list of attributes. Then

  (a) var x  (x := x | x = x) = Id .

  (b) var x  (x := x | b) ; (x := x | c) =      (x := x | (∃x • b[x := x ] ∧ c[x := x ])) . The proof is again a straightforward application of definitions and state attribute properties, so we omit it. Finally, we show how the coercions between predicates and functions on the one hand and relations on the other hand are reflected in the assignment notation. We omit the proof, since it is done the same way as in the previous lemmas.

Theorem 9.7 Assume that x is a list of program variables, e is an expression, and b and c are boolean expressions. Furthermore, assume that x does not occur free in b or e. Then

  (a) var x |b|=(x := x | b ∧ x = x).

  (b) var x |x := e|=(x := x | x = e).

  (c) var x  dom.(x := x | c) = (∃x • c).

Derived Properties of Relational Assignments From Theorem 9.6 follow two rules for independent relational assignment.

Corollary 9.8 Assume that x and y are disjoint (lists of) attributes. (a) If y does not occur free in b, and x does not occur free in c, then   var x, y  (x := x | b) ; (y := y | c) =    (x, y := x , y | b ∧ c[x := x ]). (b) If x and x do not occur free in c, and y and y do not occur free in b, then     var x, y  (x := x | b) ; (y := y | c) = (y := y | c) ; (x := x | b).

Two other special cases of Theorem 9.6 (b) are the following rules for adding leading and trailing assignments:

Corollary 9.9 (a) If x and y are not free in e, then     var x, y  (x, y := x , y | b[x := e]) =|x := e| ; (x, y := x , y | b). 9.5. Relations as Programs 159

(b) If x is not free in b and y is not free in e, then      var x, y  (x, y := x , y | b ∧ x = e) = (y := y | b) ; |x := e[y := y]| .

Example: Dividing a Problem into Subproblems The rule in Theorem 9.6 (b) can be used to divide a problem into two (smaller) subproblems. The following simple example illustrates this:       (x, y := x , y | factor.x .x ∧ factor.x .y ∧ m ≤ (x · y ) ≤ n) =       (x, y := x , y | factor.x .x ∧ factor.x .y) ; (y := y | m ≤ (x · y ) ≤ n). The original assignment requires us to set x and y to new values x and y such that x is a factor in both x and y and the product x · y is within certain bounds. This can be done by first finding a suitable value for x (and allowing y to change arbitrarily) and then finding a suitable value for y. The following derivation justifies the equality:       (x, y := x , y | factor.x .x ∧ factor.x .y) ; (y := y | m ≤ (x · y ) ≤ n) ={make assignments compatible}     (x, y := x , y | factor.x .x ∧ factor.x .y) ;     (x, y := x , y | x = x ∧ m ≤ (x · y ) ≤ n) ={composition of relational assignments}   (x, y := x , y |         (∃x y • factor.x .x ∧ factor.x .y ∧ x = x ∧ m ≤ (x · y ) ≤ n)) ={one-point rule, vacuous quantification}       (x, y := x , y | factor.x .x ∧ factor.x .y ∧ m ≤ (x · y ) ≤ n)

9.5 Relations as Programs

Relations are more expressive than (total) functions and are therefore better suited to model programs. The simplest program model (apart from a total function) is that of a partial function, which permits us to say that a program is defined only for certain initial states. For instance, we might want to say that the assignment x := x/y is defined only in an initial state where y = 0. Similarly, we can model nontermination of a program with undefinedness. This is a rather classical approach to modeling programs. With relations we can also model nondeterminism. In this case, a program may have two or more possible final states, but we do not know which one is going to be chosen (demonic nondeterminism), or we can ourselves choose the final state (angelic nondeterminism). 160 9. Relations

We can also make the syntax of relations more program-like by adding condi- tional state relations, in the same way as we defined conditional state transformers. Consider first the relation |p| ; R,where p is a predicate. The intuition is that this relation maps initial states in p as R does, but initial states outside p are mapped to the empty set. We then define the conditional state relation if p then P else Q fi by using this as a building block: ∧ if p then P else Q fi =|p| ; P ∪|¬p| ; Q . Note that conditional relations preserve determinism (Exercise 9.4). We can use a guarded relation to define a conditional relation with an arbitrary number of alternatives that need not be mutually exclusive:

|p1| ; R1 ∪···∪|pm | ; Rm .

Here pi and p j may both hold for some state, and thus both relations are applicable in that state. We can also define iteration with relations. We define the reflexive transitive closure of a relation R, denoted by R∗,by

∗ ∧ R = (∪ i | i ≥ 0 • Ri ), where ∧ R0 = Id , ∧ Ri+1 = R ; Ri , i = 0, 1, 2 ... .

We can then define an iterated relation of the form while p do R od by

∧ ∗ while p do R od = (|p| ; R) ; |¬ p| .  It is easy to see that (while p do R od).σ.σ holds if and only if there exists a  sequence of states σ = σ0,σ1,...,σn = σ such that p.σi ∧ R.σi .σi+1 holds for  i = 0, 1,...,n − 1 and ¬ p.σn holds (i.e., if σ can be reached from σ by iterating R as long as condition p holds). If condition p always holds for states reachable by R from some initial state σ , then (while p do R od).σ will be empty. Thus, we model nontermination with partiality. Let us collect the above constructs into a language in which to express state relations. We define this as follows: = || ||( =  | ) | | ∪ | ∗ . R :: Id b x : x b R1 ; R2 R1 R2 R1 Here b is a expression, while x is a list of state attributes. This language permits nondeterministic choice in the assignment, the union, and the iteration. By replacing these with functional assignment, deterministic conditional, and while loops, we have a standard deterministic programming language where the constructs are interpreted as partial functions. The syntax for this is as follows: 9.6. Correctness and Refinement 161

R ::= Id ||b|||x := e||R1 ; R2 | if b then R1 else R2 fi | while b do R od . Here the test relation and the iteration introduce partiality into the language. Blocks and procedures were defined for state transformers and can obviously also be defined for relations in a similar manner. Thus, we can introduce a relational block with local variables (see Exercise 9.12). Similarly, we can define a relational procedure, i.e., a name for a relation that can be called using reference and value parameters. However, we will not make use of these constructs on the the relational level. We return to blocks and procedures on the program statement level in later chapters.

9.6 Correctness and Refinement

Let us go back to our original goal: to model the behavior of agents whose inter- action is bound by contracts. Relations allow us to model contracts where only one agent is permitted to make choices. Consider first the situation where only our agent can make choices. We assume that the contract statement is expressed as a state relation in the language above, where we interpret R1 ∪ R2 as R1 R2.Itis then clear that our agent can establish the postcondition q in initial state σ with relation R if    (∃σ • R.σ.σ ∧ q.σ ). This therefore defines σ {| R |} q when all choices are angelic. To achieve the postcondition, it is sufficient that some state σ  that satisfies q can be reached from σ with R. If there is no σ  such that R.σ.σ , then our agent clearly cannot establish the postcondition and has to breach the contract. Consider then the situation where only the other agent can make any choices. In this case, we interpret R1 ∪ R2 as R1  R2. Our agent can then be sure that condition q is established in initial state σ with R only if    (∀σ • R.σ.σ ⇒ q.σ ). If this holds, the other agent cannot avoid establishing the postcondition q, because every state reachable from σ will establish this postcondition. If there is no σ  such that R.σ.σ , then the other agent has to breach the contract, so our agent trivially satisfies the contract. Thus, there is an angelic and a demonic interpretation of relations in our language for state relations. We can choose to let ∪ model either kind of nondeterminism. However, we have to fix either interpretation for ∪, so we cannot model both kinds of nondeterminism at the same time with relations. 162 9. Relations

We obviously also get two different notions of correctness for state relations. We say that R is angelically correct with respect to precondition p and postcondition q if

   p.σ ⇒ (∃σ • R.σ.σ ∧ q.σ ) holds for each σ. This is the same as requiring (∀σ • p.σ ⇒ (R.σ ∩ q =∅)) or ran.(|p| ; R) ∩ q =∅. Dually, we say that R is demonically correct with respect to p and q if

   p.σ ⇒ (∀σ • R.σ.σ ⇒ q.σ ). This is again the same as requiring that (∀σ • p.σ ⇒ (R.σ ⊆ q)) or ran.(|p|; R) ⊆ q.

Deterministic Programs If the relation R is deterministic, then we have a partial function. A choice between two or more alternatives is then not possible, but there is still the possibility that R is not defined for some initial states. In this case the angelic correctness requirement states that for any initial state σ that satisfies p there must exist a final state σ  reachable from σ such that q.σ  holds. In other words, the execution must terminate in a final state that satisfies q. Angelic nondeterminism for deterministic relations thus amounts to what is more commonly known as total correctness. On the other hand, demonic correctness states that for any initial state σ that satisfies p, if there exists a final state σ  reachable from σ , then q.σ  must hold. Demonic nondeterminism for deterministic relations thus captures the notion of partial correctness. The difference between these two notions of correctness for deterministic programs is whether we consider the situation in which there is no final state as good or bad. If we have an angelic interpretation, then our agent is supposed to carry out the execution, so a missing final state is a breach of contract and is thus bad. In the demonic interpretation the other agent is making the choices, so a missing final state is good for our agent.

Refinement Since we have two notions of correctness, there are also two different kinds of refinement for state relations, depending on whether we assume that the nondeter- minism expressed by the relation is angelic or demonic. If it is angelic, then we can define refinement by

R  R ≡ R ⊆ R . (angelic refinement) 9.6. Correctness and Refinement 163

The definition of refinement says that R  R holds iff σ {| R |} q ⇒ σ {| R |} q holds for every σ and every q. With an angelic interpretation this is the case if and only if R ⊆ R (Exercise 9.10). Intuitively, this means that a refinement can only increase the set of final states among which our agent can choose and hence makes it easier for it to satisfy the contract. In a similar way a demonic interpretation of the nondeterminism in relations implies that refinement is defined by

R  R ≡ R ⊇ R . (demonic refinement) In this case the refinement decreases the set of final states among which the other agent can choose, and hence it can only make it easier for our agent to satisfy the contract.

Relations as Specifications Relations are very useful for describing the intended effect of programs, i.e., to give specifications. The nondeterminism is then interpreted as a don’t care non- determinism: the implementer is given freedom to choose any of the permitted outcomes. This is demonic nondeterminism, where the user is prepared for any choice that the system makes. Consider our example specification from the intro- duction, computing the square root of x with precision e. This can be expressed by the relational assignment (x := y |−e < x − y2 < e). Any program that yields a final result that satisfies this relation is then acceptable. We might be more specific and request the square root to be computed only if x ≥ 0 and e > 0 initially. This is captured by the following relation:

if x ≥ 0 ∧ e > 0 then (x := y |−e < x − y2 < e) else True fi . We do not require anything of the final state when x < 0ore ≤ 0. We can also be more specific if we want, and state that in this case nothing should be done. This would correspond to the specification

if x ≥ 0 ∧ e > 0 then (x := y |−e < x − y2 < e) else Id fi .

Looking at a relation as a specification, a correct implementation is the same as a (demonic) refinement of the relation. Any relation that is more deterministic than the specification is a correct implementation. In particular, we would have that ≥ ∧ > ( = |− < − 2 < ) ⊆ if x 0 e 0 then x : y √e x y e else True fi if x ≥ 0 ∧ e > 0 then |x := x| else Id fi . 164 9. Relations

9.7 Summary

This chapter introduced state relations and described their lattice- and category- theoretic properties. We showed how relations can be used to model programs where there is only one kind of nondeterminism, either angelic or demonic. We also looked at the notions of correctness and refinement for both these interpretations of state relations. For deterministic programs, angelic correctness is the same as total correctness, and demonic correctness is the same as partial correctness. There is a long tradition of modeling programs as relations. An early textbook where this approach is described was written by de Bakker [36]. A relational view is also at the heart of formalisms such as VDM [94] and Z [133]. The origin of the relational view is in the relation calculus of Alfred Tarski [135]. Relational assignments are useful for specifying the required behavior of programs without overspecification. We described a simple programming language based on relations that is more expressive than the straight-line programs and discussed the notions of correctness and refinement for programs described in this language. The notion of a relational assignment is based on the nondeterministic assignment statement originally proposed by Back [6] and further studied in [12]. The relational model is restricted in that we cannot have both angelic and demonic nondeterminism at the same time in a program. Neither does it permit us to express either kind of nondeterminism together with nontermination. This shows that the relational model has its limitations, and that there are situations in which we need to choose a more general model for programs, such as the predicate transformer model that we will describe shortly. However, for many purposes, the relational model is quite sufficient.

9.8 Exercises

9.1 Show that any relation space is an atomic lattice. 9.2 Show that the identity relation is deterministic and that composition of two deter- ministic relations is deterministic.  9.3 Prove Theorem 9.4 (a). Then explain why there is no rule of the form (x := x | T) = Tr ue. 9.4 Show that the conditional relation preserves determinism; i.e., if relations Q and R are deterministic, then any relation of the form if p then Q else R fi is deterministic. 9.5 Verify that Rel is a category. 9.6 Show that in the category of state relations, composition distributes over bottom and join both from the left and from the right. Give a counterexample showing that composition does not distribute over meet or over top. 9.8. Exercises 165

9.7 Show that if R is a state relation, then the following properties hold: − |dom.R|⊆R ; R 1 , − |ran.R|⊆R 1 ; R . 9.8 Prove Corollary 9.9. 9.9 Prove the following: R∗ ; R∗ = R∗ , (R∗)∗ = R∗ . 9.10 Show that relations R and R satisfy  (∀q σ • {R}.q.σ ⇒{R }.q.σ ) if and only if R ⊆ R. 9.11 Carry out the detailed proof of Corollary 9.8. 9.12 Define a relational block with local variable y a follows:

∧   begin var y | b ; R end =|begin| ; (y := y | b[y := y ]) ; |end| , where the initialization predicate b is a predicate over the global variables and y. Show that begin var y | b ; Id end = Id. This is page 166 Printer: Opaque this This is page 167 Printer: Opaque this 10

Types and Data Structures

So far, we have assumed a fixed collection of types, type operators, constants and axioms in the logic. We next consider how to extend our theories to model new domains of discourse by defining new types with associated constants and inference rules. Either we can postulate a new type by giving a collection of axioms and inference rules that determine the properties of the type, or we can construct a new type as being isomorphic to a subset of some existing type. We describe both these methods below. We use the axiomatic approach to introduce the important type of natural numbers. As another example, we show how to define lists and disjoint sums. The latter can be seen as the category-theoretic dual to cartesian products. The constructive approach is illustrated by showing an alternative definition of natural numbers. We then look at how to define the standard data structures commonly used in programming languages: records, arrays and pointers. We show how to formalize their properties as rather straightforward extensions of the way we formalized program variables. We also show how to reason about such structures.

10.1 Postulating a New Type

We postulate a new type by introducing a new atomic type or type operator, extend- ing the signature of the logic with the operations needed for the type and extending the deductive system with new axioms and inference rules that postulate the prop- erties of these operations. We have already shown how to postulate a product type. Here we show how to define natural numbers, lists, and disjoint sums. The first is an example of an atomic type; the two others are type constructors like the product. 168 10. Types and Data Structures

Natural Numbers The type Nat of natural numbers is an atomic type. It has constants 0 : Nat and suc : Nat → Nat (the successor function). We introduce the type Nat by adding 0 and suc to the signature and postulating Peano’s axioms:

 (∀n • ¬ (suc.n = 0)) ,  (∀mn• suc.m = suc.n ⇒ m = n),  (∀p • p.0 ∧ (∀n • p.n ⇒ p.(suc.n)) ⇒ (∀n • p.n)) (from now on, we will generally write axioms and theorems without outermost uni- versal quantifiers and without leading turnstile if there are no assumptions). The first axioms states that 0 is not a successor. The second axiom states that the suc- cessor function is injective. The last axiom is the induction axiom. Peano’s axioms are sufficient for developing arithmetic in higher-order logic, because addition and multiplication (and other operations on naturals) can be defined as new constants. Note that the induction principle is stated as a single theorem in higher-order logic, using quantification over all predicates p.Inafirst-order logic, we would need to use an axiom scheme to define induction. The numerals 1, 2,... can be seen as a (potentially) infinite collection of new constants, each with its own definition (1 = suc.0, 2 = suc.1, etc.).

Sequences A sequence is a finite ordered collection of elements that are all of the same type. We shall here introduce sequences by postulating a number of axioms, much in the same way as we introduced the natural numbers. There are two basic constants over sequences: the empty sequence  : Seq of  and the (infix) sequence constructor &:(Seq of ) →  → (Seq of ), which extends a sequence with an element. The postulates are the following: s&x = , s&x = s&x ⇒ s = s ∧ x = x , p. ∧(∀sx• p.s ⇒ p.(s&x)) ⇒ (∀s • p.s)) . Note how these axioms resemble the axioms for the natural numbers. In addition to the two basic constants, we introduce two other constants, last : Seq of  →  and butlast : Seq of  → Seq of , with axioms

last.(s&x) = x , butlast.(s&x) = s . Since all functions are total, last applied to  will return some value, about which we know nothing. In practice, we use angle bracket notation for concrete sequences, so 1, 2, 3 stands for (( &1)&2)&3. 10.1. Postulating a New Type 169

Sum Type Dually to products, we extend the type structure of higher-order logic with sum types, standing for disjoint unions. Sums are added through a binary type operator +, written infix. Exactly as for products, we assume that + associates to the right, so  +  +  stands for  + ( + ). Furthermore, we assume that + has higher precedence than → but lower than ×. Elements of the sum are constructed from elements of  and  using the injection functions inl :  →  +  and inr :  →  + . In addition to the injection func- tions, we introduce four other new constants for handling sums. The test functions isl :  +  → Bool and isr :  +  → Bool tell us where an element in the sum originates, and the projection functions outl :  +  →  and outr :  +  →  project an element from the sum back into the component where it originates. Thus the sum type is characterized by the following property, for an arbitrary δ :  + :

(∃σ :  • δ = inl.σ ) ∨ (∃γ :  • δ = inr.γ ) (sum characterization) Furthermore, we have the following properties for isl and isr:

isl.(inl.σ ) ≡ T isl.(inr.γ ) ≡ F , (left test) isr.(inl.σ ) ≡ F isr.(inr.γ ) ≡ T , (right test) and the following properties for outl and outr:

outl.(inl.σ ) = σ, (left project) outr.(inr.γ ) = γ. (right project)

Note that outl : + →  and outr : + →  are not uniquely defined by these equations; we do not say anything about the value of outl.(inr.γ ) or outr.(inl.σ ). Equality on sums is as follows:

  inl.σ = inl.σ ≡ σ = σ , (sum equality) inl.σ = inr.γ ≡ F ,   inr.γ = inr.γ ≡ γ = γ .

If  and  each is equipped with an ordering, then we assume that the sum  +  is ordered as follows:

  inl.σ  inl.σ ≡ σ  σ , (sum ordering) inl.σ  inr.γ ≡ T , inr.γ  inl.σ ≡ F ,   inr.γ  inr.γ ≡ γ  γ . Thus the orderings within  and  are preserved, but every element of  is smaller than every element of . 170 10. Types and Data Structures

10.2 Constructing a New Type

Another way of extending a theory with a new type is to construct the new type in terms of existing types. We extend higher-order logic with a new type T by asserting that this type is isomorphic to a nonempty subset A of some existing type . The new type T is not a subset of , because all types in higher-order logic must be disjoint. Rather, there is a bijection between the elements of T and the elements in A. Using the bijection, we can then define operations over the new type in terms of already existing operations over the old type. The principle of extension by type definition in higher-order logic allows us to introduce a new type in this way. The resulting extension of the logic is conservative, so this approach also guarantees consistency of the extension. The steps required are the following:

(i) Specify a subset A of some existing type. (ii) Prove that the set A is nonempty. (iii) Assert that the new type is in one-to-one correspondence with A.

Formally, the logic is extended with a new type by a definition ∼ T = A . (type definition) where T is the name of the new type and A :  → Bool is a closed term such that (∃x • A.x) is a theorem (so A denotes a nonempty set). This extends the theory with a new type T , the new constants abs :  → T and rep : T → , and the following theorem:  (∀x : T • abs.(rep.x) = x) ∧ (∀y | y ∈ A • rep.(abs.y) = y). This theorem states that correspondence between the type T and the representing set A is given by the abstraction function abs and the representation function rep. This construction is illustrated in Figure 10.1.

Σ

T

A rep abs

FIGURE 10.1. Defining a new type T 10.2. Constructing a New Type 171

The type  can contain uninstantiated types 1,...,n; in this case T becomes an n-ary type operator.

Unit Type As a first simple example, we introduce the type Unit, which contains exactly one element. This can be done by choosing the value T : Bool to represent the unique element of Unit. We have that {T}=(λx : Bool • x),sowedefine the type by ∼ Unit = (λx : Bool • x). (unit type) There is at least one value in the representation set; i.e.,  (∃x • (λx : Bool • x).x) holds, because (∃x • (λx : Bool • x).x) ≡{β conversion} (∃x • x) ⇐{∃introduction with witness T} T

Now we can define unit to be the single element of Unit, using an ordinary constant definition:

∧ unit = ( x : Unit • T).

Natural Numbers As a more elaborate example, we show how the natural numbers themselves can be defined in terms of the individuals. Here we make use of the infinity axiom; we assume that the function f : Ind → Ind is injective but not surjective (see Chapter 7). We will not go through all the details of the construction, since we only want to illustrate how the type-definition mechanism works. The natural numbers can be built so that 0 is represented by an arbitrary element outside the range of f (such an element exists since f is not surjective), i.e., ( x • x ∈ ran. f ). Furthermore, the successor function is represented by f . The subset of Ind that we then choose to represent the natural numbers is the small- est subset that contains ( x • x ∈ ran. f ) and is closed under f . This set N is characterized as

N =∩{A | ( x • x ∈ ran. f ) ∈ A ∧ (∀x • x ∈ A ⇒ f.x ∈ A)} . The proof that N is nonempty is trivial. We can thus give the type definition: ∼ Nat = N . (natural numbers) 172 10. Types and Data Structures

. . . Ind

suc . . f 2 o

suc f 1 o suc o f 0

FIGURE 10.2. Constructing the natural numbers Nat

Now we can define 0 and the successor function using the abstraction and repre- sentation functions that the type definition has added: ∧ 0 = abs.( x • x ∈ ran. f ), ∧ suc.n = abs.( f.(rep.n)) . This definition is illustrated in Figure 10.2. The Peano postulates can now be proved as theorems rather than asserted as axioms. The first two postulates follow directly from the fact that f is injective (this ensures that the successor function is injective) and not surjective (this ensures that 0 is not a successor). The proof of the induction postulate is more complicated, so we omit it here.

10.3 Record Types

The previous section introduced a number of types that are useful for expressing mathematical properties in general. In programming languages we use types to structure data that are stored in the state. Below we consider such data types (or data structures) in more detail. We will look at record types in this section and consider array types and pointer types in the following sections. We have earlier assumed that a state space is a type with attributes that can be read and changed, and that we describe state-changing functions (state transformers) by giving their effect on the attributes of the state. These attributes are program variables when they satisfy the independence assumptions. 10.3. Record Types 173

We can also define a type with specific attributes directly, as a record type. Let us defineanewm-ary type operator record for each m ≥ 1. Let us write

 = record k1 : 1 ; ···; km : m end for the definition  = record(1,...,m ) together with the definition of the attributes k1,...,km on , where attribute ki has type ( → i )×(i →  → ). An element of a record type is called a record. Note that we here assume that the attributes are constants, whereas program variables are modeled as variables. The definition of the access and update operations is as before. We require that the constants k1,...,km satisfy the independence assumptions (a)–(f), i.e., that

var k1 : 1,...,km : m holds. The explicit definition of  as a record type means that it is a completely new type, of which we know nothing except that it has the constants k1,...,km . These constants satisfy the independence assumptions, now taken as axioms. A state space may have attributes other than those explicitly declared, but for record types we assume that it has only the declared attributes. We introduce a shorthand for the access and update operations on records, writing

∧ γ [k] = val.k.γ , ∧ γ(k ← a) = set.k.a.γ .

In addition to the independence axioms, we assume one more property of records that we do not assume for states with attributes in general:

   γ = γ ≡ k1.γ = k1.γ ∧ ...∧ km .γ = km .γ . (identification) This states that two records are equal if and only if they have the same values for all attributes. In other words, a record is identified by its attributes. This is essentially the same property that we assumed for the product type (that a pair is identified by its projections) and is a form of extensionality. This property is not assumed for states with attributes in general, because it would invalidate the monotonicity property for state transformers defined in terms of attributes. The monotonicity property states that if we have proved a property assuming a specific collection of state attributes, then the same property still holds if we assume that the state also has other attributes. It is not obvious that we actually want to have the identification axiom for record types. This depends on whether or not we want to compare two records for equality. The object-oriented programming paradigm assumes that records are extendable by adding new attributes and that the monotonicity property holds for records. If we follow this paradigm, then we would not assume the identification property. In traditional Pascal-like languages the identification property is assumed to hold, as there is no mechanism for record type extension. 174 10. Types and Data Structures

Record expressions and Record Assignment Let us now consider expressions that have records as values. The most central record operation is one that computes a new record value from a given record value by changing some specific attribute. Assume that R is a state attribute (i.e., a program variable) of record type , k is an attribute of the record, and e is an expression. We define a record update, denoted by R(k ← e), as follows:

∧ (R(k ← e)).σ = (R.σ )(k ← e.σ ) . (record update) In a given state σ, this expression returns a new record γ  that we get by changing the attribute k of the record γ = R.σ to the value e.σ . Thus, both the record and the new value for the attribute are given as expressions on the state. The record update is the pointwise extension of the set.k operation: (R.σ )(k ← e.σ ) = set.k.(e.σ ).(R.σ ). Let us also define a record attribute expression by

∧ R[k].σ = (R.σ )[k] . (record attribute expression) Thus, the record R.σ is first computed, and then the attribute k is determined from the computed record value. This is again the pointwise extension of the val operation on record values. We finally define record assignment by

∧ R[k]:= e = R := R(k ← e). (record assignment) Record assignment is thus written as if it were an assignment to the attribute k of the record. In reality, a record assignment is an assignment of a new updated record value to the record variable. The notation for record attributes varies. In Pascal, the standard notation is R.k, whereas we write this here as R[k], in conformance with the array indexing op- eration that we will introduce below. The Pascal dot notation cannot be used here because application is the wrong way around; we think of k as the function that is applied to R. The record assignment notation is in conformance with established notation in Pascal and many other languages. A simple example of a record type is an account, which we can define by

Account = record name : String ; amount : Nat end . If we have two record variables A, B : Account, then

A[amount]:= A[amount] + B[amount]; B[amount]:= 0 describes the transfer of all the money in B to A. 10.4. Array Types 175

Records as Constructed Types We have here defined record types by postulating a collection of constants (the attributes or fields) and postulating that these satisfy the independence assumptions. We could also define records as constructed types. The type record k1 : 1 ;···;km : m end could be represented by the product type 1 ×···×m . We would then define, for i = 1,...,m,

. .γ = π m .( .γ ) , val ki i rep . . .γ = .( m . .( .γ )) . set ki a abs updatei a rep π m m . Here i denotes the ith component in the tuple with m components, and updatei a denotes the operation that changes the ith component of a tuple to the value a, without changing the other components. These operations are straightforward to define for any particular values of m and i. With these definitions, it is easy to show that the record type satisfies the indepen- dence axioms that we have postulated for it above (Exercise 10.9).

10.4 Array Types

An array is used to store a collection of data items of the same type, to be accessed by indices (which usually are natural numbers). We can define a binary type operator array that constructs arrays from given types. Let us write

∧  = array  of 

for  = array(, ). Here  is the index type, and  is the component type.

Let us postulate two operations, val :  →  →  for retrieving the value of an array, and set :  →  →  →  for updating a specific array element. Thus val.x.γ stands for the value of array γ at index x, while set.x.φ.γ stands for the new array that we get from γ by setting the element indexed by x to the value φ. Note the difference here as compared to records and program variables, for which the access functions returned the first or second component of an attribute. Here the access and update functions are really functions on the index set. We use the same abbreviations for arrays as for records:

∧ γ [x] = val.x.γ , ∧ γ(x ← a) = set.x.a.γ .

We will assume that val.x and set.x satisfy the independence axioms for all possible values of x ∈ . In other words, we postulate the axioms in Figure 10.1 (x and y are indices and a, b array element values). We also assume extensionality for arrays: 176 10. Types and Data Structures

γ(x ← a)[x] = a , (a) x = y ⇒ γ(x ← a)[y] = γ [y] , (b) γ(x ← a)(x ← b) = γ(x → b), (c) x = y ⇒ γ(x ← a)(y → b) = γ(y ← b)(x ← a), (d) γ(x ← γ [x]) = γ. (e)

TABLE 10.1. Independence requirements for array elements

  γ = γ ≡ (∀x ∈  • γ [x] = γ [x]). (identification)

Array Expressions and Array Assignment Assume that A is an expression of array type and i an expression of index type. We define an array access operation as follows:

∧ A[i].σ = (A.σ )[i.σ] . (array access) This is really the pointwise extension of the val operation:

val.(i.σ ).(A.σ ) = (A.σ )[i.σ ] = A[i].σ . We define array update by

∧ (A(i ← e)).σ = (A.σ )(i.σ ← e.σ ) . (array update) Thus, a difference between arrays and records is that for arrays the index may also depend on the state, whereas for records, the index (attribute) is always fixed. In this way arrays permit indirect addressing of data elements, which is not possible with records or global variables. The array update is again the pointwise extension of the set operation. Finally, we define array assignment in the same way as for records:

∧ A[i]:= e = A := A(i ← e). (array assignment)

The above formalization of arrays assumes that a type is given as the index type. Thus, array Nat of Bool is an array that permits any natural number to be used as an index, so the array itself is an infinite data structure. Programming languages usually constrain the indices to some subset of the index type: we would define something like array 10 ...120 of Nat to indicate that only indices in the range 10 ≤ x ≤ 120 are permitted. An array specified in this way is defined by adding the index constraint to the index axioms. As an example, if the index range is m ...n, then the second axiom becomes 10.4. Array Types 177

m ≤ x ≤ n ∧ m ≤ y ≤ n ∧ x = y ⇒ γ [x ← a][y] = γ [y] . The identification axiom would then state that two arrays are the same if they have the same values for all indices in the permitted range.

Arrays as Constructed Types We can also define  = array  of  as a constructed type. We choose  →  as the representation type. Then we define

val.x.γ = rep.γ.x , set.x.φ.γ = abs.(update.x.φ.(rep.γ )) , where we define the update operation for functions by

update.x.a. f = (λy • if x = y then a else f.y fi). It is easy to show that the independence axioms above are satisfied with these definitions (Exercise 10.9). This shows that we could in fact consider functions  →  directly as arrays. This is, however, only the simplest form of arrays. As soon as we put more restrictions on the array, such as restricting the indices to some range, the representation becomes more involved, and the advantages of defining an array as a type of its own become more substantial. For instance, we could choose to represent a bounded array array m ...n of  by the type (Nat → ) × Nat × Nat, where the first component determined the value of the array at different indices, the second component records the lower bound, and the last component records the upper bound. We could also add operations min and max for the arrays that return the lower and upper limit for the permitted indices. This construction of arrays would also permit us to define dynamic arrays.In this case, we would in addition to the usual array operations also have operations that could change the index range of the array. For instance, we could define an operation incmax that extends the maximum index of the array by one, and an operation decmax that decreases the maximum permitted index by one, and similarly for changing the minimum permitted index.

Defining Constants over Arrays When reasoning about arrays and about programs that handle arrays, we need more than just the basic array concepts. For example, we will use the notion of swapping two elements in an array. We define the constant swap such that swap.x.y.γ stands for the result of swapping elements x and y in array γ : 178 10. Types and Data Structures

  γ [y]ifz = x, ( . . .γ ) = γ = swap x y [z]  [x]ifz y, γ [z] otherwise. All constants over arrays (and records) are automatically also available as pointwise extended constants. Thus, e.g., we have that

(swap.i. j.A).σ = swap.(i.σ ).( j.σ ).(A.σ ) . This means that we can always apply reduction to expression with arrays and records. As another example, we define what it means for the restriction of an array to a certain index set to be sorted (here, and in what follows, we take Nat as the default index set for arrays):

∧ sorted.γ.X = (∀xy| x ∈ X ∧ y ∈ X • x ≤ y ⇒ γ [x] ≤ γ [y]).

We can then use the explicit definition to derive properties for sorted. For example, we have that

sorted.γ.∅ ≡{definition} (∀xy| x ∈∅∧y ∈∅• x ≤ y ⇒ γ [x] ≤ γ [y]) ≡{quantification over empty range} T

Thus sorted.γ.∅ always holds. Another property that we will find useful later on is the following: (∀xy| x ∈ X ∧ y ∈ Y • x < y ∧ γ [x] ≤ γ [y]) ⇒ (sorted.γ.X ∧ sorted.γ.Y ≡ sorted.γ.(X ∪ Y )) . The proof of this theorem illustrates the way in which we reason about arrays using bounded quantification and range splitting: (∀xy| x ∈ X ∧ y ∈ Y • x < y ∧ γ [x] ≤ γ [y])  sorted.γ.(X ∪ Y ) ≡{definition} (∀xy| x ∈ X ∪ Y ∧ y ∈ X ∪ Y • x ≤ y ⇒ γ [x] ≤ γ [y]) ≡{range split} (∀xy| x ∈ X ∧ y ∈ X • x ≤ y ⇒ γ [x] ≤ γ [y]) ∧ (∀xy| x ∈ Y ∧ y ∈ Y • x ≤ y ⇒ γ [x] ≤ γ [y]) ∧ 10.5. Dynamic Data Structures 179

(∀xy| x ∈ X ∧ y ∈ Y • x ≤ y ⇒ γ [x] ≤ γ [y]) ∧ (∀xy| x ∈ Y ∧ y ∈ X • x ≤ y ⇒ γ [x] ≤ γ [y]) ≡{definition of sorted} sorted.γ.X ∧ sorted.γ.Y ∧ (∀xy| x ∈ X ∧ y ∈ Y • x ≤ y ⇒ γ [x] ≤ γ [y]) ∧ (∀xy| x ∈ Y ∧ y ∈ X • x ≤ y ⇒ γ [x] ≤ γ [y]) ≡{assumption} sorted.γ.X ∧ sorted.γ.Y ∧ (∀xy| x ∈ X ∧ y ∈ Y • (x ≤ y ⇒ γ [x] ≤ γ [y]) ∧ x < y) ∧ (∀xy| x ∈ Y ∧ y ∈ X • (x ≤ y ⇒ γ [x] ≤ γ [y]) ∧ y < x) ≡{properties of predicates} sorted.γ.X ∧ sorted.γ.Y ∧ (∀xy| x ∈ X ∧ y ∈ Y • x < y ∧ γ [x] ≤ γ [y]) ∧ (∀xy| x ∈ Y ∧ y ∈ X • y < x) ≡{assumption, last two conjuncts simplify to T} sorted.γ.X ∧ sorted.γ.Y In fact, these two properties characterize sorted for finite index sets, so we could also introduce the constant sorted by postulating them as axioms. We will in our examples use the notion of a permutation of an array. For this we need the number of occurrences of an element a in array γ restricted to index set X: ∧ occurrence.a.γ.X = (#x ∈ X • γ [x] = a), where # is the “number of” quantifier (for a definition, see Exercise 10.10). We define permutation.γ.X.δ.Y to mean that the restriction of array γ to index set X is a permutation of the restriction of δ to index set Y : ∧ permutation.γ.X.δ.Y = (∀a • occurrence.a.γ.X = occurrence.a.δ.Y ). For some properties of permutation, see Exercise 10.11.

10.5 Dynamic Data Structures

Dynamic data structures are often identified with pointer structures. The machinery that we already have introduced is sufficient to define such structures. Assume that we want to define a data type that consists of nodes with specific fields. Some of these fields are basic data, while others are pointers to other nodes. Say that we want to define nodes that have two fields, d and q, where d stores the data in the 180 10. Types and Data Structures

node while q is a pointer to another node. We then define the whole dynamic data structure as a type of the form

 = array I of record d : T ; q : I end . Here the index set I is the set of pointer values, while the nodes in the structure are records, with d : T as the basic data field and q the pointer field. We can declare a program variable D :  to contain the whole (dynamic) data structure, usually referred to as the heap. In the spirit of Pascal, most programming languages do not declare the heap explicitly. Declaring the heap as a state attribute the fact that dynamic data structures are just (infinite) arrays explicit. Accessing a node determined by pointer p is then simply expressed as D[p]; i.e., it is an array expression. Similarly, updating a node determined by p in the data structure is expressed as D[p]:= D[p](d ← e) or D[p]:= D[p](q ← e), depending on which field we want to change in the node. We can write this even more concisely as D[p][d]:= e or D[p][q]:= e. Dynamic data structures differ from ordinary arrays in that we have a way of generating new pointer (index) values that have not been used before. A simple wayistouseNat as the index set for the heap D and keep a separate counter new for the heap to give the next unused index, with {1, 2,...,new.σ − 1} being the indices that already have been used in state σ . This permits us to generate new indices dynamically whenever needed and to use index value 0 (called nil) for a pointer that is never used. Summarizing the above, a declaration of the form type T = record ... end ; var p : pointer to T can be seen as a shorthand for the following declaration: type T = record ... end ; var D : array Nat of T ; var new : Nat := 1; var p : Nat Here D is the heap variable associated with the record type T , and new is the variable that records the least unused index in D. We then write p ↑ for D[p] so that the heap is kept implicit, p ↑ x for the ac- cess D[p][x] and p↑ x := e for the update operation D[p][x]:= e. The state transformer new p is define by ∧ new p = p, new := new, new + 1 . The compiler enforces the restriction that p can be used only in these three contexts, and in particular, that no arithmetic operations are performed on p. 10.6. Summary and Discussion 181

This approach to modeling pointers is very simple, but it shows the basic ideas in- volved. In particular, it shows that dynamic data structures with pointers in essence are nothing but (infinite) arrays. We can forbid pointer arithmetic by choosing, e.g.,

Ind rather than Nat as the pointer type, and arbInd as the value nil. We then have to explicitly record the set of all pointer values that have been generated this far. More elaborate pointer types can easily be defined, where we can have two or more heaps, each with its own pointer type. A more detailed discussion of dynamic data structures with examples is, however, outside the scope of this book.

10.6 Summary and Discussion

We have shown above how to build theories in higher-order logic by extending a given theory with new types, constants, and inference rules. Theories are extended either axiomatically, by postulating new types, constants, and inference rules, or constructively, by defining new types and constants in terms of existing constructs and deriving their properties. The big advantage of the latter approach is that the resulting theory extension is conservative and hence known to be consistent. We have illustrated these two methods with a number of examples. The type-definition mechanism that we use has been described in detail (both for basic types and type operators) by Gordon and Melham [62]. The way in which the product and sum theories are postulated is quite standard, see, e.g., Larry Paulson’s book [115]. This chapter illustrates how theories are built in higher-order logic. One theory is built on top of another theory, as an extension. Different extensions of a basic theory can be combined by just joining all the extensions together. As long as the types and constants introduced are syntactically different in all these extensions and the extensions are conservative, the resulting combined extension is also a conservative extension of the original theory. Data types as they are used in programming languages are defined just as other types. Either we can postulate new data types, or we can introduce them as con- structed types. We have here chosen the first approach, because the basic data types that we are interested in, records, arrays, and pointers, come as quite straightforward generalizations of the notion of independent state attributes. However, we have also indicated for these data types how they would be defined explicitly as constructed types, as a way of demonstrating the consistency of extending higher-order logic with theories about records, arrays, and pointer structures.

10.7 Exercises

10.1 Construct a type tri with three elements A, B, and C and then define a function of type tri → Bool that maps A and B to T and C to F. 182 10. Types and Data Structures

10.2 Construct product types using  →  → Bool as the type representing  × . What is the predicate that characterizes the appropriate subset of  →  → Bool? How are pair, fst, and snd defined? Prove that the characteristic properties of products are in fact theorems under your definitions. 10.3 The sum type operator can be defined explicitly as follows: ∼  +  = {(b,σ,γ): Bool ×  ×  | (b ∧ γ = arb) ∨ (¬ b ∧ σ = arb)} . Define the injection, test, and projection functions (inl, inr, isl, isr, outl, outr) and prove that they have the properties given in Section 10.1. 10.4 Use the axioms and definitions for sequences given in the text to derive the following properties: s = ⇒(butlast.s)&(last.s) = s ,   s = ∨(∃s x • s = s &x). 10.5 We have seen how the higher-order capabilities of higher-order logic allow the induction principle to be stated as a theorem. Another important feature of the natural numbers that can also be derived within the logic is the principle of function definition by primitive recursion. In fact, it is possible to derive the following theorem from the Peano axioms: (∀ef• ∃!g • g.0 = e ∧ (∀n • g.(suc.n) = f.(g.n).n)) .

We can then define (see exercises below) a new constant natrec that has the following properties: natrec.e. f.0 = e , natrec.e. f.(suc.n) = f.(natrec.e. f.n).n . This constant can be used as a tool for defining recursive functions. For example, we define ∧ even = natrec.T.(λbn• ¬ b). Show that even has the following property: (even.0 ≡ T) ∧ (∀n • even.(suc.n) ≡¬(even.n)) . 10.6 Define the constant +, standing for addition of natural numbers, as follows: ∧ + = natrec.(λx • x).(λfmn• suc.( f.n)) . Show that addition has the following property: (∀n • + .0.n = n) ∧ (∀mn• + .(suc m).n = suc.(+.m.n)) . Then use the induction principle to prove that (∀n • + .n.0 = n). 10.7. Exercises 183

10.7 Derive the following theorem from the Peano axioms: (∀ef• ∃g • (g.0 = e) ∧ (∀n • g.(suc n) = f.(g.n).n)) . 10.8 Use the theorem proved in Exercise 10.7 to find a suitable definition of the constant natrec. Verify that natrec satisfies the property (∀ef• natrec.e. f.0 = e ∧ (∀n • natrec.e. f.(suc n) = f.(natrec.e. f.n).n)) . 10.9 Show that both records and arrays satisfy the independence axioms. 10.10 Assume that the constant cardrel satisfies the following: (cardrel.A.0 ≡ A =∅) ∧ (∀n • cardrel.A.(n + 1) ≡ (∃x ∈ A • cardrel.(λy ∈ A • y = x).n) (such a constant can be defined using primitive recursion, see previous exercises). Then define ∧ card.A = (εn • cardrel.A.n) and convince yourself that card.A denotes the number of elements in A if A is finite. Now we can define the “number of” quantifier as the binder version of card: ∧ (#x ∈ A • p.x) = card.(λx ∈ A • p.x). 10.11 Show that permutation satisfies the following properties: permutation.A.I.A.I , permutation.A.I.B.J ⇒ permutation.B.J.A.I , permutation.A.I.B.J ∧ permutation.B.J.C.K ⇒ permutation.A.I.C.K , i ∈ I ∧ j ∈ I ⇒ permutation.A.I.(swap.i. j.A).I . This is page 184 Printer: Opaque this This is page 185 Printer: Opaque this

Part II

Statements This is page 186 Printer: Opaque this This is page 187 Printer: Opaque this 11

Predicate Transformers

We described the basic notion of a contract between independent agents in the in- troduction and showed a number of properties that it would be reasonable to expect contracts to have. We did not, however, say what kind of mathematical entities con- tracts are. The interpretation of contracts depends on what features are considered essential and what are inessential. We now make the assumption that the essential feature of a contract is what can be achieved with it. From this point of view, contracts can be interpreted as predicate transformers, functions from predicates to predicates. We will explore this approach below in detail, concentrating first on basic mathematical properties of predicate transformers and on how contracts are interpreted as predicate transformers.

11.1 Satisfying Contracts

We have defined σ {| S |} q to mean that our agent can satisfy the contract S to establish condition q when the initial state is σ . We want to define this notion precisely for contracts. We could do this by first defining how an agent carries out a contract and, based on this, then define what it means for the agent to satisfy a contract. We will also do this, but later on. Here we will follow a simpler path, where we define σ {| S |} q directly, by induction on the structure of contracts, using our intuitive understanding of how contracts are satisfied.

Let us look at the language of contract statements introduced in the first chapter:

S ::=f | {g}|[g] | S1 ; S2 | S1  S2 | S1 S2 . 188 11. Predicate Transformers

Consider first the update  f . It has the effect of changing the present state σ to a new state f.σ . There is no choice involved here, neither for our agent or the other agent, since there is only one possible next state. In this case, we obviously have that σ {|  f |}q holds if and only if f.σ ∈ q. The assertion {g} has no effect on the state if g holds, but otherwise it will breach the contract. Hence, σ {| { g}|}q holds if and only if σ ∈ g ∩ q. Condition g must hold initially in order to avoid breaching the contract, and condition q must hold initially if it is to hold for the final state, because the state is not changed. The assumption [g] also has no effect on the state if g holds, but if g does not hold, then the assumption is false and the contract is satisfied trivially. Thus, σ {| [g] |} q holds iff either σ ∈¬g or σ ∈ q, i.e., iff σ ∈¬g ∪ q.

Consider next the angelic choice S1 S2. Our agent gets to choose the alternative, so it is sufficient that either one of the two contracts can be satisfied. Thus σ {| S1 S2 |} q holds iff either σ {| S1 |} q or σ {| S2 |} q holds.

For the demonic choice S1  S2, we have that σ {| S1  S2 |} q holds iff both σ {| S1 |} q and σ {| S2 |} q hold. Because the other agent gets to choose the alternative, our agent must be able to satisfy both contracts in order to satisfy the whole contract.

Let us finally consider satisfaction of a sequential composition S1 ; S2 of contracts. Our agent can satisfy this contract to establish q from σ if he can satisfy S1 to establish some intermediate condition from which he then can satisfy S2 to   establish q. Thus, σ {| S1 ; S2 |} q holds if and only if σ {| S1 |} q , where q is a set   of states σ for which σ {| S2 |} q holds.

Weakest Preconditions For any contract S and any condition q to be established, we define wp.S.q as the set of all initial states from which S can be satisfied to establish q:

∧ wp.S.q ={σ | σ {| S |} q} .

We refer to wp.S.q as the weakest precondition for satisfying S to establish q. We can use this definition to give an alternative definition of satisfaction for se- quential composition: σ {| S1 ; S2 |} q holds iff σ {| S1 |} (wp.S2.q). Strictly speaking, this is equivalent to the previous definition only if we assume that if a contract establishes a condition q, then it will also establish any weaker condition q. This monotonicity property of weakest preconditions for contracts will be shown to hold in Chapter 13.

We can look at wp as a function that assigns to each contract statement S a func- tion from predicates to predicates; i.e., wp.S : P() → P(), when  is the 11.2. Predicate Transformers 189

state space. Thus, wp.S is a predicate transformer. From the definitions of satisfac- tion above, we can immediately determine wp.S for the statements of our simple language for contracts: − wp. f .q = f 1.q , wp.{g}.q = g ∩ q , wp.[g].q =¬g ∪ q , wp.(S1 ; S2).q = wp.S1.(wp.S2.q), wp.(S1 S2).q = wp.S1.q ∪ wp.S2.q , wp.(S1  S2).q = wp.S1.q ∩ wp.S2.q . Let us check the first equation. We have that wp. f .q ={definition of weakest preconditions} {σ | σ {|  f |}q} ={satisfaction of the update statement} {σ | f.σ ∈ q} ={definition of inverse image} − f 1.q The other equalities are also easily checked. These definitions permit us to compute the weakest precondition for any contract to establish any postcondition. The satisfaction of contracts is captured very directly by the notion of weakest precondition:

σ {| S |} q ≡ wp.S.q.σ . There are, however, definite advantages in formalizing the notion of satisfaction in terms of predicate transformers rather that in terms of a ternary relation. This will become obvious as the theory is further developed.

11.2 Predicate Transformers

The preceding section justifies studying the mathematical properties of predicate transformers in more detail. The set of predicate transformers from  to , denoted by  %→ ,isdefined by ∧  %→  = P() → P() . (predicate transformers) A predicate transformer is thus a function that maps postconditions to precondi- tions. By using the notation  %→ , we uniformly treat the first argument ()asthe initial state space and the second () as the final state space for state transformers, state relations, and predicate transformers. 190 11. Predicate Transformers

Predicate Transformer Lattice We define the refinement ordering on predicate transformers as the pointwise extension of the subset ordering on P(): for S, T ∈  %→ ,wehave

∧ S  T = (∀q ∈ P() • S.q ⊆ T.q). ( for predicate transformers) Hence, ( %→ , ) is again a complete boolean lattice, by the pointwise extension property. The lattice operations on predicate transformers are pointwise extensions of the corresponding operations on predicates:

∧ abort.q = false , (⊥ of predicate transformers) ∧ magic.q = true , ( of predicate transformers) ∧ (¬ S).q =¬S.q , (¬ of predicate transformers) ∧ (S  T ).q = S.q ∩ T.q , ( of predicate transformers) ∧ (S T ).q = S.q ∪ T.q . ( of predicate transformers) The names for the bottom and top of the predicate transformer lattice, abort and magic, are by now traditional, and we have chosen not to change them. We comment on their interpretation below. Implication and equivalence on predicate transformers can also be defined by pointwise extension:

∧ (S ⇒ T ).q = S.q ⇒ T.q , (⇒ of predicate transformers) ∧ (S ≡ T ).q = S.q ≡ T.q . (≡ of predicate transformers) Thus, these two operations satisfy the usual laws: (S ⇒ T ) = (¬ S T ) and (S ≡ T ) = (S ⇒ T )  (T ⇒ S). Meet and join are also defined on arbitrary sets of predicate transformers:

∧ • • ( i ∈ I Si ).q = (∩ i ∈ I Si .q), (general ) ∧ • • ( i ∈ I Si ).q = (∪ i ∈ I Si .q). (general )

Predicate Transformer Category A predicate transformer category has types as objects and predicate transformers as morphisms. We have that

∧ 1 = skip , (1 of predicate transformers) ∧ S ; T = S ◦ T . (; of predicate transformers) 11.2. Predicate Transformers 191

Composition of predicate transformers S;T is thus backward function composition S ◦ T , while the identity morphism is

skip = (λq • q). These operations satisfy the category requirements:

(S ◦ T ) ◦ U = S ◦ (T ◦ U), (◦ associative) skip ◦ S = S and S ◦ skip = S . (skip unit) The reader should notice that we use forward composition in state transformer cat- egories and in state relation categories, while composition in predicate transformer categories is backward. Since the morphisms in a predicate transformer category are functions ordered by pointwise extension and the composition operator is backward function composi- tion, we have the monotonicity property from Section 2.5:

S  S ⇒ S ; T  S ; T . However, composition is not monotonic in its second argument (this is because predicate transformers are not necessarily monotonic).

The predicate transformer category PtranX has the collection of state spaces X as objects and all predicate transformers between two state spaces as its morphisms:

∧ PtranX (, ) =  %→ .

We summarize the results about predicate transformers in this category in the following theorem.

Theorem 11.1 PtranX is a left complete boolean lattice-enriched category. As also noted in Section 2.5, we get a proper (both a left and a right) complete lattice- enriched category if we restrict ourselves to monotonic predicate transformers. Monotonic predicate transformers are studied in more detail in Chapter 13. The same asymmetry between left and right also holds for the distributivity proper- ties of composition. We have distributivity from the right for all lattice operations:

magic ; T = magic , abort ; T = abort , (¬ S) ; T =¬S ; T , (S1  S2) ; T = S1 ; T  S2 ; T , (S1 S2) ; T = S1 ; T S2 ; T . The last properties generalize to meets and joins for arbitrary statements; i.e.,

( i ∈ I • S) ; T = ( i ∈ I • S ; T ), ( i ∈ I • S) ; T = ( i ∈ I • S ; T ). 192 11. Predicate Transformers

However, none of the corresponding left distributivity properties hold in general (not even if we assume monotonicity).

11.3 Basic Predicate Transformers

Let us now return to the weakest precondition predicate transformers. From the def- initions above, we see that the three basic ways of composing contracts – sequential composition and angelic and demonic choice – all correspond to operations in the lattice-enriched category of predicate transformers. We can underline this close correspondence between contract statements and pred- icate transformers by introducing specific predicate transformers that directly cor- respond to weakest preconditions of the basic contract statements. Let us define the functional update, the assertion, and the assumption predicate transformers as follows: ∧  f .q = f −1.q , (functional update) ∧ {g}.q = g ∩ q , (assertion) ∧ [g].q =¬g ∪ q . (assumption)

The definitions of weakest preconditions for basic contract statements then simplify to

wp. f =f , wp.{g}={g} , wp.[g] = [g] . On the left-hand side we have a wp applied to a contract statement; on the right-hand side we have a predicate transformer. We can express the main constants of the predicate transformer category in terms of the functional update, assertion, and assumption predicate transformers:

abort ={false} , skip =id ={true}=[true] , magic = [false] .

In particular, we then have that

wp.{false}=abort , wp.[false] = magic , which shows how to interpret the impossible assertion and the impossible assump- tion. The impossible assertion is the bottom of the predicate transformer lattice, while the impossible assumption is the top of this lattice. For the contract that does not change anything, we have 11.3. Basic Predicate Transformers 193

wp.skip = skip .

Any contract statement S defined by the language above can thus be directly expressed as a weakest-precondition predicate transformer. The function wp is really a homomorphism from the syntactic algebra of contract statements to the semantic algebra of predicate transformers. For instance, we have that

wp.( f ; (h ; {false} skip) ; k ) =f ; (h ; abort skip) ; k . Again, on the left-hand side we have the weakest precondition of a contract state- ment, on the right-hand side a predicate transformer. This permits us to consider contracts directly as predicate transformers. The context will usually tell whether we have a contract statement or a predicate transformer. The names for the bottom and top of the predicate-transformer lattice are by now traditional. They arise from interpreting predicate transformers as describing the execution of program statements. Then abort is seen as a failure: the computation has to be aborted because something went wrong. The execution may have gone into an infinite loop, there may have been a run-time error, or something else bad might have happened. The essential thing is that we lose control of the execution and do not know what happens after an abort. Our interpretation of contracts interprets abort as the assertion that is impossible to satisfy and thus always leads to a contract violation, preventing our agent from satisfying the contract, no matter what condition was supposed to be achieved. In the program execution interpretation magic is a statement that immediately achieves whatever condition our program was supposed to, as if a miracle had occurred. This does not have any direct counterpart in existing programming lan- guages, as this statement is not implementable. However, it serves as a convenient abstraction when manipulating program statements. In this respect, it is similar to the use of imaginary numbers: these simplify calculations even though they strictly speaking do not have any counterpart in reality. In our interpretation of contracts, magic is the assumption that is impossible to satisfy and thus will always discharge our agent from any obligation to follow the contract. Interpreting the statements we have introduced as talking about contracts in general, rather than just about program execution, allows us to give simple and intuitive interpretations to both abort and magic. As we have already shown, program statements can then be seen as a special kind of contract.

Expressions and Assignment Statements In practice we use expressions with program variables to describe assert and as- sumption statements. For instance, the assert statement {x ≥ y + 2} has no effect 194 11. Predicate Transformers

in a state σ where x.σ ≥ y.σ + 2 but leads to a breach of the contract in states where this condition is not satisfied. Using the definition of assert statements, we can, e.g., determine that

{x ≥ y + 2}.(x ≤ 10) = (x ≥ y + 2 ∧ x ≤ 10) because {p}.q = p ∩ q (written p ∧ q when p and q are boolean expressions). For the corresponding assumption, we get

[x ≥ y + 2].(x ≤ 10) = (x ≥ y + 2 ⇒ x ≤ 10) because [p].q = (p ⇒ q). The assignment statement x := e is a particularly useful functional update. Here x is a program variable and e an expression, as usual. By the definition, we have that x := e .q.σ = q((x := e).σ ). This can be further simplified by the substitution lemma for assignments to q[x := e].σ . This gives us the following important result (usually referred to as the assignment axiom, although here it is not an axiom but a theorem):

Theorem 11.2 Let x be an attribute (or a list of attributes), e an expression (or a list of expressions), and q a boolean expression. Then

var x x := e .q = q[x := e] .

We usually omit the explicit angular brackets for assignment statements when the context makes it is clear that an assignment statement rather than an assignment is intended. As an example, we can use this rule to calculate the weakest precondition for x := x + y ; y := y + 1 establishing condition x ≥ y. We have that

var x, y : Nat  (x := x + y ; y := y + 1 ).(x ≥ y) ={definition of sequential composition} x := x + y .(y := y + 1 .(x ≥ y)) ={assignment rule} x := x + y .(x ≥ y + 1) ={assignment rule} x + y ≥ y + 1 ={arithmetic} x ≥ 1 This shows how the calculation of weakest preconditions is kept simple using the assignment rule. 11.4. Relational Updates 195

11.4 Relational Updates

Let us now look at the angelic update {R} and the demonic update [R]. Satisfying the angelic update {R} in initial state σ to establish q requires that there be a state γ such that R.σ.γ holds and γ ∈ q. Our agent can then satisfy the contract by choosing such a next state γ . If there is no state γ such that R.σ.γ , then the contract is breached. Satisfying the demonic update [R] in initial state σ to establish q again requires that for every state γ such that R.σ.γ , the condition γ ∈ q must hold. Then no matter how the other agent chooses the next state, condition q will be established. If there is no final state γ such that R.σ.γ holds, then the other agent has to breach the contract, and our agent is discharged of its obligations. The functional update and the two relational updates are illustrated in Figure 11.1. The argument above justifies defining predicate transformers corresponding to the two updates: the angelic update {R} :  %→  and the demonic update [R]: %→ ,defined as follows:

∧ {R}.q.σ = (∃γ ∈  • R.σ.γ ∧ q.γ ) , (angelic update) ∧ [R].q.σ = (∀γ ∈  • R.σ.γ ⇒ q.γ ) . (demonic update) Since these predicate transformers capture the behaviors of the update contract statements, we simply define

wp.{R}≡{R} and wp.[R] ≡ [R] .

In set notation the two updates are as follows:

{R}.q.σ ≡ R.σ ∩ q =∅ and [R].q.σ ≡ R.σ ⊆ q .

As special cases, we have

{False}=abort and [False] = magic . This characterization of abort and magic is more general than the one we can achieve with asserts and guards, because the relation False can go from one state space  to a different state space , and hence we can define abort :  %→  and magic :  %→ , whereas with asserts and guards, we can only define abort :  %→  and magic :  %→  on a single state space . Special cases of the relational updates are

∧ choose ={Tr ue} , (arbitrary choice) ∧ chaos = [Tr ue] . (chaotic choice) The first stands for an arbitrary angelic choice and the second for an arbitrary demonic choice. 196 11. Predicate Transformers

q σ [R]

q σ 〈 f 〉 γ (b)

(a) q {R} σ

(c)

FIGURE 11.1. (a) Functional update, (b) demonic update, (c) angelic update

Commutativity of Updates with Test and Map The injection of predicates (or functions) into relations, and the subsequent con- struction of demonic and angelic updates from these commutes with forming guards and asserts (or functional updates) directly; i.e., {|p|} = {p} , [ |p| ] = [p] , {|f |} = f , [ | f | ] =f . The proofs of these properties are left as an exercise (Exercise 11.3).

Relational Assignment Statements A demonic assignment statement is a demonic update [R], where R is a relation. It is thus a statement of the form [(x := x | b)]. We omit the inner parentheses, writing the demonic assignment as [x := x | b]. Similarly, an angelic assignment statement is a angelic update {R}, where R is a relational assignment. Again, the inner parentheses are usually omitted, so we write an angelic assignment as {x := x | b}. This is generalized to simultaneous updates of two or more program variables in the obvious way. For instance, the statement     {x, e := x , e | x ≥ 0 ∧ e > 0} ; [x := y |−e ≤ x − y2 ≤ e] first updates both the x and e components of the state (angelically) so that the new value of x is nonnegative and the new value of e is positive. After this, the 11.4. Relational Updates 197

x component is changed so that the square of its new value differs by less than the value of e from its old value. We can also derive substitution rules for the relational assignment statements. These rules make use of the pointwise extension of the existential and universal quantifiers that we have defined above, in connection with meets and joins over arbitrary sets of predicates. Theorem 11.3 Assume that x is a list of program variables and b and q are boolean expressions in which x does not occur free in q. Then

   (a) var x {x := x | b}.q = (∃x • b ∧ q[x := x ]).

   (b) var x  [x := x | b].q = (∀x • b ⇒ q[x := x ]).

Proof We prove the first case here; the second one is established by analogous reasoning.  {x := x | b}.q.σ ≡{definition of angelic update}  (∃γ • (x := x | b).σ.γ ∧ q.γ ) ≡{characterization of relational assignment}   (∃γ • (∃x • γ = set.x.x .σ ∧ b.σ ) ∧ q.γ ) ≡{properties of existential quantification}   (∃γ x • γ = set.x.x .σ ∧ b.σ ∧ q.γ ) ≡{one-point rule}   (∃x • b.σ ∧ q.(set.x.x .σ )) ≡{substitution property}   (∃x • b.σ ∧ q[x := x ].σ ) ≡{pointwise extension}   (∃x • b ∧ q[x := x ]).σ 2

From now on, we will omit the assumption var x from theorems and rules unless there is some special reason to state it. Consider as an example the demonic assignment statement [x := x | x > x + y]. We have by the above rule that   [x := x | x > x + y].(x > y) ={rule for demonic assignment}    (∀x • x > x + y ⇒ x > y) 198 11. Predicate Transformers

={reduction, arithmetic} x ≥ 0 Note that the result of the simplification is a boolean expression.

11.5 Duality

The dual S◦ :  %→  of a predicate transformer S :  %→  is defined by

∧ S◦.q =¬S.(¬ q). (dual predicate transformer) A consequence of the definition is that (S◦)◦ = S, so dualization is an involution (Exercise 11.5).

Theorem 11.4 The following dualities hold between predicate transformers:  ◦ = ,()◦ = ◦ ◦ , f f S1 ; S2 S1 ; S2 ◦ ◦ ◦ { } = ,( ∈ • ) = ( ∈ • ), p [p] i I Si i I Si {R}◦ = [R] .

Proof For assertions, we have ◦ {p} .q ={definitions of assert and duals} ¬ (p ∩¬q) ={lattice properties} ¬ p ∪ q ={definition of guard} [p].q The proofs of the other parts are similar (Exercise 11.6). 2

Thus, the basic operations for constructing predicate transformers come in pairs, each one with its dual. Functional update and sequential composition are their own duals. The characterization of the constant predicate transformers also implies that

◦ ◦ magic = abort , skip = skip .

Since (S◦)◦ = S, all the duality properties listed above can be read in reverse. ◦ ◦ Thus, for example, [p] ={p} and abort = magic. 11.6. Preconditions and Guards 199

11.6 Preconditions and Guards

Let S :  %→  be a predicate transformer. Let us define the following operations in ( %→ ) → P():

∧ m.S = (∩ q : P() • S.q), (miracle precondition) ∧ t.S = (∪ q : P() • S.q), (abortion guard)

Thus, m.S holds in an initial state if S is guaranteed to establish every postcondition in this state, while t.S holds in an initial state if S is guaranteed to establish some postcondition.

In addition, we introduce special notation for the of these operations as well.

∧ a.S =¬t.S , (abortion precondition) ∧ g.S =¬m.S . (miracle guard) Intuitively, the abortion precondition characterizes those initial states in which our agent cannot avoid breaching the contract (abortion will occur). The abortion guard is the complement of this and characterizes those initial states in which a breach can be avoided (it guards against abortion). Similarly, the miracle precondition characterizes those initial states in which our agent can choose to discharge the obligations of the contract (a miracle can occur), while a miracle guard characterizes the complement of this, i.e., the initial states in which this is not possible (it guards against miracles). In practice, the abortion guard t.S and the miracle guard g.S are more useful than the other two.

It is easily seen that m.S ⊆ t.S. Hence, the abortion and the miracle guards partition the initial state space into three disjoint sets:

(i) The states in m.S from which any postcondition can be established. These are the initial states in which our agent can choose to be discharged from its obligations to satisfy the contract. (ii) The states in a.S from which no postcondition can be established. These are the states in which our agent cannot avoid breaching the contract. (iii) The states in t.S ∩ g.S from which some postconditions can be established and some cannot. These are the states in which a discharge of obligation is impossible for our agent, and a breach of contract can be avoided, so some postcondition will be established.

This trichotomy turns out to be a fundamental aspect of the predicate transformer approach to program semantics, and we will meet it again and again in different disguises in the subsequent discussion (Figure 11.2). 200 11. Predicate Transformers

a.S

t.S ∩ g.S

m.S

FIGURE 11.2. Partitioning of state space

11.7 Summary and Discussion

This chapter has introduced the fundamental notion for modeling program state- ments in higher-order logic, the predicate transformer. We have shown that the predicate transformers form a special kind of lattice-enriched category, which determines the basic algebraic properties of predicate transformers. Predicate transformers were introduced by Dijkstra [52, 53] to give a semantics to a language with nondeterminism. They were used to represent a variety of se- mantic functions for programs: the most commonly used are weakest precondition, weakest liberal precondition, and strongest postcondition. An older overview of the semantic treatment of predicate transformers can be found in the work of de Bakker [36]. Dijkstra and Scholten [54] give a treatment of predicate transformers with an attempt to derive their mathematical and logical treatment directly from first principles. Most semantic treatments of predicate transformers have concen- trated on more restricted classes, in the style of traditional denotational semantics [119, 130]. The fact that the set of all predicate transformers forms a complete boolean lattice was first explicitly used by Morris [108], Hesselink [79], and Back and von Wright [26]. In addition to introducing the algebraic structure of predicate transformers, we also identified a particular set of basic predicate transformers. These form the main vehicle for subsequent investigations on how to use predicate transformers to model program constructs and programming. Dijkstra’s original language of guarded commands [52, 53] identified skip, abort, assignments (generalized here to arbitrary functional updates), and sequential composition as basic constructs, and additionally introduced a (finitely demonic) nondeterministic conditional and a loop construct. Back [6] introduced a nondeterministic assignment (modeled here 11.7. Summary and Discussion 201 as the demonic update), and de Bakker [36] used a meet (choice) operation on predicate transformers. Miracles (guard and magic) were added by Morgan [104], Morris [108], and Nelson [113] for basic sequential statements, and by Back [15] to model enabledness of statements executed in a parallel context. Angelic nonde- terminism was added (as an angelic assignment) by Back [13] and by Gardiner and Morgan [59] (as a “conjunction operation”). The full collection of basic predicate transformers as described here were first identified by Back and von Wright [25]. Duality was used in reasoning about predicate transformers already by Pedro Guer- reiro in [71]. The specification language of Back and von Wright [26, 27] is based on the fundamental dualities between demonic and angelic nondeterminism and be- tween nontermination and miracles. Dijkstra and Scholten [54] introduce a notion of converse predicate transformers that is closely related to duals. The original formalization of the refinement calculus [7] was based on infinitary logic [95, 127], in order to permit a simple characterization of iteration. First- order predicate calculus is not very suitable as a basis for the refinement calculus, because the very definition of refinement involves a quantification over functions (all possible postconditions), which cannot be expressed directly in the syntax of first-order logic. Higher-order logic permits quantification over functions, so it is a good framework for the formalization of refinement calculus. The first systematic investigation of higher-order logic as a basis for the refinement calculus was undertaken by Back and von Wright [30]. The notion of angelic nondeterminism goes back to the theory of nondeterministic automata and the nondeterministic programs of Robert Floyd [56]. Angelic versions of weakest preconditions were investigated by Dean Jacobs and Gries [92], but this work did not include angelically nondeterministic statements. Manfred Broy [42] discusses the use of demonic and angelic nondeterminism, in particular with respect to concurrency. The basic duality between angelic and demonic statements was identified and studied in detail by Back and von Wright [25, 27, 29]. Some applications of angelic nondeterminism are shown by Nigel Ward and Ian Hayes [139] based on this work. Angelic nondeterminism is also used in a theory of program inversion by von Wright [142]. A binary join operator, but interpreted as parallel independent execution, was studied by Eike Best [39]. Category-theoretic studies of program refinement have been pioneered by Claire Martin, David Nauman, Paul Gardiner, and Oege de Moer [57, 58, 98, 112]. It is not our intention to go very deep into category theory. We mainly use it as a framework for unifying the treatment of the different domains of discourse in the refinement calculus. The assignment rule derived here is exactly the definition of the weakest-precondition semantics for assignments of Dijkstra [53], which in turn is based on the assignment axiom of Hoare [84]. Thus we can recover Dijkstra’s syntactic characterization of weakest preconditions as theorems in our more general framework. 202 11. Predicate Transformers

11.8 Exercises

11.1 Show that {p}[p] holds for an arbitrary predicate p. 11.2 Relation R is said to be total if (∀σ • ∃γ • R.σ.γ ) holds. Show that [R] {R} holds if and only if the relation R is total. 11.3 Prove the following properties of the test and map relations an for arbitrary predicate p and arbitrary function f : {|p|}={p} , [ |p| ] = [p] , {|f |}=f , [ | f | ] =f . 11.4 Prove the following:

chaos  skip , skip  choose . Furthermore, show that equality holds if and only if the state space is a singleton. 11.5 Show that (S◦)◦ = S holds for an arbitrary predicate transformer S. 11.6 Complete the proof of Theorem 11.4. 11.7 Use the assignment rules to compute the following:

(a) (x, y := x + z, y − z).(x + y = w) .

(b) (x, y := x + z, y − z).(x = y). 11.8 Complete the proof of Theorem 11.3. This is page 203 Printer: Opaque this 12

The Refinement Calculus Hierarchy

The predicate transformers that we use to model contracts are constructed from other predicate transformers or from simpler elements, like state predicates, state transformers, or state relations. The predicate transformer constructors can thus be seen as functions that map elements in one domain to the domain itself or to another domain. The basic question that we investigate in this chapter is to what extent the lattice and category structure is preserved by these constructors, i.e., to what extent the constructors are monotonic and homomorphic.

The use of monotonicity and homomorphism is particularly relevant in program- ming, because we may view programs as different kinds of entities, depending on what features we need. For instance, a straight-line program is best viewed as just a state transformer, and its properties are easiest to analyze in this simple framework. However, if a straight-line program occurs as a component of a larger program, with relational assignments, then we need to use state relations as our program model. In this case, we can embed the straight-line program f into the relational program as a mapping | f |. Or if we need the even more powerful framework of contracts and predicate transformers, then we can embed the straight-line program as an update statement  f in a larger predicate transformer context.

12.1 State Categories

We have shown earlier that state transformers, state relations, and predicate trans- formers form categories of a certain kind, where the objects are state spaces (or types) and the morphisms are either functions, relations, or predicate transform- 204 12. The Refinement Calculus Hierarchy

ers between the state spaces. In general, we will refer to these categories as state categories. Essentially, all the state categories are order-enriched categories. Relations form a complete boolean lattice-enriched category, while predicate transformers form a left complete boolean lattice-enriched category. State transformers form a discrete order-enriched category (we assume only the discrete ordering for state spaces, so f  g ≡ f = g).

State Predicate Category We can also model predicates as categories with types as objects. A state predicate category has state spaces as objects and predicates as morphisms. Each predicate p in P() is a morphism with source  and target . There are no morphisms between two different state spaces. The composition operator is meet (∩), and the identity on  is the predicate true:

∧ 1 = true , (1 for state predicates) ∧ p ; q = p ∩ q . (; for state predicates)

The reader may verify that the defining properties for a category are satisfied (Exercise 12.6):

p ∩ (q ∩ r) = (p ∩ q) ∩ r , (∩ associative) true ∩ p = p and p ∩ true = p . (true unit) To avoid confusion, we consistently use ∩ as the symbol for the composition operator on predicates. Similarly, we always write true rather than 1 for the unit. It is easy to see that composition is monotonic with respect to the predicate ordering:

p ⊆ p ∧ q ⊆ q ⇒ p ∩ q ⊆ p ∩ q . (∩ monotonic)

We let PredX denote the state predicate category with objects X and all predicates on  as morphisms PredX (, ).

Theorem 12.1 PredX is a complete atomic boolean lattice-enriched category. Composition is easily seen to distribute over bottom, meet, and join in this category, both from the left and from the right. For example, distribution from the left over bottom means that p ∩ false = false (to see that this is a distributivity property, recall that false is the empty join). Since composition is also a lattice operation, we will not make much direct use of the fact that Pred is a category. 12.1. State Categories 205

Ptran(Σ, Γ) = Pred(Γ) → Pred(Σ)

Rel(Σ, Γ) = Σ → Pred(Γ)

Tran(Σ, Γ) = Σ → Γ Pred(Γ) = Γ → Bool

Γ Bool

FIGURE 12.1. Pointwise extended poset and lattice categories

The refinement calculus hierarchy All the basic entities of the refinement calculus are built by (iterated) pointwise extension from the lattice of truth values Bool and an arbitrary collection X of state spaces ,,,... The latter can be any collection of types, as we have noted earlier. For imperative programs, we would usually choose types that define states with attributes as state spaces. The pointwise extension hierarchy of categories that we have constructed is shown pictorially in Figure 12.1. All the basic domains of the refinement calculus are order-enriched categories, in most cases different kinds of lattice-enriched categories. Thus they have many properties in common, which simplifies reasoning in these domains. Rather than having one set of inference rules for predicates, one for state transformers, one for relations, and one for predicate transformers, we have a single collection of inference rules based on the general lattice and categorical properties of order- enriched categories. Classifying each of the basic domains as forming a specific kind of order-enriched category, as we have done above, determines what inference rules are available for reasoning about entities in this particular domain. The basic categories and the operations that link them are shown in Figure 12.2. We refer to this collection of domains and operations as the refinement calculus hierarchy. It provides the basic framework within which we reason about properties of programs, specifications, and contracts in general. Due to pointwise extension, the basic domains of the refinement calculus hierarchy share the same algebraic structure. Furthermore, a term at one level of the hierarchy can be expressed in terms of elements at a lower level of the hierarchy by using a homomorphism property or the definition of pointwise extension directly. This 206 12. The Refinement Calculus Hierarchy

Ptran

〈 〉 [ ] { } [ ] { } a t g m

| | | | Tran Rel Pred dom ran ∀ ∃

Bool

FIGURE 12.2. Operations in the refinement calculus hierarchy

means that we can often choose the level in the refinement calculus hierarchy at which to carry out an argument. It is usually preferable to argue at as high a level as possible. The notation is then more concise, and the proof is more algebraic and shows the structure of the argument more clearly. However, if an argument at one level becomes too involved, we can reduce the level by writing out the pointwise extension explicitly. This movement between levels in the refinement calculus hierarchy is typical of arguments in the calculus.

12.2 Homomorphisms

We will here be particularly interested in the monotonicity and homomorphism properties of predicate transformer constructors. We are interested in questions such as whether the structure of predicates is preserved by the guard and assert constructors, and whether the structure of functions and relations is preserved by the updates. Knowing the basic algebraic properties of the basic constructors allows us to easily determine the algebraic properties of program statements that are defined in terms of these. 12.2. Homomorphisms 207

Sequential Composition All the basic predicate transformers, considered as mappings from elements of other domains to predicate transformers, preserve sequential composition and identities. Thus they are homomorphisms. Duality is also a homomorphism on predicate transformers.

Theorem 12.2 The following homomorphism properties hold for the basic predicate transformers:

(a) {p ∩ q}={p} ; {q} , {true}=skip , (b) [p ∩ q] = [p];[q] , [true] = skip , (c)  f ; g =f ; g , id =skip , (d) {P ; Q}={P} ; {Q} , {Id}=skip , (e) [P ; Q] = [P];[Q] , [Id] = skip , ( )◦ = ◦ ◦ , ◦ = . (f) S1 ; S2 S1 ; S2 skip skip

Proof The proofs are straightforward. For example, we have

{p ∩ q}.r ={definition} p ∩ q ∩ r ={definition of assert} {p}.(q ∩ r) ={definition of assert} {p}.({q}.r) ={definition of sequential composition} ({p} ; {q}).r

2

The homomorphism properties for sequential composition and identity are sum- marized in Table 12.1.

Guards and Asserts For asserts and guards we have the following lattice homomorphism properties, in addition to the functor properties listed above. 208 12. The Refinement Calculus Hierarchy

Theorem 12.3 The assert and guard constructors satisfy the following properties:

(a) p ⊆ q ≡{p}{q} , p ⊆ q ≡ [p]  [q]; (b) {false}=abort , [false] = magic ; • • (c) {∪i ∈ I pi }=( i ∈ I {pi }), • • [∪i ∈ I pi ] = ( i ∈ I [pi ]) ; • • (d) {∩i ∈ I pi }=( i ∈ I {pi }), when I =∅ , • • [∩i ∈ I pi ] = ( i ∈ I [pi ]), when I =∅ .

Proof The proofs are straightforward. 2

The assert constructor (λp • {p}) : P() → ( %→ ) is thus monotonic, and it preserves bottom, positive meets, and universal joins. The guard constructor (λp • [p]) : P() → ( %→ ), being the dual of assert, is antimonotonic. It preserves bottom, positive meets, and universal joins onto the dual lattice ( %→ )◦.

In fact, the equivalences in Theorem 12.3 show that the assert constructor homo- morphically embeds the predicates P() in  %→ . This means that reasoning about asserts can be reduced to reasoning about predicates. Similarly, the guard constructor homomorphically embeds the predicates into the dual of the predicate transformer lattice.

Updates We have the following results for the homomorphic properties of functional and relational updates.

is  ⊥  ¬ 1; (λp • {p}) yes yes no yes 1 yes no yes yes ◦ ◦ ◦ ◦ (λp • [p]) yes yes no yes 1 yes no yes yes (λf •  f ) yes -- --- yes yes (λR • {R}) yes yes no no yes no yes yes ◦ ◦ ◦ (λR • [R]) yes yes no no yes no yes yes 1when the meet is taken over nonempty sets

TABLE 12.1. Homomorphic properties of basic predicate transformers 12.2. Homomorphisms 209

Theorem 12.4 The update constructs satisfy the following lattice-theoretic properties: (a) f = g ≡f =g ; (b) P ⊆ Q ≡{P}{Q} , P ⊆ Q ≡ [P]  [Q]; (c) {False}=abort , [False] = magic ; • • (d) {∪i ∈ I Ri }=( i ∈ I {Ri }), • • [∪i ∈ I Ri ] = ( i ∈ I [Ri ]).

Proof The proofs are again straightforward. 2

The angelic update construction (λR • {R}) : Rel(, ) → ( %→ ) is thus monotonic, and it preserves bottom and universal joins. Dually, the demonic up- date construction (λR • [R]) : Rel(, ) → ( %→ ) is antimonotonic, and it preserves bottom and universal joins onto the dual lattice ( %→ )◦. These prop- erties are also listed in Table 12.1. Note further that the function space Tran(, ) is homomorphically embedded in  %→  through the functional update construc- tor. Similarly, angelic update embeds Rel(, ) into  %→ , and demonic update embeds Rel(, ) into the dual of  %→ .

Compound Predicate Transformers The monotonicity and homomorphism properties of compound predicate trans- formers follow directly from the general lattice and category properties, e.g.,

(S1  S2) ; T = S1 ; T  S2 ; T We repeat these properties in Table 12.2 for reference, together with the properties for the dualization construct.

Preconditions and Guards of Statements We finally consider the projection of predicate transformers onto predicates by the precondition and guard constructors, and to what extent these functions preserve the structure of predicate transformers.

is  ⊥ ¬ 1; (λS • S ; T ) yes yes yes yes yes yes no no (λS • S  T ) yes yes no yes yes no no no (λS • S T ) yes no yes yes yes no no no ◦ ◦ ◦ ◦ ◦ (λS • ¬ S) yes yes yes yes yes yes no no ◦ ◦ ◦ ◦ ◦ ◦ (λS • S ) yes yes yes yes yes yes yes yes

TABLE 12.2. Homomorphic properties of compound predicate transformers 210 12. The Refinement Calculus Hierarchy

Theorem 12.5 Let S and T be predicate transformers. Then the following properties hold. (a) S  T ⇒ m.S ⊆ m.T , S  T ⇒ t.S ⊆ t.T ; (b) m.abort = false , t.abort = false ; (c) m.magic = true , t.magic = true ; • • (d) m.( i ∈ I Si ) = (∩ i ∈ I m.Si ), • • t.( i ∈ I Si ) = (∪ i ∈ I t.Si ).

Proof The proofs are straightforward. 2

Thus, the miracle precondition constructor m : ( %→ ) → P() is monotonic and it preserves bottom, top, and universal meets. The abortion guard constructor t( %→ ) → P() is also monotonic and it preserves bottom, top, and uni- versal joins. These properties are listed in Table 12.3. We also show there the corresponding properties for the two other domain constructors, g.S and a.S.

Example As an example of using homomorphisms in proofs, we show that the property −  f ;[f 1]  skip holds for an arbitrary state transformer f : −  f ;[f 1] ={properties of mapping relation construct} − [ | f | ];[f 1] ={homomorphism} − [ | f | ; f 1] − {|dom.R|⊆R ; R 1 by Exercise 9.7} [ |dom.| f || ] ={all functions are total} [ |true| ]

is  ⊥ ¬ (λS • m.S) yes yes yes no yes no (λS • t.S) yes yes yes yes no no ◦ ◦ ◦ ◦ (λS • a.S) yes yes yes no yes no ◦ ◦ ◦ ◦ (λS • g.S) yes yes yes yes no no

TABLE 12.3. Homomorphic properties of preconditions and guards 12.3. Summary and Discussion 211

={|true|=Id, demonic update is functor} skip Note how the derivation uses homomorphism properties to move between different levels in the refinement calculus hierarchy. A similar derivation proves the dual − property skip {f 1} ;  f (Exercise 12.5).

12.3 Summary and Discussion

We have here identified the basic algebraic structure that underlies the refinement calculus: the refinement calculus hierarchy. This structure is formed by the state categories, all constructed by pointwise extensions from the state spaces and the truth-value lattice. All state categories share the same basic algebraic structure of order-enriched categories, in most cases combining a lattice structure with a cate- gory structure. It forms a rich (many-sorted) algebra, where refinement arguments can be carried out on many levels in the hierarchy. The refinement calculus hierar- chy and the fundamental role that pointwise extension plays in it were identified in [26]. We have looked here at the operations within this structure and in particular studied to what extent these operations preserve the lattice and category structures of their domains. Homomorphic properties of predicate transformers have, to our knowledge, not been investigated in this way systematically before, although many of the homomorphism properties that we have listed above are well known from the literature. Algebraic collections of laws for programs in a relational setting have been given by Hoare et al. in [90]. Hesselink uses algebraic properties to define his notion of command algebras, for which positive meet-homomorphic predicate transformers provide models [80]. The homomorphism properties are important in practical reasoning, because they form the main method by which an argument on one level of the refinement calculus hierarchy can be reduced to a simpler argument on another (in most cases lower) level of the hierarchy. Another important aspect of the refinement calculus hierarchy that we have iden- tified above is that it clearly demonstrates the different semantic models for pro- gramming languages that are available. Methods for program derivations based on these different models (partial functions, relations, and predicate transformers) need not be seen as competing but rather as complementary. A theory of func- tional programming is in fact embedded as a subtheory in the theory of predicate transformers. Similarly, relational programming and the denotational semantics of deterministic programs form subtheories of the predicate transformer theory. The homomorphism properties show how results in one theory can be carried over to another theory. It also shows that mixed program derivations are quite feasible. For instance, a functional program derivation might well start with a relational or 212 12. The Refinement Calculus Hierarchy

(demonically) nondeterministic program statement. This aspect will become even clearer later on, when we study iteration and recursion, and when we show how to relate different semantics of programming languages. The tabular representation of homomorphisms that we introduced in this chapter is expanded with new operations later on, and also with new operations that we may wish to be preserved by homomorphisms. We are in fact building one large table describing the homomorphism properties of useful predicate transformer constructors. This table will be revealed piece by piece as we go along, whenever a new constructor is introduced. The whole table is given in Appendix B.

12.4 Exercises

12.1 Use suitable rules to simplify the following in a detailed derivation:

(a) x := x + y ; y := x − y .

(b) x := x + 1 ;[x := x | x < x < x + 3] . 12.2 Prove the following parts of Theorem 12.2: {P ; Q}={P} ; {Q} and [P ; Q] = [P];[Q] for arbitrary relations P and Q. 12.3 Prove the following homomorphism properties over relations: {P ∪ Q}={P} {Q} and [P ∪ Q] = [P]  [Q] . 12.4 Give an example of predicate transformers S, T , and U such that S  T but U ; S  U ; T . 12.5 Use homomorphism properties to prove the following for an arbitrary state trans- former f : − skip {f 1} ;  f . 12.6 Show that Pred is a category, with ∩ as the composition operator. 12.7 Prove that composition is monotonic with respect to the ordering in the category of state predicates and in the category of state relations. 12.8 Show that the following homomorphism properties hold for arbitrary state predi- cates p and q: |p|⊆|q|≡p ⊆ q , |p| ; |q|=|p ∩ q| , |p|∩|q|=|p ∩ q| , |p|∪|q|=|p ∪ q| . This is page 213 Printer: Opaque this 13

Statements

In this chapter we show that the basic predicate transformers that we have intro- duced to model contracts (asserts, guards, functional updates, demonic and an- gelic updates) are all monotonic. In addition, composition, meet, and join preserve monotonicity. Conversely, any monotonic predicate transformer can be described in terms of these constructs, in a sense to be made more precise below. In fact, it is possible to write any monotonic predicate transformer in a normal form. There are also other reasons for considering the monotonic predicate transformers in more detail. By restricting our attention to monotonic predicate transformers, we get that sequential composition is monotonic in both arguments. This means that we have a complete lattice-enriched category. Without monotonicity, we have only a left boolean lattice-enriched category. Although we now lose the boolean lattice property, we gain a simpler category structure.

13.1 Subtyping, Sublattices and Subcategories

We are interested in the sublattice of monotonic predicate transformers. Before we can study this particular sublattice in more detail, we need to be more precise about the terminology and define the notions of subtypes, subposets, sublattices, and subcategories generally.

Subtyping We have until now assumed that a poset is a pair (, ), where  is a type and :  →  → Bool is a relation on . In practice, this is too restrictive; we also 214 13. Statements

need to consider posets (A, A), where A is a subset of . In set theory, we would then choose A to be the restriction of  to A, so that A⊆ A × A. In higher-order logic, we would like to define A: A → A → Bool. This is not, however, possible, because A is not a type, except when A = . Instead, we choose to relativize the notions needed for posets and lattices. We say that the relation :  →  → Bool is reflexive on A if a  a holds for every a ∈ A, that it is transitive on A if a  b ∧ b  c ⇒ a  c for any a, b, c ∈ A, and so on. In fact, the definitions that we have given for posets and lattices in the introductory chapter are already relativized, because we were careful in each definition to point out that the requirement expressed need hold only for elements in the set A. Thus, we can talk about the bottom of A, the top of A, the meet in A, the join in A, the negation in A, and so on. We say that (A, ) is a poset if  is reflexive, transitive, and antisymmetric on A. The lattice definitions are relativized in a similar way. We also have to relativize the notion of functions from one poset to another. Let (A, 1) and (B, 2) be two posets, where A ⊆ , B ⊆ , 1:  →  → Bool, and 2:  →  → Bool. The function f :  →  is said to be monotonic on A    if a 1 a ⇒ f.a 2 f.a holds for any a, a ∈ A. Order-preserving functions and congruences are relativized in the same way. Consider next the function space construction. For A ⊆  and B ⊆ , we would like to say that f is a function from A to B; i.e., f : A → B. However, again we have the problem that A → B is not a type. We solve this problem by defining the set

∧ A → B ={f :  →  | (∀a ∈ A • f.a ∈ B)} . (function space) In other words, A → B is the set of all functions of type  →  that map elements in A to elements in B. Instead of writing f : A → B, we can then express essentially the same thing by stating that f ∈ A → B.Bydefinition, we have that A → B ⊆  → . Hence, we can consider A → B as a subtype of  → . We can consider any set A ⊆  as a subtype of  (permitting also empty sub- types). The type constructor → applied to subtypes then generates a subtype of the corresponding type. We have that

A ⊆ A ∧ B ⊆ B ⇒ A → B ⊆ A → B . Thus, the function space constructor is antimonotonic in its first argument and monotonic in its second. Now consider the pointwise extensions A → , where the domain is a subset and the range is a type. We have that A →  = A → , for any A, A ⊆ . In particular, this means that P(A) = A → Bool is the same collection of boolean functions as P() =  → Bool. Continuing from here, we also find that 13.1. Subtyping, Sublattices and Subcategories 215

A ↔ B = A → B → Bool is the same as  ↔ , and A %→ B = P(B) → P(A) is the same as  %→ . Thus, predicates, relations and predicate transformers can be subtyped in an arbitrary fashion. A subtyping assertion f ∈ A → B is then trivially true when B is some type , but otherwise we need to check that f.a ∈ B really holds for every a ∈ A.

Pointwise Extension, Antisymmetry We also need to check whether the pointwise extension theorem holds for subtypes. • We define the ordering on A → B by f A→B g if and only if (∀a ∈ A f.a B g.a). Assume that B is a poset. Then f A→B f holds, and f A→B g and g A→B h implies that f A→B h;soA→B is reflexive and transitive. However, it is not antisymmetric: f A→B g and g A→B f implies that f.a = g.a for every a ∈ A, but it does not imply that this holds for every a in . In other words, equality • would also need to be relativized. Writing f =C g for (∀c ∈ C f.c = g.c), which says that f = gonC, we have that f A→B g and g A→B f implies f =A g. Antisymmetry does hold in  → B, i.e., when the domain is chosen to be a type, because f = g is the same as f = g. Generally, the pointwise extension theorem holds for any extension of B to  → B.

We could fix the pointwise extension theorem by making a small adjustment in our definition of the function space. Defining A → B instead as { f :  →  | (∀x ∈ A • f.x ∈ B) ∧ (∀x ∈ A • f.x = arb)}, we would also get antisymmetry, and the pointwise extension theorem would be in full force. This trick implies that all functions f ∈ A → B have the same value outside domain A, so they are equal if and only if they have the same value for each element in A. We do not choose this approach here, because for our purposes it will be sufficient to consider only pointwise extensions of the form  → B.

Sublattices and Subcategories Let (A, ) be a poset, where A is a set in  and  is a relation on . Let B be a nonempty subset of A. Then (B, ) is also a poset, said to be a subposet of (A, ).

The ordering relation :  ↔  is thus the same in both the poset A and in its subposet B. However, the supremum and infimum in these posets may depend on the elements that are included in the poset, and hence need not be the same. This is illustrated in Figure 13.1. Let A consist of all the elements in this lattice, and let B consist of the elements that are marked black. Then the supremum of the set {a, b} is d in the poset A and c in the poset B. Hence, we may need to indicate explicitly in which poset the supremum and the infimum are taken, writing a A b = d and a B b = c. 216 13. Statements

c

d

ab

FIGURE 13.1. Sublattice

A subset B of a lattice (A, ) is a sublattice of A if (B, ) is a lattice and a B b = aA b and a B b = a A b for any a, b in B. A subset B of the lattice A is a complete sublattice of A if B is a complete lattice and BC =AC and BC = AC for any subset C of B. In particular, this implies that ⊥B =⊥A and B =A. Note that the subset B of the lattice A in Figure 13.1 is a lattice, but it is not a sublattice of A, since a A b = a B b.

If we view a lattice A as an algebra (A, A, A), then the sublattice definition boils down to requiring that the subset B of A must be a subalgebra; i.e., the operations B and B must be defined for any a, b in B (with values in B), and they must give the same result as A and A for any two elements a, b in B: a B b = a A b and a B b = a A b. This notion of a subalgebra leads to the notion of a subcategory, as follows. Let C be a category, with objects Obj and morphisms Mor, and C a category with   objects Obj and morphisms Mor. Then C is a subcategory of C if Obj ⊆ Obj, Mor ⊆ Mor, and if units and composition are the same in C as in C.

Monotonicity

Assume that posets A and B are given. We write A →m B for the set of monotonic functions from A to B:

∧ • A →m B ={f ∈ A → B | (∀xy∈ A x  y ⇒ f.x  f.y)} .

Thus f ∈ A →m B if and only if f maps A into B and f is monotonic on A. We refer to A →m B as the set of monotonic functions from poset A to poset B. If A is a type , then  →m B forms a sublattice of the lattice  → B when 13.2. Monotonic Predicate Transformers 217

the ordering is determined by pointwise extension of a lattice. This is stated more generally in the following theorem.

Theorem 13.1 Let (, ) and (B, ) be posets. Then the pointwise extension of these posets to ( →m B, ) is also a poset. This poset is a (complete, distributive) sublattice of ( → B, ) if B is a (complete, distributive) lattice.

Proof The proof is straightforward. For example, ( →m B, ) is sublattice of ( → B, ), since meets and joins preserve monotonicity. 2

Note that the property of being a boolean lattice is not preserved, because the point- wise extension of a complement is not a complement in the lattice of monotonic functions. If f is monotonic and a  b, then a  b ⇒{functionality} f.a  f.b ≡{negation is antimonotonic} ¬ ( f.a) ¬( f.b) ≡{pointwise extension} (¬ f ).a  (¬ f ).b so ¬ f is antimonotonic and not monotonic.

13.2 Monotonic Predicate Transformers

Since a predicate transformer is a function between partial orders, it makes sense to consider monotonic predicate transformers. Let us use the notation  %→m  for the monotonic predicate transformers for given state spaces  and :

∧  %→m  = P() →m P() . The pointwise extension theorem (Theorem 13.1) gives us the following result.

Corollary 13.2 ( %→m , ) is a complete sublattice of ( %→ , ). We let MtranX denote the category in which the objects are types in X and the morphisms are the monotonic predicate transformers, MtranX (, ) =  %→m . This is easily seen to be a subcategory of PtranX : the identity predicate transformer on a given state space  is monotonic, and the composition of two monotonic predicate transformers is always monotonic.

Corollary 13.3 MtranX is a subcategory of PtranX . 218 13. Statements

Sequential Composition In Section 12.2 we described those monotonicity and homomorphism properties that hold in general for the basic predicate transformer constructors. We get more homomorphism properties when we restrict our attention to monotonic predicate transformers only. Sequential composition in particular is monotonic only with respect to the refinement ordering in its first argument, as shown in Section 12.2. However, it is monotonic also in its second argument if the first argument is a monotonic predicate transformer (Table 13.1).

Lemma 13.4 Let T1 and T2 be arbitrary predicate transformers and S a monotonic predicate transformer. Then

T1  T2 ⇒ S ; T1  S ; T2 .

Proof

S monotonic, T1  T2

 (S ; T1).q ={definition of composition}

S.(T1.q)

⊆{monotonicity of S, assumption T1.q ⊆ T2.q}

S.(T2.q) ={definition of composition}

(S ; T2).q 2

Lemma 13.4 is in fact a special case of the more general fact that a function f is monotonic if and only if g1  g2 ⇒ f ◦ g1  f ◦ g2 (Exercise 13.1). Combining Lemma 13.4 with the fact that sequential composition is always mono- tonic in its left argument, we have the following central result.

Corollary 13.5 Let S1,S2,T1, and T2 be monotonic predicate transformers. Then

S1  S2 ∧ T1  T2 ⇒ S1 ; T1  S2 ; T2 . Thus we can characterize the monotonic predicate transformers as follows:

Theorem 13.6 MtranX is a complete lattice-enriched category.

Preconditions and Guards The guards and preconditions have simple characterizations and more properties when the predicate transformers involved are monotonic. 13.3. Statements and Monotonic Predicate Transformers 219

is  (λT • S ; T ) yes 1 1when S is monotonic

TABLE 13.1. Monotonicity of sequential composition

Lemma 13.7 If S, S1, and S2 are monotonic predicate transformers, then t.S = S.true , a.S =¬S.true , m.S = S.false , g.S =¬S.false , .( ) = .( . ), .( ) = ◦.( . ), t S1 ; S2 S1 t S2 a S1 ; S2 S1 a S2 .( ) = .( . ), .( ) = ◦.( . ). m S1 ; S2 S1 m S2 g S1 ; S2 S1 g S2

Proof The proofs are straightforward (Exercise 13.4). 2

13.3 Statements and Monotonic Predicate Transformers

From now on, we use the word (predicate transformer) statement to denote a predicate transformer built using only the five basic predicate transformers (assert, guard, functional update, angelic update, and demonic update) and the three basic predicate transformer constructors (sequential composition, meet, and join). We emphasize that a statement is a higher-order logic term of a special form that denotes a predicate transformer. Statements do not form a new syntactic category, but rather a subset of an existing syntactic category (that of all predicate transformer terms). The basic statements are constructed out of predicates, functions, and relations: {p}, [p],  f , {R}, and [R]. By combining basic statements with sequential composition, meet, and join, we get compound statements. Our first result establishes that all basic statements are monotonic and that the operations for constructing compound statements also preserve monotonicity. Lemma 13.8 The predicate transformers  f , {p}, [p], {R}, and [R] are all monotonic, for an arbitrary function f , predicate p, and relation R. Furthermore, composition, meet, and join of predicate transformers preserve monotonicity. Proof We prove that {p} is monotonic. Let q and q be arbitrary. Then  q ⊆ q {p}.q ={definition of assert} p ∩ q ⊆{monotonicity of meet, assumption q ⊆ q}  p ∩ q 220 13. Statements

={definition of assert}  {p}.q The proofs for [p],  f , {R}, and [R] are similar. For sequential composition, the derivation is as follows: S monotonic, S monotonic, q ⊆ q   (S ; S ).q ={definition of composition}  S.(S .q) ⊆{monotonicity of S and S}   S.(S .q ) ={definition of composition}   (S ; S ).q The results for meets and joins follow from Theorem 13.1. 2

Lemma 13.8 is illustrated in Table 13.2. This table shows for each operation on the left whether it preserves the homomorphism property indicated on the top. As a direct consequence of this lemma, we get the following important result.

Theorem 13.9 All statements are monotonic.

Normal Form We do not usually distinguish between a statement as a term in higher-order logic (syntax) and the monotonic predicate transformer that it denotes (semantics). We can make the distinction if we need to, by talking about the statement term and the statement denotation, respectively. We shall now show that any monotonic predicate transformer term can in fact be written as a statement term.

Theorem 13.10 Let S be an arbitrary monotonic predicate transformer term. Then there exist state relation terms P and Q such that S ={P} ;[Q].

is  preserves  {p} yes ; yes [p] yes  yes  f yes yes {R} yes [R] yes

TABLE 13.2. Monotonicity of basic and compound statements 13.3. Statements and Monotonic Predicate Transformers 221

Proof Assume that S :  →  is a monotonic predicate transformer. Define relations P :  ↔ P() and Q : P() ↔  as follows: P.σ.δ ≡ S.δ.σ , Q.δ.γ ≡ δ.γ . Thus P holds for initial state σ and final state δ if the predicate S.δ holds in the state σ, and Q holds for initial state δ and final state γ if the predicate δ holds in the state γ (note that the intermediate state space has the type of predicates over the final state space).

Then, for an arbitrary predicate q and arbitrary state σ0 we have

({P} ;[Q]).q.σ0 ≡{definitions}

• • (∃δ S.δ.σ0 ∧ (∀γ δ.γ ⇒ q.γ )) ≡{definitions}

• (∃δ S.δ.σ0 ∧ δ ⊆ q) ≡{mutual implication}

• • (∃δ S.δ.σ0 ∧ δ ⊆ q) ⇒{S is monotonic, context says δ ⊆ q} • (∃δ S.q.σ0 ∧ δ ⊆ q) ⇒{∧elimination} • (∃δ S.q.σ0) ⇒{quantifier rule} S.q.σ0

• • (∃δ S.δ.σ0 ∧ δ ⊆ q) ⇐{∃introduction; witness δ := q} S.q.σ0 ∧ q ⊆ q ≡{reflexivity} S.q.σ0 • S.q.σ0 which shows that S ={P} ;[Q]. 2

This result has the following important corollary. Corollary 13.11 Any term that denotes a monotonic predicate transformer can be expressed as a statement term. In other words, if S is a term that denotes a monotonic predicate transformer, then there is a statement term S such that S = S. This follows directly from the theorem above by choosing S ={P} ;[Q], where P and Q are terms that are constructed from S in the way shown in the proof of the theorem. Note that this result does not mean that every monotonic predicate transformer can be expressed as a statement. That this is impossible is shown by a simple cardinality 222 13. Statements

argument. There are uncountably many monotonic predicate transformers (because the type Ind is infinite), but there can only be countably many terms that denote monotonic predicate transformers, because there is only a countable number of terms altogether in higher-order logic. What the corollary says is that if a monotonic predicate transformer can be described by a term in higher-order logic, then it can also be described as a statement. The decomposition used in the proof of Theorem 13.10 is interesting in itself. Intuitively, it splits execution of S into two parts. The intermediate state space is the predicate space over , so its elements correspond to subsets of . Given initial state σ, the angelic update {P} chooses a predicate δ such that S is guaranteed to establish δ. Then the demonic update [Q] chooses an arbitrary state γ in δ. This kind of two-stage update forms a very simple game between the angel and the demon, with one move for each. The normal form theorem shows that if we wanted to be economical, we would need to use only {R},[R], and sequential composition to describe all monotonic predicate transformers. The other constructs would be definable in terms of these. However, we have introduced a larger number of constructs because they corre- spond to important abstractions that we meet over and over again in the subsequent development. We have shown earlier that the basic statements can all be expressed as angelic or demonic update:  f ={|f |} = [| f |], {p}={|p|} and [p] = [|p|]. In reasoning about statements, we usually take advantage of this fact. We may thus without loss of generality assume that a statement S is built as follows:

• • S ::={Q}|[Q] | S1 ; S2 | ( i ∈ I Si ) | ( i ∈ I Si ).

The use of the arbitrary meet and join in a syntax definition needs a little bit of • explanation. An indexed meet ( i ∈ I Si ) can always be written more explicitly as

• ( i ∈ I Si ) =(im.S.I ), where S is a function that assigns to each index i a statement S.i, and I is some subset of all possible indices. A similar argument holds for indexed joins. As before, the statement syntax does not define a new syntactic category, but merely serves to identify a specific subclass of predicate transformers.

Shunting Rules The duality between assert and guard predicate transformers gives rise to the following useful shunting rules.

Lemma 13.12 Let p be an arbitrary predicate and S and S arbitrary monotonic predicate transformers. Then 13.4. Derived Statements 223

(a) {p} ; S  S ≡ S  [p];S .

(b) S  S ; {p}≡S ;[p]  S .

Proof Part (a) follows directly from the definitions and from the shunting rule for predicates (i.e., p ∩ q ⊆ r ≡ p ⊆¬q ∪ r). For part (b), we have

 S  S ; {p} ≡{definition}  (∀q • S.q ⊆ S .(p ∩ q)) ⇒{specialization q :=¬p ∪ q}  S.(¬ p ∪ q) ⊆ S .(p ∩ (¬ p ∪ q)) ≡{lattice properties of predicates}  S.(¬ p ∪ q) ⊆ S .(p ∩ q) ⇒{S monotonic}  S.(¬ p ∪ q) ⊆ S .q ≡{definition of guard}  (S ;[p]).q ⊆ S .q

so the forward implication half of (b) follows by generalization. The reverse implication is proved in the same way. 2

13.4 Derived Statements

The statements that we have introduced above are sufficient to express all mono- tonic predicate transformer terms, as we just have shown. However, the notation is quite terse and somewhat removed from traditional programming language con- structs. We therefore introduce a number of derived statements, i.e., statements that really are abbreviations for other statements but that have a useful operational interpretation. We have seen how abort, chaos, skip, choose, and magic can be ex- pressed in terms of basic predicate transformers, so we can view them as derived statements. We can think of them merely as abbreviations that can be removed whenever needed.

Let us next consider two other important derived statements: conditional statements and blocks with local variables. We have met these constructs before, as state transformers and state relations. Now we show how to define these as predicate transformers. Other important derived statements, such as recursion and iteration, will be introduced later on, in Chapters 20 and 21. 224 13. Statements

Conditional Statements As we already showed in the introduction, we can define the deterministic condi- tional statement by ∧ if g then S1 else S2 fi = [g];S1  [¬ g];S2 . (conditional)

From the definition it follows that this determines the following predicate trans- former:

(if g then S1 else S2 fi).q = (¬ g ∪ S1.q) ∩ (g ∪ S2.q) or, equivalently,

(if g then S1 else S2 fi).q = (g ∩ S1.q) ∪ (¬ g ∩ S2.q).

Intuitively, if g then S1 else S2 fi is a deterministic choice. If g holds, then S1 is exe- cuted; otherwise S2 is executed. Thus the meet in the definition does not introduce any real nondeterminism. In fact, from the above it follows that the deterministic conditional can also be written using asserts and a join:

if g then S1 else S2 fi ={g} ; S1 {¬g} ; S2 .

We can also define the nondeterministic guarded conditional statement in terms of the more basic constructs:

if g1 → S1 [] ··· [] gm → Sm fi =

{g1 ∨···∨gm } ; ([g1];S1 ···[gm ];Sm ). The assert statement tests that at least one of the guards is true in the state. If not, then we have abortion; otherwise one of the alternatives for which the guard is true is chosen demonically and executed. In contrast with the deterministic conditional, this statement may be genuinely (demonically) nondeterministic. This happens when two or more guards are satisfied in a given state, and the effect of executing the corresponding statements is different. Because we have an explicit definition of the conditional statement, we can de- rive its homomorphism properties from the corresponding properties of its basic constituents. For instance, the deterministic conditional constructor is (jointly) monotonic in its predicate transformer arguments:   ∧   ⇒    . S1 S1 S2 S2 if g then S1 else S2 fi if g then S1 else S2 fi The deterministic conditional is (jointly) bottom and top homomorphic in its pred- icate transformer arguments. This means that if S1 = abort and S2 = abort, then the conditional is also abort, and similarly for magic. However, the conditional is neither jointly meet homomorphic nor jointly join homomorphic in its predicate transformer arguments. Furthermore, it is neither monotonic nor homomorphic in its predicate argument. Choosing the deterministic conditional as a basic predicate 13.4. Derived Statements 225

transformer constructor would therefore not be very wise algebraically, because there are so few homomorphism properties that hold in general. The properties for the guarded conditional statement are essentially the same as for the deterministic conditional statement. In other words, the guarded conditional statement is monotonic in its statement arguments and jointly top and bottom homo- morphic in all its statement arguments, but no other homomorphism properties hold in general. The monotonicity and homomorphism properties of the conditionals are summarized in Table 13.3.

Blocks The block construct for state transformers allows us to model a situation where the state space is temporarily extended. In exactly the same way, we can define a block construct for predicate transformers that permits us to embed statements on a “larger” state space into a statement on a “smaller” state space. The block construct for predicate transformers is defined in analogy with the state transformer block:

begin ; S ; end . (block statement) Given state extension pair begin :  →  and end :  →  and statement S :  %→  (the body of the block), the block has functionality  %→ . Intuitively, the block maps state space  into  and then executes S before moving back to the original state space. In practice, the block construct is used with program variable notation, under the same assumptions as for state transformer blocks (Chapter 5). Thus,

∧ begin var y := e ; S end =begin ; y := e ◦ end ; S ; end , and similarly for a block with multiple local variables. Let us compute the predicate transformer that the block defines.

Lemma 13.13 Assume that begin var y := e ; S end is a block statement where the expression e does not mention variable y and the body S is written using only program variable notation, and that q is a boolean expression over the global variables. Then

(begin var y := e ; S end).q = (S.q)[y := e] .

Proof We have (begin var y := e ; S end).q ={definitions of block and sequential composition} begin .(y := e ◦ end .(S.(end .q))) 226 13. Statements

≡{assignment axiom, e does not mention y} begin .(S.(end .q))[y := e] The fact that S is using only program variable notation guarantees that the assign- ment axiom can be applied in the second step of the derivation. We leave it to the reader to show that end .q = q. Similarly, we leave it to the reader to show that begin .q = q if q does not mention the local variable. 2

The block constructor (λS • begin var y := e ; S end) is very regular, as it satisfies all basic homomorphism properties that can be expected of statements. These properties are summarized in Table 13.3. Theorem 13.14 The block constructor is monotonic:

  S  S ⇒ begin var y := e ; S end  begin var y := e ; S end . Furthermore, the block constructor is ⊥-, -, - and -homomorphic, and it preserves skip. Proof The proof is left to the reader as an exercise (Exercise 13.12). 2

As an example, consider the following statement:

begin var z := x ; x := y ; y := z end , where the global variables are x and y. This is the swapping example of Chapter 5 lifted to the level of statements. In fact, it is possible to prove (Exercise 13.13) the following homomorphism property for blocks:

begin var y := e ; f end =begin var y := e ;  f end . Thus we immediately have

begin var z := x ; x := y ; y := z end =x, y := y, x by the example in Section 5.6.

Blocks with Nondeterministic Initialization We generalize the block statement by allowing a (demonically) nondeterministic initialization of the local variables. We define

is  ⊥ ¬ 1; (λS, T • if g then S else T fi) yes yes yes no no no yes no • (λS1,...,Sn if g1 → S1 [] ··· [] gn → Sn fi) yes yes yes no no no yes no (λS • begin var y := e ; S end) yes yes yes yes yes no yes no

TABLE 13.3. Homomorphic properties of derived statements 13.5. Procedures 227

∧   begin var y | b ; S end =begin ;[y := y | b[y := y ]] ; S ; end , where b is a boolean expression that can mention both global and local variables. A derivation like the one in the proof of Lemma 13.13 shows which predicate transformer this block is:

(begin var y | b ; S end).q = (∀y • b ⇒ S.q), where q is a boolean expression in which y does not occur free. Exactly as for state transformer blocks, we can also permit implicit initialization conventions. Thus,

begin var y ; S end can by convention mean a totally nondeterministic initialization (with initial- ization predicate true). Alternatively, it can mean a standard initialization, e.g., initialization to 0 for a variable of type Nat.

13.5 Procedures

The notion of procedures can be lifted directly from the state transformer level (Section 5.6) to the predicate transformer level. Thus, a definition of the form

proc N(var x, val y) = S ∧ is syntactic sugaring for the definition N.x.y = S. Furthermore, a procedure call of the form N(z, e), where z is a variable and e an expression, is to be interpreted as follows:

∧   N(z, e) = begin var y := e ; N.z.y end .

As an example, consider the following procedure, which stores the smaller of the values m and n in the variable x:

proc Min(var x, val m, n) = if m ≤ n → x := m [] m ≥ n → x := n fi . A typical call to this procedure could look as follows, where y and z are program variables:

Min(y, z + 1, y). 228 13. Statements

According to the definition, this call really stands for begin var m, n := z + 1, y ; if m ≤ n → y := m [] m ≥ n → y := n fi end

Local procedures on the statement level are handled exactly as on the state trans- former level. We define a statement with local procedure as follows:

proc N(var x, val y) = S in (local procedure) ··· N(u, e) ··· =∧ let N = (λxy• S) in   ···begin var y := e ; N.u.y end ··· Procedures and calls to them can be introduced and eliminated on the statement level in exactly the same way as for state transformers. The following example illustrates the use of a local procedure: proc Min(var x, val m, n) = if m ≤ n → x := m [] m ≥ n → x := n fi in Min(y, y, z) ; Min(x, x, y).

13.6 Summary and Discussion

In this chapter we have shown how contract statements can be interpreted as mono- tonic predicate transformers. The monotonic predicate transformers form a sublat- tice and subcategory of predicate transformers, so we can reason about monotonic predicate transformers in the larger context of all predicate transformers. We have defined predicate transformer statements as predicate transformer terms that are built in terms of the basic predicate transformers and three algebraic composition operators: sequential composition, meet, and join. We have shown that any such statement is a monotonic predicate transformer. Conversely, we have also shown that predicate transformer statements can be taken as a normal form for monotonic predicate transformers: any monotonic predicate transformer that can be expressed as a higher-order logic term can also be expressed as such a statement. We also de- fined three standard programming constructs (conditional statements, blocks with local variables, and procedures) in terms of more basic constructs. We use a two-phase approach to defining the language of contracts, specifications, and programs. First, we introduce basic predicate transformers and predicate trans- former constructors, which have been chosen so that they have simple and regular 13.6. Summary and Discussion 229 algebraic properties. Then we define more traditional constructs in terms of these. This approach allows us to derive the algebraic properties of the traditional con- structs in a simple way, and also shows more clearly the underlying mathematical notions. The (less attractive) alternative would be to postulate the more traditional program constructs such as conditionals and blocks directly as predicate transform- ers, and then derive their monotonicity and homomorphism properties directly from these definitions. Hesselink develops a similar but more restricted algebra in which statements are conjunctive predicate transformers (i.e., predicate transformers that preserve nonempty meets). He takes assert, guard, meet, and sequential composition as ba- sic (and later includes a join operator also), additionally assuming a collection of procedure names [80, 81, 82]. We use the same constructs for both contract statements and predicate transformer statements. This allows us to identify these two whenever convenient and rea- son about contract statements as if they were predicate transformers. This way of identifying contract statements with their associated predicate transformers is an important device, because it allows a purely algebraic treatment of contract, speci- fication, and program statements, where the specific syntax chosen for expressing statements is irrelevant. Another consequence of this is that we do not need to make the distinction between syntax, semantics, and proof theory that is traditional in programming logics. The syntax for predicate transformer statements is just the syntax of higher-order logic, with a little bit of syntactic sugaring here and there. The semantics is the semantics of higher-order logic. This is a simple set-theoretic semantics, as we have shown. The special properties that these statements have are a consequence of the operations that we have introduced by explicit constant definitions in higher- order logic. The proof theory that we use is also the proof theory of higher-order logic. The refinement calculus is a theory within higher-order logic, and as such, it uses exactly the same set of basic axioms and inference rules as any other theory in higher-order logic. What is new in the refinement calculus is expressed as theorems and derived rules of inference, which are proved (not postulated) in higher-order logic. A consequence of choosing higher-order logic as the underlying logic for the refinement calculus is that the calculus is consistent by construction. This is because higher-order logic has been shown to be consistent, and the refinement calculus is a conservative extension of this logic. The theorems are also sound in the standard semantics of higher-order logic, where statements are interpreted as higher-order functions, because the basic axioms and inference rules of higher-order logic are sound. Of course, this alone does not prove that the theorems are also correct in the sense that they are true in an operational interpretation of contract statements, which 230 13. Statements

explains a contract in terms of the possible behaviors that it permits for the agents involved. That this is also the case will be demonstrated in the next chapter, where the operational semantics of contract statements is defined. This is also the reason why we need to distinguish between contract statements and predicate transformer statements. The predicate transformer interpretation, al- though extremely useful for reasoning about properties of contracts, is still just one possible interpretation of contracts. The operational interpretation gives another semantics for contracts. A third interpretation of contract statements will also be presented, where we show how to interpret contracts as special kinds of functions on the initial state. A notation where every monotonic predicate transformer term is expressible as a statement was originally introduced by Back and von Wright [26, 27], with an original normal form theorem. The normal form theorem presented here was proved by von Wright [143]. The guarded conditional was originally introduced by Dijkstra [53]. Blocks and procedures have been handled formally by Hoare [85] and Gries and Levin [65], among others.

13.7 Exercises

13.1 Prove that if  and  are partially ordered, then f monotonic ≡ • • (∀g1 :  →  ∀g2 :  →  g1  g2 ⇒ f ; g1  f ; g2) for an arbitrary function f :  → . 13.2 Show that if S is a monotonic predicate transformer, then the following subdis- tributivity property holds:

S1 ; (S2  S3)  S1 ; S2  S1 ; S3 . 13.3 Fill in the missing parts of the proof of Lemma 13.8. 13.4 Prove Lemma 13.7. 13.5 By the duality principle, there is a dual normal form [P];{Q} in addition to the normal form {P} ;[Q] for monotonic statements given in this chapter. Derive this normal form and interpret it intuitively. 13.6 Show that if S is a monotonic predicate transformer, then the following properties hold for arbitrary predicates p and q: S.(p ∩ q) ⊆ S.p ∩ S.q , S.p ∪ S.q ⊆ S.(p ∪ q). 13.7 Show that the following holds for arbitrary state relation R: 13.7. Exercises 231

[R] = [λσ • ∃γ • R.σ.γ ];{f |(∀σ • R.σ =∅⇒R.σ.( f.σ ))} . Interpret this equality intuitively. 13.8 Show that the deterministic conditional is monotonic with respect to substatement replacement:   ∧   ⇒ S1 S1 S2 S2    . if g then S1 else S2 fi if g then S1 else S2 fi 13.9 Prove the following rule:

if g then S1  S2 else S fi =

if g then S1 else S fi  if g then S2 else S fi . 13.10 Recall the definitions of the chaos and choose statements: ∧ ∧ chaos = [Tr ue] , choose ={Tr ue} . Show that the following refinements hold for an arbitrary monotonic predicate transformer S: S ; chaos  chaos ; S , abort ; S  S ; abort , S ; skip  skip ; S , skip ; S  S ; skip , S ; magic  magic ; S , choose ; S  S ; choose . 13.11 Find instantiations for the monotonic predicate transformer S to show the following: S ; abort  abort ; S , chaos ; S  S ; chaos , S ; choose  choose ; S , magic ; S  S ; magic . 13.12 Prove that the block construct begin var y := e ; S end is ⊥-, -, -, and - homomorphic in its statement argument S. 13.13 Prove the following rule:

begin var y := e ; f end =begin var y := e ;  f end . This is page 232 Printer: Opaque this This is page 233 Printer: Opaque this 14

Statements as Games

In the previous chapter we showed that contract statements can be interpreted as (monotonic) predicate transformers. But predicate transformers are pure mathe- matical entities, functions from predicates (sets) to predicates, and as such have no direct computational interpretation. The way in which statements are assumed to be executed as programs has only been explained informally. Because both angelic and demonic nondeterminism is involved in statement execution, it is not obvious how statement execution should be defined formally. In this chapter, we therefore proceed to give an operational interpretation of contract statements as games.We show that the predicate transformer semantics is an abstraction of the game seman- tics, in the sense that we can compute the former from the latter. Both correctness of statements and refinement can be explained operationally in terms of winning strategies in games.

14.1 Game Interpretation

We will assume that contract statements have the following syntax:

• • S ::={Q}|[Q] | S1 ; S2 | ( i ∈ I Si ) | ( i ∈ I Si ). Here Q is a relation term and I a set term of higher-order logic. We assume that there is a rule that associates a contract statement Si with each index i ∈ I in a • meet ( i ∈ I Si ) (and similarly for a join). We will interpret contract statements in terms of a game that is played between two opponents, called the angel and the demon. The game semantics describes how a contract statement S encodes the rules of a game. For a play (or execution) of the 234 14. Statements as Games

game (statement) S, we also need to give the initial position (initial state) σ ∈  of the game and a goal (a postcondition) q ⊆  that describes the set of final states that are winning positions. For a given postcondition q, the angel tries to reach a final state that satisfies q starting from the initial state σ . The demon tries to prevent this. Both follow the rules of the game, as described by the statement S. The statement determines the turns of the game, as well as the moves that are permitted for each player. The game proceeds by a sequence of moves, where each move either leads to a new state or forces the player that is to make the move to quit (and hence lose the game). Each move is carried out by either the angel or the demon. There are two basic moves, described by the demonic and the angelic update statements. The angelic update statement {Q} is executed by the angel in state σ by choosing some state γ such that Q.σ.γ holds. If no such state exists, then the angel cannot carry out this move and quits (loses). The demonic update statement [Q] is similar, except that it is carried out by the demon. It chooses some state γ such that Q.σ.γ holds. If no such state exists, then the demon cannot carry out the move and quits.

The sequential composition S1 ; S2 is executed in initial state σ by first playing the game S1 in initial state σ. If this game leads to a final state γ , then the game S2 is played from this state. The final state of this game is the final state of the whole game S1 ; S2.

• A demonic choice ( i ∈ I Si ) is executed in initial state σ by the demon choosing some game Si , i ∈ I , to be played in initial state σ .IfI =∅, then there is no game that can be played in initial state σ , and the demon has to quit. The angelic choice • ( i ∈ I Si ) is executed similarly, but by the angel. Skip, abort, and magic statements, as well as guards, asserts, and functional up- dates, can all be seen as special cases of demonic and angelic updates. Thus, we do not need to define the meaning of these constructs in the operational semantics. Similarly, the operational semantics also explains the execution of derived state- ments (like the conditional statement) that are built out of the constructs, for which we give an explicit interpretation.

Operational Semantics We define the game interpretation of statements using structured operational se- mantics. The basic notion in such a semantics is that of a transition relation → between pairs (configurations) (S,σ)where S is a contract statement and σ a state. The relation (S,σ)→ (S,σ) holds (and we say that a transition from configura- tion (S,σ)to configuration (S,σ) is possible) if one execution step of statement S in state σ can lead to state σ , with statement S remaining to be executed. Ex- 14.1. Game Interpretation 235 ecution terminates when a configuration (, σ ) is reached, where  is the empty statement symbol. In our operational semantics, a configuration is a pair (S,σ), where

• S is either an ordinary statement in  %→  or the empty statement symbol , and • σ is either a state in , the special symbol  standing for success, or the special symbol ⊥ standing for failure.

We take the point of view of the angel with regard to the notions of failure and success. Thus, success means that the angel has won while failure means that the angel has lost. From the demon’s point of view, success is something to avoid. The transition relation (i.e., the permitted moves) is inductively defined by a col- lection of axioms and inference rules. The relation → is defined to be the smallest relation that satisfies the following:

• Angelic update .σ.γ .σ =∅ R , R , ({R},σ)→ (, γ ) ({R},σ)→ (, ⊥)

, ; ({R}, ⊥) → (, ⊥) ({R}, ) → (, ) • Demonic update .σ.γ .σ =∅ R , R , ([R],σ)→ (, γ ) ([R],σ)→ (, )

, ; ([R], ⊥) → (, ⊥) ([R], ) → (, ) • Sequential composition ( ,σ)→ (  ,γ),  =  ( ,σ)→ (, γ ) S1 S1 S1 , S1 ( ,σ)→ (  ,γ) ( ,σ)→ ( ,γ) ; S1 ; S2 S1 ; S2 S1 ; S2 S2 • Angelic choice ∈ =∅ k I , I • • ; (( i ∈ I Si ), σ ) → (Sk ,σ) (( i ∈ I Si ), σ ) → (, ⊥) • Demonic choice ∈ =∅ k I , I . • • (( i ∈ I Si ), σ ) → (Sk ,σ) (( i ∈ I Si ), σ ) → (, )

These axioms and inference rules formalize the informal description given above of the moves in the game. 236 14. Statements as Games

Classification of Configurations Since there are no rules for empty configurations of the form (, σ ), moves are possible only in configurations (S,σ)where S is a proper statement. Furthermore, by inspecting the rules, we notice that for any given configuration (S,σ), at most one of the axioms and inference rules is applicable. If S is an angelic update or an angelic choice (join) of statements, then only the angel can make a move. In this case we say that the configuration (S,σ)is angelic. Dually, if S is a demonic update or a demonic choice of statements, then only the demon can make a move, and the configuration is called demonic. Finally, if S is a sequential composition, S = S1 ; S2, then the possible transitions are given by the possible transitions of S1. In this case, we say that the configuration is angelic if (S1,σ) is angelic and demonic if (S1,σ)is demonic. This means that every configuration is either empty, demonic, or angelic.

Plays A play of the game S in initial state σ is a sequence of configurations

C0 → C1 → C2 →··· where

(i) C0 = (S,σ),

(ii) each move Ci → Ci+1 is permitted by the axiomatization above, and

(iii) if the sequence is finite with last configuration Cn, then Cn = (, γ ), for some γ ∈  ∪{, ⊥}.

Intuitively, a play shows us, step by step, what choices the demon and the angel have made in a sequence of moves. A finite play cannot be extended, since no moves are possible from an empty configuration (in fact, infinite plays are never possible; see Lemma 14.1 below).

The game interpretation of contract statement S, denoted by gm.S, is a function that maps each initial state σ to the set of plays of S in initial state σ . The game interpretation of S thus describes for each initial state σ the possible behaviors of the angel and the demon when following contract S. For a given goal q, we say that the angel wins a play of a game (or the play is a win) if the final configuration (, σ ) satisfies σ ∈ q ∪ {}.

Example Game Let us illustrate the concepts introduced above with the following simple game. A   state of the game is given by an integer x. Let inc.x.x ≡ (0 ≤ x < x ≤ 2) and 14.1. Game Interpretation 237

  dec.x.x ≡ (0 ≥ x > x ≥−2), where x ranges over the integers. The rules of the game are given by the statement

S = [inc];({dec} [inc])  ({dec} [inc]) ; {dec} . We consider a play where the initial state is x = 0 and the goal (postcondition) q is x =−2 ∨ x = 2. The following is a possible sequence of moves: ([inc];({dec} [inc])  ({dec} [inc]) ; {dec}, 0) →{demon chooses first alternative} ([inc];({dec} [inc]), 0) →{demon increases state value; inc.0.1 holds} ({dec} [inc], 1) →{angel chooses second alternative} ([inc], 1) →{demon increases state value; inc.1.2 holds} (, 2) Note that the initial configuration is demonic. After the demon chooses the first substatement of the meet, the second configuration is also demonic: a sequential composition with a leading demonic update. After the demon has executed the update, the result is an angelic configuration, and the angel chooses the second alternative, which again gives a demonic configuration. Finally, the demon executes this update, and the empty configuration is reached. Another possible sequence of moves is the following: ([inc];({dec} [inc])  ({dec} [inc]) ; {dec}, 0) →{demon chooses second alternative} (({dec} [inc]) ; {dec}, 0) →{angel chooses first component in leading statement} ({dec} ; {dec}, 0) →{angel decreases state value; dec.0.(−1) holds} ({dec}, −1) →{angel decreases state value; dec.(−1).(−2) holds} (, −2) In both these cases the angel wins the play, since the state component of the final (empty) configuration is in both cases in the set q ={−2, 2}. 238 14. Statements as Games

Finally, the following is a play where the angel loses: ([inc];({dec} [inc])  ({dec} [inc]) ; {dec}, 0) →{demon chooses second alternative} (({dec} [inc]) ; {dec}, 0) →{angel chooses second component in leading statement} ([inc];{dec}, 0) →{demon increases state value; inc.0.1 holds} ({dec}, 1) →{angel has no legal move; dec.1 =∅} (, ⊥)

Figure 14.1 shows all possible sequences of moves of the game described above when the initial state is x = 0 (in this figure, we have decorated the transition arrows with A and D to indicate whether the angel or the demon makes the move). As this example shows, a given initial configuration can permit a variety of plays, with very diverse outcomes (final configurations). In accordance with our game- theoretic point of view, we are interested in what sets of outcomes the angel can force the game to reach, regardless of how the demon makes its moves. We shall consider this in the next section, but first we show that all plays are necessarily finite.

0 D D

00 D D AA

1 2 00 AA AA A A DD

1122-2 -1 1 2

ADADAAAA

⊥ 2 ⊥ T ⊥ -2 ⊥ ⊥

FIGURE 14.1. All sequences of moves from initial state 0 14.2. Winning Strategies 239

Termination A game always ends with either the demon or the angel winning. This is because all plays of a game S are finite. Intuitively, we can justify this by noting that every move decreases the size of the game that remains to be played. The formal argument requires us to be explicit about what we mean by the size of a game. This again requires us to use ordinals. Readers who are not familiar with these may at this point choose to first read Chapter 18 on well-founded sets and ordinals, before going through the proof of the lemma. Lemma 14.1 All plays of a game are finite. Proof Define the following inductive ordinal measure on statement terms (includ- ing the empty statement):

M. = 0 , M.{R}=1 , M.[R] = 1 , M.(S1 ; S2) = M.S2 + M.S1 , • • M.( i ∈ I Si ) = (lim i ∈ I M.Si ) + 1 , • • M.( i ∈ I Si ) = (lim i ∈ I M.Si ) + 1 . This measure is well-defined, since I is always a set (this guarantees that the measure of every statement is an ordinal). By induction on the structure of the statement S, one can show that any move according to the transition relation → decreases this measure in the statement part of a given configuration (S,σ). Since the ordinals are well-founded, this guarantees that no configuration can permit more than a finite number of moves. 2

Even though every play is finite, it is possible to devise games in which there is no upper bound on the number of moves needed to play the game. The following example shows this. Let R be an arbitrary relation and define statements Sn for each natural number n by S0 = [R] and Sn+1 = [R];Sn. Then consider the statement • ( n ∈ Nat Sn) in any initial state σ . If we are given some purported bound m on the number of moves, then the angel can start by making the move

• (( n ∈ Nat Sn), σ ) → (Sm ,σ) . After this, the demon makes exactly m + 1 moves, so the total number of moves will be greater than m.

14.2 Winning Strategies

We shall now use the operational semantics to define what a strategy is and what it means for the angel to have a winning strategy for game S in initial state σ with 240 14. Statements as Games

respect to goal (postcondition) q. The idea is that a strategy (for the angel) tells the angel what to do in each angelic configuration and that a strategy is winning if the angel can win a game by always moving according to the strategy. A strategy (for the angel) is a function f that maps configurations to configurations in such a way that C → f.C holds for all angelic configurations C. Thus, given an arbitrary angelic configuration C, a strategy f determines a transition (a move for the angel) C → f.C. Now assume that we are given statement S and initial state σ . A strategy f admits a play C0 → C1 →···→Cn if C0 = (S,σ) and if Ci+1 = f.Ci whenever configuration Ci is angelic. The strategy f is a winning strategy for game S in initial state σ with respect to goal q if for every play (S,σ) → ··· → (, γ ) admitted by f , the condition γ ∈ q ∪ {} holds. Thus, f is a winning strategy if the angel is guaranteed to win when it makes its moves according to f . Obviously, we can define the notion of a strategy for the demon dually. However, since the notion of winning is connected to the angel, we concentrate on strategies for the angel.

The Winning Strategy Theorem We shall now show that the existence of winning strategies is closely related to the predicate transformer semantics. In fact, we show that there is a winning strategy for game S in initial state σ with respect to postcondition q if and only if wp.S.q.σ holds. Since the notion of winning strategy was derived from the operational se- mantics, this shows how the predicate transformer semantics can be derived from the operational semantics. Theorem 14.2 Assume that contract statement S, state σ , and predicate q are given. Then there is a winning strategy for game S in initial state σ with respect to postcondition q if and only if wp.S.q.σ holds. Proof The proof is by induction over the structure of the statement S. In the proof, we need a few additional notions about strategies. To simplify the reasoning, we assume that strategies can be “undefined” for configurations that we are not interested in (this undefinedness can be accomplished by augmenting the collection of all configurations with an extra “undefined” element). The domain Dom. f of a strategy f is the set of configurations for which f is defined. We further define the domain of interest for a strategy f with respect to configuration (S,σ)(written IDom( f, S,σ)) to be the set of all angelic configurations in all plays of (S,σ) admitted by f . Obviously, f ’s restriction to IDom( f, S,σ) determines whether f is a winning strategy or not, since configurations outside IDom( f, S,σ)cannot occur while the game is being played. 14.2. Winning Strategies 241

We now consider the five cases of the inductive proof. Cases 1 and 2 are the base cases (the two relational updates), while cases 3, 4, and 5 are step cases (meet, join, and sequential composition). (i) For the demonic update [R], the domain of interest of any winning strategy is empty, since the angel never makes any moves. Thus, winning strategy exists for [R]inσ with respect to q ≡{definitions} for every transition ([R],σ)→ (, γ ) we must have γ ∈ q ∪ {} ≡{operational semantics} R.σ ⊆ q ≡{definition of update} wp.[R].q.σ

(ii) For the angelic update {R}, the reasoning is similar. In this case, the domain of interest contains the single configuration ({R},σ), and a winning strategy is one that maps this configuration to some (, γ ) where γ ∈ q ∪ {}. Thus, winning strategy exists for {R} in σ with respect to q ≡{definitions} there is a transition ({R},σ)→ (, γ ) such that γ ∈ q ∪ {} ≡{operational semantics} R.σ ∩ q =∅ ≡{definition of update} wp.{R}.q.σ

(iii) Next, we consider meets. For this part of the proof, we need to define a merge of strategies. Assume that a set of strategies { fi | i ∈ I } is given. A strategy f is a merge of this set if the following holds for all configurations (S,σ):

• (∃i ∈ I (S,σ)∈ Dom. fi ) ⇒ • (∃i ∈ I (S,σ)∈ Dom. fi ∧ f.(S,σ)= fi .(S, σ )) . In other words, we require that if a configuration is in the domain of one or more of the strategies fi , then f agrees with one of them. Using the axiom of choice, it is easily shown that a merge always exists for any collection of strategies. The following observation is now crucial. Assume that f is a winning strategy for game S in initial state σ with respect to goal q and that f is undefined outside   IDom( f, S,σ). Similarly, assume that f is a winning strategy for game S in initial state σ  with respect to the same goal q and that f  is undefined outside     IDom( f , S ,σ ). Then any merge of f and f is also a winning strategy for both S in σ and S in σ  with respect to q. This is true, because if f and f  disagree on some 242 14. Statements as Games

configuration that is in the domain of interest of both ( f, S,σ) and ( f , S,σ), then both strategies must lead to a win from this configuration. The induction assumption for the meet is now that for all i ∈ I there is a winning strategy for game Si in initial state σ with respect to postcondition q if and only if wp.Si .q.σ holds. Then we have

• f is winning strategy for ( i ∈ I Si ) in σ with respect to q ⇒{operational semantics, definition of winning strategy}

f is winning strategy for each Si in σ with respect to q ≡{induction assumption}

• (∀i ∈ I wp.Si .q.σ ) ≡{definition of meet}

• wp.( i ∈ I Si ).q.σ

• so the existence of a winning strategy for the meet implies wp.( i ∈ I Si ).q.σ . For the reverse implication, the induction assumption tells us that for each i ∈ I , σ  there exists a winning strategy fi for Si in with respect to goal q. Let fi be the result of restricting strategy fi to the domain of interest of ( fi , Si ,σ). The  observation above then guarantees that any merge of the strategies fi is a winning • strategy for ( i ∈ I Si ). (iv) Now consider join. First,

• f is winning strategy for ( i ∈ I Si ) in σ with respect to q ⇒{operational semantics, definition of winning strategy}

f is winning strategy for Si in σ with respect to q, for some i ∈ I ≡{induction assumption}

• (∃i ∈ I wp.Si .q.σ ) ≡{definition of join}

• wp.( i ∈ I Si ).q.σ

• so the existence of a winning strategy for the join implies ( i ∈ I Si ).q.σ .For the opposite implication here, we note that if f is a winning strategy for Si in σ • with respect to q, then we can adjust f so that f.(( i ∈ I Si ), σ ) = (Si ,σ)to • • get a winning strategy for ( i ∈ I Si ). This works because (( i ∈ I Si ), σ ) is necessarily outside IDom( f, Si ,σ). (v) Finally, we consider sequential composition. First, we note from the definition of the operational semantics that a play of the game S1 ; S2 is of the form

(S1 ; S2,σ0) →···(S2,σm ) →···(, σn), and furthermore,

(a) (S1,σ0) →···→(, σm ) is a play of the game S1 in initial state σ0, and 14.2. Winning Strategies 243

(b) (S2,σm ) →···→(, σn) is a play of the game S2 in initial state σm .

The induction assumption is that the statement of the theorem is true for S1 and S2. Assume that f is a winning strategy for S1 ; S2 in initial state σ with respect to q and let p be the set containing all second (state) components of the final  configurations of plays of (S1,σ) admitted by f . Then define a strategy f by   f .(S,γ)= f.(S ; S2,γ)so that f is a winning strategy for S1 in σ with respect to p. The induction assumption then tells us that wp.S1.p.σ holds. Furthermore, we know that f is a winning strategy for S2 in any initial state γ ∈ p with respect • to q, so the induction assumption tells us that (∀γ ∈ p wp.S2.q.γ ). Then

• wp.S1.p.σ ∧ (∀γ ∈ p wp.S2.q.γ ) ≡{definition of bounded quantification}

• wp.S1.p.σ ∧ (∀γ p.γ ⇒ wp.S2.q.γ ) ≡{pointwise extension}

wp.S1.p.σ ∧ p ⊆ wp.S2.q

⇒{monotonicity of wp.S1}

wp.S1.(wp.S2.q).σ ⇒{definition of sequential composition}

wp.(S1 ; S2).q.σ

For the opposite direction, we assume that wp.(S1 ; S2).q.σ . Again, this means that we have, with p = wp.S2.q,

• wp.S1.p.σ ∧ (∀γ ∈ p wp.S2.q.γ ) .

The induction assumption then tells us that there is a winning strategy f for S1 in initial state σ with respect to p and that for every γ ∈ p there is a winning  strategy gγ for S2 in initial state γ with respect to q.Nowdefine f such that  f .(S ; S2,δ) = (T ; S2,γ) whenever f.(S,δ) = (T,γ) for all S and all δ, and  f is undefined everywhere else. Next let gγ be the result of restricting gγ to  IDom(gγ , S2,γ), for all γ ∈ p, and let g be a merge of f and all the strategies in  {gγ | γ ∈ p}. This makes g a winning strategy for S1 ; S2 in initial state σ with respect to q, and the proof is finished. 2

Theorem 14.2 shows that the operational semantics uniquely defines the predicate transformer semantics. Since the operational semantics contains information not only about input–output behavior, but also about intermediate states, it is obviously not possible in general to go the other way. 244 14. Statements as Games

14.3 Correctness and Refinement

The notion of correctness can now also be explained operationally. Recall that contract statement S is correct with respect to precondition p and postcondition q, denoted by p {| S |} q,ifp ⊆ wp.S.q. The postcondition plays the role of a goal, while the precondition plays the role of permitted initial states. Correctness operationally means that the angel has a winning strategy for goal q in any initial state that satisfies predicate p. Consider again the game described by the statement

S = [inc];({dec} [inc])  ({dec} [inc]) ; {dec} ,     where inc.x.x = (0 ≤ x < x ≤ 2) and dec.x.x = (0 ≥ x > x ≥−2). Figure 14.1 shows all possible sequences of moves of this game when the initial state is x = 0. Consider the goal x =−2 ∨ x = 2. From Figure 14.1 it is easy to see that the angel always can win the game with this goal: no matter what the demon does, the angel can choose his moves so that either the demon has to quit (symbolized by ) or the game terminates in a final state where x = 2orx =−2. Hence, the angel has a winning strategy for this goal in initial state 0, so x = 0 {| S |} x =−2 ∨ x = 2 holds operationally. On the other hand, one can see that the angel does not have a winning strategy for the goal x =−1 ∨ x = 1 from the same initial state 0. Let us compute the weakest precondition for S to establish the postcondition q = (x =−2 ∨ x = 2). We have that ([inc];({dec} [inc])  ({dec} [inc]) ; {dec}).q ={definition of meet} ([inc];({dec} [inc])).q ∧ (({dec} [inc]) ; {dec})).q ={definition of sequential composition} [inc].(({dec} [inc]).q) ∧ ({dec} [inc]).({dec}.q) ={definition of join} [inc].({dec}.q ∨ [inc].q) ∧ ({dec}.({dec}.q) ∨ [inc].({dec}.q)) ={use [inc].q = (x = 0) and {dec}.q = (x = 0 ∨ x =−1)} [inc].(x = 0 ∨ x =−1 ∨ x = 0) ∧ ({dec}.(x = 0 ∨ x =−1) ∨ [inc].(x = 0 ∨ x =−1)) ={simplify} ([inc].T ∧ ({dec}.(x = 0 ∨ x =−1) ∨ [inc].(x = 0 ∨ x =−1)) ={definitions of updates} T ∧ (x = 0 ∨ F) 14.3. Correctness and Refinement 245

={simplify} x = 0

Thus, the angel has a winning strategy only for x = 0.

Let us then look more closely at refinement of contract statements. In the predicate  transformer semantics, the refinement S  S is valid if and only if (∀q σ • S.q.σ ⇒ S.q.σ ) holds. With the game interpretation, this means that for any choice of goal q and any choice of initial state σ , if the angel has a winning strategy in S, then it must also have a winning strategy in S. Thus, a refinement has to increase the angel’s possibilities of winning, or at least keep them the same.

The angel’s possibilities of winning can be increased in two ways, either by adding some choices for the angel or by removing some choices for the demon. Figure 14.2 shows a refinement of the previous game, where we have added two new choices for the angel and removed two choices for the demon (the latter shown as dashed lines). Note that when we remove the demon’s only choice, then the result is an empty choice for the demon, i.e., a miracle. As a result, the angel now also has a winning strategy for establishing x =−1 ∨ x = 1 in initial state 0, besides also being able to establish x =−2 ∨ x = 2 as before.

0 D D

00 D D AA

1 2 00

AA AA AA D D

1122-2 -1 1 2

ADADAAAAAA

⊥ 2 ⊥ 1 T ⊥ -2 -1 ⊥ ⊥

FIGURE 14.2. Refinement of game 246 14. Statements as Games

14.4 Summary and Discussion

We have described an operational semantics for statements, based on the notion of a game that is played between an angel and a demon. Correctness has been shown to amount to the angel having a winning strategy for the game. We defined the game semantics using structured operational semantics. This semantic definition method was originally introduced in Gordon Plotkin’s lecture notes [120]. A game-theoretic interpretation of logical formulas was proposed by Jaakko Hin- tikka [83]. Disjunction and existential quantification are there resolved as moves by Player (∃loise or angel) and conjunction and universal quantification as moves by Opponent (∀belard or demon). This has been developed further by Yannis Moschovakis [109] and Peter Aczel [2]. David Harel [72] introduced the idea of developing programs using AND/OR-trees. These are quite similar to the games that we have presented here. The notion that game semantics is a suitable operational interpretation for monotonic predicate transformers was originally put forward by Back and von Wright [25]. The idea of viewing the interaction between a system and its environment as a game has also been considered by Martin Abadi, Leslie Lamport and Pierre Wolper [1], and by Moschovakis [110]. A different, operationally flavored, semantics using the notion of games is given for a language similar to ours by Hesselink [82]. There are also a number of similarities with the process algebras developed by Hoare [88] and Robin Milner [101], in particular with the notions of internal and external nondeterminism. Process algebras do not, however, consider state-based programs. The notion of total correctness is also not used there; instead, correctness is seen as a simulation relation between automata.

14.5 Exercises

14.1 Consider the game

{dim} ;[dim];{dim} ;[dim]   where dim = (x := x | 1 < x ∧ x − 2 ≤ x < x) under declaration var x : Nat. Assume an initial state where x = 6 and goal q = (x = 0). Compute two different plays for this game, one of which is a win. 14.2 Assume that the program variable x ranges over the natural numbers. Draw trees to illustrate the operational semantics of the following two statements:

(a) S1 : (x := 1 x := 2)  (x := 1 x := 3).

(b) S2 : x := 1 (x := 2  x := 3). 14.5. Exercises 247

Although S1 and S2 denote the same predicate transformer (prove this!), the trees are different. Is it possible to define an equivalence relation on this kind of tree such that the trees representing two statements S and S are equivalent if and only if S and S denote the same predicate transformer? 14.3 Consider the game   [x := x | T];{y := y | T} under declaration var x, y : Nat. Assume an arbitrary initial state and goal q = (x = y). Describe a winning strategy. Compare this problem with proving that (∀x • ∃y • x = y) is a theorem. 14.4 Consider the game S defined as (x := 1 x := 2)  (x := 2 x := 3)  (x := 3 x := 1)

under declaration var x : Nat. Assume an arbitrary initial state σ0 and goal q = (x = 2).

(a) Show σ0 {| S |} q.

(b) Describe a winning strategy. 14.5 Consider the game S defined as (x := 1  x := 2) (x := 2  x := 3) (x := 3  x := 1) under declaration var x : Nat.

(a) Show that this game denotes the same predicate transformer as the game given in the preceding exercise.

(b) Assume an arbitrary initial state and goal q = (x = 2). Describe a winning strategy for S. This is page 248 Printer: Opaque this This is page 249 Printer: Opaque this 15

Choice Semantics

We now introduce a third semantic interpretation of statements, the choice seman- tics. Statements are here interpreted as functions that map each initial state to a set of predicates over the final state space. This means that we have altogether three different semantics for statements: an (operational) game semantics, a (backward) predicate transformer semantics, and a (forward) choice semantics. We show that these semantics are consistent with each other (we have already seen that this holds for the predicate transformer semantics and the game semantics). The predicate transformer semantics and the choice semantics are equivalent in the sense that either of them determines the other. The set of predicates that the choice semantics gives for an initial state is in fact the set of postconditions that the statement is guar- anteed to establish from this initial state. Thus both choice semantics and predicate transformer semantics are abstractions of the game semantics in the sense that they can be determined from the game semantics for a statement. The converse does not hold, so the game semantics is more detailed than either of the other two. The choice semantics can be interpreted as describing a simple two-step game, so it is also directly connected to the view of statements as games.

15.1 Forward and Backward Semantics

The choice semantics describes any contract statement S as a function that maps initial states to sets of predicates on final states, with the intuition that if initial state σ is mapped to a set containing predicate q, then our agent can satisfy S in initial state σ to establish postcondition q. Note that we think of the postcondition as 250 15. Choice Semantics

being specified before the execution, so it is perfectly acceptable that a statement S establishes two postconditions q1 and q2 that contradict each other. First we introduce an auxiliary function that is useful in discussing the choice semantics. Let S :  %→  be a predicate transformer. This means that S is a function with the following type: S : ( → Bool) → ( → Bool). Now let S be the function of type S :  → ( → Bool) → Bool defined by ∧ S.σ.q = S.q.σ . In other words, S is almost the same function as S, except for the ordering of the arguments σ and q. Obviously, S.σ is a set of predicates, i.e., a set of sets, so the defining condition can be written equivalently as

q ∈ S.σ ≡ S.q.σ . Thus, S is a function from initial states to sets of sets of final states. We have defined above S for an arbitrary predicate transformer S. However, state- ments always denote monotonic predicate transformers, so it is of interest to know what property of S corresponds to monotonicity of S.

Lemma 15.1 Let S be a monotonic predicate transformer and σ an arbitrary state. Then S.σ is upward closed; i.e.,

p ∈ S.σ ∧ p ⊆ q ⇒ q ∈ S.σ holds for all predicates p and q.

Proof The proof is straightforward, by the definition of S. 2

The converse of Lemma 15.1 also holds, in the following sense:

Lemma 15.2 Given an upward closed family of sets of states Aσ ⊆ P() for every state σ, there is a unique monotonic predicate transformer S such that S.σ = Aσ for all σ .

Proof We construct S according to the intuition of the normal form provided by Theorem 13.10. Its operational interpretation is that given initial state σ , the angel first chooses a set γ in Aσ , and the demon then chooses one of the elements of γ as the final state. Hence, we set S ={λσ γ • γ ∈ Aσ } ;[λγ δ • δ ∈ γ ]. The following calculation verifies that this is the desired statement: 15.1. Forward and Backward Semantics 251

q ∈ S.σ ≡{definition of S and of S} ({λσ γ • γ ∈ Aσ } ;[λγ δ • δ ∈ γ ]).q.σ ≡{statement definitions} (∃γ • γ ∈ Aσ ∧ (∀δ • δ ∈ γ ⇒ δ ∈ q)) ≡{subset inclusion is pointwise extension} (∃γ • γ ∈ Aσ ∧ γ ⊆ q) ≡{Aσ was assumed to be upward closed} q ∈ Aσ

The proof of uniqueness is straightforward. 2

Choice Semantics We now define the choice semantics ch.S of a contract statement S inductively, so that ch.S is a function that maps every initial state to a set of predicates over the final state space. The contract statement S (viewed through the game semantics) is equivalent to the simple game in which the angel chooses a predicate q ∈ ch.S.σ , and the demon then chooses a final state γ ∈ q.Ifch.S.σ is empty, then the angel loses, since it cannot choose any predicate. Similarly, if the angel chooses the empty predicate false, then it wins, since the demon cannot choose any state in false.

The choice semantics is defined inductively by the following rules, for arbitrary contract statement S:

ch.{R}.σ ={q | R.σ ∩ q =∅} , ch.[R].σ ={q | R.σ ⊆ q} , •  •  ch.(S1 ; S2).σ = (∪ p ∈ ch.S1.σ (∩ σ ∈ p ch.S2.σ )) , • • ch.( i ∈ I Si ).σ = (∩ i ∈ I ch.Si .σ ) , • • ch.( i ∈ I Si ).σ = (∪ i ∈ I ch.Si .σ ) .

The intuition behind the definitions for the updates, meet, and join should be clear. For sequential composition, the intuition requires some explanation. For each set p that the angel offers the demon according to S1, the demon chooses some state σ  ∈ p. Thus, the angel can guarantee only what is common to all such states (hence the intersection over σ ). 252 15. Choice Semantics

15.2 Comparing Semantics for Contract Statements

We have now introduced three different semantics for statements: the backward predicate transformer semantics, the forward choice semantics, and the operational game semantics. In the previous chapter, we described the relationship between predicate transformer semantics and game semantics. Here we look at the re- lationship between choice semantics and predicate transformer semantics. The relationship between choice semantics and the game semantics follows from this. Theorem 15.3 For an arbitrary contract statement S, the choice semantics satisfies ch.S = wp.S. Proof The proof is by induction on the structure of statement S. First consider the angelic update: q ∈ wp.{R}.σ ≡{definition of S} wp.{R}.q.σ ≡{definition of wp for {R}} R.σ ∩ q =∅ The derivation for the other base case (demonic update) is similar. Next we consider the three step cases. First we have sequential composition, with the induction assumption that ch.S1 = wp.S1 and ch.S2 = wp.S2:

wp.(S1 ; S2).q.σ ≡{definition of sequential composition}

wp.S1.(wp.S2.q).σ

≡{induction assumption for S1}

(wp.S2.q) ∈ (ch.S1.σ )

≡{induction assumption for S2} {σ  | ∈ . .σ }∈ . .σ  q ch S2 ch S1   ⇒: choose p :={σ | q ∈ ch.S2.σ } ≡   ⇐: p ⊆{σ | q ∈ ch.S2.σ } and upward closure

•  •  (∃p ∈ ch.S1.σ (∀σ ∈ p q ∈ ch.S2.σ )) ≡{set theory}

•  •  q ∈ (∪ p ∈ ch.S1.σ (∩ σ ∈ p ch.S2.σ )) For meet (the second step case) we have the following derivation, with the induction assumption that ch.Si = wp.Si for all i ∈ I :

• wp.( i ∈ I Si ).q.σ ≡{definition of meet}

• (∀i ∈ I wp.Si .q.σ ) 15.2. Comparing Semantics for Contract Statements 253

≡{induction assumption}

• (∀i ∈ I q ∈ ch.Si .σ ) ≡{set theory}

• q ∈ (∩ i ∈ I ch.Si .σ )

• The derivation for the third step case, ( i ∈ I Si ), is similar. 2

Computing ch.S.σ for a given statement S and initial state σ can be very tedious, since the inductive definition for sequential composition is complicated. Thus The- orem 15.3 is an important help; it shows that we can compute the choice semantics directly from the predicate transformer semantics. Using the definitions of derived statements, we also get the following equations for assertions, guards, and functional updates (Exercise 15.1):

ch.{p}.σ ={q | σ ∈ p ∩ q} , ch.[p].σ ={q | σ ∈¬p ∪ q} , ch. f .σ ={q | f.σ ∈ q} . In particular,

ch.abort.σ =∅, ch.magic.σ = P() , ch.skip.σ ={q | σ ∈ q} .

Relating the Three Semantics The game semantics gm.S describes S as a function from initial states to the set of plays from that initial state. The predicate transformer semantics wp.S describes S as a predicate transformer. Finally, the choice semantics ch.S describes S as a function from initial states to sets of sets of final states. These three semantics are illustrated in Figure 15.1. Let us define a function ws such that ws.G.q.σ holds if there is a winning strategy for the set of plays G.σ to establish q. The result that we proved about weakest preconditions and winning strategies then states that

wp.S = ws.(gm.S). The result that we proved about choice semantics and weakest precondition se- mantics states that

ch.S = wp.S .  Let us further define ws .G.σ.q = ws.G.q.σ . Then

 ws .G = ws.G . 254 15. Choice Semantics

Contract statements

gm

wp ch Game semantics

ws ws'

Predicate transformer Choice semantics semantics

FIGURE 15.1. Semantics of statements

Then it follows from these equalities that

 ch.S = ws .(gm.S). Thus, the diagram in Figure 15.1 is really a commuting diagram. It shows all three semantic interpretations and the way these interpretations are related to each other. As the figure shows, the game semantics is the most detailed one, while the two other semantics are abstractions of the game semantics and derivable from each other.

15.3 Refinement in the Choice Semantics

The choice semantics introduces two new domains into the refinement calculus hierarchy: sets of predicates and functions from states to sets of predicates. We introduce the name Res() for the upward closed sets of predicates over  and the name Stran(, ) for the functions that map states in  to upward closed sets of predicates over :

∧ Res() ={Q : P(P()) | Q upward closed} , ∧ Stran(, ) =  → Res() . 15.4. Summary and Discussion 255

Then Res() is a complete lattice when ordered by subset inclusion ⊆. It fol- lows directly that Stran(, ) is a complete lattice when ordered by the pointwise extension of subset inclusion. As usual, we let ResX and StranX denote the collec- tions of all upward-closed sets of predicates and all sets of functions from states to (upward-closed) sets of predicates, respectively, for a given collection X of state spaces. Both ResX and StranX can be seen as categories (see Exercise 15.3). Using the relationships between the different semantics, we can easily characterize refinement in the choice semantics.

Corollary 15.4 The refinement relation on predicate transformers corresponds to pointwise set inclusion for choice semantics; for all contract statements S, we have   wp.S  wp.S ≡ (∀σ • ch.S.σ ⊆ ch.S .σ ) .

Proof  wp.S  wp.S ≡{definition of refinement}  (∀q σ • wp.S.q.σ ⇒ wp.S .q.σ ) ≡{definition of S} (∀q σ • wp.S.σ.q ⇒ wp.S.σ.q) ≡{pointwise extension} (∀σ • wp.S.σ ⊆ wp.S.σ ) ≡{Theorem 15.3}  (∀σ • ch.S.σ ⊆ ch.S .σ ) 2

Thus, refinement means adding sets of possible final states for the angel to choose among. In terms of the game-theoretic interpretation, this implies that refinement increases the choices of the angel. By duality, it is obvious that decreasing the demon’s choices is also refinement. Because the set ch.S.σ is upward closed, removing elements from a set of states that the angel can choose implies adding new sets for the angel to choose among (a smaller set has more supersets). The extended hierarchy of poset- and lattice-enriched categories is illustrated in Figure 15.2.

15.4 Summary and Discussion

We have shown in this chapter how to define a forward semantics for statements that is equivalent to the predicate transformer semantics. It models a two-step game. 256 15. Choice Semantics

Stran(Σ, Γ) = Σ → Res(Γ) Ptran(Σ, Γ)

Res(Γ) = Pred(Γ) → Bool Rel(Σ, Γ)

Tran(Σ, Γ) Pred(Γ)

Γ Bool

FIGURE 15.2. Extended refinement calculus hierarchy

The choice semantics is a generalization of traditional relational semantics, where a program statement is described by a function that maps every initial state to the set of all possible final states. This kind of semantics has been studied extensively; for an overview see, e.g., [5]. We will return to it in more detail in Chapter 26. The choice semantics is clearly more expressive than functional semantics or relational semantics, and thus predicate transformer semantics is also more expressive than these more traditional semantic frameworks. The essential power that the choice semantics and predicate semantics bring is that we can model both angelic and demonic nondeterminism in a single statement. With relational semantics, we have to choose whether to interpret relations angelically or demonically, but we cannot have both interpretations in the same program.

15.5 Exercises

15.1 Derive the choice semantics for the assert, guard, and functional update:

ch.{p}.σ ={q | σ ∈ p ∩ q} , ch.[p].σ ={q | σ ∈¬p ∪ q} , ch. f .σ ={q | f.σ ∈ q} . 15.2 Recall that a set A of sets is upward closed if the following holds:

(∀AB• A ∈ A ∧ A ⊆ B ⇒ B ∈ A). Show that if A and B are upward closed sets, then A ∩ B is also upward closed. 15.3 Recall the definitions of Res and Stran: 15.5. Exercises 257

∧ Res() ={Q : P(P()) | Q upward closed} , ∧ Stran(, ) =  → Res() .

(a) Show that Res() is a complete lattice.

(b) Show that both Res and Stran are categories when composition in Res is set intersection and composition in Stran is defined as follows:   ( f ; g).σ = (∪ p ∈ f.σ • (∩ σ ∈ p • g.σ )) for f : Stran(, ) and g : Stran(, ). 15.4 Use the inductive definition to compute the choice semantics for statement {dim} ;[dim];{dim} ,   where dim = (x := x | 1 < x ∧ x − 2 ≤ x < x). Assume a state space Nat (so x.σ = σ) and initial state x = 5.

15.5 Derive the choice semantics for if g then S1 else S2 fi. This is page 258 Printer: Opaque this This is page 259 Printer: Opaque this 16

Subclasses of Statements

Not only monotonicity, but also other homomorphism properties of predicate trans- formers turn out to be important in modeling programming constructs. In this chap- ter we study homomorphic predicate transformers, i.e., predicate transformers that preserve some of the predicate lattice structure. We consider the four basic ho- momorphism properties — bottom, top, meet, and join homomorphism — to see to what extent the constructs in our statement language preserve these properties. These homomorphism properties are closely linked to distributivity properties of statements.

If a collection of basic statements has a certain homomorphism property and the statement constructors preserve this property, then we get a subalgebra of state- ments generated by these constructs, where every statement has the homomorphism property in question. We use this to identify a number of important subclasses of statements. Programming language semantics has to a large extent concentrated on studying such subclasses of statements. Partly this is because the subclasses are closer to actual implementable programming languages, and partly because their mathematical models are simpler.

16.1 Homomorphic Predicate Transformers

We introduce specific (and by now traditional) names for the basic homomorphism properties of predicate transformers. A predicate transformer S is said to be strict (or nonmiraculous) if it preserves bottom, and terminating (or nonaborting)ifit preserves top. Furthermore, S is conjunctive if it preserves nonempty meets of pred- 260 16. Subclasses of Statements

icates, and disjunctive if it preserves nonempty joins of predicates. Summarizing, the properties are:

S.false = false , (S strict) S.true = true , (S terminating) • • S.(∩ i ∈ I qi ) = (∩ i ∈ I S.qi ), I =∅ , (S conjunctive) • • S.(∪ i ∈ I qi ) = (∪ i ∈ I S.qi ), I =∅ . (S disjunctive)

Furthermore, a predicate transformer is universally conjunctive if it preserves ar- bitrary meets of predicates (i.e., it is both terminating and conjunctive). Dually, it is universally disjunctive if it preserves arbitrary joins of predicates (i.e., it is both strict and disjunctive). In addition to the above four properties, we will comment on negation homo- morphism (the property S.(¬ q) =¬S.q), whenever it holds. However, negation homomorphism is not a very common property, so we will not have much to say about it. The four homomorphism properties stated above are independent of each other. Thus, a predicate transformer can be strict without being conjunctive, it can be conjunctive without being terminating, etc. However, conjunctivity and disjunctiv- ity both imply monotonicity, as was shown for lattice homomorphisms in general (Lemma 2.6).

Homomorphism Properties The homomorphism properties of basic statements are given in the following theorems.

Theorem 16.1 Let state transformer f , state predicate p, and state relation R be arbitrary. Then

(a) {p} is conjunctive and universally disjunctive.

(b) [p] is disjunctive and universally conjunctive.

(c)  f is universally conjunctive, universally disjunctive, and negation homo- morphic.

(d) {R} is universally disjunctive.

(e) [R] is universally conjunctive.

Proof We show the proof for conjunctivity of assert; the other proofs are similar. I nonempty

• {p}.(∩ i ∈ I qi ) ={definition of assert} 16.1. Homomorphic Predicate Transformers 261

• p ∩ (∩ i ∈ I qi ) ={infinite distributivity, I nonempty}

• (∩ i ∈ I p ∩ qi ) ={definition of assert}

• (∩ i ∈ I {p}.qi ) 2

Table 16.1 shows the homomorphism properties of the basic statements. In this table, as in others,  and stand for arbitrary nonempty meets and joins. We see that the functional update  f has all possible homomorphism properties as a predicate transformer: it even preserves negations. The assert {p} is nearly as well behaved, but it does not necessarily preserve top or negation. Dually, the guard [p] does not in general preserve bottom or negation. The relational updates have even fewer homomorphism properties. Since the basic statement constructors (sequential composition, meet, and join) can have two or more statement arguments, we consider joint preservation of homomorphism properties. For example, joint strictness preservation of sequential composition means that S1 ; S2 is strict whenever S1 and S2 are both strict.

Theorem 16.2 The statement constructors have the following properties:

(a) Sequential composition preserves joint strictness and joint termination, joint conjunctivity and joint disjunctivity, and joint negation homomorphism.

(b) Meets preserve joint termination and joint conjunctivity.

(c) Joins preserve joint strictness and joint disjunctivity.

Proof We show that sequential composition preserves conjunctivity; the other proofs are similar.

S1 conjunctive, S2 conjunctive •  (S1 ; S2).(∩ i ∈ I qi ) ={definition of composition}

is ⊥ ¬ {p} yes no yes yes no [p] no yes yes yes no  f yes yes yes yes yes {R} yes no no yes no [R] no yes yes no no

TABLE 16.1. Homomorphic properties of basic predicate transformers 262 16. Subclasses of Statements

• S1.(S2.(∩ i ∈ I qi ))

={S2 conjunctive} • S1.(∩ i ∈ I S2.qi )

={S1 conjunctive} • (∩ i ∈ I S1.(S2.qi )) ={definition of composition}

• (∩ i ∈ I (S1 ; S2).qi ) 2

Table 16.2 shows for each statement constructor (composition, meet, and join) which homomorphism properties it preserves. Thus, we have that sequential com- position is the most well behaved, preserving all homomorphism properties. The other operations are less well behaved. For instance, meets do not preserve join homomorphisms or negation homomorphisms, while joins do not preserve meet homomorphisms or negation homomorphisms. These results generalize directly to indexed meets and joins.

As a direct consequence of the above theorems, we find that skip has all homo- morphism properties, abort is conjunctive and universally disjunctive, and magic is disjunctive and universally conjunctive. For the other derived statements we have the following result.

Theorem 16.3 The derived statement constructors preserve homomorphism properties in their statement arguments as follows:

(a) The deterministic conditional preserves (joint) strictness, termination, con- junctivity, and disjunctivity.

(b) The guarded conditional preserves (joint) strictness, termination, and con- junctivity.

(c) The block preserves strictness, termination, conjunctivity, and disjunctivity.

Proof The proof is left as an exercise to the reader (Exercise 16.3). 2

preserves ⊥ ¬ ; yes yes yes yes yes  no yes yes no no yes no no yes no

TABLE 16.2. Homomorphic properties preserved by statement constructors 16.1. Homomorphic Predicate Transformers 263

preserves ⊥ ¬ (λS, T • if g then S else T fi) yes yes yes yes yes • (λS1,...,Sn if g1 → S1 [] ··· [] gn → Sn fi) yes yes yes no no (λS • begin var y := e ; S end) yes yes yes yes no

TABLE 16.3. Homomorphic properties preserved by derived constructors

The homomorphism preservation of derived statements is summarized in Ta- ble 16.3.

Left Distributivity Sequential composition always distributes from the right over lattice operations (see Section 12.2) but does not in general distribute from the left over these operations. To get distributivity from the left, we need additional homomorphism assumptions on the components of the sequential composition, in the same way as we needed additional assumptions for monotonicity of sequential composition.

Theorem 16.4 Let S and Si be predicate transformers, for i in some nonempty index set I . Then S ; abort = abort if S is strict, S ; magic = magic if S is terminating, • • S ; ( i ∈ I Si ) = ( i ∈ I S ; Si ) if S is conjunctive, • • S ; ( i ∈ I Si ) = ( i ∈ I S ; Si ) if S is disjunctive, S ; (¬ T ) =¬(S ; T ) if S is ¬-homomorphic.

Table 16.4 summarizes these distributivity properties of sequential composition (they follow directly from the definition of sequential composition).

Homomorphism Properties of Procedure Calls Recall that a procedure is a (named) predicate transformer that is referred to using procedure calls. Since a procedure call itself is a predicate transformer, we can ask what homomorphism properties it has. A procedure call executes the body of the procedure, with suitable instantiations, so we would expect a procedure call to have the same homomorphism properties as the procedure body. In fact, this is the case, both for global and local procedures.

is ⊥  ¬ (λT • S ; T ) yes 1 yes 2 yes 3 yes 4 yes 5 1if S is strict 2if S is terminating 3if S is conjunctive 4if S is disjunctive 5if S is ¬-homomorphic

TABLE 16.4. Homomorphic properties of composition 264 16. Subclasses of Statements

Theorem 16.5 Assume that procedure N(var x, val y) is defined as statement S, either globally or locally. Then any call N(u, e) is strict (terminating, conjunctive, disjunctive) if S is strict (terminating, conjunctive, disjunctive).

Proof By the definition, a procedure call N(u, e) is equal to a statement of the form   begin var y := e ; N.u.y end. The result now follows, since N.u.y is the same as S, except for a change in variable names, and since deterministically initialized blocks preserve the four basic homomorphism properties. 2

16.2 Subcategories of Statements

Predicate transformers are very general and expressive. By restricting ourselves to subcollections of predicate transformers that satisfy different homomorphism properties, we lose some of this expressiveness. On the other hand, there are then more properties that can be used in reasoning about computations. When identifying such restricted collections of predicate transformers, we con- sider the four basic homomorphism properties: strictness (⊥-homomorphism), ter- mination (-homomorphism), conjunctivity (-homomorphism) and disjunctivity ( -homomorphism). Since these four homomorphism properties are independent of each other, there are 16 different ways of combining them. We consider some of these, but not all, in more detail below. We first show that any homomorphism characterizes a subcategory of the predicate transformers. Lemma 16.6 Assume that φ is an n-ary operation on predicates. Let X be a collection of state  (, )  %→  spaces, and let PtranX consist of all those predicate transformers in φ  that are -homomorphic. Then PtranX is a category, and it is a subcategory of PtranX . Proof Obviously, it is sufficient to show that skip is φ-homomorphic and that composition in Ptran preserves φ-homomorphism for an arbitrary operation φ.We have, for an n-ary operation φ,

skip.(φ.p1. ···.pn) ={skip is identity}

φ.p1. ···.pn ={skip is identity}

φ.(skip.p1). ···.(skip.pn)  which shows that skip is φ-homomorphic. Similarly, assuming that S and S are φ-homomorphic, we have 16.2. Subcategories of Statements 265

 (S ; S ).(φ.p1. ···.pn) ={definition of composition}  S.(S .(φ.p1. ···.pn)) ={S homomorphic}   S.(φ.(S .p1). ···.(S .pn)) ={S homomorphic}   φ.(S.(S .p1)). ···.(S.(S .pn)) ={definition of composition}   φ.(((S ; S ).p1). ···.((S ; S ).pn)) 2

A similar argument can be used if φ is an operation on sets of predicate transformers (such as meet or join). In particular, Lemma 16.6 shows that we have subcategories of Ptran when we restrict ourselves to conjunctive or to disjunctive predicate transformers. Further- more, the conjunction of two homomorphism properties is also a homomorphism property, so the functional (i.e., conjunctive and disjunctive) predicate transform- ers are also a subcategory of Ptran. To make it easier to talk about these major subcategories, we introduce the following names: • CtranX is the category of conjunctive predicate transformers on X. • DtranX is the category of disjunctive predicate transformers on X. • FtranX is the category of functional predicate transformers on X.

Since both conjunctivity and disjunctivity imply monotonicity, we find that Ctran and Dtran are both subcategories of Mtran. Similarly, Ftran is a subcategory of both Ctran and Dtran (see Figure 16.1). Lemma 16.6 also shows that we get subcategories of Ptran when we use strict- ness and termination as restricting properties. We consider these as giving a sec- ondary classification and use superscripts (⊥ and ) on the names to indicate these additional restrictions. Thus, for example, • ⊥ DtranX is the category of universally disjunctive (disjunctive and strict) pred- icate transformers on X. • ⊥ FtranX is the category of partial function (strict functional) predicate trans- formers on X.

Every category C constructed in this way is a subcategory of all categories that we get by dropping one or more of the properties used to restrict C. For example, ⊥  ⊥ Ctran is a subcategory both of Ctran and of Mtran . 266 16. Subclasses of Statements

Ptran

Mtran

Ctran Ftran Dtran

FIGURE 16.1. Main predicate transformer subcategories

The reader should be warned that not all homomorphism properties are visible from a name; if S ∈ Ctran, then S is not only conjunctive but also monotonic (since conjunctivity implies monotonicity). Some of the subcategories of Ptran that we have identified come in dual pairs. For  ⊥ example, Ctran and Dtran are dual. Because of the general duality principle, every property of a class immediately implies a dual property for the dual class.

16.3 Summary and Discussion

Dijkstra’s original weakest precondition semantics [53] assumed that every predi- cate transformer determined by a program statement satisfies a number of “health- iness conditions”. In our terminology these conditions are essentially strictness, conjunctivity, and continuity (continuity is treated in Chapter 22). The original healthiness restrictions were later weakened one by one, in order to enlarge the range of applications and to achieve algebraic simplicity. Continuity was first dropped by Back [6] with the introduction of the nondeterministic assignment statement. Strictness was dropped by Morgan [104], Morris [108], and Nelson [113] with the introduction of miracles (although de Bakker [36] and Eric Hehner [75] earlier had indirectly introduced nonstrict constructs). Finally, conjunctivity was weakened to monotonicity with the angelic abstraction statement of Back [13], the lattice-based specification language of Back and von Wright [26, 27], and the conjunction operator of Gardiner and Morgan [59]. The argument for dropping these restrictions is quite simple. From a proof-theoretic point of view, there is no reason to insist on a more restricted classes of statements, unless properties of these restricted classes are really needed to carry through the proof. Otherwise, a derivation can as well be carried out in the general predicate transformer framework, where no restrictions are made on the predicate trans- formers. If a refinement between two statements can be derived in this general 16.4. Exercises 267

framework, it is also valid when the statements belong to some subclasses of statements. The fact that some intermediate steps in the derivation might not be executable on a real computer has no consequences for the derivation, since the intermediate stages were not intended to be implemented anyway.

16.4 Exercises

16.1 Show that the assert {p} is universally disjunctive but not terminating. 16.2 Show that the demonic update [R] is universally conjunctive. 16.3 Prove Theorem 16.3. 16.4 In Dijkstra’s original formulation of the “healthiness conditions” for predicate transformers, finite conjunctivity was required, rather than the (positive) conjunc- tivity used here. A predicate transformer S is finitely conjunctive if S.(p ∩ q) = S.p ∩ S.q for arbitrary predicates p and q. Give an example of a predicate transformer S that is finitely conjunctive but not positively conjunctive. Hint: A counterexample with Nat as the underlying state space can be found by considering cofinite predicates, i.e., predicates p such that ¬ p is finite. 16.5 Show that if S is finitely conjunctive then S.(¬ p ∪ q) ⊆¬S.p ∪ S.q for arbitrary predicates p and q. Under what condition(s) does equality hold. 16.6 Find a counterexample to show that the guarded conditional does not preserve disjunctivity. 16.7 Let R :  ↔  be a state relation.

(a) Show that {R} is conjunctive if R is a deterministic relation.

(b) Show that [R] is strict if R is a total relation, i.e., if dom.R = . This is page 268 Printer: Opaque this This is page 269 Printer: Opaque this 17

Correctness and Refinement of Statements

Previous chapters have introduced monotonic predicate transformers as the basic model for contract statements. These can be used to describe programs, program specifications, games, and, of course, contracts and agreements. The predicate transformer semantics was originally designed for reasoning about the correctness of program statements. We will show below how to use predicate transformers to reason about correctness-preserving program refinements. We define the notion of correctness for predicate transformer statements and show how to establish the correctness of statements. The relationship between correctness and refinement is analyzed. We then show how to derive correct programs from given specifications, using the stepwise refinement paradigm. Finally, we show how the properties of predicate transformers that we have derived in previous chapters permit us to find and justify individual refinement steps.

17.1 Correctness

Recall the definition of correctness for contract statements: S is correct with respect to precondition p and postcondition q, written p {| S |} q,if(∀σ | σ ∈ p • σ {| S |} q). Hence, we have that

p {| S |} q ≡ p ⊆ wp.S.q . This motivates us to define the notion of correctness also directly for predicate transformers: The predicate transformer S is correct with respect to precondition p and postcondition q, denoted by p {| S |} q,ifp ⊆ S.q holds. 270 17. Correctness and Refinement of Statements

The traditional method for proving correctness of programs is known as Hoare logic. This is a collection of inference rules that permit us to reduce the correctness of a program statement to the correctness of its components. Dijkstra’s predicate transformer method provides an alternative approach, which can be used in the same way to reduce the correctness of any statement to the correctness of its constituents. We give below the rules for reasoning about correctness of statements and illustrate their use with a simple example.

Correctness Rules For basic statements, correctness is reduced to properties of predicates, functions, and relations.

Theorem 17.1 For an arbitrary relation R and predicates p, r, and q:

(a) p {| { r}|}q ≡ p ⊆ r ∩ q ,

(b) p {| [r] |} q ≡ p ∩ r ⊆ q ,

(c) p {|  f |}q ≡ p ⊆ f −1.q ,

(d) p {| { R}|}q ≡ ran.(|p| ; R) ∩ q = false ,

(e) p {| [R] |} q ≡ ran.(|p| ; R) ⊆ q .

Proof The results follow directly from the statement definitions and the definition of correctness. 2

The last two correctness rules of this theorem can be written equivalently as follows:

p {| { R}|}q ≡ (∀σ • p.σ ⇒ (∃γ • R.σ.γ ∧ q.γ )) , p {| [R] |} q ≡ (∀σ • p.σ ⇒ (∀γ • R.σ.γ ⇒ q.γ )) .

As special cases, we have

p {| abort |} q ≡ p = false , p {| skip |} q ≡ p ⊆ q , p {| magic |} q ≡ T .

For the compound statements, we have the following rules. These rules allow us to reduce the correctness of a compound statement to the correctness of its constituents.

Theorem 17.2 For arbitrary statements Si and predicates p, r, and q:

• (a) p {| S1 ; S2 |} q ≡ (∃r p {| S1 |} r ∧ r {| S2 |} q). 17.1. Correctness 271

• • (b) p {| (i ∈ I Si ) |} q ≡ (∀i ∈ I p {| Si |} q).

• • • (c) p {| ( i ∈ I Si ) |} q ≡ (∀σ | p.σ ∃i ∈ I {σ }{|Si |} q).

Proof We show the first result here. We have that

p ⊆ S1.r, r ⊆ S2.q  p ⊆{first assumption}

S1.r ⊆{second assumption, S monotonic}

S1.(S2.q) ={definition of composition}

(S1 ; S2).q Hence the implication from right to left follows by the rule for eliminating existen- tial quantification. Implication in the other direction follows directly by choosing r = S1.q. 2

Note that the correctness rule for angelic choice (join) permits the alternative Si that is chosen to depend on the present state σ . From the rules for update statements we can derive rules for assignment statements when the precondition and the postcondition are boolean expressions: p {| x := e |} q ≡ p ⊆ q[x := e] ,    p {| [x := x |b] |} q ≡ p ⊆ (∀x • b ⇒ q[x := x ]),    p {| { x := x |b}|}q ≡ p ⊆ (∃x • b ∧ q[x := x ]).

Rules for the correctness of derived statements are easily derived from the basic rules. For the conditional statements, we have

p {| if g then S1 else S2 fi |} q ≡ (p ∩ g {| S1 |} q) ∧ (p ∩¬g {| S2 |} q) for the deterministic conditional and

p {| if g1 → S1 [] ... [] gn → Sn fi |} q ≡ • • (p ⊆ (∃i | i ≤ n gi )) ∧ (∀i | i ≤ n p ∩ gi {| Si |} q) for the guarded conditional. Finally, for the block we have the following rule:

p {| begin var y | b ; S end |} q ≡ (∀y • p ∩ b {| S |} q), provided that y is not free in p or q. The justification for this rule is as follows: 272 17. Correctness and Refinement of Statements

p {| begin var y | b ; S end |} q ≡{definition of correctness} p ⊆ (begin var y | b ; S end).q ≡{block semantics, rewrite p ⊆ q as true ⊆ (p ⇒ q)} true ⊆ (p ⇒ (∀y • b ⇒ S.q)) ≡{y not free in p, rewrite} true ⊆ (∀y • p ∧ b ⇒ S.q) ≡{pointwise extension, T ⇒ rule} (∀y • p ∩ b ⊆ S.q) ≡{definition of correctness} (∀y • p ∩ b {| S |} q)

We can use these rules from left to right (top-down) to reduce the correctness of a larger program statement to a smaller one. This is the method that we use to determine whether a given program statement is correct with respect to given pre- and postconditions. The application of the rules in this direction is straightforward, except for the sequential composition rule, which requires us to invent an inter- mediate condition. The situation will be similar later on, when we consider how to establish the correctness of loops. There we also have to invent an intermediate condition, known as the loop invariant.

We can also use the rules from right to left (bottom-up), to gradually build up more and more complex program statements. This could be one way to do explorative programming.

There is an important difference between the Hoare-like axioms and inference rules for establishing correctness, and the use of the predicate transformer semantics as originally proposed by Dijkstra for the same purpose (aside from the fact that Hoare’s axiomatization is concerned with partial correctness and Dijkstra’s with total correctness). The former permits us to prove that a program is correct, if indeed it is, but does not allow us to say anything about an incorrect program. In the predicate transformer approach, the correctness assertion p {| S |} q is just a formula like any other, so we can try to prove it correct, or we can try to prove it incorrect, or we can prove other desirable properties of statements. Hence, the predicate transformer approach is more flexible than Hoare-like axiomatizations. In addition, the predicate transformer formulation of correctness is done without introducing any new axioms or inference rules, being essentially just an application of higher-order logic. Hoare logic, on the other hand, requires a completely new axiom system to be built, with its own notion of terms and formulas and its own axioms and inference rules, usually built on top of a standard first-order logic. 17.2. Stepwise Refinement 273

Example The following simple derivation illustrates the way in which these rules are used to establish correctness of a statement. We show that   true {| x := n ;[y := y | y > x] |} y > n . Here x and y are program variables, and n can be either a constant, a free variable, or a program variable (the derivation looks the same regardless of which alternative is chosen). We have that   true {| x := n ;[y := y | y > x] |} y > n ⇐{sequential rule, x = n as intermediate predicate}   (“ true {| x := n |} x = n ”) ∧ (x = n {| [y := y | y > x] |} y > n) ≡{assignment rule}   “ true ⊆ (n = n) ” ∧ (x = n {| [y := y | y > x] |} y > n) ≡{reduction}   T ∧ (x = n {| [y := y | y > x] |} y > n) ≡{T greatest, demonic assignment rule}    (x = n) ⊆ (∀y • y > x ⇒ y > n)    ≡{reduction:(∀xn• x = n ⇒ (∀y • y > x ⇒ y > n)} T

17.2 Stepwise Refinement

Let us next study in more detail how the notion of refinement of statements is used in program construction. In particular, we are interested in the stepwise refine- ment method for program construction. This method emphasizes systematic pro- gram construction, starting from given program requirements and working stepwise towards a final program that satisfies these requirements. The following lemma shows that refinement of a statement is the same as preserving correctness of the statement.

Lemma 17.3 Let S and S be two predicate transformers. Then   S  S ≡ (∀pq• p {| S |} q ⇒ p {| S |} q).

Proof  S  S ≡{definition of refinement}  (∀q • “ S.q ⊆ S .q ”) 274 17. Correctness and Refinement of Statements

≡{mutual implication} • S.q ⊆ S.q ⇒{transitivity of inclusion}  (∀p • p ⊆ S.q ⇒ p ⊆ S .q) • S.q ⊆ S.q ≡{reflexivity, T ⇒ rule} S.q ⊆ S.q ⇒ S.q ⊆ S.q ⇐{specialization, p := S.q}  (∀p • p ⊆ S.q ⇒ p ⊆ S .q)  • (∀pq• p ⊆ S.q ⇒ p ⊆ S .q) ≡{definition of correctness}  (∀pq• p {| S |} q ⇒ p {| S |} q) 2

Note that the proof uses only the reflexivity and transitivity of the refinement ordering, so it would hold even if refinement were only a preorder.

Thus refinement is a relation that preserves the correctness of a statement: If S  S, then S satisfies any correctness condition that S satisfies. The result also shows that refinement is the weakest relation that preserves correctness (the strongest rela- tion is equality). Thus, refinement exactly characterizes the property of preserving correctness of statements.

The stepwise refinement method is based on this observation. We start with an initial statement S0 that satisfies some correctness criteria that we have given; i.e., we assume that p {| S0 |} q. Then we derive a sequence of successive refinements S0  S1 ···Sn. By transitivity, we have that S0  Sn, and because refinement preserves correctness, we have that p {| Sn |} q. We have constructed a new statement Sn from the original statement S0, where the new statement also satisfies our initial requirement.

Even if our aim is to derive a program statement from a program specification, we may move within the full predicate transformer category when carrying out a program derivation. Only the final program that we derive has to fall into a subclass of predicate transformers that are considered executable. This is illustrated in Figure 17.1, where we indicate that Pascal-like programs form a subclass of the functional statements.

On the other hand, if we restrict our derivation to program statements in a certain class, then the monotonicity and homomorphism properties that we have identified for this class of statements permit us to prove stronger properties, which might be crucial in the derivation. Monotonicity is a good example of this, as it permits us to focus on a component of a statement and refine it in isolation from its context. 17.2. Stepwise Refinement 275

Ptran

Ctran Ftran S1 S0 Pascal

S6

S S 2 S3 5

S4

FIGURE 17.1. Program derivation in the refinement calculus Correctness as Refinement Wehave shown that refinement can be described in terms of correctness. Wecan also describe correctness in terms of refinement. Let predicates p and q be given. Then the statement {p} ;[qˆ]isapre–post specification statement, where qˆ = (λσ • q) is the pointwise extension of the predicate q to a relation. This specification statement establishes postcondition q whenever precondition p holds in the initial state. An example of a pre–post specification is the statement {n > 0} ;[(x2 ≤ n <(x + 1)2)ˆ] This statement specifies that the final value of program variable x should be the integer number square root of n. Note that this specification allows other program variables to be changed in arbitrary ways. This may not be what we desire, so in practice we prefer to use assignments in specifications, as described later. The semantics of the pre–post specification is as follows (Exercise 17.4): ({p} ;[qˆ]).r.σ ≡ p.σ ∧ q ⊆ r . Lemma 17.4 Let p and q be predicates and S a monotonic predicate transformer. Then p {| S |} q ≡{p} ;[qˆ]  S .

Proof S monotonic {p} ;[qˆ]  S ≡{semantics of pre–post specification} (∀r σ • p.σ ∧ q ⊆ r ⇒ S.r.σ ) 276 17. Correctness and Refinement of Statements

≡{mutual implication} • (∀r σ • p.σ ∧ q ⊆ r ⇒ S.r.σ ) ⇒{specialize r := q, simplify} (∀σ • p.σ ⇒ S.q.σ ) • (∀r σ • p.σ ∧ q ⊆ r ⇒ “ S.r.σ ”) ⇐{monotonicity of S, local assumption q ⊆ r} (∀r σ • p.σ ∧ q ⊆ r ⇒ S.q.σ ) ≡{quantifier rules} (∃r • q ⊆ r) ⇒ (∀σ • p.σ ⇒ S.q.σ ) ≡{antecedent is trivially true} (∀σ • p.σ ⇒ S.q.σ ) • (∀σ • p.σ ⇒ S.q.σ )) ≡{pointwise extension} p ⊆ S.q 2

Because correctness can be expressed as refinement, we can start the stepwise refinement from an initial specification statement:

{p} ;[qˆ]  S1  S2 ··· Sn .

From this we see that Sn is correct with respect to the original pre- and postcondition pair:

p {| Sn |} q .

Arefinement step may introduce a new specification statement as a component,  ˆ Si  Si+1[X :={p } ;[q ]]. If we implement the subspecification with a statement S, so that {p} ;[qˆ]  S, then the monotonicity of statements allows us to  ˆ  conclude that Si+1[X :={p } ;[q ]]  Si+1[X := S ]. This top-down derivation  step thus gives that Si  Si+1[X := S ]. Top-down derivation is an important part of the stepwise refinement paradigm and motivates our choice of monotonicity as a fundamental property of statements.

We can actually treat any statement as a specification. This enlarges the class of specifications to abstract statements. An abstract statement need not be imple- mentable, because it is not intended to be executed. It is just intended to serve as a high-level description of the kind of behavior that we are interested in achieving. Any refinement of the abstract statement is an implementation of it (the implemen- tation may or may not be executable). The notion of abstract statements permits us to treat both implementation of specifications and program transformations as two special cases of the general notion of refining a statement. 17.2. Stepwise Refinement 277

General Specification Statements The specification statement {p} ;[qˆ] establishes the postcondition q independently of what the state is initially, as long as the state satisfies the precondition p.A specification statement that permits us to define the final state as a function of the initial state is of the form {p} ;  f . We refer to this as a functional specification statement. The functional update  f is usually given as an assignment statement. An example is the specification of the factorial statement:

{x ≥ 0} ; x := factorial.x .

This requires that the initial state satisfy the condition x ≥ 0 and that the final state be such that x is set to the factorial of the original x without changing other program variables.

An even more powerful way of specifying what final states are acceptable is to use a relation Q between initial and final states, rather than just a predicate q on the final state. This gives us a specification of the form {p} ;[Q], which we refer to as a conjunctive specification statement. This requires that the precondition p be satisfied and that for any initial state σ ∈ p the final state be in the set Q.σ .If this set is empty for some σ , then the specification is impossible to satisfy for that specific initial state.

In practice, the precondition is usually a boolean expression, and the relation is a  relational assignment. The specification is then of the form {b1} ;[x := x | b2]. This specification shows that only the program variables x are to be changed; all other variables are to keep their old values. An example is the specification that we mentioned in the introduction:

  {x ≥ 0 ∧ e > 0} ;[x := x |−e < x − x 2 < e] .

This requires that x ≥ 0 ∧ e > 0 initially and that x be set to the integer square root of its initial value with tolerance e.

The boolean expression b in a specification {p} ;[x := x | b] can be very com- plex. The specification can often be made more readable by dividing the demonic assignment into different cases, giving a conditional specification:

  {p} ; if g1 → [x := x | b1][] ··· [] gn → [x := x | bn] fi .

Such a specification can always be reduced to the form with a single demonic assignment (Exercise 17.5):

  {p} ; if g1 → [x := x | b1][] ··· [] gn → [x := x | bn] fi =  {p ∧ (g1 ∨···∨gn)} ;[x := x | (g1 ∧ b1) ∨···∨(gn ∧ bn)] . 278 17. Correctness and Refinement of Statements

Specification Constants A specification constant is a free variable introduced to name a certain value (for example, the initial value of some variable) in a subsequent derivation. In this section we justify the use of specification constants and show how they make it possible to work with pre–post specifications {p} ;[qˆ] rather than general con- junctive specifications {p} ;[Q] without losing generality. This will be particularly useful in connection with loops, since there is a simple rule for refining a pre–post specification into a loop (Section 21.6). A specification constant is introduced as a fresh (free) variable in a statement. Specification constants are not intended to appear in specifications or programs; they are used as a vehicle for performing program derivations. The following theorem justifies the use of specification constants.

Theorem 17.5 Let b be a boolean expression with possible free occurrences of x0. The following rule for handling specification constants is then generally valid:

 {b} ; S  S •   {∃x0 b} ; S  S  • x0 not free in , S or S .

•  •  Proof We want to derive {∃x0 b} ; S  S from (∀x0 {b} ; S  S ).Wehave 

•   (∀x0 {b} ; S  S ) ≡{definition of refinement}

•  (∀x0 q σ b.σ ∧ S.q.σ ⇒ S .q.σ )  ≡{quantifier rules, x0 not free in S or S } • •  (∀q σ (∃x0 b.σ ) ∧ S.q.σ ⇒ S .q.σ ) ≡{pointwise extension}

• •  (∀q σ (∃x0 b).σ ∧ S.q.σ ⇒ S .q.σ ) ≡{definition of refinement}

•  {∃x0 b} ; S  S 2

• The rule is used as follows. If we want to refine the statement {∃x0 t} ; S (where x0 is variable that does not occur free in S), we start a subderivation from {t} ; S. The assertion can then be used to simplify and refine S. Whenever we reach a  statement S that does not contain x0 free, we can close the subderivation and •  deduce {∃x0 t} ; S  S . Thus the specification constant does not occur free on the outermost level in the derivation. 17.2. Stepwise Refinement 279

The simplest and most common use of specification constants is where x0 stands for the initial value of some program variable x. Then the predicate t is x = x0, • and the assertion {∃x0 t} is trivially equal to skip.

Implementing Specifications Using specification constants, we can derive a rule for reducing a refinement to a correctness condition with program variables.

Theorem 17.6 Assume that statement S is built from assignments to program variables x and y using statement constructors. The following rule for reducing refinement to correctness is valid: ( ∧ = ∧ = ) {| |} ( ,  = , ∧ = ) p x x0 y y0 S b[x x : x0 x] y y0 . {p} ;[x := x | b]  S

where x0 and y0 are fresh variables.

Proof The assumption that S is built from assignments to x and y makes it possible to prove (by induction over the structure of S) the following: p {| S |} q ⇒

• •   •   (λσ p.σ ∧ σ =σ0) {| S |} (λσ q.σ ∧ (∃x y σ =set.x.x .(set.y.y .σ0))) .  We then have (where b0 abbreviates b[x, x := x0, x])

(p ∧ x = x0 ∧ y = y0) {| S |} b0 ∧ y = y0 ≡{property above}

• (λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0) {| S |}

• • (λσ ∃x1 y1 b0.σ ∧ σ = set.x.x1.(set.y.y1.σ0) ∧ y.σ = y0) ≡{use state attribute properties to rewrite}

• (λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0) {| S |}

• • (λσ ∃x1 b0.σ ∧ σ = set.x.x1.(set.y.y0.σ0) ∧ y.σ = y0) ≡{Lemma 17.4}

• {λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0} ;  • •    [λσ σ ∃x1 b0.σ ∧ σ = set.x.x1.(set.y.y0.σ ) ∧ y.σ = y0]  S ≡{use assertion to rewrite}

• {λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0} ;  • •   [λσ σ ∃x1 b0.σ ∧ σ = set.x.x1.σ0] 280 17. Correctness and Refinement of Statements

 S ⇒{subderivation below, transitivity of }

• {λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0} ;     [λσ σ • (∃x • σ = set.x.x .σ ∧ b.σ )]  S ≡{rewrite to assignment notation}

•  {λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0} ;[x := x | b]  S and the result then follows from Theorem 17.5. The subderivation referred to above is as follows:

• {λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0} ;     [λσ σ • (∃x • σ = set.x.x .σ ∧ b.σ )] 

• {λσ p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0} ;  • •    [λσ σ ∃x1 b[x, x := x0, x].σ ∧ σ = set.x.x1.σ0] ≡{use the rule {p} ;[P] {p} ;[Q] ≡|p| ; Q ⊆ P (Exercise 17.8)}

 • (∀σσ x1 p.σ ∧ σ = σ0 ∧ x.σ = x0 ∧ y.σ = y0 ∧    b[x, x := x0, x].σ ∧ σ = set.x.x1.σ0 ⇒    (∃x • σ = set.x.x .σ ∧ b.σ )) ≡{one-point rule}

 • (∀σ x1 p.σ0 ∧ x.σ0 = x0 ∧ y.σ0 = y0 ∧  b[x, x := x0, x].(set.x.x1.σ0) ⇒  •  (∃x set.x.x1.σ0 = set.x.x .σ0 ∧ b.σ0))

≡{use x1 as existential witness} T The details of the last step are left as an exercise to the reader. 2

This shows that any refinement of the form {p} ;[x := x | b]  S can be proved by proving a corresponding correctness condition. The correctness condition can then be decomposed using the rules given earlier in this chapter, and in the end, proving the refinement has been reduced to proving theorems that do not include any programming constructs.

17.3 Refinement Laws

A refinement law is an inference rule that allows us to deduce that a certain re- finement S  S is valid. Since equality is stronger than refinement, any law with 17.3. Refinement Laws 281 a conclusion of the form S = S can also be used as a refinement law. We use refinement laws to establish that the individual refinement steps are correct in a derivation S0  S1 ···Sn. We have already encountered numerous refinement laws. Essentially all the mono- tonicity and homomorphism properties of statements proved in previous chapters can be used as refinement laws. The monotonicity and homomorphism properties that we have established for other entities in the refinement calculus hierarchy (state transformers, state predicates, and state relations) are also very useful in program derivation, because refinement of statements is often reduced to ordering or homo- morphism properties at lower levels in the hierarchy. Below we look more closely at some of the basic classes of refinement laws and justify their use in program derivations.

Monotonicity Laws The monotonicity properties that we have identified in earlier chapters are central to the stepwise refinement paradigm. For example, we can write the monotonicity law for assert as follows:

  ⊆  p p .  {p}{p} This rule tells us that if we prove   p ⊆ p (in a subderivation), then we have established {p}{p}, where  is the set of current assumptions. In practice, the predicates p and p are usually boolean expressions, so proving p ⊆ p can be further reduced to proving implication between two boolean terms. In this way, refinement between statements is reduced to implication between boolean terms, two steps down in the refinement calculus hierarchy. This kind of reduction in the hierarchy is very common. This is the reason why we have put so much emphasis on all domains within the refinement calculus hierarchy. Similarly, the fact that sequential composition of monotonic predicate transformers is monotonic in both arguments gives us the following rule:

        S1 S1 S2 S2 .      S1 ; S2 S1 ; S2 A step according to this rule in a program derivation has two subderivations, one         to establish S1 S1 and another to establish S2 S2. In practice, we usually leave one of the two substatements unchanged, and the trivial subderivation (of the form S  S) is then omitted. The general refinement rule for monotonic statement constructs is just a reformu- lation of the monotonicity property: 282 17. Correctness and Refinement of Statements

      (λ • ) t t x S is monotonic .   S[x := t]  S[x := t] Here t and t may be substatements, but they may also be predicates, relations or other entities that occur in a monotonic position inside S (with  replaced by a suitable ordering).

Laws for Refining Specification Statements We will obviously need rules for refining different kinds of specifications. We will here consider one important special case: refining a conjunctive specification to a functional update. A more extensive treatment of refining specification statements is given in Chapter 27. We have the following coercion rule (Exercise 17.9) for this purpose:  | | | |⊆ p ; f Q .  {p} ;[Q] f Such a rule is well suited for general reasoning about update statements, but as already noted, in practice we work with the assignment notation. We can formulate the above rule for general assignments as follows:    b1 ⊆ b2[x := e]  (introduce assignment)  {b1} ;[x := x | b2] x := e  • x not free in  or b1. To show that this is in fact a special case of the rule given above, we show corre- spondence between the hypotheses. Assume that p is the boolean expression b1,  f is (x := e), and Q is (x := x | b2). Then |p| ; | f |⊆Q ≡{definitions of coercions, composition, and relation order}       (∀σσ • (∃σ • σ = σ ∧ p.σ ∧ σ = f.σ ) ⇒ Q.σ.σ ) ≡{one-point rules} (∀σ • p.σ ⇒ Q.σ.( f.σ )) ≡{assumption about p and f }

• (∀σ b1.σ ⇒ Q.σ.(set.x.(e.σ ).σ )) ≡{assumption about Q}

•  •  (∀σ b1.σ ⇒ (∃x “ set.x.(e.σ ).σ = set.x.x .σ ” ∧ b2.σ )) ≡{state algebra property (Exercise 5.1)}

•  •  (∀σ b1.σ ⇒ (∃x e.σ = x ∧ b2.σ )) 17.3. Refinement Laws 283

≡{one-point rule}

•  (∀σ b1.σ ⇒ b2[x := e].σ ) ≡{pointwise extension}  b1 ⊆ b2[x := e]

As an example of applying this rule, consider the refinement

{x > 0} ;[x := x | 0 ≤ x < x] x := x − 1 . The coercion rule says that this refinement holds if

(x > 0) ⊆ (0 ≤ x − 1 < x). By reduction, this follows from the fact that x − 1 < x and x > 0 ⇒ x − 1 ≥ 0 hold for any natural number x. Hence, the refinement is valid.

Laws for Assignment Statements We use homomorphism properties to lift properties of state transformers, predicates and relations to laws for refinement of statements. The properties of assignments proved in Chapter 5 can, e.g., be lifted to properties of assignment statements using the homomorphism of functional update. Thus, Theorem 5.3 gives us the following rule for merging and splitting assignment statements:

x := e ; x := f =x := f [x := e] (merge assignments) (recall that we implicitly assume that var x is satisfied in rules like this). The corollaries of Theorem 5.3 give us additional rules as special cases. The merge/split rule is as follows for assignments to distinct variables x and y:

x := e ; y := f =x, y := e, f [x := e] . Furthermore, if y is not free in e, and x is not free in f , then

x := e ; y := f =y := f ; x := e . (commute assignments)

Laws for relational assignment statements The homomorphism properties of relational updates give us similar rules. From Theorem 9.6 we get the following basic rules:

[x := x | b];[x := x | c] = (demonic merge)      [x := x | (∃x • b[x := x ] ∧ c[x := x ])] , {x := x | b} ; {x := x | c}= (angelic merge)      {x := x | (∃x • b[x := x ] ∧ c[x := x ])} . Special rules then follow from the corollaries of Theorem 9.6. First, 284 17. Correctness and Refinement of Statements

[x := x | b];[y := y | c[x := x]] = [x, y := x, y | b ∧ c] . Furthermore, if x and x do not occur free in c and y and y do not occur free in b, then [x := x | b];[y := y | c] = [y := y | c];[x := x | b] . (commute) The rules for angelic assignments are analogous. Finally, we have two rules derived from Corollary 9.9. If x and y are not free in e, then [x, y := x, y | b[x := e]] = (add leading assignment) x := e ;[x, y := x, y | b] , and if x is not free in b and y is not free in e, then [x, y := x, y | b ∧ x = e] = (add trailing assignment) [y := y | b];x := e[y := y] .

The following derivation fragment illustrates adding a trailing assignment:     [x, y := x , y | y ≤ y ∧ x > x] {replace subcomponent in antimonotonic context}     [x, y := x , y | y ≤ y ∧ x = x + 1] {add trailing assignment}   [y := y | y ≤ y];x := x + 1

Introducing and Eliminating Conditional Statements The monotonicity rules preserve the structure of the statement involved; an assert statement is transformed into another assert statement, or a sequential composi- tion is transformed into another sequential composition. Using various homomor- phism rules we can also do derivation steps that change or introduce structure. We have already seen how sequential composition can be introduced by splitting assignments (and eliminated by merging assignments). Let us consider similar introduction/elimination rules for conditional statements. An intuitively obvious rule for a deterministic conditional is the following:

S = if g then S else S fi . (introduce conditional) Since the rule is an equality, it can be used for both introducing and eliminating conditionals. It is formally derived as follows: 17.3. Refinement Laws 285

S ={exhaustion, unit} [g ∪¬g];S ={homomorphism} ([g]  [¬ g]) ; S ={distributivity} [g];S  [¬ g];S ={characterization of conditional} if g then S else S fi

The generalization of the introduction rule to the guarded conditional statement is

S  [g1 ∪···∪gn];if g1 → S [] ··· [] gn → S fi . We leave the proof of this rule as an exercise to the reader (Exercise 17.6). Note that the guard statement on the right-hand side is needed, for without it the right-hand side would be aborting in a state where all the guards are false. The following rule for eliminating conditionals is useful when we know that a certain branch of the conditional is enabled:

g ⊆ gi ⇒ (eliminate conditional) {g} ; if g1 → S1 [] ··· [] gn → Sn fi  Si when 1 ≤ i ≤ n.

Introduction and Elimination of Local Variables  A local variable introduction is a refinement of the form begin var z := e ; S     begin var y, z := e, e ; S end, where S and S are related in some way. We saw in Chapter 5 that such block introductions work on the state transformer level. We shall now show how a corresponding basic principle works on the statement level. Local variable introduction for statements works in the same way as for state transformers. The general principle can be formulated as follows: if S is a statement built by statement constructors from assignment statements, asserts, and guards only; and y does not appear free in q or S, then

begin var z | q ; S end  (introduce local variable)  begin var y, z | p ∧ q ; S end , where S is similar to S, possibly with assignments to y added. An important special case is the situation where the initial statement has no local variables. Then the rule reduces to block introduction: 286 17. Correctness and Refinement of Statements

 S  begin var y | p ; S end . (introduce block)

The following derivation outlines how the principle works in the case of block introduction when S is a simple demonic assignment, where y does not occur free in b:  [x := x | b] {properties of begin and end}  begin var y | p ; skip end ;[x := x | b]  ={end =[y := y | T];end by state attribute properties}   begin var y | p ;[y := y | T] end ;[x := x | b]   ={end ;[x := x | b] = [x := x | b];end by state attribute properties}   begin var y | p ;[y := y | T];[x := x | b] end  ={end =[y := y | T];end by state attribute properties}    begin var y | p ;[y := y | T];[x := x | b];[y := y | T] end

Now the assignments to y can be made more deterministic and/or merged with other assignment statements. In this case, we can arrive at the following:

   [x := x | b]  begin var y | p ;[x, y := x , y | b] end .

For a more complex statement S, the same idea is used, pushing the end of the block past all substatements and adding assignments to y. The added variable y is a ghost variable; it appears only in assignments to itself. However, once it has been introduced, further refinement steps can make it play an active role in the computation. The rule for local variable introduction has refinement rather than equality, because the initialization and the added assignment to the local variables may introduce miracles. Thus it cannot directly be used for block elimination. However, if all the assignments to a local (ghost) variable are nonmiraculous, then it can be eliminated. Thus, if S is a statement built by statement constructors from assignment statements, asserts, and guards only; and y occurs in S only as a ghost variable, and the initialization p and all assignments to y are nonmiraculous; and y does not occur free in q, and z does not occur free in p, then

begin var y, z | p ∧ q ; S end = (eliminate local variable)  begin var z | q ; S end , where S is similar to S, but with all assignments to y removed. 17.4. Refinement in Context 287

Block Shunting An alternative to the rules for block introduction and elimination is the following shunting rule for blocks:

Theorem 17.7 Assume that the block begin var y := e ; S end is well-defined, with global vari- able(s) x, and that S is constructed by statement constructors from assignment statements, asserts, and guards only. Then   {p ∧ y = e} ;[x, y := x , y | b]  S ≡  {p} ;[x := x | b]  begin var y := e ; S end , provided that y and y are not free in p or b. Proof For the forward implication we do not need the assumption about S.An outline of the proof is as follows:   {p ∧ y = e} ;[x, y := x , y | b]  S ⇒{monotonicity of the block construct}   begin var y := e ; {p ∧ y = e} ;[x, y := x , y | b] end  begin var y := e ; S end ≡{drop redundant assertion}   begin var y := e ; {p} ;[x, y := x , y | b] end  begin var y := e ; S end ≡{push end leftward, use properties of state attributes}  begin ; end ; {p} ;[x := x | b]  begin var y := e ; S end ≡{homomorphism, begin ; end = id}  {p} ;[x := x | b]  begin var y := e ; S end The reverse implication in the first step is tricky to prove, since it depends on the fact that the only state changes caused by statement S are changes in the attributes x and y (this can be proved by induction on the structure of statement S). The full proof is long and not very informative, so we omit it. For an indication of the idea used, we refer to the proof of Theorem 17.6. 2

17.4 Refinement in Context

The refinement relation that we have defined for program statements is very re- strictive; it requires that a refinement should preserve the correctness of the original statement with respect to any correctness requirements that could be stated for the original statement. This means that a substatement refinement preserves correct- ness in any monotonic context. In practice this is often too strong, because we are usually interested in refining a subcomponent in a specific context and do not care whether this refinement preserves correctness in other contexts also. 288 17. Correctness and Refinement of Statements

The general problem of refinement in context can be described as follows. Let C[X := S] be a statement with a specific occurrence of substatement S singled out. Let us write C[X := S]asC[S] and refer to C as the context of S. Since the statement constructors are monotonic, we can always replace S by another statement S, where S  S, and get C[S]  C[S]. However, it may be possible to replace S with a statement S that does not refine S in isolation, but where C[S]  C[S] still holds. This can be done systematically in the following way: First find a statement S0 such that C[S] = C[S0]. Then find   arefinement S of S0. The refinement C[S]  C[S ] then follows by monotonicity and transitivity. The less refined we can make S0 without changing the overall effect of C[S0], the more freedom it gives us in the subsequent refinement. Consider the simple refinement

x := 0;y := x + 1  x := 0;y := 1 . This holds even though y := 1isnotarefinement of y := x + 1. To prove this refinement, we use the fact that x must have the value 0 right after the first assignment. Therefore (and this is easily verified using statement definitions)

x := 0;y := x + 1 = x := 0;{x = 0} ; y := x + 1 . Intuitively, this holds because the added assert statement will necessarily act as a skip. Next we note that

{x = 0} ; y := x + 1  y := 1 . Transitivity then gives us the required result. The assert statement {x = 0} is known as a context assertion, and the refinement {x = 0} ; y := x + 1  y := 1isa refinement in context. Intuitively, this says that y := x + 1isrefined by y := 1 when x = 0 initially. We collect information about the context of statement S in a context assertion {p}, and then refine {p} ; S. A derivation that takes the context into account thus has the following structure: C[S] ={introduce context assertion} C[{p} ; S] {prove {p} ; S  S, monotonicity}  C[S ]

Adding Context Information To make the above method work, we need rules that add context assertions to a statement. The most general rule is the following: If S is conjunctive, then 17.4. Refinement in Context 289

(p ∩ S.true) {| S |} q ⇒{p} ; S ={p} ; S ; {q} , (add assertion) which is proved as follows: (p ∩ S.true) {| S |} q, S conjunctive  ({p} ; S ; {q}).r ={definitions} p ∩ S.(q ∩ r) ={S conjunctive} p ∩ S.q ∩ S.r ={correctness assumption implies p ∩ S.r ⊆ S.q, lattice property} p ∩ S.r ={definitions} ({p} ; S).r

If we are faced with a statement S that has no initial assertion, then we can always add the trivial assertion {true} (i.e., skip). An important special case is where S is a deterministic assignment. In this case we get the rule

• {p} ; x := e ={p} ; x := e ; {∃x0 x = e[x := x0]} .

We can also use guard statements [p] to indicate assumptions about the context of a subcomponent to be refined. The use of such context assumptions is dual to the use of context assertions. We will study the rules for adding context assertions and assumptions in much more detail in Chapter 28 and will also study refinement in context for different kinds of statements. Here we have limited ourselves to introducing the basic ideas involved in this refinement technique.

Context Assertions in Conditionals and Blocks A branch in a conditional statement is chosen depending on the truth of a guard. This means that we can use the information that the guard is true when refining a branch. In particular, we can push assertions into conditionals as follows:

{p} ; if g then S1 else S2 fi =

if g then {p ∩ g} ; S1 else {p ∩¬g} ; S2 fi and

{p} ; if g1 → S1 [] ··· [] gn → Sn fi =

if g1 →{p ∩ g1} ; S1 [] ··· [] gn →{p ∩ gn} ; Sn fi 290 17. Correctness and Refinement of Statements

When a block construct initializes a local variable, the initialization can be used as context information when refining the block further. If S is written using program variable notation (with global variables x) and the block adds local variables, then we can generalize the local variable introduction in a simple way:    begin var z | b ; {p} ; S end  begin var y, z | b ∧ b ; {p ∧ b} ; S end , provided that y does not appear in b or S. Here S is similar to S, possibly with assignments to y added. The special case in which there are no local variables to start with reduces to the following:  {p} ; S  begin var y | b ; {p ∧ b} ; S end . We leave the verification of this rule to the reader as an exercise (Exercise 17.12).

Using Context Information Context information is propagated in order to be used in later refinement steps. The most common use of context information is in rewriting an expression inside an assignment statement or a guard of a conditional. We shall here give rules for this use of context information; more details and proofs are found in Chapter 28. Context information right in front of an assignment statement can be used for rewriting as follows:   ⊆ ( = ) b e e .  {b} ; x := e ={b} ; x := e For relational assignments, we have that   ( ∧ ) ⊆ p c b ,  {p} ;[x := x | b] {p} ;[x := x | c]   ( ∧ ) ⊆ p b c .  {p} ; {x := x | b}{p} ; {x := x | c} Context information in front of a conditional can be used to modify the guards:   ⊆ ( =  ∧···∧ =  ) b g1 g1 gn gn .  { } → ··· → ={ }  → ···  → b ; if g1 S1 [] [] gn Sn fi b ; if g1 S1 [] [] gn Sn fi

Example Derivation We illustrate these rules by developing a program fragment that computes the minimum of two numbers. We do not assume a min operator; instead we start from a relational assignment.    [z := z | (z = x ∧ x ≤ y) ∨ (z = y ∧ x ≥ y)] ={introduce conditional} 17.5. Refinement with Procedures 291

   if x ≤ y →{x ≤ y} ;[z := z | “ (z = x ∧ x ≤ y) ∨ (z = y ∧ x ≥ y) ”]    [] x ≥ y →{x ≥ y} ;[z := z | (z = x ∧ x ≤ y) ∨ (z = y ∧ x ≥ y)] fi ={rewrite using context information x ≤ y}   if x ≤ y →{x ≤ y} ;[z := z | z = x]    [] x ≥ y →{x ≥ y} ;[z := z | “ (z = x ∧ x ≤ y) ∨ (z = y ∧ x ≥ y) ”] fi ={rewrite using context information x ≥ y}   if x ≤ y →{x ≤ y} ;[z := z | z = x]   [] x ≥ y →{x > y} ;[z := z | z = y] fi ={rewrite deterministic updates, drop assertions} if x ≤ y → z := x [] x ≥ y → z := y fi

17.5 Refinement with Procedures

Recall the definition of a (local) procedure:

proc N(var x, val y) = S in (local procedure) ··· N(u, e) ··· =∧ let N = (λxy• S) in   ···begin var y := e ; N.u.y end ···

From the definition of the let construct we have proc N(var x, val y) = S in ···N(u, e) ··· =  ···begin var y := e ; S[x := u] end ··· This shows that if we prove a refinement S  S, then we have shown how to refine the whole statement construct including the local procedure declaration, provided that the context indicated by the dots ···is monotonic. This justifies the following rule of procedure refinement: S  S  . proc N(var x, val y) = S in T  proc N(var x, val y) = S in T In practice, we assume that T is built from basic statements and procedure calls N(u, e) using monotonic constructors only, so the rule can be applied. 292 17. Correctness and Refinement of Statements

Thus we can identify and introduce a procedure in an early stage of a derivation and then refine the procedure body at a later stage. This makes the derivation more structured and simpler, especially if the program contains many different calls to the same procedure.

The following simple example illustrates the idea of procedure refinement:

  ···[b := b | 0 ≤ b ≤ a + c] ··· ={procedure introduction}   proc Between(var y, val x, z) ={x ≤ z} ;[y := y | x ≤ y ≤ z] in ···Between(b, 0, a + c) ··· {make procedure body deterministic} proc Between(var y, val x, z) =y := (x + z)/2 in ···Between(b, 0, a + c) ···

17.6 Example: Data Refinement

During a program derivation we are often in a situation where we have a block with an abstract (inefficient or unimplementable) local data structure a, and we want to replace it with another, more concrete (efficient, implementable) data structure. Such a data refinement can be justified using the rules of block introduction and block elimination. The general idea works in three steps:

begin var a ; S1 end {introduce ghost variable c}

begin var a, c ; S2 end {refine the block body so that a becomes a ghost variable}

begin var a, c ; S3 end ={eliminate ghost variable a}

begin var c ; S4 end

In the first step S2 is syntactically like S1, but with assignments to c added (recall that c is a ghost variable if it appears only in assignments to c). The added assign- ments are chosen such that a certain relationship between c and a (the abstraction invariant) always holds. This relationship is then used to refine S2 into S3, where c has taken over the role that a played before and a has become a ghost variable. Finally, in the block elimination S4 is syntactically like S3 but with all assignments to a removed. 17.6. Example: Data Refinement 293

Problem Description: The Least Nonoccurrence We illustrate the idea of data refinement with an example, developing a program that finds the smallest natural number not occurring in a given array. We assume that the array B : array Nat of Nat is given and that we want to find the smallest number not occurring in B[0 ...n − 1]. Furthermore, we assume that all elements in B[0 ...n − 1] are within the range 0 ...m − 1. In other words, we assume (∀i | i < n • B[i] < m). To make specifications easier to read, we introduce notation for when a value occurs as an element in an array segment: ∧ elem.z.a[ j ...k] = (∃i | j ≤ i ≤ k • a[i] = z). For simplicity, we abbreviate elem.z.B[0 ...n − 1] as elem.z.B. Our specification is then    [x := x |¬elem.x .B ∧ (∀y • ¬ elem.y.B ⇒ x ≤ y)] .

First Refinement: A Block Introduction Our aim is now to develop a program that is reasonably efficient when n is large and m is small. Our idea is first to collect information about what numbers occur in B[0 ...n −1] and then find the smallest number that is not among these. To collect this information we add a local variable A : Set of Nat (in variable declarations, we use the notation Set of  for P()).

   [x := x |¬elem.x .B ∧ (∀y • ¬ elem.y.B ⇒ x ≤ y)] {introduce block with initial assignment} begin var A : Set of Nat ;   [A := A | (∀y • y ∈ A ≡ elem.y.B)];{∀y • y ∈ A ≡ elem.y.B} ;    [x := x | “ ¬ elem.x .B ∧ (∀y • ¬ elem.y.B ⇒ x ≤ y) ”] end {use assertion to rewrite} begin var A : Set of Nat ;   [A := A | (∀y • y ∈ A ≡ elem.y.B)];    [x := x | x ∈ A ∧ (∀y • y ∈ A ⇒ x ≤ y)] end Now the problem is divided into two steps: first build the set A and then find the smallest number not in A. 294 17. Correctness and Refinement of Statements

Second Refinement: Change Data Structure We shall now replace the data structure A (a set) with a more concrete structure. Since we have assumed that the bound m of the numbers in A is reasonably small, we can implement a set as an array C : array Nat of Bool so that C[x] is true if and only if x is in the set A. We do this data refinement in the way described above. When adding the local variable C, we make sure that the abstraction invariant

(∀y • y ∈ A ≡ y < m ∧ C[y]) holds at every point of the block. After that, we make small changes to the program text with the aim of making A a ghost variable.

begin var A : Set of Nat ;   [A := A | (∀y • y ∈ A ≡ elem.y.B)];    [x := x | x ∈ A ∧ (∀y • y ∈ A ⇒ x ≤ y)] end {introduce local variable with abstraction invariant} begin var A : Set of Nat, C : array Nat of Bool ;    [A, C := A , C | (∀y • y ∈ A ≡ elem.y.B) ∧   “ (∀y • y ∈ A ≡ y < m ∧ C [y]) ”] ; {∀y • y ∈ A ≡ y < m ∧ C[y]} ;    [x := x | x ∈ A ∧ (∀y • y ∈ A ⇒ x ≤ y)] end  ={rewrite with local assumption (∀y • y ∈ A ≡ elem.y.B)} begin var A : Set of Nat, C : array Nat of Bool ;    [A, C := A , C | (∀y • y ∈ A ≡ elem.y.B) ∧  (∀y • elem.y.B ≡ y < m ∧ C [y])]; {∀y • y ∈ A ≡ y < m ∧ C[y]} ;    [x := x | “ x ∈ A ∧ (∀y • y ∈ A ⇒ x ≤ y) ”] end ={rewrite using context information} begin var A : Set of Nat, C : array Nat of Bool ;    “[A, C := A , C | (∀y • y ∈ A ≡ elem.y.B) ∧  (∀y • elem.y.B ≡ y < m ∧ C [y])]”;     [x := x | (x ≥ m ∨¬C[x ]) ∧ (∀y • y ≥ m ∨¬C[y] ⇒ x ≤ y)] end 17.7. Summary and Conclusions 295

{split assignment} begin var A : Set of Nat, C : array Nat of Bool ;   [A := A | (∀y • y ∈ A ≡ elem.y.B)];   [C := C | (∀y • elem.y.B ≡ y < m ∧ C [y])];     [x := x | (x ≥ m ∨¬C[x ]) ∧ (∀y • y ≥ m ∨¬C[y] ⇒ x ≤ y)] end {eliminate local ghost variable A} begin var C : array Nat of Bool ;   [C := C | (∀y • elem.y.B ≡ y < m ∧ C [y])];     [x := x | (x ≥ m ∨¬C[x ]) ∧ (∀y • y ≥ m ∨¬C[y] ⇒ x ≤ y)] end

Now we have reached a version in which the array C is used to represent the set of numbers that occur in the array B. This version can be implemented in an efficient way using rules for introducing loops (Chapter 21, Exercise 21.9).

17.7 Summary and Conclusions

We have analyzed the way in which correctness and refinement are related to each other and used this as a basic justification for the stepwise refinement method for program construction. We have also given some practical refinement laws for program derivation. These are proved by using the basic properties of program statements, such as monotonicity and homomorphism properties, together with the basic properties of assignments and assignment statements given in the previous chapter. We will later study other important kinds of refinement laws, concerned with recursion, iteration, specification, and context refinement. Here we wanted only to give a first impression of the kinds of rules that can be proved formally in refinement calculus. There are many more such refinement rules, most of which can be derived quite easily. Rather than trying to present a long list of more or less ad hoc rules for prac- tical program derivation, we wanted to emphasize the central algebraic structures that underlie the refinement calculus, and have tried to identify the basic definitions and properties that one needs when reasoning in the calculus. The hope is that with this basis it will be simple to prove less common refinement rules on the spot, as they are needed, and only memorize the most important ones. Early work on correctness and verification of programs was done by Floyd [56]. Hoare gave rules for partial correctness of deterministic programs with while loops [84]. Correctness proofs of programs (both partial and total correctness), in par- ticular for loops, has been treated extensively by, e.g., Dijkstra [53], Gries [64], 296 17. Correctness and Refinement of Statements

John Reynolds [121], and Hehner [75]. An overview of earlier work is given by Krysztof Apt [4]. The notion of refinement as a total correctness-preserving relation between program statements and its formulation in terms of predicate transformers was introduced by Back [6, 7]. Back [9] generalized the notion of refinement to preserving arbitrary correctness properties and showed how this notion is related to the denotational se- mantic view of approximation of statements, and to the Smyth ordering [119, 130]. He also introduced a notion of refinement that preserves partial correctness and showed that the refinement relation that preserves both partial and total correct- ness is equivalent to the Egli–Milner ordering [35, 118, 123] of (demonically) nondeterministic program statements. Specification statements as a way of expressing input–output properties directly in a programming language were introduced by Back [6], replacing correctness rea- soning with a calculus of refinement. Morgan [104] used the pre-post specification (with syntax x :[p, q]) together with rules for introducing structures rather than correctness rules. Specification constants have been a standard method of making postconditions refer to initial values of program variables. Morgan developed a block with a “logical constant” (essentially an angelically initialized local variable) to make specification constants part of the programming notation [106]. Stepwise refinement, essentially in the form of top-down program refinement, was originally introduced as a systematic programming technique by Dijkstra in the late ’60s and early ’70s [51]. Wirth was also an early proponent of the technique, and published an influential paper on the practical application of the stepwise refine- ment method to program derivation [141]. Hoare’s axiomatic correctness calculus was also inspired by the stepwise and top-down method for program construction, and he illustrated a formal approach to top-down program construction in an early paper [86]. The formalization of this method based on Dijkstra’s weakest precon- dition calculus was the original motivation for developing the refinement calculus [6, 7]. Since then, the refinement calculus has become much broader in scope and aim, but the stepwise refinement method is still one of the main motivations for studying and applying this calculus. The top-down program derivation method has continued to be one of the main programming methods taught in programming courses. The basic idea is one of separation of concerns, articulated by Dijkstra in his early work: one should try to separate the issues and design decisions to be made, and attack one problem at a time. The simple top-down derivation method has also some disadvantages. It tends to lead to large, monolithic, single-purpose programs that are difficult to reuse. The advantage of the refinement calculus for top-down program construction is the possibility of using specification statements and the possibility of combining 17.8. Exercises 297

pure top-down derivations with optimizing program transformations, which do not fit into the standard top-down design method. Such transformations are usually needed, because the top-down method tends to produce independent refinement of specification and duplication of code, which can often be removed by identifying common substructures.

17.8 Exercises

17.1 Prove the following rules for total correctness:

p {| { R}|}q ≡ ran.(|p| ; R) ∩ q = false , p {| [R] |} q ≡ ran.(|p| ; R) ⊆ q . 17.2 Prove the correctness rules for the deterministic conditional and for the block construct. 17.3 Use the rules for relational assignments to prove the following correctness asser- tions:

(a) x ≥ y {| [x := x | x > x] |} x > y .

(b) x ≤ y {| { x := x | x > x}|}x > y . 17.4 Show that the semantics of the pre–post specification is as follows: ({p} ;[rˆ]).q.σ ≡ p.σ ∧ r ⊆ q . 17.5 Show that a conditional specification can be reduced to the form with a single demonic assignment, as follows:   {p} ; if g1 → [x := x | b1][]g2 → [x := x | b2] fi =  {p ∧ (g1 ∨ g2)} ;[x := x | (g1 ∧ b1) ∨ (g2 ∧ b2)] . 17.6 Prove the rule for introduction of a guarded conditional:

S  [g1 ∪···∪gn];if g1 → S [] ··· [] gn → S fi . 17.7 Prove the general rule for introduction of a deterministic conditional:

{p} ; S = if g then {p ∩ g} ; S else {p ∩¬g} ; S fi . 17.8 Prove the following rule for refining a specification with another specification: {p} ;[P] {p} ;[Q] ≡|p| ; Q ⊆ P . 17.9 Prove the rule for introduction of a functional update:  | | | |⊆ p ; f Q .  {p} ;[Q] f 298 17. Correctness and Refinement of Statements

17.10 Use rules for introducing conditionals and assignments to refine the following specification:

z := 100 min (x max y) (“set z to the greater of x and y but no greater than 100”). The resulting state- ment should not use the min and max operators. Hint: either you can modify the assignment introduction rule or you can rewrite the specification as a relational assignment. 17.11 Use the rule for adding a leading assignment to derive the following version of the minimum calculation: z := x; if y < z then z := y else skip fi . 17.12 Verify the principle of block introduction with context information:  var x {p} ; S  begin var y | b ; {p ∧ b} ; S end , where S is similar to S, provided that y does not appear in S. 17.13 Recall that statement S is finitely conjunctive if S.(p∩q) = S.p∩ S.q for arbitrary predicates p and q.Define Corr(p, q) (the correctness category for precondition p : P() and postcondition q : P()) to be the set of all finitely conjunctive statements S satisfying p {| S |} q. Show that for any given collection of underlying state spaces, the collection of all Corr(p, q) is a category. 17.14 Prove the following property of the correctness category:     Corr(p, q) ∩ Corr(p , q ) ⊆ Corr(p ∪ p , q ∪ q ). Hint: Show that if S is finitely conjunctive, then (p {| S |} q) ∧ (p {| S |} q) ⇒ p ∩ p {| S |} q ∩ q . 17.15 Show that if p ⊆ S.true, then there is a least (strongest) predicate q that satisfies p {| S |} q. 17.16 We define weak correctness as follows: statement S is weakly correct with respect to precondition p and postcondition q, written p (| S |) q, if the following holds:

p ∩ S.true ⊆ S.q . Now let S =x := 0 x := 1 and T ={x = 1}, with predicates p = true, q = (x = 0), and r = (x = 0). First show that S ; T =x := 1 . Then prove that p (| S |) q and q (| T |) r, but not p (| S ; T |) r. This is page 299 Printer: Opaque this

Part III

Recursion and Iteration This is page 300 Printer: Opaque this This is page 301 Printer: Opaque this 18

Well-founded Sets and Ordinals

So far, we have looked at lattices as a special kind of posets. In this chapter, we focus on another important special kind of posets: well-founded sets and well-ordered sets. Well-founded sets are partially ordered sets with the additional property that it is possible to use induction to prove that all elements have a given property. Well-ordered sets are simply well-founded sets on which the order is total.

The construction of the natural numbers in higher-order logic provides us with a standard example of a well-ordered set. A more general example of well-ordered sets is given by ordinals. The ordinals can be seen as a generalization of the natural numbers. They are useful in arguments in which the set of natural numbers is “too small” to be sufficient. We show how to define primitive recursion on natural numbers and how to extend this notion to ordinal recursion. These notions in turn justify the principle of induction for the natural numbers and ordinal induction (also known as transfinite induction). These will be found very useful, in particular in reasoning about recursion and iteration.

The sections on ordinals contain a considerable amount of material, and to avoid making the chapter too long, many proofs are only sketched or are omitted alto- gether.

18.1 Well-Founded Sets and Well-Orders

An element a is minimal in a set A if there is no smaller element in the set, i.e., if (∀b ∈ A • b < a). A partially ordered set (W, ) is said to be well-founded if 302 18. Well-founded Sets and Ordinals

every nonempty subset of W has a minimal element, i.e., if (∀A ⊆ W • A =∅⇒(∃a ∈ A • ∀b ∈ A • b < a)) . For example, any finite poset is well-founded (strictly speaking, a relation can be well-founded without being a partial order; here we consider only well-founded partial orders). It is also easy to see that Nat (the natural numbers ordered by ≤) is well-founded. A further example is Nat × Nat with the lexicographical order (see Exercise 18.1). The natural numbers ordered by ≥ are a simple example of a partially ordered set that is not well-founded. An important property of well-founded sets is that any strictly descending sequence must be finite.

Lemma 18.1 Assume that the set (W, ) is well-founded and that

w0 = w1 = w2 = ··· is a strictly descending chain in W. Then the chain must be finite.

Proof If the chain is not finite, then the set {wn | n ∈ Nat} does not have a minimal element. 2

Well-Founded Induction In any well-founded set W, the principle of well-founded induction is valid. This principle states that in order to prove that a property p holds for all elements in W, it is sufficient to prove that p holds for arbitrary w, assuming that p holds for all v ∈ W such that v < w. Thus, if W is well-founded, then

(∀w ∈ W • (∀v ∈ W • v < w ⇒ p.v) ⇒ p.w) ⇒ (∀w ∈ W • p.w) for arbitrary p (in fact, the implication can be strengthened to equivalence). In this rule, (∀v ∈ W • v < w ⇒ p.v) is called the induction assumption. The principle of well-founded induction is derived from the definition of well-foundedness as follows: (W, ) is well-founded ≡{definition} (∀A ⊆ W • A =∅⇒(∃a ∈ A • ∀b ∈ A • b < a)) ⇒{specialize A :={x ∈ W |¬p.x}} (∃a ∈ W • ¬ p.a) ⇒ (∃a ∈ W • ¬ p.a ∧ (∀b ∈ W • “ ¬ p.b ⇒ b < a ”)) ≡{contrapositive on subterm} (∃a ∈ W • ¬ p.a) ⇒ 18.2. Constructing the Natural Numbers 303

(∃a ∈ W • ¬ p.a ∧ (∀b ∈ W • b < a ⇒ p.b)) ≡{contrapositive} (∀a ∈ W • “ p.a ∨¬(∀b ∈ W • b < a ⇒ p.b) ”) ⇒ (∀a ∈ W • p.a) ≡{rewrite as implication} (∀a ∈ W • (∀b ∈ W • b < a ⇒ p.b) ⇒ p.a) ⇒ (∀a ∈ W • p.a) In the special case of the natural numbers, the principle of well-founded induction is called complete induction. In fact, this principle can be shown to be equivalent to the ordinary principle of induction on Nat (see Exercise 18.2).

Well-Ordered sets Recall that a partial order is linear if for arbitrary a and b, either a  b or b  a holds. A partial order that is both linear and well-founded is called a well-order. Well-orders are central to the foundations of mathematics because of the well- ordering principle, which states that every set can be well-ordered. This principle can be shown to be equivalent to the axiom of choice, which states that for an arbitrary collection A of nonempty sets there exists a choice function f that satisfies f.A ∈ A for all A ∈ A. Because higher-order logic contains a choice operator, the axiom of choice can be proved as a theorem (see Chapter 7). We return to the well-ordering principle in Section 18.4 below. The natural numbers is an example of a well-ordered set. Any subset of the natural numbers is also well-ordered. Ordinals, introduced in Section 18.4 below, are also well-ordered.

18.2 Constructing the Natural Numbers

In Section 10.2, we described how the natural numbers can be defined as a new type that is isomorphic to a subset of the type Ind of individuals. We shall now show a different construction that makes Nat isomorphic to a subset of P(Ind). This type definition acts as an introduction to the definition of ordinals in Section 18.4.

We let every natural number be represented by a subset of Ind. Intuitively speak- ing, the size of the set shows which natural number is represented, and we get successively larger sets by adding one element at a time (always using the choice operator to select a new element of Ind). To facilitate this construction, we first define the following constants:

∧ rep zero =∅, ∧ rep suc.B = B ∪{ y : Ind • y ∈ B} . 304 18. Well-founded Sets and Ordinals

. . Ind . 3 . . . 2

1

0

Nat

FIGURE 18.1. Constructing the type of natural numbers

We intend rep zero (the empty set of individuals) to represent zero and rep suc to represent the successor function. We can now describe the set that represents the naturals: ∧ Natset = ∩{B : P(P(Ind)) | rep zero ∈ B ∧ (∀B • B ∈ B ⇒ rep suc.B ∈ B)}

This makes Natset the smallest set that contains rep zero and is closed under rep suc (Exercise 18.3). We use here the convention that lower-case letters stand for el- ements, upper-case letters for sets, and script letters for collections of sets. The construction of the natural numbers type is illustrated in Figure 18.1.

We can now prove the following characterization lemma, which shows that Natset is a suitable set to represent the natural numbers:

Lemma 18.2 The set Natset has the following property:

B ∈ Natset ≡ B = rep zero ∨ (∃C ∈ Natset • B = rep suc.C).

Defining the Type of Natural Numbers We now have all the pieces needed for the type definition. The set Natset is nonempty, since it contains rep zero, so we introduce the name Nat for the new type represented by Natset: ∼ Nat = Natset . As described in Chapter 10, this means that the abstraction and representation functions abs : P(Ind) → Nat and rep : Nat → P(Ind) satisfy the following properties: 18.3. Primitive Recursion 305

(∀n • abs.(rep.n) = n), (∀B • B ∈ Natset ⇒ rep.(abs.B) = B).

Zero and the successor function now have obvious definitions; they are the abstrac- tions of rep zero and rep suc defined earlier:

∧ 0 = abs.rep zero , ∧ suc.n = abs.(rep suc.(rep.n)) . Having defined 0 and suc, we really do not want to refer to the functions abs and rep any more when reasoning about the natural numbers. This is made possible by a single theorem that characterizes the natural numbers completely, and which we will describe next.

18.3 Primitive Recursion

In general, it is desirable to be able to characterize a data type in a single char- acterization theorem.ForNat this is indeed possible; the theorem in question is known as the primitive recursion theorem:

Theorem 18.3 For arbitrary type , the following holds:

• • • (∀g1 :  ∀g2 :  → Nat →  ∃! f : Nat →  • f.0 = g1 ∧ (∀n f.(suc.n) = g2.( f.n).n)) .

A detailed proof of Theorem 18.3 is beyond the scope of this text; it involves proving numerous properties about the set that represents Nat (see Exercise 18.4). Let us look at the primitive recursion theorem in some detail. It asserts that if we are given an element g1 of  and a function g2 :  → Nat → , then there always exists a uniquely defined function f such that g1 is the value of f at 0 and such that g2 gives the value of f at any successor suc.n in terms of n and of the value of f at n. This is a foundation for making primitive recursive function definitions. Given g1 and g2, the principle of constant definition then permits us to give a name to the function whose existence is guaranteed by the primitive recursion theorem. As a simple example of function definition by primitive recursion, we show how to define evenness. We give the recursive definition in the form

(even.0 ≡ T) ∧ (∀n • even.(suc.n) ≡¬even.n). (∗) Note that this is not a proper definition in higher-order logic, since even is “defined” in terms of itself. In fact, we are here using the primitive recursion theorem with 306 18. Well-founded Sets and Ordinals

• g1 := T and g2 := (λbm ¬ b) to get (∃! f • ( f.0 ≡ T) ∧ (∀n • f.(suc.n) ≡¬f.n)) (the reader is encouraged to verify that this is indeed what we get). We then get the “definition” above by specifying that even is the name of the function whose existence we have just shown. Formally, we define

∧ even = ( f • ( f.0 ≡ T) ∧ (∀n • f.(suc.n) ≡¬f.n)) and then use selection introduction to deduce (∗) as a theorem. We can now define other functions in the same way. The following examples show how addition and the standard order are defined. In both cases we really define functions of more than one argument, but the primitive recursion is over exactly one of the arguments.

(∀n • 0 + n = n) ∧ (∀mn• suc.m + n = suc.(m + n)) , (∀n • 0 ≤ n ≡ T) ∧ (∀mn• suc.m ≤ n ≡ m ≤ n ∧ m = n). We leave it to the reader to verify that these are in fact valid definitions (compare this with Exercise 10.6).

Induction The primitive recursion theorem allows us to deduce that induction over Nat is valid. To show this, we let p : Nat → Bool be arbitrary and instantiate the type  in the primitive recursion theorem to Bool:

• • (∀g1 g2 ∃! f : Nat → Bool • ( f.0 ≡ g1) ∧ (∀n f.(suc.n) ≡ g2.( f.n).n)) ⇒{definition of unique existence}

• (∀g1 g2 f1, f2 • ( f1.0 ≡ g1) ∧ (∀n f1.(suc.n) ≡ g2.( f1.n).n) ∧ • ( f2.0 ≡ g1) ∧ (∀n f2.(suc.n) ≡ g2.( f2.n).n) ⇒

f1 = f2) • ⇒{specialize f1 := (λm T) and f2 := p} • (∀g1 g2 • g1 ∧ (∀n g2.T.n) ∧ (p.0 ≡ g1) ∧ • (∀n p.(suc.n) ≡ g2.(p.n).n) ⇒ (λm • T) = p) 18.4. Ordinals 307

• ⇒{specialize g1 := T and g2 := (λbm b ∨ p.(suc.m)} p.0 ∧ (∀n • p.(suc.n) ≡ p.n ∨ p.(suc.n)) ⇒ (λm • T) = p ≡{rewrite using tautology (t ≡ t ∨ t) ≡ (t ⇒ t)} p.0 ∧ (∀n • p.n ⇒ p.(suc.n)) ⇒ (λm • T) = p ≡{pointwise extension, β reduction: ((λm • T) = p) ≡ (∀m • p.m)} p.0 ∧ (∀n • p.n ⇒ p.(suc.n)) ⇒ (∀m • p.m)

Further Properties It is also possible to prove the other Peano postulates as theorems, by specializing g1 and g2 in the primitive recursion theorem in suitable ways. The details are left to the reader as an exercise (Exercise 18.5).

Lemma 18.4 The following properties hold for Nat as constructed above: (∀n : Nat • 0 = suc.n), (∀mn: Nat • suc.m = suc.n ⇒ m = n).

Since the Peano postulates are proved as theorems, we know that our construction of Nat in fact gives us the natural numbers. Thus, the primitive recursion theorem gives an abstract characterization of the natural numbers; no other facts are needed in order to build the whole theory of arithmetic.

Primitive Recursion for Sequences Sequence types can be characterized with a primitive recursion theorem that is similar to the corresponding theorem for natural numbers:

• • • (∀g1 :  ∀g2 :  → Seq of  →  →  ∃! f : Seq of  →  • ( f. =g1) ∧ (∀sx f.(s&x) = g2.( f.s).s.x)) . From this theorem the axioms given for sequences in Chapter 10 can be derived. Furthermore, the recursion theorem justifies recursive definitions of functions over sequences. A simple example is the length of a sequence, defined as len. =0 ∧ len.(s&x) = len.s + 1 .

18.4 Ordinals

All natural numbers are constructed by repeated application of the successor oper- ation, starting from zero. We now generalize this construction process by adding limits. The objects that we construct using repeated application of the successor 308 18. Well-founded Sets and Ordinals

and the limit operations are called ordinals. It is well known that this characteri- zation cannot be realized consistently in set theory; the collection of all ordinals is “too big” to be a set. Thus there is no way of introducing the ordinals as a type in higher-order logic. However, we introduce the notion of ordinals over a given type  and show that by choosing  in a suitable way, we have access to arbitrarily large ordinals.

Construction of Ordinals Assume that a type  is fixed. We shall now show how the type of ordinals over  is constructed. Exactly as for the natural numbers, we let every ordinal be represented by a set over . Intuitively speaking, we get successively larger sets by adding one element at a time or by taking the union of all sets constructed so far. We first make the following constant definitions:

∧ rep osuc.B = B ∪{ y :  • y ∈ B} , ∧ rep olim.B =∪{B | B ∈ B} ,

where the type of B is P(P()). The constant rep olim is intended to represent the operation that produces the limit of a set of ordinals. We do not need to mention an explicit representative for zero; we use the limit of the empty set for this. Next, we need to describe the set that represents the ordinals over . We choose the smallest set that is closed under rep osuc and rep olim:

∧ Ordset =∩{B | (∀B • B ∈ B ⇒ rep osuc.B ∈ B) ∧ (∀C • C ⊆ B ⇒ rep olim.C ∈ B)} .

Note that this set must be nonempty, since it contains rep olim.∅ (i.e., ∅). Exactly as we did for the natural numbers, we can now prove a characterization lemma that shows that Ordset is in fact a suitable set to represent the ordinals over  (we leave the proof to the reader):

Lemma 18.5 The set Ordset has the following property:

B ∈ Ordset ≡ (∃C ∈ Ordset • B = rep osuc.C) ∨ (∃C ⊆ Ordset • B = rep olim.C).

Defining a Type of Ordinals Since Ordset is nonempty, we can introduce the name Ord() for a new type represented by Ordset (this way, we really introduce a new unary type operator Ord). 18.4. Ordinals 309

We now define the successor and limit operations for the ordinals, in the obvious way:

∧ suc.a = abs.(rep osuc.(rep.a)) , ∧ lim.A = abs.(rep olim.{rep.a | a ∈ A}). Intuitively speaking, these definitions imply that we simply take over the notions of successor and limit from the representing set. We allow binder notation for limits, writing (lim a ∈ A • f.a) for lim.{ f.a | a ∈ A}. The least and greatest ordinals over  can now be defined in terms of the lim operator:

∧ 0 = lim.∅ , ∧ maxord = lim.Ord . The greatest ordinal maxord has a slight anomaly; it has the undesirable property of being its own successor (Exercise 18.8).

An Order on Ordinals We define an order on ordinals, corresponding to subset inclusion on the represent- ing set:

∧ a ≤ b = rep.a ⊆ rep.b . In fact, it is possible to prove that this gives a linear order on Ord(). Note that lim is the join operation with respect to this order. As usual, we let a < b stand for a ≤ b ∧ a = b. Next we define an ordinal to be a limit ordinal if it is the limit of all ordinals smaller than itself:

∧ is limit.a = a = lim.{b ∈ Ord() | b < a} . In particular, 0 is a limit ordinal. The ordinals satisfy the following generalization of the Peano postulate that states that zero is not a successor:

Lemma 18.6 Every ordinal over  except maxord is either a successor ordinal or a limit ordinal (but not both):

a = maxord ⇒ (is limit.a ≡ (∀b • a = suc.b)) .

We have that Natset ⊆ Ordset(Ind); i.e., the set that represents the type Nat is a subset of the set that represents the type of ordinals over Ind. Furthermore, ∪Natset represents a nonzero limit ordinal (Exercise 18.7). This ordinal is called 310 18. Well-founded Sets and Ordinals

ω (“omega”) and is the first infinite ordinal. The ordinals thus extend the natural numbers past the first infinite ordinal. The ordinals (over some sufficiently large type ) form a “sequence” of the form 0, 1, 2, 3,...,ω,ω+ 1,ω+ 2,ω+ 3,...,ω· 2,ω· 2 + 1,... .

18.5 Ordinal Recursion

In analogy with the natural numbers, it is possible to derive an ordinal recursion theorem:

Theorem 18.7 For arbitrary types  and , the following holds:

• • (∀g1 :  → Ord() →  ∀g2 : P() → Ord() →  ∃! f : Ord() →  • • (∀a a = maxord ⇒ f.(suc.a) = g1.( f.a).a) ∧ • (∀a is limit.a ⇒ f.a = g2.{ f.b | b < a}.a)) .

This theorem gives a foundation for defining functions by recursion over the or- dinals. Since 0 is the limit of the empty set of ordinals, we do not need a separate conjunct for the zero case. In practice, though, functions are usually defined by giving three clauses: one for zero, one for successors, and one for (nonzero) limits. Note that the successor clause is qualified; this is because maxord is its own suc- cessor. Since every ordinal (except maxord) is either a successor ordinal or a limit ordinal (but not both), the limit clause applies only when the successor clause does not apply. As an example, we define addition for ordinals. Rather than writing the definition as one big conjunction, we give the three clauses separately:

a + 0 = a , a + (suc.b) = suc.(a + b), a + b = lim{a + c | c < b} for nonzero limit ordinals b. This makes addition a generalization of addition on Nat. Note that in the second clause we permit b = maxord, since this makes both sides trivially equal to maxord. Ordinal addition has certain surprising properties, such as not being commutative (see Exercise 18.9).

Ordinal Induction and Other Properties Peano’s postulates for the natural numbers tell us that zero is not a successor and that the successor operation always generates new numbers. For the ordinals, as we have constructed them here, the situation is not as simple. If the underlying set  is 18.5. Ordinal Recursion 311

finite, then the number of ordinals is also finite and the successor operation cannot generate new ordinals forever. If  is infinite, then the limit operator sometimes generates new ordinals but sometimes not. These facts have already shown up when we needed to qualify the clauses in the recursion theorem for ordinals. We shall now show that similar qualifications appear when we prove results that correspond to the Peano postulates for Nat. From the ordinal recursion theorem, it is possible to derive a principle of ordinal induction (also known as transfinite induction):

Theorem 18.8 The following induction principle holds for arbitrary type  and arbitrary predi- cate p : Ord() → Bool: (∀a • a = maxord ∧ p.a ⇒ p.(suc.a)) ∧ (∀a • is limit.a ∧ (∀b • b < a ⇒ p.b) ⇒ p.a) ⇒ (∀a • p.a).

In practice, we split an inductive proof into three cases; the zero case is considered separately from the nonzero limits. We can also prove a result corresponding to the second Peano postulate for the natural numbers.

Lemma 18.9 With the exception of maxord, distinct ordinals have distinct successors:

suc.a = suc.b ∧ a = maxord ∧ b = maxord ⇒ a = b .

In particular, Lemma 18.9 implies that if a = maxord, then the successor of a is always distinct from all ordinals up to a. We know that the natural numbers are well-ordered; i.e., they are linearly ordered and every subset has a least element. The same is also true of ordinals.

Lemma 18.10 Any type Ord() of ordinals is well-ordered.

Proof First, linearity is proved by proving that the subset order on Ordset is linear. Then, well-foundedness is proved by showing that for an arbitrary nonempty set A of ordinals, the ordinal abs.(∩ a ∈ A • rep.a) is the smallest element in A. 2

Induction Example As an example of ordinal induction, we prove the following property of addition: 312 18. Well-founded Sets and Ordinals

0 + a = a . The proof is by induction on a. The zero case follows directly from the definition of addition. For the successor case, we have the following derivation:

0 + a = a, a = maxord  0 + suc.a ={definition of addition} suc.(0 + a) ={induction assumption} suc.a Finally, we have the limit case:

is limit.a, a = 0,(∀b • b < a ⇒ 0 + b = b)  0 + a ={definition of addition} lim .{0 + b | b < a} ={induction assumption} lim .{b | b < a} ={a is limit ordinal} a The ordinal induction principle now asserts that 0 + a = a holds for an arbitrary ordinal a.

18.6 How Far Can We Go?

The ordinals over Ind can be seen as an extension of the natural numbers. We shall now see that we can in fact construct ordinal types that are ‘arbitrarily large’ in a very specific sense.

If the underlying type  is finite, then the type Ord() is also finite. In general, we want ordinals for reasoning when the type Nat is not “big enough”. The following result (a version of Hartog’s lemma) guarantees that we can always find a type of ordinals that is big enough to index any given type .

Lemma 18.11 Assume that a type  is given. Then there exists a type Ord of ordinals such that there is no injection from Ord to . 18.6. How Far Can We Go? 313

Proof Choose P() as the underlying set for the construction of the ordinals. The construction in Exercise 18.11 shows that the cardinality of Ord() is always at least the same as the cardinality of P(). The result then follows from the fact that for an arbitrary nonempty set A there can be no injection from P(A) to A. 2

We call a set of the form {a : Ord() | a < a0} an initial segment of the ordinals over . It is possible to show that for any two types of ordinals, Ord() and Ord(), one is always order-isomorphic to an initial segment of the other. Thus, whenever we have used two different types of ordinals in some reasoning, we can replace the “smaller” by the “bigger”. In fact, one can show that for any two well-ordered sets, one is always order-isomorphic to an initial segment of the other. In particular, we can consider Nat to be an initial segment of the ordinals. Thus, there is really no confusion using the constants 0 and suc for both the natural numbers and the ordinals.

In traditional developments of ordinals, one considers ordinals as equivalence classes of well-ordered sets rather than specific instances, as we have done. This allows one to talk about the collection of all ordinals, which is sufficient for all set-theoretic reasoning. This collection is not a set, but a proper class, so we cannot construct it inside higher-order logic. However, we can construct arbitrarily large types of ordinals, so once we know what type we want to reason over, we can select a type of ordinals that is sufficient. In subsequent chapters, we talk about “ordinals” without specifying the underlying type. We then assume that the type of ordinals is chosen so large that we do not have to worry about conditions of the form a = maxord.

Ordinals as Index Sets Assume that A is an arbitrary set. Lemma 18.11 then shows that we can find a type Ord of ordinals that cannot be injected into A. In fact, we can find a function f : Ord → A and an ordinal γ : Ord such that f maps {a : Ord | a <γ} bijectively into A. This means that we can index A using this set of ordinals. For the construction of this function f , see Exercise 18.11 (in fact, the function f gives a well-ordering of A, so this proves the well-ordering principle).

Using the ordinals as an index set, we can also construct generalized ordinal- indexed chains.If(A, ) is any partially ordered set, an ordinal-indexed chain is a monotonic function f : Ord → A. Ordinal-indexed chains are typically defined by ordinal recursion, and ordinal induction can be used to reason about them. They are useful for generalized limit arguments, in situations where the natural numbers are not sufficient. 314 18. Well-founded Sets and Ordinals

18.7 Summary and Discussion

We have in this chapter defined the notion of well-founded and well-ordered sets, and have shown how the induction principle applies to such structures. As an appli- cation, we have given an alternative construction of the type of natural numbers in higher-order logic. This construction motivates the principle of primitive recursion, which is the main tool for defining functions over natural numbers. The reason for giving the alternative construction of natural numbers here is that it generalizes directly to a construction of ordinals (over a given base type ), and we showed above how to construct the ordinals in this way. Ordinal recursion was then defined as a principle for defining functions over all ordinals over a given set, and transfinite induction was defined based on ordinal recursion. The theory of ordinals may seem a little bit esoteric to those who are not familiar with it, but it is surprisingly useful in programming logics. We need ordinals, e.g., when proving termination of programs that are not necessarily continuous. For a good introduction to the construction of the ordinals within the type-free framework of Zermelo–Fraenkel set theory, see Johnstone [93]. A theory of well- ordered sets within higher-order logic is described by John Harrison in [73]. Apt and Plotkin [5] give a nice overview of ordinals and some good justification for the use of ordinals in programming language semantics. The main difference between ordinals as presented here and the classical theory of ordinals was already pointed out above: we can define only an initial segment of all ordinals, whereas in the classical treatment, it is possible to define the (proper) class of all ordinals. In practice, this does not seem to cause any problems, because the initial segment can always be chosen large enough for the problem at hand.

18.8 Exercises

18.1 The lexicographical order  on Nat × Nat is defined as follows:

∧ (m, n)  (m, n) = m < m ∨ (m = m ∧ n ≤ n). Prove that this order is well-founded. 18.2 Show that the two induction principles on natural numbers,

(∀n ∈ Nat.(∀m < n • p.m) ⇒ p.n) ⇒ (∀n ∈ Nat • p.n) and

p.0 ∧ (∀n ∈ Nat • p.n ⇒ p.(n + 1)) ⇒ (∀n ∈ Nat • p.n), are equivalent. 18.8. Exercises 315

18.3 Prove that the set Natset contains rep zero and is closed under rep suc, where

Natset = ∩{B : Ind → Bool | rep zero ∈ B ∧ (∀x • x ∈ B ⇒ rep suc.x ∈ B)} Deduce that Natset is the smallest set that contains rep zero and is closed under rep suc. 18.4 In the text, we have assumed that the primitive recursion theorem (Theorem 18.3) is proved before any other properties of Nat. Show that it is possible to go the other way, proving the primitive recursion theorem from the Peano postulates. 18.5 Prove the following two Peano postulates from the primitive recursion theorem: (∀n : Nat • 0 = suc.n), (∀mn: Nat • suc.m = suc.n ⇒ m = n).

Hint: specialize g1 and g2 in the primitive recursion theorem to suitable boolean- valued functions.

18.6 Use list recursion to define the function append : list() → list() → list() that appends two lists into a new list.

18.7 Show that Natset ⊆ Ordset(Ind); i.e., the set that represents the type Nat is a subset of the set that represents the type of ordinals over Ind. Furthermore, show that ∪ Natset represents a nonzero limit ordinal. 18.8 Prove the following properties of ordinals: a = maxord ⇒ a < suc.a , suc.maxord = maxord , a = lim.{b | b < a}∨a = suc.(lim.{b | b < a}). 18.9 Prove the following properties of ordinal addition for any type of ordinals that includes the first infinite ordinal ω (see Exercise 18.7).

ω + 1 = suc.ω , 1 + ω = ω. This shows that addition of ordinals is not commutative. 18.10 Prove that the successor and limit operations on the ordinals are monotonic, in the following sense:

a ≤ b ⇒ suc.a ≤ suc.b , (∀a ∈ A • ∃b ∈ B • a ≤ b) ⇒ lim.A ≤ lim.B . Furthermore, prove the following property for the order on the ordinals:

b = maxord ∧ a < b ⇒ suc.a < suc.b . 316 18. Well-founded Sets and Ordinals

18.11 Let some type  be given. Define a function f that maps any ordinal a = maxord over  to the unique element b ∈  such that b ∈ rep.a ∧ b ∈ rep.(suc.a). Show that the restriction of this function to B ={a | a = maxord} is bijective. Now define the relation ≤ on B as follows: ∧ b ≤ b = f −1.b ≤ f −1.b . Show that this gives a well-ordering of . This is page 317 Printer: Opaque this 19

Fixed Points

The statement constructors that we have introduced earlier — sequential compo- sition, join, and meet — are all monotonic with respect to the refinement relation on Mtran. Hence, we can use these constructs to define monotonic functions from the complete lattice of predicate transformers to itself. From a mathematical point of view, monotonicity of functions over complete lat- tices is important because it guarantees the existence of (least and greatest) fixed points, which are needed for defining recursion and iteration. In this chapter, we investigate fixed points of functions over complete lattices in general. We also show how ordinals are used to give an alternative characterization of the fixed points of monotonic functions as limits of ordinal-indexed approximation sequences. In the next chapter we apply this theory to recursively defined statements, in particular predicate transformers, and show how fixed points are used to give meaning to recursive definitions.

19.1 Fixed Points

Let f :  →  be an arbitrary function. An element a :  is said to be a fixed point of f if f.a = a. In general, a function f need not have a fixed point. However, when a complete lattice A ⊆  is closed under monotonic f , then the situation is different. The following Knaster–Tarski theorem is a central result about fixed points in lattice theory.

Theorem 19.1 Let A be a complete lattice and f a monotonic function on A; i.e., f ∈ A →m A. Then f has a least and a greatest fixed point in A. 318 19. Fixed Points

Proof We show that f has a least fixed point; the proof for the greatest fixed point is dual. In fact, we show that the least fixed point is given by ( x ∈ A | f.x  x • x). We have f.( x ∈ A | f.x  x • x) {f monotonic implies f.(X)  ( x ∈ X • f.x)} ( x ∈ A | f.x  x • f.x) {replace every element by a larger element} ( x ∈ A | f.x  x • x) and

T ⇒{derivation above} f.( x ∈ A | f.x  x • x)  ( x ∈ A | f.x  x • x) ⇒{f monotonic} f.( f.( x ∈ A | f.x  x • x))  f.( x ∈ A | f.x  x • x) ≡{general rule p.a ≡ a ∈{b | p.b}} f.( x ∈ A | f.x  x • x) ∈{x ∈ A | f.x  x} ⇒{general rule x ∈ A ⇒A  x} ( x ∈ A | f.x  x • x)  f.( x ∈ A | f.x  x • x) so ( x ∈ A | f.x  x • x) is a fixed point. For arbitrary y ∈ A,wehave f.y = y ⇒{a = b implies a  b} f.y  y ≡{general rule p.a ≡ a ∈{b | p.b}} y ∈{x ∈ A | f.x  x} ⇒{general rule x ∈ A ⇒A  x} ( x ∈ A | f.x  x • x)  y so ( x ∈ A | f.x  x • x) is in fact the least fixed point. 2

The least fixed point of f is denoted by µ. f , while the greatest fixed point is by denoted by ν. f . The set A is assumed to be the whole underlying type unless otherwise stated explicitly. If there is danger of confusion, we write µA. f to indicate that we are considering the least fixed point on the set A, and similarly for the greatest fixed point. The proof of the Knaster–Tarski theorem gives the following explicit constructions of the least and greatest fixed points:

• µA. f = ( x ∈ A | f.x  x x), (µ characterization) • νA. f = ( x ∈ A | x  f.x x). (ν characterization) 19.1. Fixed Points 319

Thus the least fixed point is the meet of all prefixed points (a is a prefixed point of f if f.a  a). Dually, the greatest fixed point is the join of all postfixed points (a is a postfixed point if a  f.a). We permit the use of µ and ν as binders. Thus, we can write (µ x • t) rather than µ.(λx • t) and (ν x • t) for ν.(λx • t). Similarly, we can use index notation • • • • (µ x ∈ A t) for µA.(λx t) and (ν.x ∈ A t) for νA.(λx t). The least fixed point of f is characterized by the following two properties (Exercise 19.1):

f.(µ. f ) = µ. f , (folding least fixed point) f.x  x ⇒ µ. f  x . (least fixed point induction) When we use the folding property as a rewrite rule from left to right, we say that we fold the fixed point. Dually, we say that we unfold the fixed point when we use it from right to left. The induction property implies that µ. f is the least (pre)fixed point (this follows from the last derivation in the proof of Theorem 19.1). The greatest fixed point is characterized by the following dual properties:

f (ν. f ) = ν. f , (folding greatest fixed point) x  f.x ⇒ x  ν. f . (greatest fixed point induction) Note that least fixed point induction can be used as a µ elimination rule, while greatest fixed point induction is a ν introduction rule. Fixed points of a function f are at the same time solutions of the equation x = f.x. Thus, µ. f is the least and ν. f the greatest solution of this equation. This indicates that fixed points can be used to give meaning to recursively defined statements. We will consider this in more detail after we have studied general properties of fixed points.

Monotonicity of Fixed Points Monotonicity is sufficient to guarantee the existence of least and greatest fixed points of a function f over a complete lattice. We shall now show that the least and greatest fixed point operators themselves are monotonic (Table 19.1).

Lemma 19.2 Assume that A is a complete lattice and that the functions f ∈ A →m A and • g ∈ A →m A satisfy (∀x ∈ A f.x  g.x). Then µA. f  µA.g and νA. f  νA.g.

Proof f monotonic, g monotonic, f  g  µ. f  µ.g ⇐{least fixed point induction with x := µ.g} 320 19. Fixed Points

is  (λf • µ. f ) yes1 (λf • ν. f ) yes1 1monotonic f

TABLE 19.1. Fixed-point monotonicity

f.(µ.g)  µ.g ≡{unfolding least fixed point} f.(µ.g)  g.(µ.g) ≡{assumption f  g, pointwise order} T The proof for the greatest fixed point is dual. 2

19.2 Fixed points as Limits

In connection with the Knaster–Tarski fixed-point theorem, we showed an explicit construction of the least fixed point µ. f of a monotonic function f on a complete lattice. We shall now show that µ. f has another characterization in terms of a spe- cial kind of join (limit) that can be given a computational interpretation. Similarly, ν. f has a dual characterization.

Limits and Continuity In order to define limits and continuity, we first need the notion of a directed set. A subset C of a poset (A, ) is directed if every finite subset of C is bounded above, i.e., if the following condition is satisfied:

(∀a, b ∈ C • ∃c ∈ C • a  c ∧ b  c). (directedness) Note that the empty set is directed and that a finite nonempty directed set must have a greatest element. We refer to the join of a directed set as its limit. This is in agreement with the notion of limit for ordinals, since a well-ordered set is always directed. When we want to show explicitly that an indexed join is a limit, we write • • (lim i ∈ I xi ) rather than ( i ∈ I xi ). We say that the poset (A, ) is a complete partial order (a cpo) if lim C exists for every directed subset C of A. In particular, a cpo A must have a bottom element (since the empty set is directed: see Exercise 19.2). A monotonic function f : A → B, where A and B are complete partial orders, is said to be (join) continuous if it preserves limits, i.e., if 19.2. Fixed points as Limits 321

• • f.(lim i ∈ I xi ) = (lim i ∈ I f.xi ) ( f continuous)

for an arbitrary directed set {xi | i ∈ I }. Note that continuity is a weaker property than universal join homomorphism; every universally join homomorphic function is continuous. Furthermore, we note that continuity implies strictness (Exercise 19.2).

Fixed Points as Limits and Colimits

Let f be a monotonic function on a complete lattice, f ∈ A →m A, and define an ordinal-indexed collection of derived functions: f 0.x = x , f α+1.x = f.( f α.x) for arbitrary ordinals α, α β f .x = (lim β | β<α• f .x) for nonzero limit ordinals α.

We now show that there exists an ordinal γ such that µ. f = f γ .⊥ (each f α.⊥ is called an approximation of µ. f ).

Theorem 19.3 Assume that a complete lattice A ⊆  is given. Then there exists an ordinal γ such that for any function f ∈ A →m A the least fixed point of f in A satisfies µ. f = f γ .⊥.

Proof The full proof is fairly long, so we only outline it here. First we note that (by induction over ordinals) α ≤ β ⇒ f α.⊥ f β .⊥ . (∗) Now let γ be an ordinal that cannot be injected into A (such an ordinal exists by Lemma 18.11). Then in the ordinal-indexed chain { f α.⊥|α ≤ γ } some element must occur more than once. Thus we can find α and α where α<α ≤ γ and  f α .⊥= f α.⊥. Then α f .⊥ {(∗) above} α+ f 1.⊥ and α+ f 1.⊥ {α<α,(∗) above} α f .⊥ ={assumption} α f .⊥ 322 19. Fixed Points

which shows that f α.⊥= f α+1.⊥,so f α.⊥ is a fixed point. Furthermore, for an arbitrary ordinal β ≥ α we get f β .⊥= f α.⊥. In particular, f γ .⊥= f α.⊥,so f γ .⊥ is a fixed point. Now let x be an arbitrary fixed point of f . By ordinal induction one easily shows that f α.⊥x, for an arbitrary ordinal α. This shows that f γ .⊥x, and so f γ .⊥ is in fact the least fixed point. 2

Note from the proof of Theorem 19.3 that the ordinal γ can be chosen so that it does not depend on the function f , only on the complete lattice A. In Chapter 22 we show that if the function f is continuous on A, then the natural numbers are sufficient to index the approximations, so we need not go beyond the first infinite ordinal ω. For greatest fixed points, the dual result holds, using coapproximations:

Corollary 19.4 Assume that a complete lattice A ⊆  is given. Then there exists an ordinal γ such that for arbitrary f ∈ A →m A the greatest fixed point satisfies ν. f = fγ ., where the coapproximations fα.xaredefined as follows:

f0.x = x , fα+1.x = f.( fα.x) for arbitrary ordinals α, fα.x = (β | β<α• fβ .x) for nonzero limit ordinals α.

19.3 Properties of Fixed Points

The following fusion theorem is very useful for proving properties of fixed points. Its proof is also a good example of induction over the ordinals.

Theorem 19.5 Assume that h :  →  is a continuous function and that f :  →  and g :  →  are monotonic functions on complete lattices. Then

h ◦ f = g ◦ h ⇒ h.(µ. f ) = µ.g .

The fusion theorem is illustrated in Figure19.1

Proof We show that h.( f α.⊥) = gα.⊥ for an arbitrary ordinal α, under the assumption h ◦ f = g ◦ h. The result then follows from Theorem 19.3. The base case h.⊥=⊥follows because h is continuous. The successor case is proved as follows: α α h.( f .⊥) = g .⊥ α+  h.( f 1.⊥) ={definition of approximation} 19.3. Properties of Fixed Points 323

f ΣΣ µ. f: Σ

h h h

ΓΓ µ. g: Γ g

FIGURE 19.1. Fusion theorem

α h.( f.( f .⊥)) ={assumption h ◦ f = g ◦ h} α g.(h.( f .⊥)) ={induction assumption} α g.(g .⊥) ={definition of approximation} α+ g 1.⊥ Finally, the limit case is as follows, for a nonzero limit ordinal α: β β (∀β | β<α• h.( f .⊥) = g .⊥) α  h.( f .⊥) ={definition of approximation} β h.(lim β | β<α• f .⊥) ={assumption h continuous} β (lim β | β<α• h.( f .⊥)) ={induction assumption} β (lim β | β<α• g .⊥) ={definition of approximation} α g .⊥ and the proof is finished. 2

We note a special corollary of the fusion theorem.

Corollary 19.6 Assume that f : ( → ) → ( → ) and g :  →  are monotonic functions on complete lattices and that x :  is fixed. Then

(∀h • f.h.x = g.(h.x)) ⇒ µ. f.x = µ.g . 324 19. Fixed Points

Proof Since the function (λk • k.x) is continuous (Exercise 19.3), we have

T ≡{fusion theorem with h := (λk • k.x)} (λk • k.x) ◦ f = g ◦ (λk • k.x) ⇒ (λk • k.x).(µ. f ) = µ.g ≡{definition of composition, extensionality} (∀h • (λk • k.x).( f.h) = g.((λk • k.x).h) ⇒ (λk • k.x).(µ. f ) = µ.g) ≡{beta reduction} (∀h • f.h.x = g.(h.x) ⇒ µ. f.x = µ.g) 2

Fixed Points Preserve Monotonicity As another example of how the limit characterization of fixed points can be used in proofs, we show that µ and ν preserve monotonicity (Table 19.2) in the sense that if f is a function that preserves monotonicity and f itself is monotonic, then µ. f is a monotonic function.

Lemma 19.7 Assume that A and B are complete lattices and that f ∈ (A → B) →m (A → ) µ . = µ . ν = B and f preserves monotonicity. Then A→B f A→m B f and A→B f ν A→m B f.

Proof We show the proof for least fixed points. The existence of both fixed points is α guaranteed by Tarski’s theorem. We show that each approximation f of µA→B . f µ . µ . is monotonic. This guarantees that the approximations for A→B f and A→m B f are pairwise equal, so the fixed points are also equal. For the base case, we have f 0.⊥ monotonic ≡{definition of approximation}

⊥A→B monotonic ≡{constant function is monotonic} T Next, the successor case: f α.⊥ monotonic  f α+1.⊥ monotonic ≡{definition of approximation} f.( f α.⊥) monotonic ≡{f preserves monotonicity by assumption, induction assumption} T 19.3. Properties of Fixed Points 325

Finally, the limit case: f β .⊥ monotonic for all β<α  f α.⊥ monotonic ≡{definition of approximation} β (lim β | β<α• f .⊥) monotonic ⇐{all joins preserve monotonicity} β (∀β | β<α• f .⊥ monotonic) ≡{induction assumption} T and the proof is finished. 2

Generally, we say that µ preserves property P if P.(µ. f ) holds whenever the monotonic function f preserves P (and similarly for ν). The above lemma then shows that the fixed-point operators preserve monotonicity.

Homomorphism Preservation for Fixed Point Constructs In addition to monotonicity, we are also interested in what other homomorphism properties the fixed-point constructors preserve.

Lemma 19.8 Least fixed points preserve disjunctivity and strictness, while greatest fixed points preserve conjunctivity and termination.

Proof We consider only least fixed points, as the properties for greatest fixed points are dual. To prove that µ preserves strictness we follow the same idea as in the proof that µ preserves monotonicity (Lemma 19.7); i.e., we show that all approximations fα of the fixed point µ. f are disjunctive if f preserves disjunctivity. For the base case we know that abort is disjunctive. For the successor case the assumption that f preserves disjunctivity tells us that f.( fα) is disjunctive if fα is disjunctive. Finally, the limit case follows from the fact that joins preserve disjunctivity. Preservation of strictness is proved in the same way as preservation of disjunctivity. 2

The following counterexample can be used to show that least fixed points do not preserve conjunctivity. Let the function f : (Unit %→ Nat) → (Unit %→ Nat) be defined as follows:   chaos if S < chaos,   [λux• x ≥ i + 1] if S < [λux• x ≥ i + 1] and . = < λ • ≥ ∈ f S  S [ ux x i] for i Nat,  λ • ≥ = λ • ≥  choose ;[ nx x n]ifS choose ;[ nx x n], magic otherwise. 326 19. Fixed Points

This function is monotonic, so µ. f exists. From the definition it follows that µ. f = choose ;[λnx• x ≥ n] is the least fixed point (the only other fixed point is magic). However, µ. f is not conjunctive, even though f preserves conjunctivity ( f maps all predicate transformers except µ. f to conjunctive predicate transformers). To see that µ. f is not conjunctive, consider the collection of predicates {pi | i ∈ Nat} where pi .x ≡ x ≥ i. Then µ. f establishes any pi but not their meet (false). We leave it to the reader to fill in the details of this argument.

In spite of the above counterexample, it is still possible that any least fixed point (λµ X • S) preserves conjunctivity when S is a statement built from asserts, demonic updates, sequential composition, meet, and the statement variable X. This is true in the case of tail recursion, since any tail-recursive statement can be written using iteration statements that preserve conjunctivity (see Chapter 21). We leave the general case as an open question; our conjecture is that it is in fact true.

The fixed-point homomorphism properties are summarized in Table 19.2.

19.4 Summary and Discussion

This chapter has given an overview of fixed-point theory, which is central to any treatment of recursion. We gave the basic fixed-point theorem by Tarski and showed that fixed points also can be characterized as limits of directed sets. This character- ization required the use of ordinals. We then proved a collection of basic properties about the fixed-point operators that will be very useful later on.

Tarski’s influential fixed-point theorem for lattices was originally published in [136], but its significance for recursion was discovered by Dana Scott, who used fixed points to give a semantics for loops and recursive programs, and started the tradition of denotational semantics. The early pioneers in this area were Scott and Cristopher Strachey [128] and de Bakker [36]. A good overview of denotational semantics is given by, e.g., Robert Tennent [137], or David Schmidt [125]. The fusion theorem is a classic theorem in fixed-point theory. Its usefulness (in a slightly weaker form requiring universal join homomorphism rather than continuity) is demonstrated by Roland Backhouse et al. [34].

preserves  ⊥ µ yes1 yes no no yes ν yes1 no yes yes no 1for monotonicity preserving f

TABLE 19.2. Fixed-point preservation of properties (monotonic arguments) 19.5. Exercises 327

19.5 Exercises

19.1 Show that the least fixed point of a monotonic function f on a complete lattice is characterized by the following two properties: f.(µ. f ) = µ. f , f.x  x ⇒ µ. f  x . Furthermore, show that if the equality in the first property is weakened to , the resulting two properties still characterize the least fixed point. 19.2 Show that the empty set is directed. Then show that every cpo must have a least element. Finally, show that continuity implies strictness. 19.3 Show that the function (λf • f.x) is continuous (in fact, it is a universal join homomorphism). This is page 328 Printer: Opaque this This is page 329 Printer: Opaque this 20

Recursion

We now apply fixed-point theory to recursively defined statements, interpreting a recursive statement as the least fixed point of a monotonic function on pred- icate transformers. We show how to construct recursive statements as limits of approximation chains and develop inference rules for introducing recursion as a refinement of a simpler nonrecursive statement. Finally, we show how to define recursive procedures with parameters and give inference rules for working with procedures.

20.1 Fixed Points and Predicate Transformers

Let us now consider recursive contract statements of the form (µX • S), which were briefly described in the introduction. The basic intuitive interpretation of (µ X • S) is that of recursive unfoldings. This means that the contract is carried out (executed) by executing S, and whenever the statement variable X is encountered, it is replaced by the whole statement (µ X • S) before execution continues. If unfolding can go on infinitely, then the computation is considered to be aborting. As an example, consider the contract statement

(µ X • S ; X  skip).

This contract is executed by the demon first choosing either the left- or right-hand side of the meet. If the first alternative is chosen, then S is executed, after which X is replaced by (µ X • S; X skip) and execution continues with this substatement. If the second alternative is chosen, then the skip is executed and execution terminates. 330 20. Recursion

We need to determine the weakest precondition predicate transformer for a recur- sive contract, i.e., define wp.(µX • S). In the recursive unfolding interpretation, we find that (µX • S) must have the same effect as its recursive unfolding, S[X := (µX • S)]. Hence, we can postulate that wp.(µX • S) = wp.S[X := (µX • S)]. Let us extend the definition of wp so that wp.X = X, in the sense that the contract state- ment variable X is mapped to a predicate transformer variable X (on the same state space). Then we have that wp.S[X := (µX • S)] = (wp.S)[X := wp.(µX • S)], because of the way wp is defined as a homomorphism on the contract statement syntax. In other words, wp.(µX • S) is a fixed point of the function (λX • wp.S), which maps predicate transformers to predicate transformers.

This narrows the scope of what the weakest precondition of a recursive contract can be, but does not determine it uniquely, as there may be more than one fixed point for any given function. It is also possible that a function has no fixed point, in which case the definition would not give a weakest precondition interpretation to all contract statements.

The latter problem is settled by fixed-point theory. The predicate transformer wp.S is built out of basic predicate transformers and X using sequential composition,   meets, and joins. Hence T  T ⇒ (λX • wp.S).T  (λX • wp.S).T , when T and T  are both monotonic predicate transformers (T and T  must be monotonic so that sequential composition is monotonic in both arguments). Thus, (λX • wp.S) is a monotonic function on the set of monotonic predicate transformers  %→m . As  %→m  is a complete lattice, the Tarski–Knaster fixed-point theorem shows that this function has a (least and greatest) fixed point.

The first problem (uniqueness) is settled by the requirement that nonterminating execution of a recursive contract be identified with breaching the contract. The argument here is that infinitely delaying the satisfaction of a condition required by the contract cannot be considered an acceptable way of satisfying the contract. A contract that never terminates is (µX • X), for which we therefore must have that wp.(µX • X) = abort. On the other hand, we know that wp.(µX • X) must be a fixed point of (λX • wp.X), i.e., of (λX • X). Any predicate transformer is a fixed point of this function. Of these predicate transformers, abort is the least one. We take this as a general rule and define the least fixed point of the function (λX • wp.S) as the weakest precondition of the recursive contract statement (µX • S):

∧ wp.(µX • S) = (µX • wp.S).

This means that we can study the satisfaction of recursive contract statements by considering recursive predicate transformers (µX • S) (standing for µ.(λX • S)). In this way, the properties of recursive contract statements, as far as satisfaction of contracts goes, is reduced to properties of least fixed points of functions on predicate transformers. 20.1. Fixed Points and Predicate Transformers 331

Recursive Statements as Limits The above argument shows that we should take the least fixed point as the meaning of a recursive contract in the special case of the never terminating contract (µX • X). We can give further credibility to the choice of least fixed points by considering the way in which these are determined as limits of approximation sequences. Assume that f is a monotonic function on predicate transformers. Applying The- orem 19.3, we see that µ. f can be expressed using approximations

f 0.abort = abort , α+ α f 1.abort = f.( f .abort) for arbitrary ordinals α, α β f .abort = (lim β | β<α• f .abort) for nonzero limit ordinals α. γ Theorem 19.3 then shows that µ. f = f .abort for some ordinal γ .Forν. f ,we have a dual formulation with coapproximations starting from magic. For a given contract statement (µ X • S), we can define the approximations α S .{false} of (µ X • S) in the same way as for predicate transformers. Each of these approximations denotes a contract statement (remember that we also permit contract statements that are joins over infinitely indexed sets of contract statements). α Assume that the approximation S .{false} is executed in an initial state from which execution of (µ X • S) is guaranteed to terminate. We can then think of a counter initially set at the ordinal α. Each time the recursive definition is unfolded, the counter is decreased. If the ordinal is chosen large enough, then termination occurs α before α = 0, and the effect of S .{false} will be the same as the effect of (µ .X • S) in this initial state. If the number of unfoldings needed is too large, then the effect α of S .{false} will be a breach of contract in this state. As the ordinal α is chosen larger and larger, more and more unfoldings are pos- α sible, and for more and more initial states S .{false} will have the same effect as (µ X • S).Atγ , we reach the point where all terminating executions are ac- counted for. For those initial states from which unfolding is infinite, every ap- proximation acts as abort. Hence, the effect of (µ X • S) will be the same as γ the effect of S .{false}, as far as satisfaction of the contract to establish some γ postcondition goes. Hence wp.(µ X • S) = wp.(S .{false}) for some γ .Bythe definition of weakest preconditions for contract statements, we then have that γ γ wp.(S .{false}) = (λX • wp.S) .abort = (µX • wp.S). This shows that the weak- est precondition of a recursive contract statement should be taken as the least fixed point of the corresponding function on monotonic predicate transformers in the general case also. This explains why Sα can be seen as an approximation of the recursive contract statement (µ X • S). In some initial states, the number of unfoldings needed for α (µ X • S) to establish the postcondition may be so small that S gives the same result, but in some other states, α may not be sufficiently large, so Sα leads to a breach of contract. 332 20. Recursion

Example It may not be evident why we need to use ordinals when defining the approxima- tions, i.e., why is it not sufficient to consider only the natural numbers (and their limit ω). Let us therefore give an example where transfinite ordinals are needed to reach the least fixed point. Consider the statement   (µ X • {x = 0} ;[x := x | x ≥ 1] ; X {x = 1} {x > 1} ; x := x − 1;X). The behavior of this statement in initial state x = 0 is to decrease the value of x until x = 1, when the execution terminates. In initial state x = 0, the program variable x is first assigned some arbitrary nonzero natural number, after which the same steps of decreasing x are carried out, eventually leading to termination when x = 1. Computing successive approximations we get, after some simplifications,

S0 = abort , S1 ={x = 1} , S2 ={1 ≤ x ≤ 2} ; x := 1 ,

and inductively, for all n ∈ Nat,wefind that Sn ={1 ≤ x ≤ n} ; x := 1:

Sn+1 ={definition of approximation}   {x = 0} ;[x := x | x ≥ 1] ; Sn {x = 1} {x > 1} ; x := x − 1;Sn ={induction assumption}   {x = 0} ;[x := x | x ≥ 1] ; {1 ≤ x ≤ n} ; x := 1 {x = 1} {x > 1} ; x := x − 1;{1 ≤ x ≤ n} ; x := 1 ={simplification} abort {x = 1} {2 ≤ x ≤ n + 1} ; x := 1 ={easily verified equality} {1 ≤ x ≤ n + 1} ; x := 1 (for the rules used in the last two steps see Exercises 20.3 and 20.4). At the first infinite ordinal ω a similar derivation gives us

• Sω = ( n ∈ Nat Sn) ={1 ≤ x} ; x := 1 .

Note that this statement behaves as abort when x = 0. We can still continue the process of computing approximations, getting 20.1. Fixed Points and Predicate Transformers 333

Sω+1 = x := 1 , Sω+2 = x := 1

(again, we get Sα+1 by substituting Sα for X in the recursion body). At this point, we have reached the ordinal where no new approximations are produced. This means that the least fixed point is the functional update x := 1.

The need for transfinite ordinals does not, of course, not mean that execution of a recursive statement could terminate after an infinite number of recursive calls. Rather, this argument shows that if we want to set a counter at the start so that its value can be decreased at each recursive call, then we have to set the counter to ω (or higher) at the start.

Greatest Fixed Points We can also introduce (ν X • S) as an alternative kind of recursively defined statement. In fact, the greatest fixed point is important when we handle different kinds of iterations in Chapter 21.

If we want to understand greatest fixed points in terms of unfoldings, then we cannot use the same argument as for least fixed points. Since the greatest fixed point of a function on predicate transformers is a colimit of a chain of coapproximations starting from magic, this requires that infinite unfolding be treated as miraculous computation. This is not generally the case, but it can be justified in certain cases if a notion of biased execution is introduced.

Consider, for example, the recursive definition X:

X = skip  S ; X

Successive unfolding of this definition leads to a meet of repeated executions of S:

skip  S  S ; S  S ; S ; S ··· .

(here we have assumed that S is conjunctive, to deduce S ; (skip  S ; X) = S  S ; S ; X). If S is not miraculous, we can intuitively think of the demon always choosing one more unfolding, so that an infinite unfolding is possible. This means that (µ X • S ; X  skip) is nonterminating. However, our interpretation of (ν X • S ; X  skip) is such that the demon must avoid infinite unfolding. Thus the demon will at some point choose the skip alternative, and the whole statement is guaranteed to terminate. This means that there is a bias in the choice; the demon may not neglect the skip alternative forever. 334 20. Recursion

20.2 Recursion Introduction and Elimination

Let P ={pw | w ∈ W} be a collection of predicates that are indexed by the well-founded set W, such that v  w ⇒ pv  pw. We refer to a predicate in this set as a ranked predicate. In practice, a ranked predicate pw is often written as the conjunction of an invariant expression I and a variant expression t (also known as a termination function):

pw = (λσ • I.σ ∧ t.σ = w)

or, with pointwise extension, pw = (I ∧t = w). Here w is a fresh variable ranging over W. Assuming a collection of ranked predicates P as above, we define ∧ p = (∪ w ∈ W • pw), ∧ p

Thus, p holds if some ranked predicate holds, while p

Recursion Introduction We shall now derive a rule for recursion introduction based on the use of ranked predicates. This rule states under what conditions it is possible to refine a statement S by a recursive statement µ. f . The rule is based on the following theorem.

Theorem 20.1 Assume that W is a well-founded set, f a monotonic function on monotonic pred- icate transformers, and {pw | w ∈ W} a collection of state predicates indexed by W. Then the following holds:

(∀w ∈ W • {pw} ; S  f.({p

Proof Assume that {pw} ; S  f.({p

(∀w ∈ W • {pw} ; S  µ. f ) (∗) 20.2. Recursion Introduction and Elimination 335 by well-founded induction, as follows. The induction assumption is (∀v • v< w ⇒{pv} ; S  µ. f ):

{pw} ; S {assumption} f.({∪ v ∈ W | v

From Theorem 20.1 we get the following rule for introducing a recursive statement, when I is a boolean expression and t an expression that ranges over some well- founded set:

 {I ∧ t = w} ; S  T [X :={I ∧ t

In practice, this rule of recursion introduction can be used as follows. Viewing a statement S as a specification, we first decide on an invariant I and a variant t ranging over some well-founded set W. We then transform {I ∧ t = w}; S (where w is a fresh variable), applying refinement rules, until we get as a refinement a term of the form T [X :={I ∧ t

Recursion Elimination The recursion introduction rule shows how we can introduce recursion in a deriva- tion. The opposite, recursion elimination, is also sometimes useful. As already noted, the principle of least fixed point induction is itself a rule for µ elimination. Another rule that can be used is the following:

(µ elimination)  (µ X • S) = S • X not free in S. In fact, this rule can be used both to introduce and to eliminate a (vacuous) recursion. Similarly, the principle of greatest fixed point induction is itself a rule for ν intro- duction, while a (vacuous) greatest fixed point construct can be eliminated (and introduced) using

(ν elimination)  (ν X • S) = S • X not free in S.

20.3 Recursive Procedures

The definition of a recursive procedure is of the following form:

rec N(var x1,...,xm , val y1,...,yn) = S , (recursive procedure) where S is a statement that may contain (recursive) calls to N of the form

N(z1,...,zm , e1,...,en).

Here each zi is one of the variables x1,...,xm , y1,...,yn (the parameters), each e j is an expression in the parameters, and the zi are all distinct. Note that we do not permit references to “global” program variables in a procedure body.

Intuitively, a call N(z1,...,zm , e1,...,en) is executed as a statement

begin var y1,...,yn := e1,...,en ; S[x1,...,xm := z1,...,zm ] end , and calls to N inside S are executed in the same way, recursively. If a formal value parameter y j has the same name and type as an actual reference parameter, then the local variable y j has to be renamed. To simplify the presentation, we assume that no such name clashes occur. Note that the intuitive interpretation of recursive calls means that the execution builds up a collection of nested blocks with local variables. The same happens if a recursive call to N inside S is within the scope of a local variable declaration. 20.3. Recursive Procedures 337

Semantics of Recursive Procedure Even though the intuitive description of a recursive procedure includes a potentially unbounded nesting of blocks, we do not need this construction in order to describe mathematically what a recursive procedure means. In order to describe what type of predicate transformer such a recursive definition defines, we first consider an example with a single recursive call: rec N(var x, y, val z) =···N(y, z, x + z) ··· . We consider this definition as syntactic sugaring, as follows: ∧ • • N.x.y.z = (µ X ···( x0 {x = x0} ; x, y, z := y, z, x + z; X ; x, y, z := x0, x, y) ···). The effect of the angelic choice and the assignments that are wrapped around the recursion variable X is (a) to store and restore the values of the variables (here x) that are not used as reference parameters in the recursive call, and (b) to map the right variables (here y and z) to the reference parameter positions in the call. Using a local variable to get the same effect would lead to a typing problem, since the recursive call would then occur in a different state space. The combination of angelic choice and asserts keeps the initial value of x in a kind of “suspended animation” during the execution of the nested block; see Exercise 20.5. Thus we have a method of reducing a recursive procedure definition into a definition where the right-hand side includes a µ -recursion. Furthermore, we consider a call N(u,v,e) to be syntactic sugaring:

N(u,v,e) = begin var z := e ; N.u.v.z end (recall that we assume that there is no name clash between z and u or v). In Section 13.5, we showed that a procedure body can be refined freely, pro- vided that all calls to the procedure are in monotonic positions and that procedure calls preserve the homomorphism properties (strictness, termination, conjunctivity and disjunctivity) of the procedure body. The results obviously remain valid for recursive procedures.

Recursive Procedures with Local Variables Special care has to be taken when a recursive call occurs within the scope of a block with local variables in the procedure body. We assume that the body is of the form begin var z := e ; S end where S does not contain blocks with local variables (this is no real restriction because inner blocks can always be pushed to the outermost level in a statement). Then z is handled like a value parameter in the recursive calls. On the outermost level it is a local variable outside the recursion. 338 20. Recursion

The following example should make this clear. Assume the following recursively defined procedure with a single recursive call:

rec N(var x, val y) = begin var z := 0;···N(y, x + z) ··· end . This really defines N as follows: N.x.y = begin var z := 0;

• • (µ X ···( x0, z0 {x, z = x0, z0} ; x, y, z := y, x + z, 0;

X ; x, y, z := x0, x, z0) ···) end . Before the recursive call, the old values of x and z are stored, while x plays the role of y and y and z are reinitialized. After the call, x and z are restored, and y takes back its old role. Again, we assume for simplicity that if the procedure N is called from a state in which a global variable z is used as a reference parameter, then the local variable of the body has been renamed to avoid a name clash.

Procedure Introduction From the rule for recursion introduction (Section 20.2) we can derive a rule for intro- duction of recursive procedures. For simplicity, we assume only a single recursive call in the body of the procedure definition. Assume that N is a recursive procedure defined as

∧ N(var x, val y) = begin var z := f ; S[X := N(xˆ, e)] end , where x, y, and z are lists of distinct variables; f is a list of expressions over x and y; e is a list of expressions over x, y, and z; and xˆ is a list of distinct variables drawn from x, y, and z. Furthermore, assume that {pw | w ∈ W} is a collection of ranked predicates over x and y, and that b is a boolean expression that does not contain y or z free. Then the following rule of procedure introduction is valid:   {pw} ;[x, y := x , y | b]  begin var z := f ;  S[X :={p

We show the proof for the special case when xˆ = x, i.e., when the reference parameters take their own places in the recursive calls. The proof of the general case 20.3. Recursive Procedures 339 is analogous, except that the assignments immediately surrounding the recursion variable X become more complicated.

  {p} ;[x, y := x , y | b]  N.x.y ≡{definition of recursive procedure}   {p} ;[x, y := x , y | b]  begin var z := f ;

• • (µX S[X := ( y0, z0 {y = y0 ∧ z = z0} ; y := e ;

X ; y, z := y0, z0)]) end ≡{Theorem 17.7, assume z not free in b}    {p ∧ z = f } ;[x, y, z := x , y , z | b] 

• • (µX S[X := ( y0, z0 {y = y0 ∧ z = z0} ; y := e ;

X ; y, z := y0, z0)]) ⇐{recursion introduction}    {pw ∧ z = f } ;[x, y, z := x , y , z | b] 

• S[X := “ ( y0, z0 {y = y0 ∧ z = z0} ; y := e ;   {p

• ···( y0, z0 {y = y0 ∧ z = z0} ; y := e ;   {p

• ···( y0, z0 {y = y0 ∧ z = z0} ;  {p

Note the use of ellipsis to hide a context in this proof. We will use this device quite frequently to shorten a derivation and to highlight the places where changes are made in a term. Ellipses are used together with quotation marks: when we quote a subterm in one step, then the ellipses denote the part of the term outside the quotation marks in the next step. 340 20. Recursion

Once the introduction of the recursive procedure N is verified, we can introduce calls to N, using the following rule of procedure call introduction: { } , , = ,  |  . . p ;[x y : x y b] N x y . {p[y := e]} ;[x := x | b[y := e]]  N(x, e) Here the hypothesis is that the required refinement holds for the formal procedure parameters x and y, both of which may be lists of program variables. The conclusion then permits us to infer that the required refinement holds for the actual reference parameters x (a list of distinct program variables) and actual value parameters e (a list of expressions). The proof is a short derivation:  {p[y := e]} ;[x := x | b[y := e]]  N(x, e) ≡{definition of procedure call}   {p[y := e ]} ;[x := x | b[y := e]]  begin var y := e ; N.x.y end ≡{Theorem 17.7}   {p[y := e] ∧ y = e} ;[x, y := x , y | b[y := e]]  N.x.y ≡{use assertion information to simplify}   {p ∧ y = e} ;[x, y := x , y | b]  N.x.y ⇐{weaken assertion}   {p} ;[x, y := x , y | b]  N.x.y

Using the Rule in Practice In practice, the introduction rules can be used in the following stepwise fashion. We start from a specification {p};[x := x | b] of the procedure, where x is intended to be a reference parameter and where other free variables y in p and b are intended to be value parameters. We then develop a recursive procedure N such that

{p} ;[x, y := x, y | b]  N.x.y . After this, we can replace any specification of the form {p[x, y := u, e]} ;[u := u | b[x, x, y := u, u, e]] with a procedure call N(u, e). As a trivial example, we develop a procedure for calculating whether a number is even or not, relying only on the following recursive characterization of evenness:

even.0 ∧ (∀n • even.(n + 1) ≡¬even.n).

We want the argument n : Nat to be a value parameter and the result to be stored in a reference parameter x : Bool. Thus our specification is

x := even.n 20.3. Recursive Procedures 341

  or, equivalently, {T } ;[x := x | x = even.n]. As invariant we use true and as variant n. The derivation of the procedure is as follows:    {n = w} ;[x, n := x , n | x = even.n] {introduce conditional, simplify if-branch} if n = 0 → x := T    [] n > 0 → “ {n = w ∧ n > 0} ;[x, n := x , n | x = even.n]” fi ={characterization of even, context information}    ···{n = w ∧ n > 0} ;[x, n := x , n | x =¬even.(n − 1)]··· ={split assignment}    ···{n = w ∧ n > 0} ;[x, n := x , n | x = even.(n − 1)];x :=¬x··· {weaken assertion, make assignment deterministic}   ···{n − 1 0 →{n − 1

By the rule of procedure introduction (with pw := (n = w) and x := x and y := n and b := even.n) this derivation proves    [x, n := x , n | x = even.n]  Even.x.n , where Even is defined as follows: Even(var x, val n) = if n = 0 → x := T [] n > 0 → Even(x, n − 1) ; x :=¬x fi . Furthermore, the rule for introduction of procedure calls allows us to replace state- ments of the right form with calls to Even. For example, the following replacement is correct: z := even.(a + b)  Even(z, a + b).

Note how the refinement theorem for the procedure itself contains the value param- eters as program variables, making the left-hand side look a bit more complicated than we might want (here we must use a demonic assignment rather than a deter- ministic one). However, the important thing is that we have a rule for deriving a procedure that allows us to replace program statements with procedure calls.

Mutual Recursion A mutually recursive definition is of the form 342 20. Recursion

rec N1(var x1, val y1),...,Nm (var x1, val y1) = S1,...,Sm .

Here the defining expression Si for each Ni may contain calls to all of the procedures N1,...,Nm . This kind of definition can easily be handled by considering it to define a single m-tuple N.x.y = (N1.x1.y1,...,Nm .xm .ym ). Since the cartesian product of complete lattices is a complete lattice, N.x.y is then a least fixed point in analogy with the single-recursion case. Now, the definition of N1,...,Nm above is taken to mean that Ni .xi .yi is the ith component of this least fixed point, for i = 1,...,m. We will not consider mutual recursion in any more detail here.

20.4 Example: Computing the Square Root

As another example, we derive a recursive procedure for computing the integer square root of a given natural number n, i.e., the (unique) natural number x sat- isfying x2 ≤ n <(x + 1)2. Our plan is to let two variables l and r bound an interval within which the square root of n is guaranteed to be, and to make this interval smaller and smaller until r = l + 1. This suggests a recursive procedure with one reference parameter (x) and three value parameters (n, l, and r) and using l2 ≤ n < r 2 as invariant and r − l as variant. The initial specification is       {l2 ≤ n < r 2} ;[x, n, l, r := x , n , l , r | x 2 ≤ n <(x + 1)2] .

We then have the following derivation, where S stands for the statement [x := x | x2 ≤ n <(x + 1)2]:

      {l2 ≤ n < r 2 ∧ r − l = w} ;[x, n, l, r := x , n , l , r | x 2 ≤ n <(x + 1)2] ={make assignment more deterministic}    {l2 ≤ n < r 2 ∧ r − l = w} ;[x := x | x 2 ≤ n <(x + 1)2] ={introduce conditional, simplify assertions} if l + 1 = r → “ {l2 ≤ n < r 2 ∧ r − l = w} ; S ” [] l + 1 = r ∧ (l + r)2/4 ≤ n → {l + 1 < r ∧ (l + r)2/4 ≤ n ∧ l2 ≤ n < r 2 ∧ r − l = w} ; S [] l + 1 = r ∧ (l + r)2/4 > n → {l + 1 < r ∧ n <(l + r)2/4 ∧ l2 ≤ n < r 2 ∧ r − l = w} ; S fi {introduce assignment} 20.5. Summary and Discussion 343

if l + 1 = r → x := l [] l + 1 = r ∧ (l + r)2/4 ≤ n → {l + 1 < r ∧ (l + r)2/4 ≤ n ∧ l2 ≤ n < r 2 ∧ r − l = w} ; S [] l + 1 = r ∧ (l + r)2/4 > n → {l + 1 < r ∧ n <(l + r)2/4 ∧ l2 ≤ n < r 2 ∧ r − l = w} ; S fi {weaken assertions} if l + 1 = r → x := l [] l + 1 = r ∧ (l + r)2/4 ≤ n → {(l + r)2/4 ≤ n < r 2 ∧ r − (l + r)/2 n → {l2 ≤ n <(l + r)2/4 ∧ (l + r)/2 − l n → Sqrt(x, n, l,(l + r)/2) fi and deduce       {l2 ≤ n < r 2} ;[x, n, l, r := x , n , l , r | x 2 ≤ n <(x + 1)2]  Sqrt.x.n.l.r . Furthermore, the rule for procedure call introduction tells us that if e is an expression over the program variables, then    [x := x | x 2 ≤ e <(x + 1)2]  Sqrt(x, e, 0, e + 1). To avoid the redundant parameters (0 and e + 1) we can define an additional procedure with only two parameters (x and e), the body of which consists of only the call Sqrt(x, e, 0, e + 1).

20.5 Summary and Discussion

This chapter has given an overview of the fixed-point theory of recursion, applied to the framework of predicate transformers and refinement calculus. We have also 344 20. Recursion

shown how recursive procedures with value and reference parameters are derived and used. Dijkstra’s original weakest precondition calculus [53] did not handle recursion, it did not use fixed points for defining loops, and it required that loop bodies be continuous. The weakest precondition calculus was extended to recursion by Hehner in [74]. A predicate transformer semantics for general recursion (including mutual recursion) is given by Hesselink [78]. Ordinals were first used to describe recursion in terms of limits by Boom [41], who described his results in the framework of constructive mathematics. Dijkstra and van Gasteren used fixed-point reasoning and well-founded induction for loops, to avoid the restriction to continuity [55]. However, most authors have preferred to stay within the realm of bounded nondeterminism (i.e., continuity), where the natural numbers are sufficient. A continuous semantics for unbounded nondeterminism can be given, as shown by Back [8], but such a semantics cannot then be fully abstract; i.e., it requires additional information not related to the input–output behavior of the program [5]. The original refinement calculus in [7] had only iteration constructs, and no general recursion. Recursion with noncontinuous statements was added to the refinement calculus by Back [12] using ordinals, essentially following Boom [41]. A fixed- point version of recursion in the refinement calculus was given by Morris [108]. The rule for recursion introduction in the refinement calculus was first described by Back in [11]. Greatest fixed points are used in a partial correctness framework but have not attracted much attention in total correctness formalisms. An exception is David Park [114], who used greatest fixed points to describe parallelism and fair termination. Rules for introducing and reasoning about procedures, including recursive pro- cedures, within the refinement calculus framework were given by Back [11]. Independently, Morgan also derived rules for procedures, but without recursion [103].

20.6 Exercises

20.1 Show that the approximation Sω in the example in Section 20.1 is {1 ≤ x} ; x := 1, while Sω+1 is x := 1.

20.2 Assume that each predicate pw in the ranked collection {pw | w ∈ W} (where W is a well-founded set) is of the form I ∧ t = w, where w does not occur free in I or t. Show that the predicate (∪ w ∈ W • pw) is I . 20.3 Prove the following rules, which were used in Section 20.1: 20.6. Exercises 345

(a) (∃σ • q.σ ∧¬p.σ ) ⇒ [q˙];{p}=abort .

(b)  f ; {p}={λσ • p.( f.σ )} ;  f .

(c) x := e ; x := e  x := e if x does not appear in e. 20.4 Prove the following, which was used in Section 20.1: {x = 1} {2 ≤ x ≤ n + 1} ; x := 1 ={1 ≤ x ≤ n + 1} ; x := 1 . 20.5 Prove the following refinement equality:

y := a ; ( y0 |{y = y0} ; S ; y := y0) = y := a ; S ; y := a

for an arbitrary value a. This shows that the statement ( y0 |{y = y0}; S ; y := y0) manages to execute S but then restore y to its initial value. 20.6 Prove the following result used in the proof of the rule for procedure introduction:   ( y0 |{y = y0} ; {p} ;[x := x | b];y := y0) ={p} ;[x := x | b] . 20.7 Prove the following result used in the proofs of the procedure rules:  {p} ; x := x | b]  begin var y := e ; S end ≡   {p ∧ y = e} ;[x, y := x , y | b]  S when y and y do not occur free in p or b. 20.8 Derive a recursive procedure for computing the factorial of a natural number, using the following primitive recursive characterization of fact: (fact.0 = 1) ∧ (∀n • fact.(n + 1) = n · fact.n). 20.9 Derive a recursive procedure for computing the greatest common factor (also known as the greatest common divisor) for two positive natural numbers, using the following properties of gcf, whenever x and y are positive: gcf.x.x = x , x ≤ y ⇒ gcf.x.y = gcf.x.(y − x), gcf.x.y = gcf.y.x . A suitable initial specification is {x > 0 ∧ y > 0} ; z := gcf.x.y . 20.10 Fill in the missing details in the derivation of the recursive procedure that computes the integer square root, i.e., the details of the derivation steps justified by “introduce assignment” and “weaken assertion”. This is page 346 Printer: Opaque this This is page 347 Printer: Opaque this 21

Iteration and Loops

The recursive constructs provide powerful programming tools. In the general case, the body S of a recursion (µ X • S) or (ν X • S) can be an arbitrary term of predicate transformer type. An important special case of recursion is iteration, i.e., the repeated execution of a statement. In this chapter we consider different kinds of iterations. We introduce two basic iteration constructs, a strong iteration statement given by the least fixed point construct and a weak iteration statement given by the greatest fixed point construction. We study the monotonicity and homomorphism properties of these statements and give rules for reasoning about iteration statements. We then show how to describe ordinary program constructs such as while loops as derived statements in terms of iterations. In particular, we derive correctness and introduction rules for iteration statements and loops using the corresponding rules for recursive statements.

21.1 Iteration

Let S be a monotonic predicate transformer. Wedefine two basic iteration constructs over S, strong iteration and weak iteration, as the following fixed points:

ω ∧ S = (µ X • S ; X  skip), (strong iteration) ∗ ∧ S = (ν X • S ; X  skip). (weak iteration) Iterations are important building blocks when we introduce and reason about loops. Intuitively speaking, statement S∗ is executed so that S is repeated a demonically 348 21. Iteration and Loops

chosen (finite) number of times. Execution of statement Sω is similar, but if S can be executed indefinitely, then Sω is aborting (recall that infinite unfolding corresponds to nontermination for least fixed point statements).

Predicate-Level Characterization The two iterations can also be described using fixed points on the predicate level.

Lemma 21.1 Let S be an arbitrary monotonic predicate transformer. Then

ω S .q = (µ x • S.x ∩ q), ∗ S .q = (ν x • S.x ∩ q) for arbitrary predicate q.

Proof For strong iteration we have for arbitrary predicate q, (µ X • S ; X  skip).q = (µ x • S.x ∩ q) ⇐{Corollary 19.6} (∀T • (λX • S ; X  skip).T.q = (λx • S.x ∩ q).(T.q)) ≡{β reduction} (∀T • (S ; T  skip).q = S.(T.q) ∩ q) ≡{statement definitions} T The proof for S∗ is dual. 2

As an example of an iteration, consider the weak iteration (x := x + 1)∗.It increments x a demonically chosen (finite) number of times. Thus it has the same effect as the update [x := x | x ≤ x]. On the other hand, (x := x + 1)ω is nonterminating, since x can be incremented indefinitely, so it has the same effect as abort (Exercise 21.4).

Tail Recursion The choice of skip as the exit alternative (i.e., as the second argument of the meet in the body of the recursion) in the definitions of the iterations may seem arbitrary, and in a sense it is. The following lemma shows how we get an iteration with an arbitrary exit alternative T . Lemma 21.2 Let S and T be arbitrary monotonic predicate transformers. Then

ω S ; T = (µ X • S ; X  T ), ∗ S ; T = (ν X • S ; X  T ). 21.1. Iteration 349

Proof We have ω S ; T = (µ X • S ; X  T ) ≡{definitions} (µ X • S ; X  skip) ; T = (µ X • S ; X  T ) ⇐{fusion (Theorem 19.5) with h.X = X ; T } (∀X • (S ; X  skip) ; T = S ; (X ; T )  T ) ≡{distributivity} T which proves (a). The proof for weak iteration is similar. 2

Statements of the form (µ X • S ; X  T ) are important, since they model tail recursion, i.e., recursion in which the body of the recursion has no substatement that sequentially follows the recursive call. Thus Lemma 21.2 shows how tail recursion can be written using strong iteration.

Infinite Repetition Statement We define the infinite repetition statement S∞ as follows:

∞ ∧ S = (µ X • S ; X). (repetition statement) The following lemma shows what the predicate transformer S∞ is.

Lemma 21.3 Assume that S is a monotonic predicate transformer. Then S∞.q = µ.S. Proof (µ X • S ; X).q = µ.S ⇐{Corollary 19.6} (∀T • (λX • S ; X).T.q = S.(T.q)) ={β reduction, definition of composition} T 2

Thus, S∞.q = µ.S for an arbitrary postcondition q. Intuitively, this means that the predicate µ.S characterizes those states from which repeated execution of S eventually terminates miraculously. Infinite repetition can be expressed using strong iteration (Exercise 21.5): ∞ ω S = S ; magic . This justifies considering the strong and weak iteration statements are more basic that the repetition statement. 350 21. Iteration and Loops

21.2 Properties of Iterations

Both iteration statements can be viewed as statement constructors with one sub- statement argument. We now investigate their monotonicity and homomorphism properties. Both iteration statements are monotonic, in a restricted way, as shown by the following lemma. Theorem 21.4 The iterative constructs S∗,Sω, and S∞ are monotonic when restricted to a mono- tonic argument S. Furthermore, they preserve monotonicity: if S is monotonic, then so are S∗,Sω, and S∞. Proof Consider first the monotonicity of the iterative constructs. We have  S  S ⇒{monotonicity properties of statements}  (∀X • S ; X  skip  S ; X  skip) ⇒{monotonicity of pointwise extension}  (λX • S ; X  skip)  (λX • S ; X  skip) ⇒{monotonicity of least fixed point, Lemma 19.2 }  (µX • S ; X  skip)  (µX • S ; X  skip) ≡{definition of strong iteration} ω ω S  S Monotonicity of S∗ is proved with the same argument. We then also have ω S is monotonic ≡{definition of strong iteration} (µX • S ; X  skip) is monotonic ⇐{Lemma 19.7 } (λX • S ; X  skip) is monotonic ∧ (λX • S ; X  skip) preserves monotonicity ⇐{basic statements are monotonic and preserve monotonicity} T 2

These properties are summarized in Tables 21.1 and 21.2. We next consider which homomorphism properties are preserved by iterations and loops. 21.2. Properties of Iterations 351

is  ω (λS • S ) yes1 ∗ (λS • S ) yes1 ∞ (λS • S ) yes1 1monotonic S

TABLE 21.1. Iteration monotonicity

Theorem 21.5 The iterations S∗ and Sω and the infinite repetition S∞ have the following proper- ties:

(a) Weak and strong iterations are always strict, and infinite repetitions preserve strictness.

(b) Weak iterations preserve termination.

(c) Weak iterations and infinite repetitions preserve conjunctivity.

(d) Infinite repetitions preserve disjunctivity.

Proof We first show that S∗ is strict: ∗ S .false ={Lemma 21.1} (ν x • S.x ∩ false) ={predicate calculus} (ν x • false) ={fixed point of constant function} false The proof for strong iteration is similar. Next, Lemma 19.8 is used to show that weak iteration preserves conjunctivity and that infinite repetition preserves disjunctivity. That infinite repetition preserves ∞ • conjunctivity follows directly from the fact that S .(∩ i ∈ I pi ) and (∩ i ∈ • ∞ I S .pi ) are both equal to µ.S when I is nonempty. To see that S∗ preserves termination, we have the following derivation: S terminating ∗  S .true ={Lemma 21.1} (ν x • S.x ∩ true) ={predicate calculus} (ν x • S.x) 352 21. Iteration and Loops

preserves  ⊥ ω (λS • S ) yes yes no yes no ∗ (λS • S ) yes yes yes yes no ∞ (λS • S ) yes yes no yes yes

TABLE 21.2. Property preservation for derived statements

={assumption S.true = true means true must be greatest fixed point} true

We leave the proof that infinite repetition preserves strictness as an exercise to the reader (Exercise 21.6). 2

In fact, it can be shown that strong iteration also preserves conjunctivity. However, the proof of this is most easily done after some further properties of conjunctive statements have been investigated, so we postpone it to Chapter 29 (Theorem 29.7).

21.3 Correctness of Iterations

Recall the fixed-point properties and the rule of recursion introduction in Chapter 20. As an application of these rules, we can derive correctness rules for the iter- ation statements. Our aim is to show under what conditions the total correctness assertions p {| S∗ |} q, p {| Sω |} q, and p {| S∞ |} q are valid. We start with weak iteration. Theorem 21.6 Assume that S is a monotonic predicate transformer.

(a) The following correctness rule holds for weak iteration: ∗ r {| S |} r ⇒ r {| S |} r .

(b) Furthermore, if {rw | w ∈ W} is a collection of ranked predicates, then the following correctness rule holds for strong iterations: ω (∀w ∈ W • rw {| S |} r

Proof Part (a) is really a special case of greatest fixed point induction:

T ≡{specialize greatest fixed point induction} r ⊆ (λx • S.x ∩ r).r ⇒ r ⊆ (ν x • S.x ∩ r) ≡{β reduction, Lemma 21.1} ∗ (r ⊆ S.r ∩ r) ⇒ r ⊆ S .r 21.4. Loops 353

≡{lattice property (x  y  x) ≡ (x  y)} ∗ r ⊆ S.r ⇒ r ⊆ S .r

and the proof is finished. For (b), recall that r = (∪ w ∈ W • rw) and r

Note the similarities between the two rules in Theorem 21.6. The rule for strong iteration is somewhat more complicated, as it makes use of ranked predicates. The basic idea is that each execution of the statement S has to decrease the rank of the predicate in order to guarantee termination. Since preconditions can be strengthened and postconditions weakened, we imme- diately get the following corollary for strong iteration.

Corollary 21.7 Assume that p and q are predicates and S is a monotonic predicate transformer. Assume that {rw | w ∈ W} is a ranked collection of predicates. Then ω p {| S |} q if

(i) p ⊆ r,

(ii) (∀w ∈ W • rw {| S |} r

(iii) r ⊆ q. Intuitively, the first condition in Corollary 21.7 means that the invariant r is es- tablished by the precondition p. The second condition asserts that each iteration decreases the rank of the invariant. Finally, the third condition states that the post- condition q holds after termination (regardless of which rw was reached when execution terminated).

21.4 Loops

We now consider a derivative of the strong iteration Sω, the while loop and its generalizations. The while loop is less general than strong iteration, and its syntax is more restrictive. It models the standard while loop of imperative programming languages.

While loops We define the construct while g do S od as follows: ∧ while g do S od = (µ X • if g then S ; X else skip fi), 354 21. Iteration and Loops

where g is a predicate and S a predicate transformer. By unfolding, we see that execution of while g do S od consists in repeatedly executing S as long as the predicate g holds. Predicate g is called the guard and S the body of the loop. While loops can be reduced to iterations. This is useful, since it means that we can reduce reasoning about loops to reasoning about iteration.

Lemma 21.8 Let g be a predicate and S a monotonic predicate transformer. Then ω while g do S od = ([g];S) ;[¬ g] .

Proof The proof is a short calculation: S monotonic  while g do S od ={definition} (µ X • if g then S ; X else skip fi) ={rewrite conditional} (µ X • [g];S ; X  [¬ g]) ={Lemma 21.2} ω ([g];S) ;[¬ g] 2

A consequence of Lemma 21.8 is the following predicate-level fixed-point char- acterization of the while loop:

Lemma 21.9 Let g and q be predicates and S a monotonic predicate transformer. Then while g do S od.q = (µ x • (g ∩ S.x) ∪ (¬ g ∩ q)) .

Proof We have while g do S od.q ={Lemma 21.8} ω (([g];S) ;[¬ g]).q ={definitions of sequential composition and guard} ω ([g];S) .(g ∪ q) ={Lemma 21.1} (µ x • ([g];S).x ∩ (g ∪ q)) ={definitions of sequential composition and guard} (µ x • (¬ g ∪ S.x) ∩ (g ∪ q)) ={predicate property} (µ x • (g ∩ S.x) ∪ (¬ g ∩ q)) 21.4. Loops 355

which proves the lemma. 2

Guarded Iteration Statements We have already seen how the guarded conditional generalizes the deterministic conditional by permitting nondeterminism and also more than two alternatives. The guarded iteration statement similarly generalizes the while loop. It is defined as

∧ do g1 → S1 [] ··· [] gn → Sn od = while g1 ∪···∪gn do [g1];S1 ···[gn];Sn od . This gives the while loop as a special case, when n = 1:

do g → S od = while g do S od . A while loop with a deterministic conditional is easier to understand when rewritten as a guarded conditional:

while g do if h then S1 else S2 fi od =

do g ∧ h → S1 [] g ∧¬h → S2 od . This is also true in general; the logic behind a program is often more clearly expressed by the guarded conditional and iteration statements than by their deter- ministic counterparts. The guarded conditional and iteration statements are also very convenient to use in practical program derivations, as the different cases can be handled more independently of one another.

Properties of Loops We now investigate the monotonicity and homomorphism properties of loops. Theorem 21.10 The while loop and guarded iteration are both monotonic with respect to mono- tonic predicate transformer arguments. Furthermore, both loop constructs preserve monotonicity. Proof For the while loop, both properties follow from Lemma 21.8 and the corresponding properties of strong iteration, the guard statements and sequential composition. The properties for guarded iteration are then easily derived from the properties for while loops and guarded conditionals. 2

Monotonicity preservation of guarded iteration means the following:   ∧ ...∧   ⇒ S1 S1 Sn Sn → ... →  →  ... →  do g1 S1 [] [] gn Sn od do g1 S1 [] [] gn Sn od 356 21. Iteration and Loops

is  (λS • while g do S od) yes1 • 1 (λS1 ···Sn do g1 → S1 [] ··· [] gn → Sn od) yes 1when restricted to monotonic arguments S

TABLE 21.3. Loop monotonicity

Theorem 21.10 shows that we can extend our syntax of monotonic statements with the while loop and guarded iteration. Thus, whenever S is a monotonic statement, while g do S od is then also a monotonic statement, and similarly for do ... od. Tables 21.3 and 21.4 show the monotonicity properties of loops. The homomorphism preservation properties of loops are as follows.

Theorem 21.11 While loops and guarded iterations both preserve strictness. Furthermore, while loops preserve disjunctivity.

Proof Since guards are conjunctive, sequential composition and strong iteration ω both preserve conjunctivity, and while g do S od = ([g];S) ;[¬ g], it follows that while loops preserve conjunctivity. Next, from Lemma 19.8 it follows that while loops preserve disjunctivity. The proof of strictness is left to the reader as an exercise. The properties of guarded iterations follow directly from the properties of while loops and guarded conditionals. 2

In fact, while loops and guarded iterations also preserve conjunctivity. This will be proved in Chapter 29 (Theorem 29.7). Note that the infinite repetition statement is a special case of the while loop (Exercise 21.5):

∞ S = while true do S od .

21.5 Loop Correctness

From the correctness rule for strong iteration we can now derive correctness rules for loops.

preserves  ⊥ (λS • while g do S od) yes yes no yes yes • (λS1 ···Sn do g1 → S1 [] ··· [] gn → Sn od) yes yes no yes no

TABLE 21.4. Loop preservation of monotonicity 21.5. Loop Correctness 357

Theorem 21.12 Assume that g is a predicate and S a monotonic predicate transformer. Assume that {rw | w ∈ W} is a ranked collection of predicates. Then

(∀w ∈ W • rw ∩ g {| S |} r

Proof We use the corresponding rule for strong iteration:

r {| while g do S od |} r ∩¬g ≡{Lemma 21.8} ω r {| ([g];S) ;[¬ g] |} r ∩¬g ≡{definition of correctness} ω r ⊆ (([g];S) ;[¬ g]).(r ∩¬g) ≡{definition of sequential composition and guard} ω r ⊆ ([g];S) .(g ∪ (r ∩¬g)) ≡{lattice property} ω r ⊆ ([g];S) .(g ∪ r) ⇐{monotonicity} ω r ⊆ ([g];S) .r ≡{definition of correctness} ω r {| ([g];S) |} r ⇐{correctness rule for strong iteration (Theorem 21.6)} (∀w ∈ W • rw {| [g];S |} r

From Theorem 21.12 we get the following general rule for proving correctness of loops:

Theorem 21.13 Assume that g1,...,gn, p, and q are predicates and Si are monotonic statements for i = 1,...,n. Furthermore, assume that {rw | w ∈ W} is a ranked collection of predicates. Then

p {| do g1 → S1 [] ... [] gn → Sn od |} q , provided that 358 21. Iteration and Loops

(i) p ⊆ r,

• (ii) (∀w ∈ W rw ∩ gi {| Si |} r

(iii) r ∩¬g ⊆ q.

where g = g1 ∨···∨gn. The first condition in Theorem 21.13 states that the loop invariant r holds initially; the second condition asserts that each iteration of the loop preserves the loop invariant while decreasing the rank of the particular predicate rw; and the third condition states that the postcondition q is established upon termination of the loop. Once we have found a loop invariant, we may use it to modify the loop guard. We have the following general inference rule for this:   p ⊆ (g = g)  ,  {p} ; while g do S od ={p} ; while g do S od provided that p is preserved by the loop; i.e., (g ∩ p ∩ S.true) {| S |} p.

Example As a simple example of loop correctness, we consider the following loop:

L = while y > 0 do x, y := x · y, y − 1 od . It stores the factorial of the initial value of y in x if the condition x = 1 holds initially. We show that it satisfies the total correctness assertion

var x, y : Nat  (x = 1 ∧ y = y0 ∧ y0 ≥ 0) {| L |} x = y0! ,

where y0 is a free variable standing for the initial value of y, and y0! is the factorial of y0.

We choose 0 ≤ y ≤ y0 ∧ x · y! = y0! as the invariant and y as the variant. Thus we have to prove the following conditions:

(i) x = 1 ∧ y = y0 ∧ y0 ≥ 0 ⇒ 0 ≤ y ≤ y0 ∧ x · y! = y0! .

(ii) 0 ≤ y ≤ y0 ∧ x · y! = y0! ∧ y0 > 0 ∧ y = n {| x, y := x · y, y − 1 |} 0 ≤ y ≤ Y ∧ x · y! = y0! ∧ y < n .

(iii) 0 ≤ y ≤ y0 ∧ x · y! = y0! ∧¬(y > 0) ⇒ x = y0! .

The first and third conditions are straightforward arithmetic . The second condition is reduced using the assignment rule to get the following condition:

0 ≤ y ≤ y0 ∧ x · y! = y0! ∧ y0 > 0 ∧ y = n ⇒ 0 ≤ y − 1 ≤ y0 ∧ x · y · (y − 1)! = y0! ∧ y − 1 < n . 21.6. Loop Introduction and Elimination 359

This now follows by standard arithmetic reasoning.

21.6 Loop Introduction and Elimination

From the correctness rule we can now derive a simple rule for while-loop intro- duction, where a pre–postcondition specification is refined by a loop:   p ⊆ I   (I ∧¬g) ⊆ b[x := x]     {p} ;[x := x | b]  while g do {g ∧ I } ;[x := x | I ∧ t[x := x ] < t] od • x does not occur free in b, where t is a state expression ranging over some well-founded set. This loop intro- duction rule may seem restrictive, since the original demonic assignment cannot refer to the initial value of the variable x. However, by using specification constants (treated in Section 17.2) for the initial values of program variables, we can avoid this restriction. In some cases we know (or can guess) what the body should be in a loop intro- duction. In this case we can use a proof rule derived directly from the introduction rule given above:

Theorem 21.14 Assume that g, p, I , and b are boolean expressions such that x does not occur free in b. Furthermore, assume that state expression t ranges over some well-founded set. Then  {p} ;[x := x | b]  while g do S od , provided that

(i) p ⊆ I,

(ii) (∀w ∈ W • (g ∧ I ∧ t = w) {| S |} (I ∧ t

(iii) (¬ g ∧ I ) ⊆ b[x := x]. For the elimination of loops, the following rule can be used:

. {¬ g} ; while g do S od  skip

Example: Exponentiation As an example of program refinement using loop introduction, we develop a pro- gram for computing x y, where x and y are program variables that range over natural numbers. We start from the following specification:     [x, y, r := x , y , r | r = x y] . 360 21. Iteration and Loops

The idea is to compute the exponentiation step by step, storing intermediate results in r. Note that the specification allows x and y to be changed arbitrarily. We will make use of this freedom, but first we must introduce specification constants to stand for the initial values of x and y. We divide the derivation into two parts. The first part introduces specification constants and a loop in which y is successively decreased. The loop body is a demonic assignment, and in the second part we implement it in an efficient way.

    [x, y, r := x , y , r | r = x y] {introduce trivially true assertion}

•     y {∃x0 y0 x = x0 ∧ y = y0} ;[x, y, r := x , y , r | r = x ] {introduce leading assignment} r := 1;

• {∃x0 y0 x = x0 ∧ y = y0 ∧ r = 1} ;     [x, y, r := x , y , r | r = x y] {specification-constant rule}

{x = x0 ∧ y = y0 ∧ r = 1}; [x, y, r := x, y, r  | r  = x y] { · y = y0 } introduce loop with invariant r x x0 , variant y while y > 0 do {y > 0 ∧ r · x y = x y0 }; 0  , , = , ,  |  · y = y0 ∧  < [x y r : x y r r x x0 y y] od • r := 1; while y > 0 do { > ∧ · y = y0 } y 0 r x x0 ;  , , = , ,  |  · y = y0 ∧  < [x y r : x y r r x x0 y y] od

In the loop introduction, the entry condition for Theorem 21.14 is

= ∧ = ∧ = ⇒ · y = y0 , x x0 y y0 r 1 r x x0 and the exit condition is

¬ ( > ) ∧ · y = y0 ⇒ = y0 . y 0 r x x0 r x0 These are both easily verified. We now refine the body of the loop. Rather than decreasing y by 1 all the time (where we would use the property x y+1 = x · x y), we use the following property: 21.7. Summary and Discussion 361

x2·y = (x · x)y to halve y whenever possible. The derivation thus continues as follows (focusing on the loop body):

{ > ∧ · y = y0 } y 0 r x x0 ;  , , = , ,  |  · y = y0 ∧  < [x y r : x y r r x x0 y y] ={introduce conditional} if even.y then { . ∧ > ∧ · y = y0 } even y y 0 r x x0 ;  , , = , ,  |  · y = y0 ∧  < [x y r : x y r r x x0 y y] else { . ∧ > ∧ · y = y0 } odd y y 0 r x x0 ;  , , = , ,  |  · y = y0 ∧  < [x y r : x y r r x x0 y y] fi {introduce assignment in both branches} if even.y then x, y := x · x, y/2 else y, r := y − 1, x · r fi We leave it to the reader to verify that the assignment introductions are correct. Thus we have reached the following refinement of our initial specification:     [x, y, r := x , y , r | r = x y] {above derivations, switch to guarded iteration} r := 1; while y > 0 do if even.y then x, y := x · x, y/2 else y, r := y − 1, x · r fi od

21.7 Summary and Discussion

The while loop has been part of predicate transformer reasoning since Dijkstra’s original work [53]. The weak iteration statement constructor is in many ways similar to the star operation of regular languages, which has been used to describe 362 21. Iteration and Loops

imperative programs by Backhouse and Carr´e [33]. However, in regular languages the choice operation has no zero element, but for statements we have abort  S = abort, so the algebra of regular expressions cannot be applied directly. Stefan R¨onn [124] has also investigated the use of regular language algebra in program reasoning. Weak and strong iteration statements essentially similar to the ones described here were originally introduced in the refinement calculus (as action sequences) by Back [15] in connection with a study of refining the atomicity of parallel program constructs. They were later studied more carefully by Back [17, 18] and by Michael Butler and Morgan [44, 46]. The careful study of the homomorphism properties of weak and strong iteration is believed to be new. Dijkstra’s original definition of the loop semantics (which we will derive in The- orem 22.10) attracted a lot of attention, and a number of solutions to the problem of noncontinuous loops, unbounded nondeterminism, and weak termination were proposed. These include Back [10] and Boom [41]. The use of implicit fixed points and well-founded sets proved a simple and stable solution to the problem, as in Park [114] and Dijkstra and van Gasteren [55]. As an alternative, ordinals were used by Boom [41], Morris [108], and Back [12]. For a comprehensive study of the problems connected with lack of continuity (unbounded nondeterminism), see Apt and Plotkin [5].

21.8 Exercises

21.1 Recall that an iteration operation for relations was defined in Chapter 9. Show that the following homomorphism property holds:

[R]∗ = [R∗] . 21.2 Show the following properties for the weak iteration of an arbitrary statement S:

S ; S∗  S∗ ; S , S∗ ; S∗ = S∗ , (S∗)∗ = S∗ . 21.3 Investigate to what extent the properties of weak iteration in Exercise 21.2 hold for strong iteration also. ω ∗   21.4 Show that (x := x + 1) = abort and (x := x + 1) = [x := x | x ≤ x ]. 21.5 Show that the following characterizations of the infinite repetition statement are correct: ∞ ω S = S ; magic , ∞ S = while true do S od . 21.8. Exercises 363

21.6 Show that the infinite repetition construct preserves strictness. 21.7 Let predicate g and statement S be arbitrary. Show that

while g do S od = while g do S od ; {¬ g} . 21.8 Fill in the details in the proof of Theorem 21.6. 21.9 In Chapter 17 we introduced the problem of finding the smallest natural number not occurring in the array segment B[0..n − 1], assuming that all values in this array segment are in the range 0..m − 1. We derived the following:    [x := x |¬elem.x .B ∧ (∀y • ¬ elem.y.B ⇒ x ≤ y)]  begin var C : array Nat of Nat ;   [C := C | (∀y • elem.y.B ≡ y < m ∧ C [y])];     [x := x | (x ≥ m ∨¬C[x ]) ∧ (∀y • y ≥ m ∨¬C[y] ⇒ x ≤ y)] end Continue this derivation until the block body contains three loops: one to set all elements in C[0..m − 1] to F, one to make C[0..m − 1] represent the set of all elements in B[0..n − 1], and finally one to set x to the smallest index such that x ≥ m ∨¬C[x]. 21.10 Prove the following rule: ∞ ω p {| S |} q ≡ p {| S |} true . Then use it to prove that p {| S∞ |} q holds if

(i) p ⊆ r, and

(ii) (∀w ∈ W • rw {| S |} r

Continuity and Executable Statements

In Chapter 19, we used ordinals to express fixed points as limits. In practice, the natural numbers are sufficient as long as all statements involved are continuous. In this chapter we will study the notion of continuity more closely and show how fixed points and recursion are handled in the continuous case. Continuity is a homomorphism property. A predicate transformer is continuous if it preserves limits, and since limits are a special kind of join, continuity is in fact a restricted form of join homomorphism. We consider continuity of statements, identifying which statements are continuous and which constructors preserve con- tinuity. Continuity is important in computer science because it is connected with the notion of what can be computed in finite time from a finite amount of input information. We use continuity as a guiding principle in defining a subclass of statements that one can consider as executable. The executable statements that we define generalize the traditional guarded commands language to interactive statements, in which the user can carry on a continued dialogue with the computer.

22.1 Limits and Continuity

We first look at the algebraic notions of complete partial orders and continuity in general. In subsequent sections, we then investigate in more detail continuity properties of predicate transformers. Recall that a subset C of a poset (A, ) is directed if every finite subset is bounded above and that the poset (A, ) is a complete partial order (a cpo) if the limit 366 22. Continuity and Executable Statements

lim C (i.e., C) exists for every directed subset C of A. A monotonic function f : A → B, where A and B are complete partial orders, is continuous if it preserves limits. Note that a complete lattice is always a cpo. Any unordered set can be made into a cpo by adding a bottom element (such a cpo is called a flat cpo).

We write A →c B for the set of all continuous functions f ∈ A → B from a cpo A to a cpo B. When A is a type , then this set is itself a cpo with the pointwise extended order. In general, A →c B need not be a cpo even if A and B are cpos. The reason is the same as for monotonic functions A →m B; antisymmetry need not hold.

Lemma 22.1 Let type  and set B be cpos. Then the pointwise extension of B to  →c Bis also a cpo. Since limits preserve continuity (Exercise 22.2), we have the following result.

Lemma 22.2 The cpo ( →c B, ) is a subposet of ( →m B, ). Furthermore, limits give the same result in ( →c B, ) as in ( →m B, ).

Fixed Points and Continuity For continuous functions f ∈ A → A, where A is a cpo, we can give an explicit characterization of the least fixed point µ. f using only natural numbers:

Lemma 22.3 Assume that f is a continuous function on a cpo A. Then f has a least fixed point µ. f . Furthermore, µ. f can be described as a limit of a countable set of approximations:

µ. f = (lim n ∈ Nat • f n.⊥).

For a proof of this lemma we refer to standard textbooks on order theory, such as [50]. Lemma 22.3 shows that if f is continuous, then we need not go beyond the first infinite ordinal; the least fixed point µ. f is the limit of a chain of approximations indexed by the natural numbers. For statements, Lemma 22.3 means that if (λX • S) is a continuous function, then • • the recursive statement (µ X S) is the limit (lim n ∈ Nat Sn), where S0 = abort and Sn+1 = S[X := Sn].

Colimits and Cocontinuity The dual notions of directedness and limits are codirectedness and colimits. Thus, C is codirected if 22.2. Continuity of Statements 367

(∀a, b ∈ C • ∃c ∈ C • c  a ∧ c  b), (codirectedness) and the meet of a codirected set is called its colimit. We write colim.C for the colimit of the set C. A partially ordered set A is a dual cpo if every codirected set has a greatest lower bound, i.e., if all colimits exist in A.

A linearly ordered set (such as Nat) is an example of a set that is both directed and codirected. Using colimits, we can define cocontinuity (the dual notion of continuity). A func- tion f ∈ A → B is cocontinuous if it preserves colimits. The properties of cocontinuous functions on dual cpos are in every respect dual to the properties of continuous functions on cpos. Thus the greatest fixed point of a cocontinuous function on a dual cpo is the colimit

ν. f = (colim.n ∈ Nat • f n.). In particular, a complete lattice is a dual cpo, so this construction can be used for the greatest fixed point of a cocontinuous function on a complete lattice.

22.2 Continuity of Statements

We shall now investigate which basic statements are continuous and which state- ment constructors preserve continuity. This allows us to identify statement sub- classes that generate continuous statements.

Basic Statements Since limits are a special kind of join, universally disjunctive predicate transformers are always continuous (note that since the empty set is directed, a continuous predicate transformer is always strict). Hence, we get the following result as an immediate consequence of Theorem 16.1:

Corollary 22.4 The assert statement, the functional update, and the angelic update are all contin- uous. The demonic update need not be continuous, however. As a counterexample, consider the demonic update  [x := x | T] .

This chaotic assignment assigns an arbitrary natural number to x.Define qi to be the predicate x < i, for i ∈ Nat. Then {qi | i ∈ Nat} is an ascending chain. 368 22. Continuity and Executable Statements

Consequently, {S.qi | i ∈ Nat} is also an ascending chain when S is monotonic. As an ascending chain is also a directed set, we have

•  (lim i ∈ Nat [x := x | T].qi )

={definitions of demonic update and qi }   (lim i ∈ Nat • (∀x • T ⇒ x < i)) ={reduction, order reasoning} (lim i ∈ Nat • F) ={property of join} false  However, with S = [x := x | T],

• S.(lim i ∈ Nat qi )

={definition of qi } S.(lim i ∈ Nat • x < i) ={pointwise extension} S.(∃i ∈ Nat • x < i) ={Nat is upward unbounded} S.true  ={[x := x | T] is terminating} true Thus, limits are not preserved, so statement S is not continuous. We can give the following very general conditions under which the demonic update is continuous. A relation R :  ↔  is said to be image-finite if {γ | R.σ.γ } is finite for each σ : . Theorem 22.5 The demonic update [R] is continuous if and only if R is total and image-finite.

Proof First assume that R is total and image-finite. Let the directed set {qi | i ∈ I } and the state σ be arbitrary. If the set is empty, then its limit is false and [R].false = false holds when R is total (Exercise 22.3). Otherwise, define a new directed set {pi | i ∈ I } by

pi = qi ∩ R.σ

for all i ∈ I . Since R is image-finite, the set {pi | i ∈ I } can contain at most a finite • number of distinct elements, which means that (lim i ∈ I pi ) = pn for some n. Then

• (lim i ∈ I pi ) = pn •  [R].(lim i ∈ I qi ).σ ≡{definition of [R]} 22.2. Continuity of Statements 369

• • (∀γ R.σ.γ ⇒ (lim i ∈ I qi ).γ ) ≡{pointwise extension, predicate properties}

• • (∀γ R.σ.γ ⇒ (∃i ∈ I qi .γ ∧ R.σ.γ ))

≡{definition of pi , pointwise extension} • • (∀γ R.σ.γ ⇒ (lim i ∈ I pi ).γ ) ≡{assumption}

• (∀γ R.σ.γ ⇒ pn.γ )

≡{definition of pn} • (∀γ R.σ.γ ⇒ qn.γ ∧ R.σ.γ ) ≡{predicate calculus}

• (∀γ R.σ.γ ⇒ qn.γ ) ≡{definition of [R]}

[R].qn.σ ⇒{∃introduction, witness i := n}

• (∃i ∈ I [R].qi .σ ) ≡{pointwise extension}

• (lim i ∈ I [R].qi ).σ

• • which shows that [R].(lim i ∈ I qi ) ⊆ (lim i ∈ I [R].qi ). Since the reverse ordering follows by monotonicity, we have shown that if R is image-finite, then [R] is continuous. In order to prove the reverse implication, we now assume that R is not image-finite. In particular, let σ be arbitrary but fixed and assume that R.σ.γ holds exactly for a set of distinct states γ ∈{γi | i ∈ Nat}. (This argument assumes that R.σ is countable; otherwise we have to partition R.σ into a countable number of nonempty subsets and change the argument accordingly.) Define the ascending chain {qi | i ∈ Nat} as follows:

qi ={γ j | j < i} .

Then qi is a proper subset of R.σ for all i.Wehave

• (lim i ∈ Nat [R].qi ).σ ≡{pointwise extension}

• (∃i ∈ Nat [R].qi .σ ) ≡{definition of demonic update}

• (∃i ∈ Nat R.σ ⊆ qi )

≡{qi is a proper subset of R.σ for all i} F 370 22. Continuity and Executable Statements

However,

• [R].(lim i ∈ Nat qi ).σ

≡{definition of qi } [R].(R.σ ).σ ≡{definition of [R]} (∀γ • R.σ.γ ⇒ R.σ.γ ) ≡{predicate calculus} T which shows that [R] cannot be continuous. This shows that if [R] is continuous, then R must be image-finite. 2

A basic constraint that we would expect any computing machine to satisfy is that it should not be able to choose among an infinite number of alternatives in finite time. If the machine attempts to make such a choice, then either the execution might not terminate, because it has to spend some minimum amount of time on each case, or else it systematically considers only a finite subset of the possible choices. In either case, the machine would be guaranteed to terminate when executing the demonic update [R] only if R is image-finite. In this way, continuity and executability are linked, with continuity being a necessary (but not sufficient) condition for executability. A more detailed discussion on executability, or computability,is outside the scope of this book.

Statement Constructors We shall now see that both sequential composition and joins preserve continuity, while meets in general do not. However, meets preserve continuity under the same condition as the demonic update statement does, i.e., when the choice is finite. Theorem 22.6 Sequential composition and joins of predicate transformers preserve continuity. Furthermore, S1  S2 is continuous if both S1 and S2 are continuous.

Proof For finite meet, assume that S1 and S2 are continuous and let {qi | i ∈ I } be a directed set of predicates. Then

• (S1  S2).(lim i ∈ I qi ).σ ≡{definition of meet}

• • S1.(lim i ∈ I qi ).σ ∧ S2.(lim j ∈ I q j ).σ ≡{continuity, predicate calculus}

• • (∃i ∈ I S1.qi .σ ) ∧ (∃ j ∈ I S2.q j .σ )

≡{for ⇒ choose k with qi ⊆ qk and q j ⊆ qk ; for ⇐ choose i = j = k} • (∃k ∈ I S1.qk .σ ∧ S2.qk .σ ) 22.3. Continuity of Derived Statements 371

is lim preserves lim {p} yes ; yes [p] no  yes1  f yes yes {R} yes ¬ no [R] yes1 1meet is finite 1 R total, image-finite

TABLE 22.1. Continuity of statements

≡{definition of meet}

• (lim k ∈ I (S1  S2).qk ).σ

which shows that S1  S2 is continuous. The proofs for sequential composition and join are left to the reader as exercises (Exercise 22.4). 2

Infinite meets do not necessarily preserve continuity. As a counterexample, con- sider statements Si = (x := i) for all i ∈ Nat. Each of these statements is uni- versally disjunctive and hence also continuous. However, their meet is the chaotic  assignment [x := x | T], which was already shown to be noncontinuous. We can summarize our investigation of continuity by adding a column that extends the tables of homomorphism properties given earlier. In Table 22.1 we state which homomorphic properties the constructs have and preserve.

22.3 Continuity of Derived Statements

We can consider a limit as a new statement constructor. Since limits are a restricted form of joins, we get some homomorphism properties as immediate consequences of Theorem 16.2. Theorem 22.7 Limits of predicate transformers preserve monotonicity, strictness, disjunctivity, and continuity. Proof The preservation of monotonicity, strictness, and disjunctivity follows directly from Theorem 16.2. For continuity, we leave it as an exercise for the reader to prove the general result that the limit of a set of continuous functions is a continuous function (Exercise 22.2). 2

The following counterexample shows that limits do not preserve conjunctivity (it is derived from the counterexample for least fixed points given in Section 19.3). Define a chain {Si : Unit %→ Nat | i ∈ Nat} of conjunctive predicate transformers • by Si = [λux x ≥ i]. The limit of this chain is the predicate transformer choose ;[λnx• x ≥ n], which is not conjunctive. 372 22. Continuity and Executable Statements

preserves  ⊥ ¬ lim lim yes yes no no yes no yes

TABLE 22.2. Homomorphic properties preserved by limits

Table22.2 presents a row for limit that can be added to Table16.2 of homomorphism properties. We now consider to what extent conditionals, iterations, and loops preserve conti- nuity.

Theorem 22.8 Both the deterministic and the nondeterministic conditional preserve continuity.

Proof The result follows immediately by rewriting the conditionals using asserts, guards, and sequential composition and applying Corollary 22.4 and Theorem 22.6. 2

Next, consider iterations and loops. Strong iteration, infinite repetition, while loops, and guarded iteration can all be defined as least fixed points, and hence as limits. Thus they should all preserve continuity. Theorem 22.9 Strong iteration, infinite repetition, while loops, and guarded iteration all preserve continuity. Proof We outline the proof for strong iteration; the other proofs are similar. Assume that S is a continuous predicate transformer. Then Sω is the limit of an ordinal- indexed sequence of approximations starting from abort. By ordinal induction, it is easily shown that every approximation in this sequence is continuous. For the base case, we know that abort is continuous. For the step case, it follows from earlier results that S ; X  skip is continuous if S is continuous. Finally, the limit case follows from Theorem 22.7. 2

As a counterexample showing that weak iteration does not preserve continuity, consider the statement x := x + 1. This statement is continuous, but (x := x + 1)∗ is [x := x | x ≥ x], which is not continuous. Table 22.3 summarizes continuity preservation of derived statements. Let us finally consider loops in the special case of continuity. Assume that predicate transformer S is continuous. Then the semantics of the loop while g do S od can be described using the limit of a countable chain of predicates:

Theorem 22.10 Define a sequence of functions {Hn | n ∈ Nat} recursively as follows:

H0.g.S.q = false , Hn+1.g.S.q = (¬ g ∩ q) ∪ S.(Hn.g.S.q). Then for continuous predicate transformer S and arbitrary predicates g and q, 22.4. Executable Statements 373

preserves lim • (λS1 S2 if g then S1 else S2 fi) yes • (λS1 ···Sn if g1 → S1 [] ··· [] gn → Sn fi) yes ω (λS • S ) yes ∗ (λS • S ) no ∞ (λS • S ) yes (λS • while g do S od) yes • (λS1 ···Sn do g1 → S1 [] ··· [] gn → Sn od) yes

TABLE 22.3. Continuity preservation of derived statements

• (while g do S od).q = (∃n Hn.g.S.q).

Proof First, we need to show that if S is continuous, then the function

(λX • if g then S ; X else skip fi)

is continuous. This follows from the fact that skip is continuous and that sequential and conditional composition both preserve continuity. Next, we use Lemma 22.3 to construct the sequence

S0 = abort , Sn+1 = if g then S ; Sn else skip fi ,

• for which while g do S od = (lim n ∈ Nat Sn). It is now possible to show (by induction on n, Exercise 22.8) that Sn.q = Hn.g.S.q for all n and from the definition • • of limits it follows that (lim n ∈ Nat Sn).q.σ ≡ (∃n Hn.g.S.q.σ ). 2

22.4 Executable Statements

Program derivation can be seen as taking us from an initial specification through a series of intermediate steps to a final program. We have not until now stated what is an acceptable form for the final program, particularly if we assume that it is to be executed by a computer. For execution on a real computer, our final programs have to be written in a real programming language with a real compiler for the computer in question. For our purposes here, this is too demanding. Instead, we will look at how to put together a simple, yet very powerful, programming language from the constructs that we already have defined. The requirements for the statements in this language are that they be continuous and that there be a standard way of executing these statements on a computer. 374 22. Continuity and Executable Statements

We will enlarge the notion of execution of a program by considering not only the computer that executes the program but also the user of the program, who executes the program in order to achieve some specific goals. The user takes on the role of the angel, while the computer has the role of the demon.

Extended Guarded Commands Language Dijkstra’s guarded commands language forms a suitable basis for our program- ming language. We define the basic guarded commands as statements built by the following syntax.

S ::= abort , (abortion) | skip , (empty statement) | x := e , (multiple assignment) | S1 ; S2 , (sequential composition) | if g1 → S1 [] ... [] gn → Sn fi , (conditional composition) | do g1 → S1 [] ... [] gn → Sn od . (iterative composition)

Here x = x1,...,xm is a list of program variables, e = e1,...,em a correspond- ing list of expressions, and g1,...,gn boolean expressions. We assume that the standard types of higher-order logic are available, in particular truth values, natural numbers, and integers, as well as records and arrays. Let us extend this simple language by permitting general assert statements, blocks with local variables, and procedure definitions and calls:

S ::= ... , |{b} , (assert) | begin var x := e ; S end , (block) | proc P(var x, val y) = S1 in S2 , (procedure) | P(x, e). (procedure call) Here x and y are lists of program variables, and e is a list of expressions. The procedure declaration may be recursive.

The basic statements (abort, the assert statement, skip, and the assignment state- ment) are all conjunctive and continuous predicate transformers (and hence also monotonic and strict). Sequential composition preserves both conjunctivity and continuity, and so do conditional and iterative composition (Exercises 22.7 and 22.9) and blocks. Procedure calls are conjunctive and continuous if the procedure bodies are. We consider the extended guarded command language as a small executable pro- gramming language (assuming that the expressions e in assignments and the guards g in assertions, conditionals, and iterations can be evaluated). Hence, these state- ments form a suitable goal for a program derivation. The demonic nondeterminism 22.4. Executable Statements 375 inherent in the conditional and iteration constructs can be understood as under- specification and may in actual computations be implemented by a deterministic choice among the enabled alternatives. As before, the syntax for the extended guarded commands language is not intended to define a new syntactic category but serves only to identify a certain subclass of statements. Hence, we may mix guarded commands with arbitrary statements, e.g., using demonic updates or joins of guarded commands as subcomponents. These constructs all have a well-defined meaning as monotonic predicate transformers.

Executing Guarded Commands The extended guarded commands describe a batch-oriented programming language in which the user can initiate a computation but thereafter can only wait to see what will be the result of the computation. The computer will do all the work and make all the choices. Consider first the execution of the abort statement. We assume that there is no information on how the computer executes this statement. Different computers may execute it in different ways, and the same computer need not execute it in the same way each time. As nothing is known about the execution of this statement, we can also say nothing about the effect that executing it has. In programming language terminology, abort is an undefined statement. It may have some effect, but we do not know what it is and we have no way of finding out what it does. It does not even have to terminate. Hence, there is no postcondition that we can be sure of achieving by executing the abort statement. The skip and assignment statements are executed by the computer as expected, the first having no effect at all on the present state, the other changing it by updating the state as indicated by the assignment. The sequential composition S1 ; S2 is also executed as expected, first executing S1 and then continuing by executing S2. The conditional statement is executed by first evaluating the guards and then choosing one of the statements for which the guard is true in the present state and executing this statement. If two or more guards are true in the state, then one of these is chosen, but we do not know which one. Different computers may choose in different ways. Thus, if we want to be certain to achieve some postcondition, we have to make certain that the postcondition is achieved no matter how the computer resolves the nondeterministic choice. The block statement is executed by allocating the local variables on a stack. Thus, the state space is enlarged so that it also supports the new program variables. The local variables are removed from the stack on block exit. The procedure call is also executed in the standard way. In all these cases, it is easy to see that the game interpretation and the standard implementation of these statements on a computer give the same result when one considers whether a certain postcondition can be reached with certainty in finite time. 376 22. Continuity and Executable Statements

Consider finally the iteration statement do g1 → S1 [] ··· [] gn → Sn od, assuming that the component statements are all continuous and conjunctive. This statement is equivalent to

• ( i ∈ Nat Doi ) (we write a join rather than a limit here, to emphasize that we have an angelic choice). Here

Do0 = abort , Doi+1 = [g1];S1 ; Doi ···[gn];Sn ; Doi  [¬ (g1 ∪···∪gn)] .

In the game semantics, the angelic choice is executed by the angel. If there is an i ≥ 0 such that Doi .true holds in the current state, then we have that Dom = Doi for each m ≥ i. The angel chooses an alternative Doi for which Doi .true holds in the current state. By the equivalence, it does not matter which alternative satisfying this condition is chosen, as all have the same effect. If there is no alternative for which this condition holds, then the angel has to choose one alternative nonetheless, but will lose the game eventually because the demon can play the game so that the abort statement inside the chosen Doi is eventually reached. After the choice, the game continues with playing the selected Doi . The notion of iteration in the execution of the iteration statement arises from the way in which we suppose that the computer simulates the angel in choosing the right alternative. We assume that the computer scans through the alternatives Do0, Do1, Do2,... in this order, until it finds a statement Don that satisfies the required condition. If such a statement is found, then execution continues with it. If, however, there is no such statement, then the scanning of the alternatives never terminates, because there are infinitely many of these. Hence, we get an infinite execution. The angel, on the other hand, can with infinite foresight immediately pick the right number of iterations and so chooses the right approximation directly. The different ways in which the computer and the angel execute a loop are illustrated in Figure 22.1. The computer execution of iteration is closely related to the game interpretation. The computation can be infinite (nonterminating), but it is always finitely wide for continuous statements (i.e., there is never a choice of more than a finite number of alternatives). In the game interpretation, there are no infinite computations, but there may be infinitely wide choices. If these choices are scanned in some order one at a time, then the infinitely wide choice point becomes a possible infinite computation. Even though the user plays the role of the angel and in principle should choose the right number of iterations, this is not necessary, because there is at most one right choice. If there is a number of iterations n that is sufficient for termination, then the computer can choose the right alternative Don by scanning through the alternatives, and thus do the job of the user. If no n is sufficiently large, then the computer will 22.5. Interactive Guarded Commands 377

[g];S [ g] [g];S [g];S [g];S [ g] [g];S [ g] [g];S [g];S [ g] [g];S [ g] [g];S [ g]

[ g] [ g]

FIGURE 22.1. Iteration versus infinite angelic choice

be stuck in an infinite loop. The result is then the same; i.e., no postcondition is established.

22.5 Interactive Guarded Commands

The extended guarded commands defined above are essentially batch oriented, as there is no real place for user interaction. By adding angelic statements to the language, we can make it into an interactive programming language. Let us extend the language as follows: S ::= ... , |{x := x | b} , (input statement) | if g1 :: S1 [] ··· [] gn :: Sn fi , (conditional choice) | do g1 :: S1 [] ··· [] gn :: Sn od . (iterative choice) The input statement is the angelic assignment statement. We interpret {x := x | b} as a request to the user to supply a new value x for the program variable x. This new value must satisfy the condition b in the current state. The conditional choice construct was already mentioned in the introduction: ∧ if g1 :: S1 [] ··· [] gn :: Sn fi = {g1} ; S1 ··· {gn} ; Sn . The effect of the conditional choice is to let the user choose one of the alternatives Si for which the enabledness condition gi is true in the current state. The iterative choice was also mentioned in the introduction, as describing an event loop: ∧ do g1 :: S1 [] ··· [] gn :: Sn od = • (µ X {g1} ; S1 ; X ··· {gn} ; Sn ; X skip). 378 22. Continuity and Executable Statements

The effect of the iterative choice is to let the user repeatedly choose an alternative that is enabled and have it executed until the user decides to stop. Using distributivity, the iterative choice can be written in the form (µ X • S;X skip) (note that this is the dual construct of weak iteration). We have

(µ X • S ; X skip).q = (µ x • S.x ∪ q) by a similar argument as in the proof of Lemma 21.1. The interactive guarded commands are all continuous, but they need not be con- junctive. Interactive guarded commands extend the guarded commands language further, so that ongoing interaction between the user and the computer can be described. The user may, for instance, give the input data as needed but not all at once as a batch-oriented execution requires. The user may determine the next input based on the decisions the computer has taken before. Also, the user may use alternative commands as menu choices, picking the alternatives that will further the goals that he or she is presently pursuing. In particular, this means that the same interactive guarded command can be used for different purposes on different occasions. The following example illustrates the interactive iterative choice:

S = do z ≥ 2::z := z − 2[]z ≥ 5::z := z − 5 od . Intuitively, this is executed as follows. On every iteration, the user can choose to decrease z by 2 or 5, if possible, or to stop. The predicate S.(z = 0) holds in those states from which it is possible to make a sequence of choices such that z = 0 holds when we choose to stop. An intuitive argument indicates that we should get S.(z = 0) = (z = 0 ∨ z = 2 ∨ z ≥ 4). We have S.(z = 0) = (µ x • ({z ≥ 2} ; z := z − 2 {z ≥ 5} ; z := z − 5).x ∨ (z = 0)) .

When we compute approximations pi for this least fixed point, we find

p0 = false , p1 = (z = 0), p2 = (z = 0 ∨ z = 2 ∨ z = 5), p3 = (z = 0 ∨ z = 2 ∨ z = 4 ∨ z = 5 ∨ z = 7 ∨ z = 10), p3 = (z = 0 ∨ z = 2 ∨ 4 ≤ z ≤ 7 ∨ z = 9 ∨ z = 10 ∨ z = 12 ∨ z = 15),

and so on. Every approximation pn holds in those states from which it is possible to reach z = 0 in fewer than n steps by making suitable choices. We show the calculation of p2: 22.5. Interactive Guarded Commands 379

p2 ={definition of approximation}

({z ≥ 2} ; z := z − 2 {z ≥ 5} ; z := z − 5).p1 ∨ (z = 0)

={assumption p1 = (z = 0), statement definitions} (z ≥ 2 ∧ z − 2 = 0) ∨ (z ≥ 5 ∧ z − 5 = 0) ∨ (z = 0) ={arithmetic} z = 2 ∨ z = 5 ∨ z = 0

Every new approximation pn+1 adds disjuncts z = z0 + 2 and z = z0 + 5if z = z0 is a disjunct of pn. By induction and by computing limits, we find that • (∪i : Nat pi ) = (z = 0 ∨ z = 2 ∨ z ≥ 4). This can be seen as a (computational) proof of the fact that any natural number greater than or equal to 4 can be written in the form 2x + 5y for suitable natural numbers x and y.

General Specification Language The extensions to the language of guarded commands that we have defined above essentially include all statement constructs we have defined earlier, except magic and guard statements, demonic assignment, and arbitrary indexed meets and joins. We could even add finite meets and arbitrary indexed joins, and the language would still be continuous. For practical purposes, the role of the former is handled by conditional composition and the role of the latter by recursion and iterative choice. Furthermore, arbitrary joins are not finitary, so we prefer not to consider them as executable statements. We could also include finite demonic assignment, but the problem there is that it may not be possible to decide whether an assignment is finite or not. This justifies excluding demonic assignments from the class of executable statements.

Guards, magic, and demonic assignments may be used in specification state- ments in program derivations. Let us extend our programming language with these constructs to get a specification language:

S ::= ... , | magic , (magic) | [b] , (guard statement) | [x := x | b] . (demonic assignment) Statements in the specification language are not necessarily continuous, but they are monotonic. Hence, we are free to replace any component statement with its refinement, preserving correctness of the whole statement. We have shown earlier how to use specification statements in program derivations making essential use of this monotonicity property. We will analyze specification statements in more 380 22. Continuity and Executable Statements

Statements

Specification commands

Interactive commands

Extended guarded commands

FIGURE 22.2. Guarded commands extensions

detail in Chapter 27. The languages defined here form a hierarchy as shown in Figure 22.2. Note that specifying programs by pre- and postconditions is not sufficient anymore for interactive guarded commands. For extended guarded commands they are suf- ficient, as will be shown in Chapter 26 when we study normal forms for different classes of statements. However, when the programs may involve nontrivial user interaction, then we need more expressive specifications for programs.

22.6 Summary and Discussion

In this chapter we have studied continuous statements more carefully. We have noted the central role that finiteness plays for continuity. We have also identified the homomorphism properties of continuous statements. Continuity as a general property of cpos and lattices is described in more detail in Davey and Priestley [50]. The connection between continuity and fixed points as countable limits has a long tradition in denotational semantics. Models of programs have often restricted attention to continuous semantics. Continuity has been used to model implementability; noncontinuous constructs can generally not be imple- mented on a computer. Continuity is also a requirement for the solution of recursive domain equations, which appear in most theories of denotational semantics (see Stoy [134] or Schmidt [125]). In Dijkstra’s original weakest precondition calculus, continuity was identified as one of the “healthiness conditions” that all programs were assumed to satisfy [53]. This made it possible to define the semantics of iteration using limits of chains indexed by the natural numbers. Back introduced a noncontinuous nondeterministic 22.7. Exercises 381

assignment [6] and used infinitary logic to reason about programs [10]. A number of other authors also challenged Dijkstra’s definition of the semantics of loops; they used either abstract fixed-point properties (Hehner [74]) or limits of ordinal-indexed chains (Boom [41]). In general, one could argue that continuity is much overrated as a semantic restric- tion. Most arguments go through quite easily without the continuity assumption. In general, our view is that complete lattices with monotonic functions provide a much better foundation (both easier and more intuitive) for programming lan- guage semantics than cpos with continuous functions. It is interesting to note that Scott’s original formulation of the fixed-point theory for program semantics did use complete lattices (and continuous functions). The complete lattices were dropped, because a cpo was a weaker structure that was sufficient for the theory, and the miraculous top element was difficult to justify in the context of a semantics that essentially was concerned only with partial functional statements. In the more gen- eral framework of the refinement calculus, the complete lattices are well justified, and the mathematical framework is simpler and more regular. Continuity can be assumed whenever needed, as this chapter has tried to show, but there is no point in restricting the mathematical treatment to only continuous statements. We have looked at the mathematical properties of continuous statements and have also defined a simple yet very expressive programming language that can be con- sidered as executable. The extended guarded commands add some features to the traditional guarded commands, like blocks with local variables and procedures. The interactive guarded commands are a real strengthening of the expressive power of the language. They allow us to describe permitted user interaction with a com- puter and also to analyze what the user can achieve through this interaction. The interactive guarded commands are continuous but not conjunctive, and can be seen as executable if a wider interpretation of this term is used. The analogue here is computability with an oracle that is studied in computability theory.

22.7 Exercises

22.1 Show that if f is a monotonic function and K is a directed set, then im. f.K is also directed. 22.2 Show that limits preserve continuity; i.e., if F is a directed set of continuous functions on a cpo, then ( f ∈ F • f ) is continuous. 22.3 Show that if the relation R is total, then [R].false = false. 22.4 Complete the proof of Theorem 22.6. 22.5 Continuity of a function f from a cpo (A, ) to a cpo (B, ) is sometimes defined using chains indexed by natural numbers (ω-chains) rather than directed sets. We say that f is ω-continuous if it preserves limits of ω-chains, i.e., if 382 22. Continuity and Executable Statements

• • f.( n ∈ Nat xn) = ( n ∈ Nat f.xn)

whenever x1  x2 ···. Show that

(a) If f is continuous then it is ω-continuous.

(b) If f is strict and ω-continuous and A and B are countable sets, then f is continuous. Hint: a set A is countable if there exists a bijection (a function that is one-to-one and onto) from Nat to A. 22.6 Let f be the function on predicates over Nat defined by

f.q = (λx : Nat • x = 0 ∨ q.(x − 1)) . Show that f is continuous and compute µ. f . Furthermore, show that f is cocon- tinuous (in fact, f preserves arbitrary meets) and compute ν. f . 22.7 Show that nondeterministic conditional composition preserves continuity. 22.8 Fill in the details of Theorem 22.10. ω 22.9 Show that if S is continuous, then S and while g do S od are continuous. This is page 383 Printer: Opaque this 23

Working with Arrays

In this chapter we look at how to construct programs that work on more advanced data structures, concentrating in particular on arrays. No new theory needs to be introduced for this purpose, as all the tools needed have already been built. We illustrate array manipulation techniques by constructing some searching and sorting algorithms. The case studies in this chapter also give us the opportunity to show how to structure somewhat larger program derivations.

23.1 Resetting an Array

Let us start with a very simple problem: how to set all elements in an array A : array Nat of Nat to zero in the range m..n (recall that m..n is an abbreviation for the set { j | m ≤ j ≤ n}). This is done in the obvious way, by scanning through the elements in A from m to n and setting each element to zero. We assume that m ≤ n, so that the range in question is nonempty. We show how to derive this solution in full detail. The other algorithms will use the same pattern of reasoning, so it is good to look at this first one in a very simple context. Let us define ∧ allreset.A.m.n = (∀i | m ≤ i ≤ n • A[i] = 0). Our task is to find an implementation for the specification   [A := A | allreset.A .m.n] .  Note that if n < m then this specification is equivalent to [A := A | T], since  allreset.A .m.n is then trivially satisfied. 384 23. Working with Arrays

Introduce Loop We start by introducing a new local variable that we will use as a loop counter:   [A := A | allreset.A .m.n]; {introduce local variable j} begin var j : Nat ; j := m ;    “ { j = m} ;[A, j := A , j | allreset.A .m.n]” end

Next, we introduce the loop itself. The loop invariant is in this case very simple. It states that the initial part of the array, up to j − 1, has already been set to zero. In addition, it must state that the termination function is bounded from below (in this case by zero). We choose n + 1 − j as the termination function. Let us introduce an abbreviation for the invariant:

∧ resetbelow.A. j.m.n = (∀i | m ≤ i < j • A[i] = 0) ∧ j ≤ n + 1 . We have that    { j = m} ;[A, j := A , j | allreset.A .m.n] {introduce loop, choose n + 1 − j as termination function} while j ≤ n do “ {resetbelow.A. j.m.n ∧ j ≤ n} ;     [A, j := A , j | resetbelow.A . j .m.n ∧  n + 1 − j < n + 1 − j]” od

We need to check that the conditions for loop introduction are satisfied:

(i) j = m ⇒ resetbelow.A. j.m.n , (ii) resetbelow.A. j.m.n ∧ j > n ⇒ allreset.A.m.n .

The initialization condition holds, because

j = m  (∀i | m ≤ i < j • A[i] = 0) ∧ j ≤ n + 1 ≡{assumption} (∀i | m ≤ i < m • A[i] = 0) ∧ m ≤ n + 1 ≡{empty range, m ≤ n} T 23.1. Resetting an Array 385

The exit condition again holds, because (∀i | m ≤ i < j • A[i] = 0) ∧ j ≤ n + 1 ∧ j > n  (∀i | m ≤ i ≤ n • A[i] = 0) ≡{assumption implies j = n + 1} (∀i | m ≤ i < j • A[i] = 0) ≡{assumption} T

Introduce Assignment We then implement the loop body. This is now done with a single assignment statement, as follows: {resetbelow.A. j.m.n ∧ j ≤ n} ;      [A, j := A , j | resetbelow.A . j .m.n ∧ n + 1 − j < n + 1 − j] {introduce assignment statement} A[ j], j := 0, j + 1

We have to show that the assignment introduction is correctly done: resetbelow.A. j.m.n ∧ j ≤ n ⇒    resetbelow.A . j .m.n ∧ n + 1 − j ≤ n + 1 − j , where

A = A( j ← 0), j  = j + 1 . We have (∀i | m ≤ i < j • A[i] = 0) ∧ j ≤ n + 1 ∧ j ≤ n  (∀i | m ≤ i < j + 1 • A( j ← 0)[i] = 0) ∧ j + 1 ≤ n + 1 ∧ n + 1 − ( j + 1)

Note how the array property is proved by splitting the range of the universal quantifier into two parts, one for the index being changed (here j) and the other for the other index values, which are not changed by the array assignment.

Final Implementation Let us define the procedure proc Reset(var A, val m, n) = begin var j : Nat ; j := m ; while j ≤ n do A[ j], j := 0, j + 1 od end

This procedure is the solution to our programming problem. We have that   [A := A | allreset.A .m.n]  Reset(A, m, n) for any program variable A and any natural number expressions m and n.

23.2 Linear Search

Assume that we want to find an index i pointing to the smallest element in the array segment A[m..n], where m ≤ n (so the segment is nonempty). We do this with a linear search where i is initially equal to m and updated whenever a value smaller than A[i] is found. We consider A : array Nat of Nat and m, n : Nat to be constants with m ≤ n, while i : Nat is a global program variable. The local variable k is used to index array values during the search. We define minat.i.A.m.n to hold when the minimum value in A[m..n] is found at A[i]: ∧ minat.i.A.m.n = m ≤ i ≤ n ∧ (∀ j | m ≤ j ≤ n • A[i] ≤ A[ j]). Then   {m ≤ n} ;[i := i | minat.i .A.m.n] {introduce local variable k} begin var k := m + 1;    “ {m ≤ n ∧ k = m + 1} ;[i, k := i , k | minat.i .A.m.n]” end {add initial assignment, use assumption m ≤ n} 23.2. Linear Search 387

···i := m ;“{i = m ∧ k = i + 1 ∧ i ≤ n} ;    [i, k := i , k | minat.i .A.m.n]”··· {introduce loop} ···while k ≤ n do “ {k ≤ n ∧ k ≤ n + 1 ∧ minat.i.A.m.k} ;      [i, k := i , k | k ≤ n + 1 ∧ minat.i .A.m.k ∧  n + 1 − k < n + 1 − k]” od··· {simplify condition in assertion} ···{k ≤ n ∧ minat.i.A.m.k} ;      [i, k := i , k | k ≤ n + 1 ∧ minat.i .A.m.k ∧  n + 1 − k < n + 1 − k]··· {introduce conditional} if k ≤ n ∧ A[k] < A[i] → {A[k] < A[i] ∧ k ≤ n ∧ minat.i.A.m.k} ;      [i, k := i , k | k ≤ n + 1 ∧ minat.i .A.m.k ∧  n + 1 − k < n + 1 − k] [] k ≤ n ∧ A[k] ≥ A[i] → {A[k] ≥ A[i] ∧ k ≤ n ∧ minat.i.A.m.k} ;      [i, k := i , k | k ≤ n + 1 ∧ minat.i .A.m.k ∧  n + 1 − k < n + 1 − k] fi {introduce assignments} if k ≤ n ∧ A[k] < A[i] → i, k := k, k + 1 [] k ≤ n ∧ A[k] ≥ A[i] → k := k + 1 fi ={substitute back, switch to guarded iteration} begin var k := m + 1;i := m ; do k ≤ n ∧ A[k] < A[i] → i, k := k, k + 1 [] k ≤ n ∧ A[k] ≥ A[i] → k := k + 1 od end

We leave the details of the loop introduction and assignment introduction steps to the reader. The invariant and variant of the introduced loop are seen from the 388 23. Working with Arrays

body of the loop; the invariant is k ≤ n + 1 ∧ minat.i.A.m.k, while the variant is n + 1 − k. We summarize the result of our derivation in a procedure definition. We define proc FindMin(var i, val A, m, n) = begin var k := m + 1;i := m ; do k ≤ n ∧ A[k] < A[i] → i, k := k, k + 1 [] k ≤ n ∧ A[k] ≥ A[i] → k := k + 1 od end Then we know that   {m ≤ n} ;[i := i | minat.i .A.m.n]  FindMin(i, A, m, n) for arbitrary program variable i and expressions A, m, and n of appropriate types.

23.3 Selection Sort

We now derive a sorting program for the array A[0..n−1], where we assume n > 0. The idea of the algorithm is to let i go from 0 to n − 1, so that A[0..i − 1] contains the i smallest elements sorted, while the rest of the elements are in A[i..n − 1]. At each step, we find the least element in A[i..n − 1] and swap this with A[i] before incrementing i. This method is known as selection sort. We introduce abbreviations as follows: ∧ perm.A.B.n = permutation.A.(0..n − 1).B.(0..n − 1), ∧ sort.A.n = sorted.A.(0..n − 1), ∧ part.A.i.n = sorted.A.(0..i − 1) ∧ (∀hk| 0 ≤ h ≤ i < k < n • A[h] ≤ A[k]), Thus perm.A.B.n means that B[0..n − 1] is a permutation of A[0..n − 1], sort.A.n means that A[0..n − 1] is sorted and part.A.i.n means that A is partitioned at i so that A[0..i − 1] is sorted and all its elements are smaller than all elements in A[i..n − 1]. The specification that we want to implement is expressed in terms of these notions as    [A := A | sort.A .n ∧ perm.A.A .n] .

The derivation uses a specification constant A0 to stand for the initial value of the array A. A program implementing the given specification is derived as follows: 23.3. Selection Sort 389

   {A = A0} ;[A := A | sort.A .n ∧ perm.A.A .n] {introduce local variable i} begin var i := 0;     “ {i = 0 ∧ A = A0} ;[A, i := A , i | sort.A .n ∧ perm.A.A .n]” end ={rewrite using context assertion}     ···{i = 0 ∧ A = A0} ;[A, i := A , i | sort.A .n ∧ perm.A0.A .n]··· {introduce loop} ···while i < n − 1 do

“ {i < n − 1 ∧ 0 ≤ i < n ∧ perm.A0.A.n ∧ part.A.i.n} ;       [A, i := A , i | 0 ≤ i < n ∧ perm.A0.A .n ∧ part.A .i .n ∧  n − i < n − i]” od··· {make assignment more deterministic, use assertion}

···{0 ≤ i < n − 1 ∧ perm.A0.A.n ∧ part.A.i.n} ;      “[A, i := A , i | i = i + 1 ∧ perm.A0.A .n ∧ part.A .(i + 1).n]”··· {add trailing assignment}    ···[A := A | perm.A0.A .n ∧ part.A .(i + 1).n];i := i + 1··· ={substitute back} begin var i := 0; while i < n − 1 do

{0 ≤ i < n − 1 ∧ perm.A0.A.n ∧ part.A.i.n} ;   [A := A | perm.A0.A.n ∧ part.A .(i + 1).n];i := i + 1 od end {introduce local variable j} begin var i, j := 0, 0; while i < n − 1 do   [ j := j | minat. j .A.i.n];

“ {minat. j.A.i.n ∧ 0 ≤ i < n − 1 ∧ perm.A0.A.n ∧ part.A.i.n} ;    [A := A | perm.A0.A .n ∧ part.A .(i + 1).n]; ” i := i + 1 od 390 23. Working with Arrays

end {introduce assignment} begin var i, j := 0, 0; while i < n − 1 do   [ j := j | minat. j .A.i.n]; A := swap.i. j.A ; i := i + 1 od end {procedure call rule} begin var i, j := 0, 0; while i < n − 1 do FindMin( j, A, i, n − 1) ; A := swap.i. j.A ; i := i + 1 od end We need to prove that the conditions for introducing the loop and introducing the assignment in the next-to-last step are satisfied. For the assignment introduction, the proof obligation is

minat. j.A.i.n ∧ 0 ≤ i < n − 1 ∧ perm.A0.A.n ∧ part.A.i.n ⇒   perm.A0.A .n ∧ part.A .(i + 1).n , where  A = swap.i. j.A . The first conjunct of the conclusion follows by the results of Exercise 10.11. The rest of this proof obligation and the proof for the loop introduction are left as an exercise (Exercise 23.2).

23.4 Counting Sort

We consider next another approach to sorting, which gives a sorting time that is linear in the number of items to be sorted and the range of the sorted values. We need to know beforehand that all values are integers within a given range. The price to be paid for the efficiency in sorting time is that space efficiency is rather poor: the algorithm needs additional space for sorting that is linear in the length of the integer range. 23.4. Counting Sort 391

The Sorting Problem We assume that m, n, l, and k are given natural numbers. The purpose of the program is to sort the elements in the array segment A[m..n], where A : array Nat of Nat . We assume that the array range is not empty and that the values in A are all in the range l..k: m ≤ n ∧ (∀i | m ≤ i ≤ n • l ≤ A[i] ≤ k). The array A will not be changed by the program. Instead, we assume that the sorted array is constructed in another array segment B[m..n]: B : array Nat of Nat .

We define a few abbreviations before we formulate the sorting problem in more precise terms. Assume that A is an array with the natural numbers as index set. A subarray of A is determined by an index range m..n. We describe the subarray as a triple (A, m, n). We will write such a triple more suggestively as A[m..n]. We write sort.X[a..b] for sorted.X.(a..b) and just sort.X for sorted.X[m..n]. This indicates that the index range m..n is the one we are chiefly interested in. Similarly, we write occur.i.X[a..b] for occurrence.i.X.(a..b) (the number of occurrences of i in X[a..b]) and just occur.i.X for occur.i.X[m..n]. Finally, we write perm.X.Y for permutation.X.(m..n).Y.(m..n). With these abbreviations, we are ready to give the specification of the problem: Construct an array B that is sorted and whose values form a permutation of the values in A, i.e.,    [B := B | sort.B ∧ perm.A.B ] . Here, A, m, n, k, and l are constants (meaning that we can take the restrictions on them as global assumptions rather than initial assertions), while B is the only global program variable.

A First Solution The basic solution idea is the following. We introduce an auxiliary array C : array Nat of Nat; that contains an entry for each value in the range l..k. Thus we only use the array segment C[k..l], which is what makes the space complexity of the algorithm linear in the length of the range. We then record in C[i] the number of times the value i occurs in the array A. This step will establish the condition ∧ countedocc.C = (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m..n]). 392 23. Working with Arrays

From this we can compute the sorted array B. One possible way is to scan through the value range i = l, l + 1,...,k − 1, k, and successively insert the number C[i] of values i into B for each i. However, this would make our algorithm linear in the value range, which usually would be much larger than the size of the array A itself. We will therefore derive another method below, which will be linear in the size of the array A. Based on these considerations, we derive an initial solution to the programming problem:    [B := B | sort.B ∧ perm.A.B ] {introduce local variable C} begin var C : array Nat of Nat ;     “[B, C := B , C | sort.B ∧ perm.A.B ]”; end {introduce sequential composition} begin var C : array Nat of Nat ;   “[C := C | countedocc.C ]”; {countedocc.C} ;     [B, C := B , C | sort.B ∧ perm.A.B ] end

Let us next consider how to implement the first step, counting the occurrences of the different values in array A. We decide to first reset the array elements in C to zero before starting the actual counting:   [C := C | countedocc.C ] {introduce sequential composition} Reset(C, l, k) ; {allreset.C.l.k} ;   [C := C | countedocc.C ]

Here the first step establishes the condition allreset.C.l.k which then can be assumed by the second step. Thus, we arrive at the following first implementation of our specification:    [B := B | sort.B ∧ perm.A.B ] {above derivation} begin var C : array Nat of Nat ; Reset(C, l, k) ; {allreset.C.l.k} ; 23.4. Counting Sort 393

  [C := C | countedocc.C ]; {countedocc.C} ;     [B, C := B , C | sort.B ∧ perm.A.B ]; end

This reduces our task to implementing the following two steps: counting the value occurrences and then constructing the final sorted array.

Counting Occurrences Our next task is to count the occurrences of the different values in A. This is simply done by scanning through the array A and updating the appropriate counter in C each time. We give the derivation first and then show that the individual steps are indeed correct.

{allreset.C.l.k} ;   [C := C | countedocc.C ]; {introduce local variable j} begin var j : Nat ; j := m ; “ {allreset.C.l.k ∧ j = m} ;    [C, j := C , j | countedocc.C ]”; end {introduce loop, choose n + 1 − j as termination function} ···while j ≤ n do “ {countedbelow.C. j ∧ j ≤ n} ;      [C, j := C , j | countedbelow.C . j ∧ n + 1 − j < n + 1 − j]” od ··· {introduce assignment statement} ··· C[A[ j]], j := C[A[ j]] + 1, j + 1 ··· {substitute back} begin var j : Nat ; j := m ; while j ≤ n do C[A[ j]], j := C[A[ j]] + 1, j + 1 od end

Here we have used the abbreviation countedbelow,defined as follows: ∧ countedbelow.C. j = (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m.. j − 1]) ∧ j ≤ n + 1 . 394 23. Working with Arrays

Loop Introduction Conditions The derivation again leaves us with two conditions to be proved: loop introduction and assignment introduction. Consider first the loop introduction. We have to show that

(i) allreset.C.l.k ∧ j = m ⇒ countedbelow.C. j , (ii) countedbelow.C. j ∧ j > n ⇒ countedocc.C .

The invariant here states that the elements in the initial part of the array, up to j −1, have already been counted. The initialization condition holds, because

(∀i | l ≤ i < j • C[i] = 0) ∧ j = m  (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m.. j − 1]) ∧ j ≤ n + 1 ≡{assumption j = m} (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m..m − 1]) ∧ m ≤ n + 1 ≡{array has empty range, m ≤ n} (∀i | l ≤ i ≤ k • C[i] = 0) ∧ T ≡{assumption} T

The exit condition again holds, because

(∀i | l ≤ i ≤ k • C[i] = occur.i.A[m.. j − 1]) ∧ j ≤ n + 1 ∧ j > n  (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m..n]) ≡{assumption implies j = n + 1} (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m.. j − 1]) ≡{assumption} T

Assignment Introduction Conditions Finally, we have to show that the assignment introduction is correctly done. This means that we have to show that

countedbelow.C. j ∧ j ≤ n ⇒    countedbelow.C . j ∧ n + 1 − j < n + 1 − j , where

C = C(A[ j] ← C[A[ j]] + 1), j  = j + 1 . 23.4. Counting Sort 395

We have (∀i | l ≤ i ≤ k • C[i] = occur.i.A[m.. j − 1]) ∧ j ≤ n + 1 ∧ j ≤ n  (∀i | l ≤ i ≤ k • C(A[ j] ← C[A[ j]] + 1)[i] = occur.i.A[m.. j]) ∧ j + 1 ≤ n + 1 ∧ n + 1 − ( j + 1)

≡{assumption } T

The range split method thus works well also in the more general case where we have an array index that itself is an array access. In this way, we can work with indirect references in an array without any further complications.

Computing Cumulative Occurrences Let us finally construct the sorted array B from the counter array C. The method we will use is as follows. Making use of the fact that C[i] records the number of occurrences of i in array A,wefirst change C so that C[i] records the number of occurrences of values less than or equal to i in A. Let us for this purpose define

∧ cumsum.i.X[a..b] = (# j | a ≤ j ≤ b • X[ j] ≤ i). We write cumsum.i.X for cumsum.i.X[m..n]. The first step will thus be to establish the condition

∧ countedcum.C = (∀i | l ≤ i ≤ k • C[i] = cumsum.i.A). After this step, we know what the index of each element A[ j] = i should be in the sorted array B. There are C[i] values in A that are less than or equal to i. If there is only one element i = A[ j]inA, then this element should obviously be placed in position C[i] in array B. If there are more elements with value i in A, then these other elements should be placed immediately below C[i] in array B. These considerations lead to the following derivation:

{countedocc.C} ;     [B, C := B , C | sort.B ∧ perm.A.B ]; {introduce sequential composition} “ {countedocc.C} ;   [C := C | countedcum.C ]”; {countedcum.C} ;     [B, C := B , C | sort.B ∧ perm.A.B ]

The implementation of the first step is as follows:

{countedocc.C} ;   [C := C | countedcum.C ] 23.4. Counting Sort 397

{exercise} begin var j : Nat ; j := l + 1; while j ≤ k do C[ j], j := C[ j] + C[ j − 1], j + 1 od end

We leave this step as an exercise for the reader, and concentrate on the final, most difficult step in the last section.

Constructing the Sorted Array We will here first give the derivation of the final step and then justify the steps in more detail. We have

{countedcum.C} ;     [B, C := B , C | sort.B ∧ perm.A.B ]; {introduce local variable} begin var j : Nat ; j := n ; “ {countedcum.C ∧ j = n} ;      [B, C, j := B , C , j | sort.B ∧ perm.A.B ]” end {introduce loop} ···while j ≥ m do “ {movedpart. j.B.C ∧ j ≥ m} ;        [B, C, j := B , C , j | movedpart. j .B .C ∧ j < j]” od ··· {introduce assignment} ··· B[C[A[ j]]], C[A[ j]], j := A[ j], C[A[ j]] − 1, j − 1 ··· {substitute back} begin var j : Nat ; j := n ; while j ≥ m do B[C[A[ j]]], C[A[ j]], j := A[ j], C[A[ j]] − 1, j − 1 od end 398 23. Working with Arrays

Justifying the Loop Introduction The invariant for the loop introduction describes the fact that the elements in A[ j + 1..n] have already been moved to B and that the counters in C have been adjusted appropriately, so that C[i] now counts the number of values smaller than or equal to i in the remaining array A[m.. j]: ∧ movedpart. j.B.C = (∀i | l ≤ i ≤ k • occur.i.A[m.. j] = C[i] − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ∧ m − 1 ≤ j ≤ n The termination function is simply j. As usual, we have to show that the invariant holds initially and that the exit condition holds when the loop is terminated. For the entry condition, we have to show that

countedcum.C ∧ j = n ⇒ movedpart. j.B.C . We have

(∀i | l ≤ i ≤ k • C[i] = cumsum.i.A) ∧ j = n  (∀i | l ≤ i ≤ k • occur.i.A[m.. j] = C[i] − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ∧ m − 1 ≤ j ≤ n ≡{assumption j = n} (∀i | l ≤ i ≤ k • occur.i.A[m..n] = C[i] − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ∧ T ≡{assumption, abbreviation A for A[m..n]} (∀i | l ≤ i ≤ k • occur.i.A = cumsum.i.A − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ cumsum.i.A < r ≤ cumsum.i.A • B[r] = i) ∧ T ≡{definition of cumsum, empty range} T

For the exit condition, we have to show that

movedpart. j.B.C ∧ j < m ⇒ sort.B ∧ perm.A.B . This is shown as follows:

(∀i | l ≤ i ≤ k • occur.i.A[m.. j] = C[i] − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ∧ m − 1 ≤ j ≤ n ∧ j < m  T 23.4. Counting Sort 399

⇒{assumptions, j = m − 1} (∀i | l ≤ i ≤ k • occur.i.A[m..m − 1] = C[i] − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ⇒{occur.i.A[m..m − 1] = 0} (∀i | l ≤ i ≤ k • C[i] = cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ cumsum.(i − 1).A < r ≤ cumsum.i.A • B[r] = i) ⇒{cumsum.l.A ≤ cumsum.(l + 1).A ≤···≤cumsum.k.A} (∀ jr| m ≤ j ≤ r ≤ n • B[ j] ≤ B[r]) ∧ (∀i | l ≤ i ≤ k • occur.i.B = cumsum.i.A − cumsum.(i − 1).A) ≡{definition of sorted and cumsum} sort.B ∧ (∀i | l ≤ i ≤ k • occur.i.B = occur.i.A) ≡{definition of perm} sort.B ∧ perm.A.B

Justifying the Assignment Statement Finally, we need to justify the assignment introduction. We need to show that

    movedpart. j.B.C ∧ j ≥ m ⇒ movedpart. j .B .C ∧ j < j , where

j  = j − 1 , h = A[ j] , C = C(h ← C[h] − 1), B = B(C[h] ← h). The proof is as follows: (∀i | l ≤ i ≤ k • occur.i.A[m.. j] = C[i] − cumsum.(i − 1).A) ∧ (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ∧ m − 1 ≤ j ≤ n ∧ j ≥ m   (∀i | l ≤ i ≤ k • occur.i.A[m.. j − 1] = C [i] − cumsum.(i − 1).A) ∧   (∀ir| l ≤ i ≤ k ∧ C [i] < r ≤ cumsum.i.A • B [r] = i) ∧ m − 1 ≤ j − 1 ≤ n ∧ j − 1 < j ≡{assumption m ≤ j}  “ (∀i | l ≤ i ≤ k • occur.i.A[m.. j − 1] = C [i] − cumsum.(i − 1).A) ” ∧   (∀ir| l ≤ i ≤ k ∧ C [i] < r ≤ cumsum.i.A • B [r] = i) ∧ T ∧ T ≡{range split} ···(∀i | l ≤ i ≤ k ∧ i = h •  occur.i.A[m.. j − 1] = C [i] − cumsum.(i − 1).A) ∧ 400 23. Working with Arrays

 occur.h.A[m.. j − 1] = C [h] − cumsum.(h − 1).A) ··· ≡{definition of C} ···(∀i | l ≤ i ≤ k ∧ i = h • occur.i.A[m.. j − 1] = C[i] − cumsum.(i − 1).A) ∧ occur.h.A[m.. j − 1] = C[h] − 1 − cumsum.(h − 1).A) ··· ≡{h = A[ j]} ···(∀i | l ≤ i ≤ k ∧ i = h • occur.i.A[m.. j] = C[i] − cumsum.(i − 1).A) ∧ occur.h.A[m.. j] = C[h] − cumsum.(h − 1).A ··· ≡{merge ranges} (∀i | l ≤ i ≤ k • occur.i.A[m.. j] = C[i] − cumsum.(i − 1).A) ∧   (∀ir| l ≤ i ≤ k ∧ C [i] < r ≤ cumsum.i.A • B [r] = i) ≡{assumption}   T ∧ (∀ir| l ≤ i ≤ k ∧ C [i] < r ≤ cumsum.i.A • B [r] = i) ≡{range split}   (∀ir| l ≤ i ≤ k ∧ C [i] < r ≤ cumsum.i.A ∧ i = h • B [r] = i) ∧   (∀r | C [h] < r ≤ cumsum.h.A • B [r] = h) ≡{definition of B and C} (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A ∧ i = h • B(C[h] ← h)[r] = i) ∧ (∀r | C[h] − 1 < r ≤ cumsum.h.A • B(C[h] ← h)[r] = h) ≡{array indexing} (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A ∧ i = h • B[r] = i) ∧ (∀r | C[h] < r ≤ cumsum.h.A • B[r] = h) ≡{merge ranges} (∀ir| l ≤ i ≤ k ∧ C[i] < r ≤ cumsum.i.A • B[r] = i) ≡{assumption } T

Final Result Combining the separate refinements, we get the following theorem:    [B := B | sort.B ∧ perm.A.B ] {above derivation} 23.4. Counting Sort 401

begin var C : array Nat of Nat ; begin var j : Nat ; j := l ; while j ≤ k do C[ j], j := 0, j + 1 od ; end ; begin var j : Nat ; j := m ; while j ≤ n do C[A[ j]], j := C[A[ j]] + 1, j + 1 od end ; begin var j : Nat ; j := m + 1; while j ≤ k do C[ j], j := C[ j] + C[ j − 1], j + 1 od end ; begin var j : Nat ; j := n ; while j ≥ m do B[C[A[ j]]], C[A[ j]], j := A[ j], C[A[ j]] − 1, j − 1 od end end

Finally, we move the local variable j to the outermost level (we leave the proof of this to the reader as an exercise) and get the following final version of the program:    [B := B | sort.B ∧ perm.A.B ] {merge blocks (Exercise 23.5)} begin var C : array Nat of Nat, j : Nat ; j := l ; while j ≤ k do C[ j], j := 0, j + 1 od ; j := m ; while j ≤ n do C[A[ j]], j := C[A[ j]] + 1, j + 1 od ; j := m + 1; while j ≤ k do C[ j], j := C[ j] + C[ j − 1], j + 1 od ; j := n ; while j ≥ m do B[C[A[ j]]], C[A[ j]], j := A[ j], C[A[ j]] − 1, j − 1 od end 402 23. Working with Arrays

23.5 Summary and Discussion

We have shown in this chapter how to build programs that manipulate arrays, looking in particular at searching and sorting. The counting sort example shows that the methods introduced earlier can also handle quite tricky array manipulations. The importance of having a good program derivation environment also becomes much more evident in looking at a larger derivation. The ratio of invention to pure copying of text and checking that inference rules are correctly applied becomes quite small. Hence, there is a strong need for computer-supported tools for carrying out program refinements. The programming problems treated above are all classical, and we do not claim to contribute anything new to the algorithmic solution of these. A classical reference to sorting and searching algorithms is the book written by Knuth [96].

23.6 Exercises

23.1 Prove that the conditions for introducing a loop and for introducing an assignment are satisfied in the derivation of the linear search algorithm. 23.2 Prove that the conditions for introducing a loop and for introducing an assignment are satisfied in the derivation of the selection sort algorithm. 23.3 In the derivation of the selection sort program we assumed that n > 0. Show that the final program in fact gives the right result also when n = 0. In a situation like this, the following rule can be useful: { }   { }   p ; S S q ; S S . {p ∪ q} ; S  S Derive this rule. 23.4 Complete the derivation in the counting-sort case study by showing how to derive the computation of cumulative occurrences of values. 23.5 Prove the following rule for merging blocks:

begin var y := e1 ; S1 end ; begin var y := e2 ; S2 end =

begin var y := e1 ; S1 ; y := e2 ; S2 end ,

provided that y does not occur free in e2. This is page 403 Printer: Opaque this 24

The N-Queens Problem

The n-queens problem is to place n queens (where n > 0) on an n-by-n chessboard so that no queen is threatened by another one. According to the rules of chess, this is equivalent to the requirement that no two queens be on the same row or the same column or on a common diagonal. For some values of n this is possible but for some values (for example, for n = 2) there is no solution. In this chapter we show how one solution for a particular value of n is found with a depth-first search. The program derivation illustrates both recursion and loop introduction in a nontrivial setting. It also illustrates how to handle data structures like sequences in a program derivation.

24.1 Analyzing the Problem

Let us start by analyzing the problem domain in somewhat more detail and finding out some basic facts. The number of queens (and the size of the board) is n : Nat.We think of the board as a matrix with rows and columns numbered 0,...,n−1. A pair of numbers (r, c) represents a position on the board, with r and c in 0,...,n − 1. It is obvious that in a solution there must be one queen in each row. Thus we can represent a solution as a sequence of numbers c0,...,cn−1, where ci is the column number of the queen in row i.

We allow sequences to be indexed; s[i] stands for the ith element in s (for i = 0,...,len.s − 1, where len.s is the length of sequence s). However, indexing is not considered implementable, so it must not appear in the final program. 404 24. The N-Queens Problem

c = 5

7 6

5 r = c − 3 r 4 3 2 r = 2 1

0 r= − c + 7 0 1 2 345 67 c

FIGURE 24.1. Queen threat on a board

First let us define what it means for two queens in positions (i, x) and ( j, y) not to threaten each other: ∧ nothreat.(i, x).( j, y) = i = j ∧ x = y ∧ i + x = j + y ∧ i + y = j + x . These four conditions are illustrated in Figure 24.1. They guarantee that the two queens are not on the same row, the same column, the same downward diagonal, or the same upward diagonal. Next, we define safecon.s.n to mean that the sequence s represents legally placed queens on an n-by-n board that do not threaten each other: ∧ safecon.s.n = (∀i | i < len.s • s[i] < n) ∧ (∀ij| i < j < len.s • nothreat.(i, s[i]).( j, s[ j])) . We can then define when a sequence q is a solution to the n-queens problem: ∧ solution.q.n = len.q = n ∧ safecon.q.n .

Our idea is to solve the problem as follows. We start from an empty board. Then queens are added, one by one. If at some point it is impossible to place a queen, 24.1. Analyzing the Problem 405 then we recursively backtrack to the previous queen and try a new place for this queen. Before starting the derivation, we define the following functions:

 ∧   s(s = len.s ≤ len.s ∧ (∀i | i < len.s • s[i] = s [i]), ∧ safeadd.s.x = (∀i | i < len.s • nothreat.(i, s[i]).(len.s, x)) , ∧ qinit.s.n = (∃q • solution.q.n ∧ s(q).   Thus s(s holds if s is a prefix of s . Furthermore, safeadd.s.x means that if we place a queen in position (len.s, x), then it is not threatened by any queen in s. Finally, qinit.s.n means that the queens in s form the initial part of a solution. In particular, qinit. .n holds if and only if there is a solution for an n-by-n board. The following properties that will be used in the derivation can be proved from the definitions given above (Exercise 24.1):

safecon.s.n ∧ x < n ∧ safeadd.s.x ≡ safecon.(s&x).n , (∗) qinit.s.n ∧ len.s < n ⇒ (∃x | x < n • qinit.(s&x).n), (∗∗) ¬ safeadd.s.x ⇒¬qinit.(s&x).n , (∗∗∗) safecon.s.n ∧ len.s = n ⇒ qinit.s.n . (∗∗∗∗)

The n-queens problem can now be specified as follows: {n > 0 ∧ q = };   if qinit. .n → [q := q | solution.q .n] [] ¬ qinit. .n → skip fi . Thus the sequence q is assumed to be empty initially. At the end, q contains a solution, or, if no solution exists, q is still empty. We aim at a recursive procedure Queens(var q, val s, n). The reference parameter q will receive the result sequence, and the value parameter s is used to collect and pass on partial solutions. The value parameter n is the number of queens (the size of the board); it is always passed on unchanged. Thus the procedure will explicitly change the value only of q (and the value of s at the calls), and at every recursive call, n > 0 ∧ len.s ≤ n ∧ q = ∧safecon.s.n should hold. For every call we want one queen to be added to the board (i.e., s to be extended with one value), so the variant is n − len.s. Thus the starting point of our derivation of the recursive procedure is the following: {n > 0 ∧ len.s ≤ n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;    if qinit.s.n → [q := q | s(q ∧ solution.q .n] [] ¬ qinit.s.n → skip fi 406 24. The N-Queens Problem

(note that if s = , then this reduces to the original specification of the problem). By rewriting the conditional specification as a basic one (Exercise 17.5) and using the initial assertion, we find that the specification can be rewritten as {n > 0 ∧ len.s ≤ n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;     [q := q | if qinit.s.n then s(q ∧ solution.q .n else q = fi] . Although this form may be harder to read, it is easier to use in formal manipulations. For brevity, we introduce the abbreviation S.s.q.n for the demonic assignment in this specification statement.

24.2 First Step: The Terminating Case

We begin by separating the recursive case from the terminating case. In the ter- minating case the solution has been collected in s, so we have only to move it to q. {n > 0 ∧ len.s ≤ n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ; S.s.q.n ={introduce conditional, unfold abbreviation} if len.s = n → “ {n > 0 ∧ len.s = n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;     [q := q | if qinit.s.n then s(q ∧ solution.q .n else q = fi]” [] len.s < n → {n > 0 ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ; S.s.q.n fi {use (∗∗∗∗) and assertion} if len.s = n → “ {n > 0 ∧ len.s = n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;    [q := q | s(q ∧ solution.q .n]” [] len.s < n → {n > 0 ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ; S.s.q.n fi {use definition of solution to make assignment deterministic} 24.3. Second Step: Extending a Partial Solution 407

if len.s = n → q := s [] len.s < n → {n > 0 ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ; S.s.q.n fi

24.3 Second Step: Extending a Partial Solution

Now we find out how to extend the partial solution s in the recursive case. The idea is that if there is a solution, then we can extend s with one of the values in 0,...,n − 1. We can try them in order, one by one. We continue from the last version by unfolding the abbreviation:

={unfold abbreviation} if len.s = n → q := s [] len.s < n → “ {n > 0 ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;     [q := q | if qinit.s.n then s(q ∧ solution.q .n else q = fi]” fi {introduce local variable x with assignments} ···begin var x := 0; “ {x = 0 ∧ n > 0 ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;      [q, x := q , x | if qinit.s.n then s(q ∧ solution.q .n else q = fi]” end··· {introduce loop} ···while x < n ∧ q = do “ {x < n ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ;    [q := q | if qinit.(s&x).n then (s&x)(q ∧ solution.q .n  else q = fi]”; x := x + 1 od··· ={fold abbreviation} ···{x < n ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w} ; S.(s&x).q.n··· 408 24. The N-Queens Problem

Loop Introduction We now show the proof of the loop introduction in the derivation above, using Theorem 21.14. The idea of the loop introduction is to find a suitable placement x of the next queen. We let x count from 0 and stop when either qinit.(s&x).n holds (meaning that a suitable position is found) or when x = n ∧ q = (meaning that no suitable position exists).

The invariant (let us call it J)is

x ≤ n ∧ len.s < n ∧ safecon.s.n ∧ n − len.s = w ∧ ((q = ∧(∀y | y < x • ¬ qinit.(s&y).n)) ∨ (x > 0 ∧ qinit.(s&(x − 1)).n ∧ (s&(x − 1))(q ∧ solution.q.n))

(in the second half of this invariant, the first disjunct holds when a position has not yet been found for the next queen, and the second disjunct holds when a position has been found). The variant is simply n − x.

To prove that the invariant holds initially, we have to prove

x = 0 ∧ n > 0 ∧ len.s < n ∧ q = ∧safecon.s.n ∧ n − len.s = w ⇒ J ,

which is not hard (select the first disjunct in J).

Next, we must show that the loop body preserves the invariant and decreases the variant. Since x is increased by the loop body, the variant is obviously decreased. For the invariant, we must prove   x < n ∧ q = ∧J ∧ qinit.(s&x).n ∧ (s&x)(q ∧ solution.q .n ⇒  J[q, x := q , x + 1]

and  x < n ∧ q = ∧J ∧¬qinit.(s&x).n ∧ q = ⇒  J[q, x := q , x + 1] .

Both conditions can be proved using the properties of sequences given earlier.

Finally, to show that the postcondition holds on termination, we show

(x ≥ n ∨ q = ) ∧ J ∧ qinit.s.n ⇒ s(q ∧ solution.q.n

and

(x ≥ n ∨ q = ) ∧ J ∧¬qinit.s.n ⇒ q = .

All these conditions are most easily proved by case analysis on q = . 24.4. Third Step: Completing for Recursion Introduction 409

24.4 Third Step: Completing for Recursion Introduction

The body of the loop now contains almost what we need for recursion introduction; the only problem is that the assertion should contain safecon.(s&x).n instead of safecon.s.n in order for the variant to decrease. To get this, we need to introduce a conditional, based on whether safeadd.s.x holds or not. We continue the derivation as follows: ={introduce conditional, unfold abbreviation} ···if safeadd.s.x → {safeadd.s.x ∧ x < n ∧ len.s < n ∧ q = ∧ safecon.s.n ∧ n − len.s = w} ; S.(s&x).q.n [] ¬ safeadd.s.x → {¬ safeadd.s.x ∧ x < n ∧ len.s < n ∧ q = ∧ safecon.s.n ∧ n − len.s = w} ;    [q := q | if qinit.(s&x).n then (s&x)(q ∧ solution.q .n  else q = fi] fi··· {use (∗∗∗) and assertion to simplify} ···if safeadd.s.x → “ {safeadd.s.x ∧ x < n ∧ len.s < n ∧ q = ∧ safecon.s.n ∧ n − len.s = w} ”; S.(s&x).q.n [] ¬ safeadd.s.x → skip fi··· {weaken assertion, use (∗)} ···{n > 0 ∧ len.(s&x) ≤ n ∧ q = ∧ safecon.(s&x).n ∧ n − len.(s&x)

={substitute back} if len.s = n → q := s [] len.s < n → begin var x := 0; while x < n ∧ q = do if safeadd.s.x → {n > 0 ∧ len.(s&x) ≤ n ∧ q = ∧ safecon.(s&x).n ∧ n − len.(s&x) 0 ∧ len.(s&x) ≤ n ∧ q = ∧ safecon.(s&x).n ∧ n − len.(s&x)

24.5 Final Result

We now use the rule for procedure introduction and define

Queens(var q, val s, n) =∧ begin var x := 0; if len.s = n → q := s [] len.s < n → while x < n ∧ q = do if safeadd.s.x → Queens(q, s&x, n) [] ¬ safeadd.s.x → skip fi ; x := x + 1 od fi end

The preceding derivations then allow us to conclude the following:

{n > 0 ∧ q = };   if (∃q • solution.q.n) → [q := q | solution.q .n] [] ¬ (∃q • solution.q.n) → skip fi  Queens(q,  , n)

so the procedure Queens indeed gives us a solution to the n-queens problem.

24.6 Summary and Discussion

This chapter has shown how to derive a somewhat more complicated recursive algorithm, using both recursion introduction and loop introduction. The solution illustrates the need for analyzing the problem domain carefully before starting the derivation, in order to capture the essential aspects of the problem and to find good data representations for the information that needs to be manipulated. The n-queens problem is a classical exercise in program derivation and has been treated systematically by Wirth, e.g., as an example of stepwise refinement [141]. 412 24. The N-Queens Problem

24.7 Exercises

24.1 Derive properties (∗)–(∗∗∗∗) for sequences given in the text. 24.2 Complete the details of the proof for the loop introduction in the derivation of the Queens procedure. This is page 413 Printer: Opaque this 25

Loops and Two-Person Games

We have looked at recursion and iteration in general in the preceding chapters. In this chapter, we will show that the refinement calculus framework is usable beyond its original area of application, by showing how to model and analyze two-person games in the calculus. We show how to use total correctness reasoning to show the existence of winning strategies for such games. Furthermore, we show how one can derive a program that actually plays a game against an opponent. We choose the game of nim as an example and study it within the refinement calculus framework.

25.1 Modeling Two-Person Games

A typical two-person game (such as chess, checkers, tic-tac-toe or nim) involves two players taking turns in making moves. In order to model such games as state- ments, we first note that the game board can be represented as a state space. A move is an action that changes the board, i.e., a nondeterministic statement whose nondeterminism is determined by the possible legal alternatives in the current sit- uation. We call one contestant Player and the other Opponent; then Player’s moves are represented as angelic updates, while Opponent’s moves are demonic updates. A typical game can then be represented by an ordinary iteration: while g do S od . An execution of this iteration corresponds to a run (a play) of the game. If the body S aborts at some point (i.e., Player is not able to move), then Player has lost the game. Dually, if S is miraculous at some point (i.e., Opponent is not able to move), then Player has won the game. The game is a draw if the loop terminates 414 25. Loops and Two-Person Games

nonmiraculously (i.e., if ¬ g holds after a normally terminating execution of the body). The game is also considered to be a draw if the execution is nonterminating (the game continues indefinitely). In terms of the game semantics for statements, this means that we are only inter- ested in a game where the postcondition is false. In this case, normal termination, nontermination, and abortion all imply that Player has lost, and only miraculous termination is a win for Player. As a simple example, consider the repetition     while true do [x := x | x < x];{x := x | x > x} od

assuming program variable x : Nat. In this game, Opponent always decreases the value of x while Player always increases it. Opponent begins, and the two contestants alternate making moves (i.e., changing the value of x). It is easily seen that if the initial value of x is 0, then Player wins. This is because Opponent has no possible opening move (the demonic choice is empty). In any other initial state, the game goes on forever. Note that the guard true indicates that the game never ends in a draw in finite time.

Example Game: Nim As a more interesting example, we consider a simple version of the game of nim, in which the contestants alternate removing one or two matches from a pile. Player is the first to move, and the one who removes the last match loses. The state has only one attribute, x, the number of matches, ranging over the natural numbers. Thus, we can consider the state space to be the type Nat.   We introduce the abbreviation Move for the state relation (x := x | x − 2 ≤ x < x) (since we are talking about natural numbers, we follow the convention that m − n = 0 when m ≤ n). Then the game can be expressed in the following simple form:

while true do [x = 0] ; {Move} ; {x = 0} ;[Move] od .

The game can be explained as follows. The loop guard true shows that the game never ends in a draw in finite time. The body shows the sequence of moves that is repeated until the game ends (or indefinitely). First, the guard statement [x = 0] is a check to see whether Player has already won (Opponent has lost if the pile of matches is empty when it is Player’s move). If there are matches left (x = 0), then Player moves according to the relation Move, i.e., removes one or two matches (decreases x by 1 or 2). Now the dual situation arises. The assert statement {x = 0} is a check to see whether Player has lost. If there are matches left, then Opponent moves according to the relation Move, i.e., removes one or two matches. 25.2. Winning Strategies 415

Other two-person games can be represented by similar kinds of iterations. Typically, the iteration body contains guard and assert statements that detect situations in which either Player or Opponent has lost, and angelic and demonic updates that describe the permitted moves.

25.2 Winning Strategies

We have shown how games can be represented as iterations. The obvious question is now, what can we prove about a game?

Assume that the loop while g do S od represents a game and that the initial board is described by the predicate p. If the loop establishes postcondition false from precondition p, then from the semantics of statements we know that Player can make choices in the angelic updates in such a way as to be guaranteed to win, regardless of what choices Opponent makes in the demonic updates. Thus there exists a winning strategy for Player under precondition p if the following total correctness assertion holds:

p {| while g do S od |} false .

By instantiating in the correctness rule for loops, we get the following rule for proving the existence of winning strategies in two-person games.

Corollary 25.1 Player has a winning strategy for game while g do S od under precondition p, provided that the following three conditions hold for some ranked collection of predicates {rw | w ∈ W}:

(i) p ⊆ r,

(ii) (∀w ∈ W • rw {| S |} r

(iii) r ⊆ g.

Proof By specializing q to false in Theorem 21.13 we get the following three conditions:

(i) p ⊆ r, (ii) (∀w ∈ W • g ∩ rw {| S |} r

Straightforward simplification now shows that the third of these conditions is equiv- alent to r ⊆ g and also to (∀w ∈ W • rw ⊆ g). This can then be used to simplify the second condition. 2 416 25. Loops and Two-Person Games

As before, the ranked predicate rw is usually described as a conjunction of an invariant I and a variant t, so that rw is I ∧ t = w. Note that in the case that g = true, the third condition in Corollary 25.1 is trivially satisfied.

Winning Strategy for Nim We now return to the example game of nim. For Player to be assured of winning this game, it is necessary that he always make the number x of matches remaining satisfy the condition x mod 3 = 1. The strategy consists in always removing a number of matches such that x mod 3 = 1 holds.

Assume that the precondition p is x mod 3 = 1 (since Player moves first, this is necessary to guarantee that Player can actually establish x mod 3 = 1 in the first move). The invariant I is simply p, and the variant t is x.

Since the guard of the iteration that represents nim was true, the third condition of Corollary 25.1 is trivially satisfied. Furthermore, the choice of invariant is such that the first condition is also trivially satisfied. Thus, it remains only to prove the second condition, i.e., that

x mod 3 = 1 ∧ x = n {| S |} x mod 3 = 1 ∧ x < n (∗) holds for an arbitrary natural number n, where S = [x > 0] ; {Move} ; {x > 0} ;[Move].

We prove (∗) by calculating S.(x mod 3 = 1 ∧ x < n) in two parts. First, we find the intermediate condition:

({x > 0} ;[Move]).(x mod 3 = 1 ∧ x < n) ={definitions}     x > 0 ∧ (∀x • x − 2 ≤ x < x ⇒ x mod 3 = 1 ∧ x < n) ={reduction, arithmetic} x mod 3 = 1 ∧ x ≤ n This is the condition that Player should establish on every move. Continuing, we find the precondition

([x > 0] ; {Move}).(x mod 3 = 1 ∧ x ≤ n) ={definitions}     x = 0 ∨ (∃x • x − 2 ≤ x < x ∧ x mod 3 = 1 ∧ x ≤ n) ⊇{reduction, arithmetic} x mod 3 = 1 ∧ x = n

Thus we have shown that (∗) holds, i.e., that the game has a winning strategy. 25.3. Extracting Winning Strategies 417

25.3 Extracting Winning Strategies

We have shown how to use loop reasoning with invariants to show that Player has a winning strategy in a two-person game. We shall now show how to extract a strategy for Player in the form of a nonangelic program statement (i.e., a program that plays the game, with the role of Player). The result of this operation is a new loop that still describes the same game as the old one, but with a more restricted set of alternatives for Player. We also consider refinements that improve Opponent’s strategy; this is done by removing “bad” alternatives from the demonic nondeterminism that describes Opponent’s moves. First consider the simple case of the form while g do {Q} ;[R] od in which every round consists of a move by Player followed by a move by Opponent. Furthermore, assume that we have proved that there exists a winning strategy for this game with respect to precondition p, i.e., that the following total correctness assertion holds: p {| while g do {Q} ;[R] od |} false . If the role of Player was performed by an angel, then the game would always be won. However, if the role of Player is played by a human being (or a computer program), then the mere existence of a winning strategy is not enough, since the angelic update {Q} does not tell us which alternative should be taken in order to win the game. Thus, we would want to replace {Q} with a new angelic update {Q} such that Q only gives alternatives that are guaranteed to lead eventually to a win for Player. This replacement is not arefinement of the original game, since it means replacing Q with a Q such that Q ⊆ Q (and the angelic update is monotonic, not antimonotonic). However, we can still give a formal argument for the replacement. Lemma 25.2 Assume that the three conditions of Corollary 25.1 hold for the game while g do {Q}; [R] od with precondition p and ranked predicates {rw | w ∈ W}. Furthermore, assume that relation Q satisfies the following three conditions:

(i) Q ⊆ Q,

 (ii) (∀σ • r.σ ⇒ Q .σ =∅), and

 (iii) (∀w ∈ W • rw ⊆ ([Q ];[R]).r

rw.σ ⇒{second assumption and (∗)}   Q .σ =∅∧(∀γ • Q .σ.γ ⇒ [R].(∪ v ∈ W | v

The three conditions in Lemma 25.2 can be interpreted as follows. The first con-  dition implies that the new game while g do {Q } ;[R] od is compatible with the old one, in the sense that the only difference is that Player has fewer alternatives to choose from. In other words, Opponent is not disadvantaged if we replace Q by Q in the game. The second condition tells us that under Q, Player always has a move available. Thus, there is a proof of a winning strategy for the new game. Finally, the third condition implies that Q only offers alternatives that preserve the invariant, i.e., Q offers only winning alternatives. Whenever the conditions of Lemma 25.2 hold, we say that {r};[Q] is a specification for a program that plays the role of Player in the game. The justification for this is that whenever a ranked predicate rw holds (and the setup is such that a ranked predicate in fact always holds when Player is to move), any move permitted by the relation Q takes Player closer to a win. Later, in Section 25.4, we consider the question of what Player should do in situations in which the invariant does not hold.

A Program for Nim In order to apply the ideas above to the game of nim, we need a slight generalization. Consider the more general game while g do S od where the body S has the form [q];{Q} ; S for some statement S containing only demonic updates and asserts. This means we allow a check of whether Player has won before Player moves, and we also allow Opponent’s move to be mixed with checks of whether Player has lost. The idea of Lemma 25.2 can be used in this situation also. The specification for Player’s move is then {r ∩ q} ;[Q], where the conditions on the new relation Q are

(i) Q ⊆ Q,  (ii) (∀σ • r.σ ∧ q.σ ⇒ Q .σ =∅), and   (iii) (∀w ∈ W • rw ∩ q ⊆ ([Q ];S ).r

(the proof of this follows the proof of Lemma 25.2 exactly). In the case of nim, we first compute the three conditions with Q as Move, q as  x > 0, rn as x mod 3 = 1 ∧ x = n, and S as {x > 0} ;[Move]. After some simplifications, we get

    (i) (∀xx • Q .x.x ⇒ x − 2 ≤ x < x),    (ii) (∀x • x mod 3 = 1 ∧ x > 0 ⇒ (∃x • Q .x.x )), and      (iii) (∀xx • x mod 3 = 1 ∧ x > 0 ∧ Q .x.x ⇒ x > 0 ∧ x mod 3 = 1).

The third condition was arrived at as follows:

•   • (∀n rn ∩ q ⊆ ([Q ];S ).(∪ m | m < n rm )) ≡{fill in current instances, pointwise extension} (∀nx• x mod 3 = 1 ∧ x = n ∧ x > 0 ⇒   ([Q ];S ).(x mod 3 = 1 ∧ x < n) ≡{fill in current S,definitions} (∀nx• x mod 3 = 1 ∧ x = n ∧ x > 0 ⇒     (∀x • Q .x.x ⇒ x > 0 ∧       (∀x • x − 2 ≤ x < x ⇒ x mod 3 = 1 ∧ x < n))) ≡{one-point rule, quantifier rules}     (∀xx • x mod 3 = 1 ∧ x > 0 ∧ Q .x.x ⇒ x > 0) ∧        (∀xx x • x mod 3 = 1 ∧ x > 0 ∧ Q .x.x ∧ x − 2 ≤ x < x ⇒   x mod 3 = 1 ∧ x < x) ≡{simplify second conjunct}     (∀xx • x mod 3 = 1 ∧ x > 0 ∧ Q .x.x ⇒ x > 0) ∧     (∀xx • x mod 3 = 1 ∧ x > 0 ∧ Q .x.x ⇒ x mod 3 = 1)

The easiest way to find a relation that satisfies these three conditions is to use the first and third conditions to find a candidate,      Q .x.x ≡ x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1 , and to check that the second condition is satisfied: x mod 3 = 1 ∧ x > 0 ⇒     (∃x • x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1) ≡{property of mod , predicate calculus} 420 25. Loops and Two-Person Games

(x mod 3 = 0 ∧ x > 0 ⇒     (∃x • x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1)) ∧ (x mod 3 = 2 ∧ x > 0 ⇒     (∃x • x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1)) ≡{quantifier rules}  (∃x • x mod 3 = 0 ∧ x > 0 ⇒    x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1) ∧  (∃x • x mod 3 = 2 ∧ x > 0 ⇒    x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1) ⇐{use x − 2 and x − 1 as existential witnesses} (x mod 3 = 0 ∧ x > 0 ⇒ x − 2 < x ≤ x ∧ x − 2 > 0 ∧ (x − 2) mod 3 = 1) ∧ (x mod 3 = 2 ∧ x > 0 ⇒ x − 1 < x ≤ x + 1 ∧ x − 1 > 0 ∧ (x − 1) mod 3 = 1) ≡{arithmetic simplification} T

Thus we have shown that a specification for a Player program is the following:

{x > 0 ∧ x mod 3 = 1};     [x := x | x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1] .

We can develop this further (Exercise 25.1) by refining it into a conditional:

if x > 0 ∧ x mod 3 = 0 → x := x − 2 [] x mod 3 = 2 → x := x − 1 [] x mod 3 = 1 → abort fi .

This describes Player’s winning strategy as follows: if x mod 3 is zero, then remove two matches; if it is 2, then remove one match. If x mod 3 = 1, then the strategy says simply to give up (since the statement aborts in this case). Giving up in this way is reasonable when playing against a perfect opponent, but in practice it is not a very useful strategy. Instead, we would want a strategy that might work if an imperfect opponent makes a mistake. Below, we shall see how refinement and heuristics together can be used to improve the strategies of both contestants in such a way. 25.4. Strategy Improvement 421

25.4 Strategy Improvement

So far, we have considered two-person games as they would be played perfectly. We shall now consider how the notion of winning strategy and invariant can be used to describe strategy improvements that increase a contestant’s chances of winning against an imperfect strategy. We first consider Player and then Opponent. To make the argument simple, we assume that the game has the simple form while g do {Q} ;[R] od, with precondition p and ranked predicates {rw | w ∈ W}.

Improving Player’s Strategy Assume that we have found a specification {r} ;[Q] of Player’s winning strategy, where r = (∪ w ∈ W • rw). The initial assertion {r} indicates that unless the invariant holds (which is a requirement for the strategy to work against a perfect opponent), Player may as well give up. It is now straightforward to show that an equally good strategy is specified by  if r then [Q ] else [Q] fi (by choosing Q as the specification in the case when the invariant does not hold, we permit any legal move). We can now continue refining this strategy, making the else branch more deterministic. In particular, we may try to find permitted moves that could give an imperfect opponent a chance to establish the invariant by mistake. In the example of nim, a first improvement to Player’s strategy gives us

if x > 0 ∧ x mod 3 = 0 → x := x − 2 [] x mod 3 = 2 → x := x − 1   [] x mod 3 = 1 → [x := x | x − 2 ≤ x < x] fi .

Next, we note that Player cannot force Opponent to establish the invariant x mod 3 = 1 unless it holds when Player is to move. Thus, we cannot essentially improve Player’s strategy further. It could be argued that the best way of implementing the nondeterministic choice in the third branch of the conditional is by randomly choosing to remove either one or two matches, since this makes it hard for an opponent to figure out what strategy is used.

Improving Opponent’s Strategy Opponent’s strategy can be analyzed in the same way as Player’s. The demonic update [R] is a specification, and by making it more deterministic, Opponent can discard bad choices. In particular, Opponent should try to avoid establishing the invariant p, if possible. Thus, it is desirable to find a total relation R ⊆ R such 422 25. Loops and Two-Person Games

that r ⊆ [R].(¬ p) for as weak a predicate r as possible. This guarantees that if Player makes a move that establishes r, then Opponent can break the invariant, at least temporarily.

In the example of nim, this idea is used as follows. The invariant p is x mod 3 = 1, and the specification for Opponent’s move is [x := x | x − 2 ≤ x < x]. This can be made more deterministic by changing it to x := x − 1 (provided that x > 0) or x := x − 2 (provided that x > 1). The assignment x := x − 1 establishes ¬ p if x mod 3 = 2, and the assignment x := x − 2 establishes ¬ p if x mod 3 = 0. Thus an improved strategy for Opponent is

if x > 0 ∧ x mod 3 = 0 → x := x − 2 [] x mod 3 = 2 → x := x − 1   [] x mod 3 = 1 → [x := x | x − 2 ≤ x < x] fi .

It turns out that we get exactly the same strategy as for Player. This is not surprising, since nim is a symmetric game; the rules are the same for both contestants.

25.5 Summary and Discussion

We have shown how to formalize two-person games as statements in the refinement calculus and how to use standard correctness-proof techniques, with invariants and termination functions, to prove the existence of a winning strategy in the game. Furthermore, we have shown how to use refinement techniques to derive an actual game-playing program from an original abstract description of the game.

The idea of using predicate-transformer semantics for describing two-person games was introduced by Back and von Wright in [32]. The correspondence between total correctness reasoning and winning strategy extends the applicability of the predi- cate transformer and refinement calculus framework quite far outside its original area of application.

25.6 Exercises

25.1 Prove the refinement equivalence 25.6. Exercises 423

{x > 0 ∧ x mod 3 = 1};     [x := x | x − 2 ≤ x < x ∧ x > 0 ∧ x mod 3 = 1] = if x > 0 ∧ x mod 3 = 0 → x := x − 2 [] x mod 3 = 2 → x := x − 1 [] x mod 3 = 1 → abort fi . 25.2 There is a dual version of nim in which the winner is the one who takes the last match. Describe this game using the statement notation (when Player can move first) and investigate under what precondition there is a winning strategy. 25.3 Tick-tack-toe (or noughts and crosses) on a 3 × 3 board can be represented using three sets A, B, and C of natural numbers 1,...,9. The squares on the board are numbered so that every line-of-three adds up to 15 (i.e., as in a magic square). The set A contains the numbers of the empty squares, B contains the numbers of the squares where Player has placed a marker, and C contains the numbers of the squares where Opponent has placed a marker. A move consists in removing one number from A and adding it to B (Player’s move) or C (Opponent’s move). The game ends when either A is empty (a draw) or when either B or C contains three numbers that add up to 15. Describe this game using statement syntax and outline a proof that if Player starts by placing a cross in the center square and Opponent then places one in a corner square, then Player has a winning strategy. This is page 424 Printer: Opaque this This is page 425 Printer: Opaque this

Part IV

Statement Subclasses This is page 426 Printer: Opaque this This is page 427 Printer: Opaque this 26

Statement Classes and Normal Forms

In Chapter 16, we showed how the homomorphism properties give rise to a number of subcategories of statements. In this chapter, we study the most interesting of these subcategories in more detail and show how they can be expressed with a subset of the statement constructors. We concentrate on universally conjunctive and conjunctive statements, on disjunctive statements, on (partial and total) functional statements, and on continuous statements.

For each of these subcategories, we also give a normal form for statements. We will later see that these normal forms correspond to specification statements in the different statement subclasses. Because they are normal forms, any requirement can be expressed as a specification statement of this kind. On the other hand, these normal forms contain a minimum of information about how the statement would actually be implemented.

26.1 Universally Conjunctive Predicate Transformers

 The general structure of Ctran , the universally conjunctive predicate transformers, is summarized in the following lemma, which we state without proof (the proof is straightforward).

  Theorem 26.1 For arbitrary  and , Ctran (, ) is a complete lattice. Meets in Ctran (, ) give the same result as in Mtran(, ), but joins generally do not. The top element  of Ctran (, ) is magic, and the bottom element is the arbitrary demonic update chaos = [Tr ue]. 428 26. Statement Classes and Normal Forms

From Theorems 16.1 and 16.2 we see that guards, functional updates, and demonic updates are always universally conjunctive and that furthermore, sequential com- position and meets preserve universal conjunctivity. This means that if we restrict ourselves to these constructs, then we only generate predicate transformers that are universally conjunctive. Hence, any statement generated by the following syntax is universally conjunctive:

• S ::= [g] |f |[R] | S1 ; S2 | ( i ∈ I Si ). Of the derived constructs, the deterministic conditional, ν-recursion, and weak iteration also preserve universal conjunctivity, so they can also be used in a syntax of universally conjunctive statements. As the normal form theorem below shows, any universally conjunctive predicate transformer that can be expressed in higher-order logic can be expressed as a universally conjunctive statement.

Normal Form The normal form theorem for Mtran (Theorem 13.10) showed that every monotonic predicate transformer can be written as a sequential composition of an angelic and  a demonic update. This result has a counterpart in Ctran :

Theorem 26.2 Let S be an arbitrary universally conjunctive predicate transformer. Then there exists a unique relation R such that S = [R] .

Proof We first note that for universally conjunctive S, the following holds for all predicates q and all states σ :

S.q.σ ≡ (∩ p | S.p.σ • p) ⊆ q . (∗) The proof of this is left to the reader as an exercise (Exercise 26.2). Now set R.σ = (∩ p | S.p.σ • p). Then [R].q.σ ≡{definition}    (∀σ • (∩ p | S.p.σ • p).σ ⇒ q.σ ) ≡{pointwise extension} (∩ p | S.p.σ • p) ⊆ q ≡{(∗) above} S.q.σ which shows that [R] = S. For uniqueness we have 26.2. Conjunctive Predicate Transformers 429

 [R] = [R ] ≡{antisymmetry}   [R]  [R ] ∧ [R]  [R ] ≡{order embedding}   R ⊇ R ∧ R ⊆ R ≡{antisymmetry}  R = R 2

Theorem 26.2 shows that if we want to prove some general property about uni- versally conjunctive predicate transformers, it is sufficient that we prove it for predicate transformers of the form [R], where R is an arbitrary relation. From the homomorphism properties of the demonic choice constructor, we see that the rela- tion space is in fact embedded inside the predicate transformer space as universally conjunctive predicate transformers.

26.2 Conjunctive Predicate Transformers

By adding the possibility of nontermination, we move from universally conjunc- tive predicate transformers to conjunctive ones. We shall now investigate the con- junctive predicate transformers Ctran, which form a framework for demonically nondeterministic statements that can be both miraculous and nonterminating.

Theorem 26.3 For arbitrary  and , Ctran(, ) is a complete lattice. Meets in Ctran(, ) give the same result as in Mtran(, ), but joins generally do not. The top element of Ctran(, ) is magic, and the bottom element is abort. The assert, guard, functional update, and demonic update are conjunctive. Fur- thermore, sequential composition and meet preserve conjunctivity. Only angelic update and join break this property. The conjunctive basic statements are generated by the following syntax:

• S ::={g}|[g] |f |[R] | S1 ; S2 | ( i ∈ I Si ). Compared with the universally conjunctive statements, the only addition is the assert statement. Again, the normal form theorem below states that any conjunc- tive predicate transformer that can be expressed in higher-order logic can also be expressed as a conjunctive statement. Of the derived statements, conditionals, fixed points, iterations and loops all pre- serve conjunctivity. This means that we can use a very broad range of predicate 430 26. Statement Classes and Normal Forms

transformer constructors and still stay within the realm of conjunctive predicate transformers.

Normal Form The normal form theorem for universally conjunctive predicate transformers can be generalized to Ctran:

Theorem 26.4 Let S be an arbitrary conjunctive predicate transformer. Then there exists a unique predicate p and a relation R such that

S ={p} ;[R] .

Proof Recall that the termination guard of S is t.S = S.true. It is easily seen that we must choose p = t.S: S ={p} ;[R] ⇒{congruence, definitions} S.true = p ∩ [R].true ≡{demonic update is terminating} S.true = p ∩ true ≡{predicate calculus, definition of t.S} t.S = p

Then [t.S];S is universally conjunctive (Exercise 26.1), and we know from the proof of Theorem 26.2 that [t.S];S = [R] where R.σ = (∩ p | ([t.S];S).p.σ • p). Then

({t.S} ;[R]).q ={definition} ({t.S} ;[t.S];S).q ={definitions} t.S ∩ (¬ t.S ∪ S.q) ={general lattice property} t.S ∩ S.q ={definition of t.S} S.true ∩ S.q ={S monotonic} S.q and the proof is finished. 2 26.3. Disjunctive Predicate Transformers 431

Theorem 26.4 can in fact be strengthened somewhat; it is possible to prove that if we require that dom.R ⊆ p, then R is uniquely determined by S. Similarly, R is uniquely defined if we require that (∀σγ• ¬p.σ ⇒ R.σ.γ ).Inthefirst case, the condition means that the relation R must be undefined outside the domain of termination for S, while the second case implies that R must map any state outside the domain of termination for S to all possible final states. Theorem 26.4 shows that if we want to prove some general property about conjunc- tive predicate transformers, it is sufficient that we prove it for predicate transformers of the form {p} ;[R], where p and R are arbitrary. We showed earlier that nor- mal forms like this can be taken as basic specification statements in a systematic program refinement technique.

26.3 Disjunctive Predicate Transformers

Since disjunctivity is dual to conjunctivity, and universal disjunctivity is dual to universal conjunctivity, we can transform the above results immediately to corre- sponding results about disjunctive and universally disjunctive predicate transform- ⊥ ers. Let us first consider Dtran , the universally disjunctive predicate transformers. Duality and Theorem 26.1 then give the following result.

⊥ ⊥ Corollary 26.5 For arbitrary  and , Dtran (, ) is a complete lattice. Joins in Dtran (, ) give the same result as in Mtran(, ), but meets generally do not. The bottom ⊥ element of Dtran (, ) is abort, and the top element is the arbitrary angelic update choose ={Tr ue}. The previous results show that assert, functional update, and angelic update are universally disjunctive, and that sequential composition and joins preserve univer- sal disjunctivity. The basic universally disjunctive statements are thus defined by the following syntax:

• S ::={g}|f |{R}|S1 ; S2 | ( i ∈ I Si ). In addition to this, the deterministic conditional, (µ-) recursion, and while loops preserve universal disjunctivity, so they can also be used to construct universally disjunctive statements. The normal form theorem for universally disjunctive predicate transformers is dual to the one for universally conjunctive predicate transformers. It shows that the uni- versally disjunctive statements are sufficient to express the universally disjunctive predicate transformers.

Corollary 26.6 Let S be an arbitrary universally disjunctive predicate transformer. Then there is a unique relation R such that S ={R} . 432 26. Statement Classes and Normal Forms

This shows that the angelic update is a normal form for universally disjunctive predicate transformers and that the relation space is embedded inside the predicate transformer space as universally disjunctive predicate transformers. Exactly as for conjunctivity, relaxing universal disjunctivity to plain disjunctivity means we get a slightly larger collection of predicate transformers, Dtran, which has properties dual to those of Ctran.

Corollary 26.7 For arbitrary  and , Dtran(, ) is a complete lattice. Joins in Dtran(, ) give the same result as in Mtran(, ), but meets generally do not. The bottom element of Dtran(, ) is abort, and the top element is magic. Next, Theorem 26.4 immediately has the following dual:

Corollary 26.8 Let S be an arbitrary disjunctive predicate transformer. Then there exists a unique predicate p and a relation R such that

S = [p];{R} .

Thus, the form [p];{R} is a normal form for disjunctive predicate transformers. The disjunctive statements are defined in the obvious way, by adding guard statements to the syntax of universally disjunctive statements.

26.4 Functional Predicate Transformers

Recall that a predicate transformer that is both conjunctive and disjunctive is called functional (it could also be called deterministic, since the normal form theorems below show that if such a statement terminates normally when executed in some initial state, then it has a uniquely defined final state). We shall now briefly consider different kinds of functional predicate transformers.

Total Functional Predicate Transformers Predicate transformers that are both universally conjunctive and universally dis- junctive are called total functional. We call these TFtran = Ftran⊥. The refinement ordering for TFtran reduces to equality.

Lemma 26.9 For arbitrary  and , TFtran(, ) is discretely ordered. The fact that S  S holds if and only if S = S when S and S are elements in TFtran is a direct consequence of the following normal form theorem. It justifies calling the predicate transformers in TFtran “total functional”.

Theorem 26.10 If the predicate transformer S is both universally conjunctive and universally disjunctive, then there is a unique state transformer f such that S =f . 26.4. Functional Predicate Transformers 433

We do not prove this lemma separately; it is a consequence of Theorem 26.14 below. Note that Theorem 26.10 shows that the functional update is a normal form for total functional predicate transformers. Of the basic predicate transformer constructs, only functional update and sequential composition give us both universal conjunctivity and universal disjunctivity. In addition to this, conditional composition also preserves both universal conjunctivity and universal disjunctivity. Thus the following syntax generates a language of functional statements:

S ::=f |S1 ; S2 | if g then S1 else S2 fi .

Partial Functional Predicate Transformers The collection TFtran is a very restricted class of predicate transformers. It permits no recursive or iterative construct. This is remedied by moving to a slightly larger class of predicate transformers. A predicate transformer that is conjunctive and universally disjunctive is called partial functional. We use the name PFtran for the ⊥ collection of all partial functional predicate transformers (so PFtran = Ftran ). Theorem 26.11 For arbitrary  and , PFtran(, ) is a complete meet semilattice with bottom element abort. Exactly as for the classes of predicate transformers treated earlier, we can give a normal form for the partial functional case.

Theorem 26.12 If the predicate transformer S is conjunctive and universally disjunctive, then there is a unique predicate p and a state function f such that S ={p} ;  f . We do not prove this lemma separately, since it is a consequence of Theorem 26.14 below. Theorem 26.12 shows that {p} ;  f is a normal form for partial functional predicate transformers. From the homomorphism properties we see that partial functional predicate transformers are generated by asserts, functional updates, and sequential composition. In addition to this, deterministic conditionals and while loops also preserve conjunctivity and universal disjunctivity and can thus be used to build partial functional statements. Thus a language of partial functional statements is generated by the following syntax:

S ::={g}|f |S1 ; S2 | if g then S1 else S2 fi | while g do S od .

Functional Predicate Transformers We now consider Ftran, the collection of all conjunctive and disjunctive predicate transformers. As noted above, such predicate transformers are called functional. Theorem 26.13 For arbitrary  and , Ftran(, ) is a complete lattice, but neither meets nor joins generally give the same result as in Mtran. The bottom element of Ftran(, ) is abort, and the top element is magic. 434 26. Statement Classes and Normal Forms

The normal form for Ftran uses asserts, guards, and functional updates. This justifies calling these predicate transformers functional; they can be expressed without using nondeterministic constructs such as relational updates, meets, and joins. Theorem 26.14 Assume that S is a functional predicate transformer. Then there are predicates p and q and a state transformer f such that

S = [q];{p} ;  f .

Proof Assume that S is both conjunctive and disjunctive. By earlier normal form theorems, we know that S can be written in the form {p} ;[P] but also in the form [q];{Q}, for predicates p and q and relations P and Q such that

(∀σγ• Q.σ.γ ⇒ q.σ ) (∗) (i.e., dom.Q ⊆ q). The equality {p} ;[P] = [q];{Q} can be written as follows:

(∀r σ • p.σ ∧ P.σ ⊆ r ≡¬q.σ ∨ (Q.σ ∩ r =∅)) . (∗∗) We first show that Q must be deterministic:  Q.σ.γ ∧ Q.σ.γ ⇒{assumption (∗)}  q.σ ∧ Q.σ.γ ∧ Q.σ.γ ⇒{assumption (∗∗)}  q.σ ∧ p.σ ∧ P.σ ⊆{γ }∧P.σ ⊆{γ } ⇒{set theory}  q.σ ∧ p.σ ∧ (P.σ =∅∨γ = γ ) ⇒{q.σ ∧ p.σ ∧ P.σ =∅would contradict (∗∗) with r :=∅}  q.σ ∧ p.σ ∧ γ = γ One can now show that the determinism of Q implies that {Q}={λσ • Q.σ = ∅} ; λσ • εγ • Q.σ.γ (Exercise 26.5). Thus S can be rewritten in the desired form. 2

It should be noted that the normal form in Theorem 26.14 is not the only possible one. We could as well use {p} ;[q]; f as the normal form (this follows easily from the general rule for reversing the order of a guard and an assert: [q];{p}= {¬ q ∪ p} ;[q]). Note also that in the normal form [q];{p} ;  f , the predicate q is unique, q = ¬ S.false, while p and f are not. However, they are restricted so that S.true ∩ ¬ S.false ⊆ p ⊆ S.true and so that f is uniquely defined on S.true ∩¬S.false. This completes our investigation of the statement subclasses characterized by the four basic lattice-theoretic homomorphism properties. 26.5. Continuous Predicate Transformers 435

26.5 Continuous Predicate Transformers

Let us finally consider the subclass of monotonic predicate transformers that is formed by the continuity restriction. Continuity is also a kind of homomorphism property, as it requires that limits of directed sets be preserved by the predicate transformer.

We let  %→c  as before stand for the continuous predicate transformers in  %→ . Let CNtranX denote the category with types in X as objects and the continuous predicate transformers as morphisms, i.e., CNtran(, ) =  →c . Basic properties of CNtran are collected in the following lemma: Theorem 26.15 CNtran is a category; it is a subcategory of Mtran. For each pair of state spaces  and , CNtran(, ) is a complete lattice, but meets in CNtran(, ) generally do not give the same results as in Mtran(, ). CNtran(, ) has bottom element abort and top element magic. The normal form theorem for monotonic predicate transformers has a counterpart in CNtran, in the situation where the underlying state spaces are countable.

Theorem 26.16 Let S :  %→  be an arbitrary continuous predicate transformer, where  and  are countable state spaces. Then there exists a relation P and a total and image-finite relation Q such that S ={P} ;[Q]. Proof The proof rests on the fact that a continuous function f : P(A) → P(B) (where A and B are countable) is uniquely defined by its behavior on the finite subsets of A (the proof of this fact is beyond the scope of this book). Assume that a continuous S ∈ Ptran(, ) is given. The construction is then almost as in the normal form theorem for monotonic predicate transformers (Theorem 13.10). However, now we consider an intermediate state space consisting of all finite nonempty subsets of  (to be really precise, we would have to construct a new type that is isomorphic to this set of subsets, but that is a complication without real significance). One can then define P and Q exactly as in the proof of Theorem 13.10 and show that S ={P} ;[Q] (see also the proof of Theorem 26.18 below). 2

Thus {P} ;[Q], where Q is total and image-finite, is a normal form for continuous predicate transformers. Furthermore, for S :  %→  with  countably infinite it can be shown that both P and Q can be chosen to be relations over  (this is because the finite subsets of a countably infinite set can be mapped bijectively into the set itself).

Combining Continuity and Conjunctivity In addition to the continuous predicate transformers, we can consider subalgebras that satisfy a combination of continuity and some other homomorphism property. 436 26. Statement Classes and Normal Forms

A combination that is particularly useful is that of conjunctivity and continuity. We let CCNtran stand for the collection of all conjunctive and continuous predicate transformers. Lemma 26.17 CCNtran is a category; it is a subcategory of Mtran. For each pair of state spaces  and , CCNtran(, ) is a meet semilattice, and it has bottom element abort. In fact, any continuous and conjunctive predicate transformers can be written as a sequential composition of an assert and a continuous demonic update:

Theorem 26.18 Let S :  %→  be an arbitrary continuous conjunctive predicate transformer, where  and  are countable state spaces. Then there exists a unique predicate p and a total and image-finite relation R such that S ={p} ;[R].

Proof Assume that predicate transformer S is continuous and conjunctive. The construction is almost as in the normal form theorem for conjunctive predicate transformers (Theorem 26.4). We set p = t.S and  (∩ q | S.q.σ • q) if t.S.σ , R.σ = { γ • T } otherwise (the set { γ • T } is a singleton containing some unspecified state γ ∈ ). The same argument as in the proof of Theorem 26.4 now shows that S ={t.S};[R]. Thus, we have only to show that R.σ is nonempty and finite. This is obviously true if ¬ t.S.σ , since R.σ in that case is a singleton. Now assume t.S.σ . Then S.true.σ ≡ T, so the set {p | S.p.σ } is nonempty. Furthermore, we get S.(R.σ ).σ ≡{S conjunctive, definition of R} (∀q | S.q.σ • S.q.σ ) ≡{trivial quantification} T

Since continuity implies S.false = false, this shows that R.σ must be nonempty. Finally, we prove that R.σ is finite by contradiction. Let C be the collection of all finite subsets (predicates) of R.σ . Note that C, ordered by set inclusion, is a directed set. Assume that R.σ is infinite. Then (lim q ∈ C • S.q).σ ≡{pointwise extension} (∃q ∈ C • S.q.σ) ≡{R.σ infinite implies S.q.σ ≡ F for all q ∈ C} F 26.6. Homomorphic Choice Semantics 437

However, we have S.(lim q ∈ C • q).σ ≡{the limit of C is R.σ , because of countability} S.(R.σ ).σ ≡{above} T which contradicts continuity of S. Thus R.σ must be finite. 2

This theorem shows that {p} ;[R] (where R is total and image-finite) is a normal form for conjunctive continuous predicate transformers.

26.6 Homomorphic Choice Semantics

The choice semantics for statements that we described earlier showed how predicate transformers can be identified with generalized state transformers that map initial states to sets of postconditions. We shall now consider how the homomorphism properties of predicate transformers translate into properties of state transform- ers. Each homomorphism property of a basic predicate transformer construct can be translated into a property of the corresponding generalized state transformer. Thus to every subclass of statements corresponds a subclass of generalized state transformers. Our aim here is to show how traditional denotational semantics of subclasses of statements can be extracted from the choice semantics in a simple way, using the state transformer characterizations of homomorphism properties. In this way, we get a unification of programming logics, including both backward predicate transformer semantics and forward state transformer and relational semantics, and in which the more traditional semantics can be identified as special cases.

Homomorphism Properties Recall the function that maps any predicate transformer S :  %→  to a generalized state transformer S :  → P(P()),defined by S.σ ={q | S.q.σ } . Thus S maps initial state σ to the set of all postconditions that S is guaranteed to establish. We shall now consider how homomorphism properties translate into generalized state transformer equivalents. We have already noted in Chapter 14 that monotonicity of S corresponds to upward closedness of Sˆ. We first consider strictness and termination. 438 26. Statement Classes and Normal Forms

Theorem 26.19 Let S be an arbitrary statement. Then

(a) S strict ≡ (∀σ • ∅ ∈ S.σ ) .

(b) S terminating ≡ (∀σ • S.σ =∅).

Proof We show the proof of part (b), which is the more difficult part. Let S be arbitrary. We have S terminating ≡{definition of termination} (∀σ • S.true.σ ) ≡{definition of S} (∀σ • true ∈ S.σ ) ≡{⇒is trivial; ⇐ since S.σ is upward closed} (∀σ • S.σ =∅) and the proof is finished. 2

Note how the three basic cases for initial state σ (abortion, normal termination, miracle) are reflected in the choice semantics. Statement S aborts if S.σ =∅,itis miraculous if ∅∈S.σ , and it terminates normally otherwise. We next show that conjunctivity corresponds to intersection closedness. Duality then gives a characterization of disjunctivity. Theorem 26.20 Predicate transformer S is conjunctive if and only if

• • (∀i ∈ I pi ∈ S.σ ) ≡ (∩ i ∈ I pi ) ∈ S.σ

holds for an arbitrary nonempty collection {pi | i ∈ I } of predicates and for all states σ. Dually, S is disjunctive if and only if

• • (∃i ∈ I pi ∈ S.σ ) ≡ (∪ i ∈ I pi ) ∈ S.σ

holds for an arbitrary nonempty collection {pi | i ∈ I } of predicates and for all states σ. Proof First assume that S is conjunctive and let the state σ and a nonempty collection {pi | i ∈ I } of predicates be given. Then S conjunctive, I =∅

•  (∩ i ∈ I pi ) ∈ S.σ ≡{definition of S}

• S.(∩ i ∈ I pi ).σ ≡{S conjunctive, pointwise extension}

• (∀i ∈ I S.pi .σ ) 26.6. Homomorphic Choice Semantics 439

≡{definition of S}

• (∀i ∈ I pi ∈ S.σ )

• • Now assume that (∩ i ∈ I pi ) ∈ S.σ ≡ (∀i ∈ I pi ∈ S.σ ) holds for an arbitrary state σ and an arbitrary nonempty collection {pi | i ∈ I } of predicates. Then, for I =∅,

• S.(∩ i ∈ I pi ).σ ≡{definition of S}

• (∩ i ∈ I pi ) ∈ S.σ ≡{assumption}

• (∀i ∈ I pi ∈ S.σ ) ≡{definition of S}

• (∀i ∈ I S.pi .σ ) ≡{pointwise extension}

• (∩ i ∈ I S.pi ).σ so S is conjunctive. The proof for disjunctivity is dual. 2

Choice Semantics for Universally Conjunctive Statements The choice semantics maps statement S into the rich structure Stran(, ).We will now show that for subclasses of Mtran it is possible to map S to an alternative S so that intuition behind the choice semantics is preserved but the structure that S is mapped into is simpler.

If S ∈ Mtran(, ) is universally conjunctive, then for an arbitrary state σ the set (∩ p | S.p.σ • p) is the unique smallest element in S.σ (Theorem 26.19). Furthermore, upward closure guarantees that the intersection is nonempty, and Theorem 26.20 guarantees that it is an element of S.σ . Thus S.σ is completely characterized by this element. For a conjunctive statement S we define  ∧ S.σ = (∩ p | S.p.σ • p).

This makes S a function in  → P(). Note that this interprets universally conjunctive statements directly as relations; setting S = [R] we immediately have

γ ∈ [R].σ ≡ R.σ.γ .

Exactly as for the general choice semantics, we have operators corresponding to the different operators on predicate transformers. For example, demonic choice on  Ctran corresponds to pointwise union while refinement corresponds to pointwise reverse subset inclusion: 440 26. Statement Classes and Normal Forms

Mtran Stran

inj cr

Ctran Σ → P ( Γ)

FIGURE 26.1. Semantics for universally conjunctive statements

   S1  S2.σ = S1.σ ∪ S2.σ ,    S  S ≡ (∀σ • S .σ ⊆ S.σ ) .

The relationship between the semantic functions and the four domains involved  (Mtran, Ctran , Stran, and  → P()) can be described nicely if we first define  the function cr that recovers S from S:

cr.T.σ ={q | T.σ ⊆ q}  for T :  → P(), so that cr.(S) = S. The commuting diagram in Figure 26.1 then  shows the situation, where inj is the injection function that maps any S ∈ Ctran to the same predicate transformer in Mtran.

The functions involved are also homomorphic. For example, for arbitrary S1 and  S2 in Ctran we have

•   • cr.(λσ S1.σ ∪ S2.σ ) = (λσ inj.S1.σ ∩ inj.S2.σ )

• (recall that S1  S2 = (λσ S1.σ ∩ S2.σ )). Similar results can be shown for other operators; this is pursued in Exercise 26.7.

Conjunctivity If S is conjunctive but possibly nonterminating, we have only to define S slightly differently, compared with the universally conjunctive case. Now S is a function in  → {⊥} + P() (thus we have added an artificial bottom element ⊥ to the state space), and  ∧ ⊥ if S.σ =∅, S.σ = (∩ p | S.p.σ • p) otherwise. 26.6. Homomorphic Choice Semantics 441

In fact, S then gives the relational interpretation often used to justify weakest precondition semantics. Again, S can immediately be recovered from S, and the refinement relation has the following characterization:     S  S ≡ (∀σ • S.σ =⊥∨S .σ ⊆ S.σ ) . This is the Smyth powerdomain ordering on {⊥} + P(). Exercise 26.8 elaborates further on the alternative choice semantics for conjunctive statements.

Disjunctivity For disjunctive predicate transformers, the results are dual to those for the conjunc- tive case, so we outline only the universally disjunctive case here. From Theorems 26.20 and 26.19 it follows that S.σ is uniquely determined by its singleton elements, so we define S.σ to contain all members of singleton elements of S.σ : ∧ S.σ ={σ  |{σ }∈S.σ } . This makes S a function in  → P(). Nontermination here corresponds to S.σ being empty, and refinement corresponds to pointwise subset inclusion:    S  S ≡ (∀σ • S.σ ⊆ S .σ ) .

Determinism Determinism means that the final state of a normally terminating computation is always uniquely determined for a given initial state. For deterministic predicate transformers, we thus want to simplify the state transformer interpretation so that S.σ is the final state that is reached when S is executed in initial state σ . We first consider the special case S ∈ TFtran, when predicate transformer S is deterministic (i.e., both conjunctive and disjunctive) and also terminating and non- miraculous. In this case, Theorem 26.10 shows that S =f for the state function f defined by

f.σ = ( γ • (∀p | S.p.σ • p.γ )) , and from Chapter 14 we know that S.σ ={p | f.σ ∈ p} . In other words, S.σ contains all those predicates that contain f.σ . Thus we lose no information if we give the simpler semantics ∧ S = f  that identifies S with a function S :  → . Since TFtran is discretely ordered, refinement reduces to equality. 442 26. Statement Classes and Normal Forms

If S is partial functional, i.e., conjunctive, disjunctive, and strict, it follows from the above that given a state σ , either there is a unique state σ  such that S.σ = {q | σ  ∈ q},orS.σ =∅. Using the additional element ⊥ (standing for abortion), we can define a simpler state transformer semantics without losing information:   ∧ ⊥ if S.σ =∅, S.σ =   (ε σ • S.σ ={q | σ ∈ q}) otherwise.

This definition makes S a function in  → {⊥} + . In this way, we recover traditional denotational semantics for partial functions, where ⊥ is the undefined state. In order to characterize refinement, we recall that the default ordering on a sum A + B is defined such that the orderings inside A and B are preserved but all elements of A are smaller than all elements of B.Refinement then corresponds to the pointwise extension of this ordering on {⊥} + :

    S  S ≡ (∀σ • S.σ =⊥∨S.σ = S .σ ) .

The case when S is deterministic (i.e., conjunctive and disjunctive) is handled in a similar way, but the range of S is  → {⊥} +  + {}, where  stands for miraculous termination.

Finite Conjunctivity and Disjunctivity In predicate transformer semantics, it is often assumed that (executable, imple- mentable) programs are represented by predicate transformers that are finitely conjunctive and nonmiraculous. Recall that S is finitely conjunctive if S(p ∩ q) = S.p ∩ S.q for arbitrary predicates p and q. Finite conjunctivity is slightly weaker than positive conjunctivity, but all examples of predicate transformers that are finitely conjunctive but not conjunctive that we have found are pathological and far removed from ordinary program specifications (see Exercise 26.9). It is interesting to note that S is a finitely conjunctive, strict statement if and only if for all states σ the following three conditions hold:

(a) ∅ ∈ S.σ , (b) p ∈ S.σ ∧ q ∈ S.σ ⇒ p ∩ q ∈ S.σ , and (c) p ∈ S.σ ∧ p ⊆ q ⇒ q ∈ S.σ ;

i.e., S.σ is either empty or (in topological terminology) a filter. Furthermore, if we strengthen the conjunctivity condition, requiring S to be uni- versally conjunctive and strict, then S.σ is a principal filter, i.e., a filter generated by a specific set of states. 26.7. Summary and Discussion 443

26.7 Summary and Discussion

In this chapter we have studied the main subclasses of monotonic predicate trans- formers, identified by the homomorphism properties they satisfy. For each subclass, we have identified the normal form for predicate transformers in this class and also identified the collection of basic statement constructs that generate this subclass. Subclasses of predicate transformers have been investigated by Back and von Wright [27, 29]. Distributivity and homomorphism properties of conjunctive pred- icate transformers are the basis for Hesselink’s command algebras [80, 81]. Universally conjunctive predicate transformers are not very expressive for rea- soning about programs, because they cannot express nontermination. However, it should be noted that if a language (with demonic nondeterminism) is given a weakest liberal precondition semantics, then its statements denote universally con-  junctive predicate transformers. Thus Ctran is suitable for reasoning about partial correctness of demonically nondeterministic programs. The special cases of the general choice semantics described in this chapter yield semantics that have been used earlier for various programming languages and paradigms. The choice semantics for universally conjunctive statements corre- sponds to traditional relational semantics. For conjunctive statements, we have a version of the Smyth powerdomain ordering without the restriction to continuity [130]. These correspondences were noted already in [9]. The choice semantics for deterministic statements (with artificial bottom and top elements added to the state space) correspond to the ideas of Scott [126] that treat data types as lattices. Later mainstream denotational semantics worked with cpos and omitted the top element; see, e.g., [125]. This corresponds to our semantics for partial functional statements. The way in which we have identified statement subclasses should be seen as an alternative to the traditional approach in programming logics, where a particularly interesting class of program statements is singled out, and characterized by its properties. Statements in this class are given a syntax and a semantic meaning, and properties of statements are then studied within this framework. This approach is actually quite restrictive. First of all, it introduces an unnecessary element into the discussion, the choice of statement syntax. This is an area where people have strong preferences, and it usually only confuses the issue. Secondly, the analysis is then locked into the fixed framework, and comparison with alternative statement classes becomes difficult and can be carried out only on a semantic level. Our formalization of programming languages in higher-order logic avoids these problems, because we stay within a single very general framework, that of higher- order logic. We do not introduce a specific programming language syntax for our statement subclasses, but concentrate instead on the collection of operations that generate this subclass. This allows us to freely mix components from different 444 26. Statement Classes and Normal Forms

statement classes and derive properties for such new mixtures. Of course, our framework is also restricted, in the sense that we interpret our programs as predicate transformers, so we are essentially considering only the input–output behavior of programs.

26.8 Exercises

26.1 Show that if S is conjunctive, then [t.S];S is universally conjunctive. 26.2 Show that if S is universally conjunctive, then the following holds for all predicates q and all states σ:

S.q.σ ≡ (∩ p | S.p.σ • p) ⊆ q . Hint: First show that (∧ x | B.x • B.x) =for arbitrary B.  26.3 Show that the demonic update chaos = [Tr ue] is the bottom element of Ctran , i.e., the least universally conjunctive predicate transformer. 26.4 Prove Theorem 26.11. 26.5 Show that if the relation Q is deterministic, then

{Q}={λσ • Q.σ =∅}; λσ • εγ • Q.σ.γ . 26.6 In this chapter, we defined the various homomorphic choice semantics by applying a suitable transformation to the original choice semantics. Another possibility would be to define the homomorphic choice semantics recursively over the syntax for the corresponding subclass of contract statements and then show that the two choice semantics are related as stated in the text. For the universally conjunctive case we define

 ch .[R].σ ≡ R.σ ,    ch .(S1 ; S2).σ ≡ im (ch .S2).(ch .S1.σ ) ,  • •  ch .( i ∈ I Si ).σ ≡ (∪ i ∈ I ch .Si .σ ), I =∅ . Show that with this definition,

 ch.S.σ ≡{q | ch .S.σ ⊆ q}  for S ∈ Ctran .   26.7 Recall the definitions of S and cr.S for universally conjunctive S given in the text. Prove that the diagram in Figure 26.1 commutes and then prove the following homomorphism property for demonic choice:

•   • cr.(λσ S1.σ ∪ S2.σ ) = (λσ inj.S1.σ ∩ inj.S2.σ ) . 26.8. Exercises 445

Then formulate the corresponding homomorphism property for sequential compo- sition and prove it.  26.8 For the conjunctive case we define the function ch recursively over the syntax of a subset of contract statements as follows:   {σ } if p.σ , ch .{p}.σ = ⊥ otherwise;  ch .[R].σ = R.σ ;   ⊥, if ch .S1.σ =⊥,    ⊥, if ch .S1.σ =⊥∧ ch .(S ; S ).σ =   1 2  (∃γ ∈ . .σ • . .γ =⊥)  ch S1 ch S2 , .( . ).( . .σ ),  im ch S2 ch S1 otherwise;   ⊥, if (∃i ∈ I • ch .S .σ =⊥), .( ∈ • ).σ = i ch i I Si •  (∩ i ∈ I ch .Si .σ ), otherwise. (when I =∅). Show that this semantics is related to the original choice semantics as follows:   ⊥ if ch.S.σ =∅, ch .S.σ = ∩ ch.S.σ otherwise. 26.9 A set is said to be cofinite if its complement is finite. Since predicates are sets of states, it makes sense to talk abut cofinite predicates also. Let S be the predicate transformer in Mtran(Nat, Nat) defined as follows: S.q.σ ≡ q is cofinite. Show that S is finitely conjunctive, i.e., that S.(p ∩ q) = S.p ∩ S.q for arbitrary predicates p and q. S is not conjunctive, however. Show this by exhibiting a set of • • predicates {qi | i ∈ Nat} such that S.(∩ i ∈ Nat qi ) = (∩ i ∈ Nat S.qi ). Hint: a set of natural numbers A is cofinite if and only if there is n ∈ Nat such that A contains all numbers that are greater than or equal to n. This is page 446 Printer: Opaque this This is page 447 Printer: Opaque this 27

Specification Statements

In this chapter we describe in more detail how to model program specifications as program statements. We give an exact characterization of what a specification statement is, and we study refinement of specification statements by other specifica- tion statements. We also look at methods for combining two or more (conjunctive) specifications into a single specification that retains the same information.

A stepwise refinement of a specification statement embodies some design deci- sion as to how the specification is to be implemented. The term top-down program development is often used to describe this classical technique for program develop- ment. One starts with an abstract specification statement that describes the intended behavior of the program and refines this step by step. At each step, a specification statement is replaced with some compound statement, possibly containing new specification statements as subcomponents.

We concentrate on refinement of conjunctive statements. We give the basic rules for refining conjunctive specification statements by a sequential composition or meet of other specification statements, and illustrate the top-down development method with a small example. We choose the Guarded commands language by Dijkstra to represent the programming language that we are programming into. We describe this language first, in particular to determine its homomorphism properties, which can then be used in program derivations.

We finally consider the problem of refining a specification statement by an arbi- trary statement. We consider two kinds of specification statements: conjunctive specifications and universally disjunctive specifications. In both cases, we show how to reduce a refinement to a correctness assertion, which then can be proved using correctness rules from earlier chapters. 448 27. Specification Statements

27.1 Specifications

In preceding chapters we have shown that subclasses of statements can be char- acterized by normal forms for statements. We shall now show that these normal forms really correspond to specifications of program behaviors. By a specification we mean a statement that indicates in a simple way what is to be computed without saying how it is computed. For example, the statement

x := factorial.x

specifies that the output value of x should be the factorial of the input value. If the programming language used has an explicit factorial operator, then this would also be an executable statement. By a sequence of refinement steps (a derivation), we can refine this to an implementation that, e.g., uses only multiplications and additions. We will not address here the question of what operations are implemented, as this differs from programming language to programming language. A programming language implementation of a statement is a refinement of the statement that is directly expressible in the chosen programming language and hence executable on the chosen computer.

We have shown in Chapter 13 that every monotonic predicate transformer term can be written in statement form {P} ;[Q]. We consider a predicate transformer of this form to be a general monotonic specification. Intuitively, such a specifica- tion describes a two-step game in which the angel chooses an intermediate value according to relation P, after which the demon chooses a final value according to Q. In a similar way, we also consider the other normal forms from Chapter 26 to be specifications.

Conjunctive Specifications A conjunctive specification is a predicate transformer of the form {p} ;[Q]. This is a normal form for conjunctive predicate transformers; i.e., every conjunctive predicate transformer term can be written in this form. Furthermore, it describes a computation in a concise form: the relation Q shows how the final state is related to the initial state, while the predicate p (the precondition) characterizes those initial states that we are interested in. An implementation of the specification {p} ;[Q] must behave as permitted by Q whenever p holds in the initial state. Arbitrary behavior (even nontermination) is acceptable in states where p does not hold.

In practice, we usually write the relation Q as a relational assignment. For example,    {x > 2} ;[y := y | 1 < y < x ∧ (∃a • a · y = x)]

is a specification for a program that stores some proper factor of x in y if x is initially greater than 2. 27.2. Refining Specifications by Specifications 449

If the precondition is the predicate true, then the specification is in fact a universally conjunctive specification. We do not consider universally conjunctive specifications separately; we simply treat them as special cases of conjunctive ones. An example (with x, y : Nat)is    [y := y | y 2 ≤ x <(y + 1)2] , which assigns to y the (natural number) square root of x. Note that even though the square root is uniquely defined for arbitrary x, it would be much more complicated to specify it as a function of x. This is a strong argument in favor of conjunctive specifications even when specifying a deterministic computation: we can specify by giving constraints rather than by giving an explicit function.

Other Specifications Dually to the conjunctive specification, the statement [p];{Q} is a disjunctive specification and {Q} is a universally disjunctive specification. Similarly, {p} ;  f is a partial function specification. An example of a universally disjunctive specification is     {x, e := x , e | 0 < e < x } . This specifies setting x and e to any (angelically chosen) nonnegative values with 0 < e < x. An example of a partial functional specification is {x > 0} ; z := y/x , which specifies that z is assigned the result of the division y/x, with precondition x > 0. If x ≤ 0, then any behavior, even nontermination, is acceptable. The partial functional specification is easily seen to be a special case of the con- junctive specification, since any functional update can be rewritten as a demonic update. Even though the partial functional specification predicate transformer is sufficient to specify arbitrary partial functions, the natural number square root ex- ample above shows that it can be more convenient to specify a function using the more general conjunctive specifications.

27.2 Refining Specifications by Specifications

Specifications are canonical ways of expressing statements. It is therefore useful to know when one specification refines another one. Refinement between specifi- cations corresponds to refining the requirements of the program statement to be built, without going into the details of how the statement is to be constructed in terms of sequential composition, conditionals, and loops. 450 27. Specification Statements

Refinement Between Similar Specifications The rules for refinement between similar specifications generalize the homomor- phism properties given earlier. We consider seven basic forms of specifications: monotonic {P} ;[Q], conjunctive {p} ;[Q], universally conjunctive [Q], disjunc- tive [p];{Q}, universally disjunctive {Q}, partial functional {p} ;  f , and total functional  f .

Theorem 27.1 The following rules hold for arbitrary predicates p, p; arbitrary state transformers f, f; and arbitrary relations P, P, Q, and Q. (a) {P} ;[Q] {P} ;[Q] ⇐ P ⊆ P ∧ Q ⊆ Q , (b) {p} ;[Q] {p} ;[Q] ≡ p ⊆ p ∧|p| ; Q ⊆ Q , (c) [Q]  [Q] ≡ Q ⊆ Q , (d) [p];{Q}[p];{Q}≡p ⊆ p ∧ Q ⊆|p| ; Q , (e) {Q}{Q}≡Q ⊆ Q , (f) {p} ;  f {p} ;  f  ≡p ⊆ p ∧|p| ; | f |⊆|f | , (g)  f f  ≡ f = f  .

Proof We first note that (a) holds by homomorphism and monotonicity properties. To prove (b) we have   {p} ;[Q] {p } ;[Q ] ≡{definitions}   (∀σ r • p.σ ∧ Q.σ ⊆ r ⇒ p .σ ∧ Q .σ ⊆ r) ≡{⇒by specialization (r := Q.σ ), ⇐ by transitivity of ⊆}   (∀σ • p.σ ⇒ p .σ ∧ Q .σ ⊆ Q.σ ) ≡{calculus, definition}      (∀σ • p.σ ⇒ p .σ ) ∧ (∀σσ • p.σ ∧ Q .σ.σ ⇒ Q.σ.σ ) ≡{definitions}   p ⊆ p ∧|p| ; Q ⊆ Q This proves (b). Now (c) follows as a special case of (b) when p and p are both true. The rule for disjunctive specifications (d) follows from (b) by dual reasoning (Exercise 27.1). Finally, (e) is a special case of (d), (f) is a special case of (b), and (g) is a special case of (f). 2

Note that rules (b)–(g) in Theorem 27.1 are all equivalences, so they can be used for refinements in both directions. All the rules in Theorem 27.1 give conditions for refinement in terms of predicates, functions, and relations. This is important; it allows us to reduce refinement of statements to properties on lower levels in the predicate transformer hierarchy: to properties of functions, relations, and predi- cates. When the rules are applied to a concrete specification, the conditions can 27.2. Refining Specifications by Specifications 451

easily be reduced further to first-order formulas. For example, the second conjunct on the right-hand side in (f) can be rewritten as  (∀σ • p.σ ⇒ f.σ = f .σ ) .

Theorem 27.1 may seem like an unnecessarily long list. In fact, as noted in the proof, three of the cases (c, e, and g) are direct special cases of the others. From now on, we will not consider these specifications (universally conjunctive, universally disjunctive, and total functional) separately.

Refinement Between Different Specifications In many cases, it is possible to refine a specification of one form into a specification of another form. We concentrate on those situations in which a specification is refined by a more restricted kind of specification (where {P};[Q] is the most general and  f the most restricted form). Furthermore, we concentrate on four basic forms of specifications: general, conjunctive, disjunctive, and partial functional.

Theorem 27.2 The following holds for arbitrary predicates p, p; arbitrary state transformer f ; and arbitrary relations P, P, and Q.    −  (a) {P} ;[Q] {p } ;[Q ] ≡ dom.P ⊆ p ∧ P 1 ; Q ⊆ Q , (b) {p} ;[Q] {p} ;  f  ≡p ⊆ p ∧|p| ; | f |⊆Q , (c) {P} ;[Q]  [p];{Q}≡|p| ; P ⊆ Q ; Q−1 ,     (d) {Q}{p } ;  f ≡dom.Q ⊆ p ∧ Q ⊆|f | .

Proof We first prove (a). We have   {P} ;[Q] {p } ;[Q ] ≡{definitions}   (∀σ r • (∃γ • P.σ.γ ∧ Q.γ ⊆ r) ⇒ p .σ ∧ Q .σ ⊆ r) ≡{quantifier rules}   (∀σγr • P.σ.γ ∧ Q.γ ⊆ r ⇒ p .σ ∧ Q .σ ⊆ r) ≡{⇒by specialization (r := Q.γ ), ⇐ by transitivity of ⊆}   (∀σγ• P.σ.γ ⇒ p .σ ∧ Q .σ ⊆ Q.γ ) ≡{predicate calculus}   (∀σγ• P.σ.γ ⇒ p .σ ) ∧ (∀σγδ• P.σ.γ ∧ Q .σ.δ ⇒ Q.γ.δ) ≡{quantifier rules}   (∀σ • (∃γ • P.σ.γ ) ⇒ p .σ ) ∧ (∀γδ• (∃σ • P.σ.γ ∧ Q .σ.δ) ⇒ Q.γ.δ) ≡{definitions}  −  dom.P ⊆ p ∧ P 1 ; Q ⊆ Q 452 27. Specification Statements

Next, we find that (b) is in fact a special case of Theorem 27.1 (b), specializing Q to | f |. For (c) we have a derivation similar to the one for (a). Finally, (d) follows   as a special case of (a) with P := Q, Q := Id, and Q :=|f |, together with a −   small derivation to show Q 1 ; | f |⊆Id ≡ Q ⊆|f |. 2

In the rule in Theorem 27.2 (d) it is sufficient to have the universally disjunctive specification {Q} on the left-hand side rather than the disjunctive specification [p];{Q} because the refinement

[p];{Q}{p} ;  f  is possible only if p = true (Exercise 27.2). In some situations, we may also be interested in refining a specification into one of a more general form. Since the less general form can always be rewritten into the more general one, the problem here reduces to refinement between similar specifications. For example, the refinement

{p} ;  f {p} ;[Q] is equivalent to

{p} ;[| f | ] {p} ;[Q] and can be handled by the rule in Theorem 27.1 (b). Another similar example can be found in Exercise 27.3.

Example As an example, we consider refining the conjunctive specification {x > 0} ;[x := x | x < x]. We propose to refine this by replacing the demonic assignment with a functional one. Theorem 27.2 (b) gives us a rule for proving the refinement

{x > 0} ;[x := x | x < x] {x > 0} ; x := x − 1 . Since both specifications have the same precondition, the proof obligations are simplified. We have   {x > 0} ;[x := x | x < x] {x > 0} ; x := x − 1 ≡{Theorem 27.2 (b)}   |x > 0| ; |x := x − 1 |⊆(x := x | x < x) ⇐{reduction to boolean terms}    (∀xx • x > 0 ∧ x = x − 1 ⇒ x < x) ≡{arithmetic} T 27.3. Combining Specifications 453

27.3 Combining Specifications

There are many cases in which we do not start with a single specification, but rather with a collection of requirements that our program has to satisfy. Before starting a derivation, we first need to combine these different requirements into a single specification. We shall now give basic rules for this. We assume that the requirements are themselves given as specifications, and we concentrate on conjunctive statements. We first show that if S1 and S2 are conjunc- tive specifications, then S1 ; S2 and S1  S2 can also be expressed as conjunctive specifications. The join S1 S2 is the least common refinement of the two specifica- tions, but it need not be conjunctive, since joins introduce angelic nondeterminism. But since the conjunctive predicate transformers Ctran(, ) form a complete lat- tice, any set of conjunctive predicate transformers must also have a unique least conjunctive upper bound. We will show below how to derive this.

Sequential Composition and Meet Sequential composition and meet preserve conjunctivity. Hence, composing two conjunctive specifications with these operators gives us conjunctive statements. These can again be expressed as a conjunctive specification, as shown by the following result.

Theorem 27.3 Let predicates p and q and relations P and Q be arbitrary. Then

(a) {p} ;[P];{q} ;[Q] ={p ∩ [P].q} ;[P ; Q] .

(b) {p} ;[P] {q} ;[Q] ={p ∩ q} ;[P ∪ Q] .

Proof We use the following assertion propagation rule for conjunctive S, which the reader can easily verify (Exercise 27.4):

S ; {p}={S.p} ; S . For (a), we have {p} ;[P];{q} ;[Q] ≡{assertion propagation} {p} ; {[P].q} ;[P];[Q] ≡{homomorphisms} {p ∩ [P].q} ;[P ; Q] and for (b), ({p} ;[P] {q} ;[Q]).r.σ ≡{definitions} 454 27. Specification Statements

p.σ ∧ (∀γ • P.σ.γ ⇒ r.γ ) ∧ q.σ ∧ (∀γ • Q.σ.γ ⇒ r.γ ) ≡{predicate calculus} p.σ ∧ q.σ ∧ (∀γ • P.σ.γ ∨ Q.σ.γ ⇒ r.γ ) ≡{definitions} ({p ∩ q} ;[P ∪ Q]).r.σ which finishes the proof. 2

Least Conjunctive Refinement Theorem 27.3 states that the meet (greatest lower bound) of specifications {p};[P] and {q} ;[Q]is{p ∩ q} ;[P ∪ Q]. Intuitively, it may seem that the rule for a least conjunctive upper bound should be dual. However, we cannot simply take the disjunction of the preconditions and the conjunction of the next-state relations (see Exercise 27.5). We will here need the relational quotient operator, defined as follows: ∧ P\Q = (λσ δ • ∀γ • P.σ.γ ⇒ Q.γ.δ) . (relation quotient) Intuitively, (P\Q).σ.δ means that whenever P relates σ to some γ , then Q must relate this γ to δ (the quotient can be seen as a dual to composition of relations). We assume that the quotient operator binds more tightly than meet but more weakly than composition. The following properties of this quotient operator show that it is related to compo- sition similarly to how implication is related to meet.

Lemma 27.4 Assume that P, Q, and R are relations. Then

(a) P\Q =¬(P ; ¬ Q),

(b) P\(Q\R) = (P ; Q)\R ,

(c) P ⊆ Q\R ≡ Q−1 ; P ⊆ R ,

(d) (P\Q)−1 = (¬ Q−1)\(¬ P−1),

(e) Id\R = R . The proof is left as an exercise to the reader (Exercise 27.7). Lemma 27.4 (c) shows that the quotient can be given the following intuition: if we want to divide relation R into relations Q and P so that Q ; P ⊆ R, and Q is given, then for P we can take Q−1\R. The relational quotient can be used to give a simple characterization of least conjunctive upper bounds. 27.3. Combining Specifications 455

Theorem 27.5 The least conjunctive upper bound of two conjunctive specifications {p} ;[P] and {q} ;[Q] is given by {p ∪ q} ;[|p|\P ∩|q|\Q] .

Proof We develop the proof by finding what characterizes a specification ({r};[R]) that refines both {p} ;[P] and {q} ;[Q]. {p} ;[P] {r} ;[R] ∧{q} ;[Q] {r} ;[R] ≡{Theorem 27.1 (d)} p ⊆ r ∧|p| ; R ⊆ P ∧ q ⊆ r ∧|q| ; R ⊆ Q ≡{Lemma 27.4 (c), |p|−1 =|p|} p ⊆ r ∧ q ⊆ r ∧ R ⊆|p|\P ∧ R ⊆|q|\Q ≡{lattice properties} (p ∪ q) ⊆ r ∧ R ⊆|p|\P ∩|q|\Q This shows that the least conjunctive upper bound is {p ∪ q} ;[|p|\P ∩|q|\Q]. 2

We call the least conjunctive upper bound of S1 and S2 the least conjunctive re- ˆ ˆ finement, written S1 S2. The expression for S1 S2 given in Theorem 27.5 is fairly complicated. However, if the conjunctive specification {p} ;[P] is normalized so that relation P is full wherever p does not hold (i.e., (∀σ • ¬p.σ ⇒ (∀γ • P.σ.γ ))), and similarly for {q};[Q], then the expression for the least conjunctive upper bound reduces to the simpler form {p ∪ q} ;[P ∩ Q] . Showing this is left to the reader as an exercise (Exercise 27.9). In the special case in which the two preconditions are the same, the least conjunctive refinement can also be simplified.

Theorem 27.6 Let p be a predicate and P and Q relations. Then {p} ;[P] {ˆ p} ;[Q] ={p} ;[P ∩ Q] .

Proof {p} ;[P] {ˆ p} ;[Q] ={Theorem 27.5} {p} ;[|p|\P ∩|p|\Q] ={definitions} {p} ;[λσ γ • (p.σ ⇒ P.σ.γ ) ∧ (p.σ ⇒ Q.σ.γ )] ={use assertion (Exercise 27.8} {p} ;[λσ γ • p.σ ∧ (p.σ ⇒ P.σ.γ ) ∧ (p.σ ⇒ Q.σ.γ )] 456 27. Specification Statements

={predicate calculus} {p} ;[λσ γ • p.σ ∧ P.σ.γ ∧ Q.σ.γ ] ={use assertion (Exercise 27.8} {p} ;[λσ γ • P.σ.γ ∧ Q.σ.γ ] ={definitions} {p} ;[P ∩ Q] 2

The least conjunctive upper bound can be seen as a way of combining specifications. If we specify different aspects of a program separately, we want the implementation to satisfy (refine) each of the component specifications. Theorem 27.5 gives us a rule for computing the combined specification. Since the result is the least upper bound, we do not lose any refinement potential when combining the component specifications if we have decided that the refinement must be conjunctive.

Example: Merging Arrays As an example, we specify a merging algorithm, intended to merge two sorted arrays A and B into an array C. We assume that only the array segments A[0..m − 1], B[0..n − 1], and C[0..m + n − 1] are used. Thus we use the abbreviation sorted.A for sorted.A.(λi • i < n) and occur.A.x for occur.A.(λi • i < n).x (and similarly for B and C). We require that the input and the output be sorted:

  S1 ={sorted.A ∧ sorted.B} ;[C := C | sorted.C ] . Furthermore, we require that all input elements occur in the output list:

S2 ={sorted.A ∧ sorted.B};   [C := C | (∀i • occur.C .i = occur.A.i + occur.B.i)] .

The result of combining the two specifications S1 and S2 is, by Theorem 27.6, {sorted.A ∧ sorted.B} ;   [(C := C | sorted.C ) ∩   (C := C | (∀i • occur.C .i = occur.A.i + occur.i))] ={combine relations} {sorted.A ∧ sorted.B} ;    [(C := C | sorted.C ∧ (∀i • occur.C .i = occur.A.i + occur.B.i))]

Note that although both component specifications were nondeterministic, the result of combining them is deterministic. 27.4. Refining Conjunctive Specifications 457

It is not necessary to go directly to the least conjunctive refinement of the two statements, because the ordinary join of the two statements is also an acceptable refinement. If we take the join as the basis for further refinement, where the angelic choice is removed at one stage or another (or maybe left in the program as a user interaction point), then there is no reason why we should avoid the join.

27.4 Refining Conjunctive Specifications

Now consider the problem of finding rules that state exactly when a refinement of the form S  S1 ; S2 (and similarly for meet and join) is valid, where S, S1, and S2 are all specifications of the same form. Given such rules, we can later derive corresponding rules for other composition operations, such as conditionals and loops. We concentrate on conjunctive specifications. This means that we take sequential composition and meet as basic constructors. However, similar rules can be derived for other classes of specifications. For example, every rule for conjunctive specifi- cations gives rules for universally conjunctive and partial functional specifications as special cases. We do not state all these special cases explicitly. Instead, a number of additional rules are given in the exercises.

Refining Specifications by Composed Specifications We give two basic rules that show how sequential composition and meet can be introduced in conjunctive specifications.

Theorem 27.7 Let predicates p and q and relations P, Q, and R be arbitrary. Then

(a) Refinement of a conjunctive specification with a sequential composition sat- isfies {p} ;[P] {p} ;[Q];{q} ;[R] ≡ p ⊆ [Q].q ∧|p| ; Q ; R ⊆ P .

(b) Refinement of a conjunctive specification with a demonic choice satisfies {p} ;[P] {p} ;[Q] {p} ;[R] ≡|p| ; Q ⊆ P ∧|p| ; R ⊆ P .

Proof For (a), we have {p} ;[P] {p} ;[Q];{q} ;[R] ≡{Theorem 27.3 (a)} {p} ;[P] {p ∩ [Q].q} ;[Q ; R] ≡{Theorem 27.1 (b)} 458 27. Specification Statements

p ⊆ p ∩ [Q].q ∧|p| ; Q ; R ⊆ P ≡{property of meet} p ⊆ [Q].q ∧|p| ; Q ; R ⊆ P The rule in (b) follows from Theorems 27.3 (b) and 27.1 (b). We leave it to the reader to verify this. 2

In a similar way it is possible to derive rules for introducing sequential composition and join of disjunctive specifications. Note that the rule for adding a leading assignment given in Chapter 17 is a special case of the rule in Theorem 27.7 (a). The following derivation shows this for the case that b is a boolean expression in which y does not occur free and e is an expression in which neither y nor y occurs free.   [x, y := x , y | b] {refine specification with composition}   • true ⊆ [y := y | y = e](y = e) ≡{rule for demonic assignment}    true ⊆ (∀y • y = e ⇒ (y = e)[y := y ]) ≡{one-point rule, y not free in e } true ⊆ ((y = e)[y := e]) ≡{reduction, y not free in e } T • (y := y | y = e) ; (x, y := x, y | b) ⊆ (x, y := x, y | b) ≡{composition of assignments}         (x, y := x , y | (∃x y • x = x ∧ y = e ∧ b[x, y := x , y ]) ⊆ (x, y := x, y | b) ≡{one-point rule} (x, y := x, y | b[y := e]) ⊆ (x, y := x, y | b) ≡{y not free in b} T     • [y := y | y = e];{y = e} ;[x, y := x , y | b] {rewrite demonic assignment (coercion), drop assertion}   y := e ;[x, y := x , y | b]

27.5 General Refinement of Specifications

We now consider the situation in which a specification is refined by an arbitrary statement. The rules that we get are very general, but their verification conditions are more complicated than what we get in the more specific rules given above. We 27.5. General Refinement of Specifications 459

first consider refinements of the form {p} ;[Q]  S, where {p} ;[Q]isagiven conjunctive specification. This expresses Theorem 17.6 in a more general form. Theorem 27.8 Assume that p and q are predicates, Q a relation, and S a monotonic predicate transformer. Then

{p} ;[Q]  S ≡ (∀σ • p.σ ⇒ S.(Q.σ ).σ ) .

Proof We have {p} ;[Q]  S ≡{definitions} (∀q σ • p.σ ∧ Q.σ ⊆ q ⇒ S.q.σ ) ≡{mutual implication; subderivations} • (∀q σ • p.σ ∧ Q.σ ⊆ q ⇒ S.q.σ ) ⇒{specialization q := Q.σ ; simplification} (∀σ • p.σ ⇒ S.(Q.σ ).σ ) • (∀q σ • p.σ ∧ Q.σ ⊆ q ⇒ S.q.σ ) ⇐{S monotonic} (∀σ • p.σ ⇒ S.(Q.σ ).σ ) • (∀σ • p.σ ⇒ S.(Q.σ ).σ ) 2

Note that Lemma 17.4 follows as a special case of this, by choosing Q := (λσ • q). The reader can verify that Theorem 27.8 can be reformulated as follows:

• • {p} ;[Q]  S ≡ (∀σ0 (λσ σ = σ0) ∩ p {| S |} Q.σ0). This shows the similarity with Theorem 17.6 that was given in terms of program variables. We now derive a rule for refining a disjunctive specification. The rule is not dual to the rule for conjunctive specifications, since a dual rule would use the reverse refinement relation. Theorem 27.9 Let S be a monotonic predicate transformer and Q a relation. Then

• • • {Q}S ≡ (∀σ0.σ1 Q.σ0.σ1 ⇒ (λσ σ = σ0) {| S |} (λσ σ = σ1)) .

Proof S monotonic {Q}S ≡{definitions, rewriting implication}    (∀q σ • (∃σ • Q.σ.σ ∧ q.σ ) ⇒ S.q.σ ) 460 27. Specification Statements

≡{properties of ⇒, quantifiers}    (∀σσ.q • q.σ ⇒ (Q.σ.σ ⇒ S.q.σ ))  ≡{⇒by specialization (q := (λσ • σ = σ )), }   ⇐ by monotonicity (q.σ ⇒ (λσ • σ = σ ) ⊆ q)    (∀σσ • Q.σ.σ ⇒ S.(λσ • σ = σ ).σ ) ≡{one-point rule}

 •  •  (∀σσ σ0 (σ = σ0) ⇒ (Q.σ0.σ ⇒ S.(λσ σ = σ ).σ )) ≡{properties of ⇒, quantifiers}

 •  • •  (∀σ σ0 Q.σ0.σ ⇒ (∀σ (σ = σ0) ⇒ S.(λσ σ = σ ).σ )) ≡{pointwise order, α-renaming}

• • • (∀σ0 σ1 Q.σ0.σ1 ⇒ (λσ σ = σ0) {| S |} (λσ σ = σ1)) 2

Intuitively speaking, Theorem 27.9 shows that S refines the disjunctive specifica- tion {Q} if and only if S can choose (angelically) every computation path between states related by Q.

27.6 Summary and Discussion

This chapter has identified specification statements with the normal forms for statement subclasses that we gave previously. We have studied the refinement of specification statements by other specifications in detail and have given a collection of useful inference rules for this. We have also studied the problem of combining specification statements using the basic statement constructors: sequential com- position, meets, and joins. We have in particular looked at the least conjunctive refinement of conjunctive specification statements. Specification statements were originally introduced in a predicate transformer framework as a way of expressing general input–output specifications within a programming language by Back [6], who called the specification statement a non- deterministic assignment. Morgan later introduced a variant of this, which he called a specification statement [104]. Back’s specification statement uses a single rela- tion, while Morgan’s specification is a pre- and postcondition pair. However, both express specifications for conjunctive predicate transformers only. This chapter has shown how to generalize the notion of specification statements to other statement subclasses. The least conjunctive refinement is a good example of a case in which something that is quite simple on the relational level becomes much harder in the general case of predicate transformers. On the relational level, the least upper bound of 27.7. Exercises 461

two demonic updates is just the demonic update for conjunction of the relations involved: [P] [Q] = [P ∩ Q]. This has sometimes been taken as an argument for why a relational semantics for programs is superior to a predicate transformer approach [76, 89]. As we have tried to show here, this argument is not really valid, because relational semantics forms a subclass of predicate transformer semantics. Hence, we do not lose anything by choosing predicate transformer semantics, but rather we gain expressibility. A relational specification can be built, combined and analyzed within the relational domain, and once the different requirements have been combined into a good specification, refinement can continue on the statement/predicate transformer level. The problem of conjunctive upper bounds and ways of finding the least one has been studied by a number of authors, including Morgan [107] and Back and Butler [19]. The quotient operator on relations is very close to the weakest prespecification operator that Hoare and He use as a basis for a method of stepwise refinement [91] within a relational framework. We also gave a general rule for refining any (conjunctive or universally disjunctive) specification statement with an arbitrary statement. This shows that refinement be- tween programs can always be reduced to a correctness formula in the weakest precondition calculus, and thus refinement provides a bridge in the other direction between refinement calculus and weakest precondition calculus for total correct- ness (the other direction being the definition of the refinement relation itself). This characterization of refinement was originally given by Back [7] in terms of strongest postconditions. A similar characterization was also given by Morgan [104].

27.7 Exercises

27.1 Show that Theorem 27.1 (d) follows from (b) by duality. 27.2 Show that the refinement

[p];{P}{q} ;  f is possible only if p = true. 27.3 Show that the following rule for refinement between different specifications is valid:

{p} ;  f [q];{Q}≡(∀σ • p.σ ∧ q.σ ⇒ Q.σ.( f.σ )) . 27.4 Show that the following equality always holds when S is a conjunctive predicate transformer:

S ; {p}={S.p} ; S . 462 27. Specification Statements

27.5 Show that {p ∪ q} ;[P ∩ Q]isalwaysarefinement of both {p} ;[P] and {q} ;[Q]. This means that {p ∪q} ;[P ∩ Q] is an upper bound of {p} ;[P] and {q} ;[Q]. Give a counterexample showing that {p ∪ q} ;[P ∩ Q] need not be the least conjunctive upper bound. 27.6 Prove that the following equivalence holds:

{p} ;[Q] f ≡(∀σ • p.σ ⇒ Q.σ.( f.σ )) . 27.7 Prove Lemma 27.4. 27.8 Show the following:

{p} ;[Q] ={p} ;[λσ γ • p.σ ∧ Q.σ.γ ] . 27.9 Show that (∀σ • ¬ p.σ ⇒ (∀γ • P.σ.γ )) implies |p|\P = P. 27.10 Use Theorem 27.8 to derive necessary and sufficient conditions for when the refinement {p} ;[P] {q} ;[Q] holds. 27.11 Use the rules for top-down program development to refine the specification    [x := x | x 2 ≤ y <(x + 1)2] to x := 0; while (x + 1)2 < y do x := x + 1 od . This is page 463 Printer: Opaque this 28

Refinement in Context

The refinement relation is very restrictive, as it requires that the refining statement preserve the correctness of the original statement no matter what pre- and post- conditions are assumed for the original statement. In practice, we are interested mostly in refining a statement that occurs as a component of a larger statement. This context will then restrict the pre- and postconditions that need to be preserved for the whole statement to be refined. We described in Section 17.4 how to carry out such refinements in context. In this chapter, we will look at this technique in more detail, in particular on the methods for propagating context information inside a program statement. We consider only conjunctive commands here, though a dual theory can be built up for disjunctive commands (see the exercises).

28.1 Taking the Context into Account

Let us first recall the method for using context information in a refinement as it was described in Section 17.4. We assume that C is a statement built from conjunctive components and with possible occurrences of a statement variable X. Then C[S] (i.e., C[X := S]) is a conjunctive statement with a specific occurrence of substatement S singled out. We refer to C as the context of S. Since the statement constructors are monotonic, we have that S  S ⇒ C[S]  C[S]. We are thus permitted to replace S by a refinement S in C, getting as the result a refinement C[S] of the original statement C[S]. In many situations, the requirement that S  S is, however, too strong, so the question is whether we might replace S by some statement S for which S  S 464 28. Refinement in Context

does not hold, but for which C[S]  C[S] would still hold. In other words, the replacement of S by S preserves the correctness of the whole statement. This can indeed be done in a systematic way, as follows: First find a statement  S0 such that C[S] = C[S0]. Then find a refinement S of S0. The refinement  C[S]  C[S ] then follows by monotonicity and transitivity. The less refined S0 is, the more freedom it gives us in the subsequent refinement.

Kernel Equivalence Refinement in context can be described in terms of kernel equivalence. Given an arbitrary function f :  → ,wedefine = f , the kernel equivalence of f ,by ∧ x = f y = f.x = f.y (kernel equivalence) (when applying kernel equivalence to context refinement, the function (λX • C) corresponds to f in this definition). It is easily seen that = f is an equivalence relation. Lemma 28.1 Assume that  and  are complete lattices and that the function f :  →  is meet homomorphic. Furthermore, let x :  be arbitrary. Then the set {y :  | x = f y} has a smallest element.

Proof It suffices to show that {y :  | y = f x} (i.e., the equivalence class of x under = f ) is nonempty and closed under nonempty meets. First, we always know x = f x, so the set is nonempty. For an arbitrary set {xi :  | i ∈ I } we have

• (∀i ∈ I xi = f x), f :  →  meet homomorphic, I nonempty •  f.( i ∈ I xi ) ={meet homomorphism}

• ( i ∈ I f.xi ) ={kernel equivalence} ( i ∈ I • f.x) ={I nonempty} f.x 2

We can now show that the smallest element shown to exist in Lemma 28.1 has a desirable property.

Lemma 28.2 Assume that  and  are complete lattices and that the function f :  →  is meet homomorphic. Then the following holds, for arbitrary x, y : :

• ( z | x = f z z)  y ⇒ f.x  f.y . 28.1. Taking the Context into Account 465

Proof

• ( z | z = f x z)  y  f.y {assumption, -homomorphism implies monotonicity}

• f.( z | z = f x z) ={-homomorphism}

• ( z | z = f x f.z) ={kernel equivalence}

• ( z | z = f x f.x) ={meet of singleton set} f.x 2

For predicate transformers, the two lemmas above show the following. Consider a statement C[S] where C[·] is a meet homomorphic context. Then there always ex- ists a smallest predicate transformer S0 such that C[S] = C[S0], and furthermore,   arefinement C[S0]  C[S ] holds whenever S0  S holds. We can build meet homomorphic contexts by using guards, asserts, demonic updates, sequential com- position, meets, least and greatest fixed points, iterations, and all other constructs that are conjunctive or preserve conjunctivity.

Context Assertions Assume that C[S] is given, with S a predicate transformer and C[·] a meet homo- morphic context. It may not be feasible to find the least predicate transformer S0 such that C[S] = C[S0]. In practice, we can make considerable progress by finding a (reasonably strong) predicate p (a context assertion) such that C[S] = C[{p}; S]. When we then refine {p} ; S, we are really refining S under the (contextual) knowl- edge that the predicate p can be assumed to hold in any initial state from which S is executed. For this to work we need rules for propagating context assertions. We can start with outermost assertions {true} (i.e., skip) and then propagate these inwards until the assertions meet at the subcomponent that we want to refine. The structure of the derivation is then C[S] ={skip ={true}} {true} ; C[S];{true} {propagate context assertions} 466 28. Refinement in Context

C[{p} ; {q} ; S] ={merge assertions, {p} ; {q}={p ∩ q}} C[{p ∩ q} ; S] {prove {p ∩ q} ; S  S, monotonicity}  C[S ] Note that we here have generalized the method for propagating context assertions, as compared to Section 17.4, by permitting both forward and backward propagation of assertions. In practice, forward propagation seems to be the most useful method. A restricted refinement puts weaker constraints on the refining statement:   {p} ; S  S ≡ (∀σ • p.σ ⇒ (∀q • S.q.σ ⇒ S .q.σ )) . Thus, we have to prove refinement between S and S only for those initial states x that satisfy the condition p. This is weaker than S  S, which requires refinement between S and S for every initial state. We have that S  S ⇒{p} ; S  S,but the converse need not hold.

Example Consider, as an example, the following statement, where x and y are assumed to be program variables ranging over the natural numbers: x := x + 1;y := x + 2;if y ≤ 3 then skip else abort fi . Let us simplify this statement. The rules for assertion propagation to be given below permit us to construct the following derivation: x := x + 1;y := x + 2;if y ≤ 3 then skip else abort fi ={add initial and final trivial assertions} {true} ; x := x + 1;y := x + 2;if y ≤ 3 then skip else abort fi ; {true} {propagate assertion forward} x := x + 1;{x ≥ 1} ; y := x + 2;if y ≤ 3 then skip else abort fi ; {true} {propagate assertion backwards} x := x + 1;{x ≥ 1} ; y := x + 2;{y ≤ 3} ; if y ≤ 3 then skip else abort fi ={propagate assertion backwards} x := x + 1;{x ≥ 1} ; {x ≤ 1} ; y := x + 2;if y ≤ 3 then skip else abort fi ={merge assertions} x := x + 1;{x = 1} ; y := x + 2;if y ≤ 3 then skip else abort fi ={refinement in context} x := x + 1;y := 3;if y ≤ 3 then skip else abort fi 28.2. Rules for Propagating Context Assertions 467

This derivation illustrates the way in which statements preceding the component to be replaced, as well as statements following this component, can contribute to the context assertion that we can assume when refining the component.

Guards as Context Assumptions Context assertions allow us to transport information from one place in a program text to another. The amount of information that is available in assertions is often huge. Of course, we can always weaken an assertion (since p ⊆ q ⇒{p} {q}), but it can be difficult to decide what information to discard. By using guard statements in addition to asserts, we can avoid this problem. Since any guard statement [p]refines skip, we can always introduce guards in a program text. Intuitively, introducing a guard means that we introduce a hypothet- ical assumption that will later need to be discharged (otherwise we are left with a miraculous component in our program). We propose the following method for working with context assumptions. Assume that we want to refine a subcomponent S of a statement C[S]. Furthermore, assume that we feel confident that it would be possible to propagate information that estab- lishes an assertion {p} at S, and that this assertion would be useful in subsequent refinements. We can then introduce a guard–assert pair using the following general rule (which is straightforward to verify):

S  [p];{p} ; S . Now we can refine S in context {p}. Furthermore, we need rules by which we can show that C[[p];{p} ; S] = C[{p} ; S]; i.e., the context assumption [p] can be discharged. The rules for propagating the guard [p] backwards in order to discard are described below. If we have chosen the context assumption correctly, then the propagated guard statement is eventually reduced to skip and can then be removed. This will then establish that C[S] = C[{p} ; S], from which we then continue with arefinement in context as before. Restricted refinement can also be expressed in terms of context assumptions. From Lemma 13.12 we know that {p} ; S  S ≡ S  [p];S, so we could first carry out refinements that add context assumptions, C[S] = C[[p];S], and then show that the guard statement can actually be removed.

28.2 Rules for Propagating Context Assertions

Assume that a conjunctive statement is given. The two basic constructors used in conjunctive statements are sequential composition and meet. For sequential com- position, we need rules that propagate assertions forward and backwards through a 468 28. Refinement in Context

statement. For meet, we need a rule that to one conjunct adds assertions induced by the other conjuncts (we refer to this as sideways propagation). Intuitively speaking, forward propagation carries information about the results of previous computations, while backward and sideways propagation carries information about what states can be treated arbitrarily, since they would lead to nontermination in the subsequent computation.

Rules for Sequential Composition and Meet The basic rules in the following theorem determine how assertions can be propa- gated forward, backwards and sideways.

Theorem 28.3 Let S and S be arbitrary conjunctive predicate transformers and p and q predi- cates. Then

(a) {p} ; S ={p} ; S ; {q} if and only if p ∩ S.true ⊆ S.q ;

(b) S ; {q}={p} ; S ; {q} if and only if S.q ⊆ p ;

   (c) S  S ={p} ; S  S if and only if S.true ∩ S .true ⊆ p .

Proof Assume that S and S are conjunctive. Since dropping an assertion is always a valid refinement, we need only consider left-to-right refinement in the proof. For (a), we have {p} ; S {p} ; S ; {q} ≡{definitions} (∀r • p ∩ S.r ⊆ p ∩ S.(q ∩ r)) ≡{S conjunctive} (∀r • p ∩ S.r ⊆ p ∩ S.q ∩ S.r) ≡{lattice property x  x  y ≡ x  y} (∀r • p ∩ S.r ⊆ S.q) ≡{⇒by specialization (r := true), ⇐ by monotonicity (r ⊆ true)} p ∩ S.true ⊆ S.q The proofs of (b) and (c) are similar. 2

In general we are interested in adding the strongest (i.e., least with respect to the refinement ordering) assertions possible, since a stronger assertion carries more information than a weaker one. Theorem 28.3 (b) and (c) immediately give us rules for finding the strongest possible assertion to be added. In (a), we do not give a closed expression for the strongest forward propagated assertion. However, one can show that such an assertion always exists: 28.2. Rules for Propagating Context Assertions 469

Lemma 28.4 Let conjunctive predicate transformer S :  %→  and predicate p over  be given. Then the strongest predicate q such that {p} ; S ={p} ; S ; {q} is the following:

(∩ q | p ∩ S.true ⊆ S.q • q).

• Proof We give the predicate (∩ q | p ∩ S.true ⊆ S.q q) the name q0. First we show that q0 satisfies the desired condition:

{p} ; S ={p} ; S ; {q0}

≡{Theorem 28.3 (a), definition of q0} p ∩ S.true ⊆ S.(∩ q | p ∩ S.true ⊆ S.q • q) ≡{S conjunctive} p ∩ S.true ⊆ (∩ q | p ∩ S.true ⊆ S.q • S.q) ≡{general rule (∀x ∈ A • y  x) ≡ y A} T

We then show that q0 is in fact the strongest of all predicates satisfying the condition in question: {p} ; S ={p} ; S ; {q} ≡{Theorem 28.3 (a)} p ∩ S.true ⊆ S.q ≡{rewrite} q ∈{q | p ∩ S.true ⊆ S.q} ⇒{general rule x ∈ A ⇒A  x} (∩ q | p ∩ S.true ⊆ S.q • q) ⊆ q

≡{definition of q0}

q0 ⊆ q This finishes the proof. 2

Propagation of context assertions is closely linked to correctness of statements. The following is a direct consequence of Theorem 28.3. Corollary 28.5 Assume that predicates p and q are given and that S is a conjunctive predicate transformer. Then

(a) p ∩ S.true {| S |} q ≡{p} ; S  S ; {q} .

(b) p {| S |} q ⇒{p} ; S  S ; {q} . This shows that correctness assuming termination, or weak correctness, is equiv- alent to forward propagation of context assertions and that ordinary correctness is a sufficient condition for propagation. 470 28. Refinement in Context

Propagation Rules for Basic Statements The forward propagation rule of Theorem 28.3 is not as such very useful, since it does not give us a way of calculating the propagated assertion. Below, we give individual rules for forward propagation through different constructs. The proofs of these rules are left to the reader as an exercise (Exercise 28.1).

{p} ; {q}={p} ; {q} ; {p ∩ q} , {p} ;[q] ={p} ;[q];{p ∩ q} , {p} ;  f ={p} ;  f ; {im. f.p} , {p} ;[Q] ={p} ;[Q];{im.Q.p} . Recall here that im. f.p ={γ | (∃σ • p.σ ∧ γ = f.σ )} (the image of p under f ), while im.Q.p ={γ | (∃σ • p.σ ∧ Q.σ.γ )} (the image of p under relation Q). For backward propagation, we immediately get the following rules for basic state- ments:

{q} ; {p}={p ∩ q} ; {q} ; {p} , [q];{p}={¬q ∪ p} ;[q];{p} ,  f ; {p}={f −1.p} ;  f ; {p} , [Q];{p}={λσ • Q.σ ⊆ p} ;[Q];{p} .

Example Derivation The following derivation illustrates how one works with assertions. We start from the following statement over program variables x, y : Nat: x := x + 1;(y := x + 1 {x < 2} ; y := x). The main aim is to refine the substatements y := x + 1 and y := x. x := x + 1;(y := x + 1 {x < 2} ; y := x) ={add vacuous assertion} {true} ; x := x + 1;(y := x + 1 {x < 2} ; y := x) ={forward propagation for functional update, simplify} x := x + 1;{x > 0} ; (y := x + 1 {x < 2} ; y := x) ={sideways propagation (Exercise 28.2)} x := x + 1;{x > 0} ; ({x < 2} ; y := x + 1 {x < 2} ; y := x) ={distributivity, merge assertions (homomorphism property)} x := x + 1;({x = 1} ; y := x + 1 {x = 1} ; y := x) Now we can focus inside the assignment y := x + 1. Our derivation continues as follows: x := x + 1;({x = 1} ; y := “ x + 1”) {x = 1} ; y := x) 28.3. Propagation in Derived Statements 471

={arithmetic, use local assumption x = 1} x := x + 1;({x = 1} ; y := 2 {x = 1} ; y := “ x ”) ={substitute, use local assumption x = 1} x := x + 1;({x = 1} ; y := 2 {x = 1} ; y := 1) {drop assertions} x := x + 1;(y := 2  y := 1) The final step here was to drop all assertions, in order to make the final statement easier to read. We could also leave the assertions in the code, since they convey information that might be used if the result is to be refined further.

28.3 Propagation in Derived Statements

We have shown how to propagate context assertions into statements built out of basic statements using only sequential composition, meets, and joins. Below, we use these basic rules to derive propagation rules for derived statements. These are more useful in practical program development.

Conditionals The rules for sequential composition and meet can be used to derive rules for other constructors. For the deterministic conditional, we have the following rule:

Theorem 28.6 Assertions are propagated into and out of a deterministic conditional as follows:

 (a) {p} ; if g then S else S fi ; {q}=  {p} ; if g then {p ∩ g} ; S ; {q} else {p ∩¬g} ; S ; {q} fi ; {q} .

   (b) if g then {p} ; S ; {q} else {p } ; S ; {q } fi =      {(g∩ p)∪(¬ g∩ p )};if g then {p}; S ;{q} else {p }; S ;{q } fi;{q ∪q } .

Proof We apply the rules for assertion propagation given above.  {p} ; if g then S else S fi ; {q} ={definition of deterministic conditional}  {p} ; ([g];S  [¬ g];S ) ; {q} ={distributivity}  {p} ;[g];S ; {q}{p} ;[¬ g];S ; {q} ={forward propagation}  {p} ;[g];{p ∩ g} ; S ; {q}{p} ;[¬ g];{p ∩¬g} ; S ; {q} ={homomorphism property, q ∩ q = q} 472 28. Refinement in Context

 {p} ;[g];{p ∩ g} ; S ; {q} ; {q}{p} ;[¬ g];{p ∩¬g} ; S ; {q} ; {q} ={distributivity}  {p} ; ([g];{p ∩ g} ; S ; {q}[¬ g];{p ∩¬g} ; S ; {q}) ; {q} ={definition of deterministic conditional}  {p} ; if g then {p ∩ g} ; S ; {q} else {p ∩¬g} ; S ; {q} fi ; {q} The proof of (b) is similar. 2

The generalization of Theorem 28.6 to guarded conditionals is straightforward. Thus, for example, the rule for propagation into a guarded conditional is

{p} ; if g1 → S1 [] ··· [] gn → Sn fi ; {q}=

{p} ; if g1 →{p ∩ g1} ; S1 ; {q} [] ··· [] gn →{p ∩ gn} ; Sn ; {q} fi ; {q} .

Loops For loops, we derive a rule for adding an invariant as a context assertion. In addition to the invariant, the guard of the loop gives information; it always holds when the body is executed, and it cannot hold after the loop has terminated.

Theorem 28.7 Assume that the following condition holds:

g ∩ p ∩ S.true ⊆ S.p . Then context assertions can be added to a while loop as follows:

{p} ; while g do S od ={p} ; while g do {p ∩ g} ; S ; {p} od ; {¬ g ∩ p} .

Proof We first note that the condition is equivalent to the condition

{g ∩ p} ; S ={g ∩ p} ; S ; {p} (use Corollary 28.5). Now define functions f and h by

f.X = if g then S ; X else skip fi , h.X = if g then {g ∩ p} ; S ; {p} ; X else skip fi . We show by ordinal induction that for an arbitrary ordinal α,

α α {p} ; ( f .abort) ={p} ; (h .abort) ; {¬ g ∩ p} . (∗)

The base case (α = 0) is

{p} ; abort ={p} ; abort ; {¬ g ∩ p} , 28.3. Propagation in Derived Statements 473 which is obviously true. For the successor case, we assume that (∗) holds for α. Then α+ {p} ; ( f 1.abort) ={definition of f } α {p} ; if g then S ; ( f .abort) else skip fi ={assertion propagation rule, skip ={true}} α {p} ; if g then {g ∩ p} ; S ; ( f .abort) else {¬ g ∩ p} fi ={assumption of the lemma} α {p} ; if g then {g ∩ p} ; S ; {p} ; ( f .abort) else {¬ g ∩ p} fi ={induction assumption} α {p} ; if g then {g ∩ p} ; S ; {p} ; (h .abort) ; {¬ g ∩ p} else {¬ g ∩ p} fi ={assertion propagation rule} α {p} ; if g then {g ∩ p} ; S ; {p} ; (h .abort) else skip fi ; {¬ g ∩ p} ={definition of hα+1} α+ {p} ; (h 1.abort) ; {¬ g ∩ p} Finally, we have the limit case, assuming that (∗) holds for all ordinals β<α: α {p} ; ( f .abort) ={definition of f α} β {p} ; ( β | β<α• f .abort) ={distributivity (assert is disjunctive)} β ( β | β<α• {p} ; ( f .abort)) ={induction assumption} β ( β | β<α• {p} ; (h .abort) ; {¬ g ∩ p}) ={distributivity} β {p} ; ( β | β<α• h .abort) ; {¬ g ∩ p} ={definition of hα} α {p} ; (h .abort) ; {¬ g ∩ p} Thus (∗) holds for an arbitrary ordinal, and the proof is finished. 2

Note that the predicate p in Theorem 28.7 is a weak invariant: the body S has to preserve p only if S terminates. This is different from the strong invariant used in total correctness proofs of loops. Propagation of assertions in the loop does not require the loop to terminate. The generalization of Theorem 28.7 to guarded iteration is straightforward: if gi ∩ p ∩ Si .true ⊆ Si .p for i = 1,...,n, then 474 28. Refinement in Context

{p} ; do g1 → S1 [] ··· [] gn → Sn od = { } →  ··· →  {¬ ∩ } , p ; do g1 S1 [] [] gn Sn od ; g p  ={ ∩ } { } = ∪···∪ where Si p gi ; Si ; p and g g1 gn.

Blocks For blocks we need two rules: one for propagating an assertion into the block from the left and one to propagate an assertion out from the block on the right. For simplicity, we use the general demonic (nondeterministically initialized) block construct in the rule, since the other block constructs are special cases of this one. Theorem 28.8 Assume that p, q, and r are boolean expressions such that y does not occur free in p or q. Furthermore, assume that p and p are similar, and that r and r  are similar. Then

(a) {p} ; begin var y | q ; S end =  {p} ; begin var y | q ; {p ∩ q} ; S end .

(b) begin var y | q ; S ; {r} end =  begin var y | q ; S ; {r} end ; {∃y • r } .

Proof Both proofs are straightforward, using the definition of the block. We show the crucial derivation for the second part. We have

{r} ; end end ; {∃y • r} ≡{definitions} {λγ σ • r.γ ∧ σ = end.γ }{λγ σ • σ = end.γ ∧ (∃y • r).σ } ⇐{homomorphism} (∀γσ• r.γ ∧ σ = end.γ ⇒ σ = end.γ ∧ (∃y • r).σ ) ≡{one-point rule} (∀γ • r.γ ⇒ (∃y • r).(end.γ )) ≡{pointwise extension} (∀γ • r.γ ⇒ (∃y • r.(end.γ ))) ≡{use y.γ as existential witness} T and the rule now follows from the definition of the block. 2

Similar rules can also be derived for propagating assertions backwards into and out of blocks (Exercise 28.4). 28.4. Context Assumptions 475

28.4 Context Assumptions

The rules for backward propagation of context assumptions (guards) are in many ways dual to those for forward propagation of asserts. We start with a general rule.

Theorem 28.9 Guard statements can be propagated backwards according to the following rule whenever S is conjunctive: S ;[q]  [S.q];S .

Proof S ;[q]  [S.q];S ≡{Lemma 13.12} {S.q} ; S  S ; {q} ≡{Theorem 28.3 (b)} T 2

We can now state rules for individual program constructs that can be used in practical program derivations. Most of the rules follow directly from Theorem 28.9.

Theorem 28.10 The following rules are available for basic statements:

(a) [true] = skip .

(b) {p} ;[q] = [¬ p ∪ q];{p} .

(c) [p];[q] = [p ∩ q] .

(d)  f ;[q]  [preim. f.q]; f .

(e) [Q];[q]  [λσ • Q.σ ⊆ q];[Q] .

(f) (S1  S2) ;[q]  S1 ;[q]  S1 ;[q] .

(g) [q1];S1  [q2];S2  [q1 ∩ q2];(S1  S2). The lemma can be then be used as a basis for proving the corresponding propagation rules for our derived constructs.

Theorem 28.11 The following rules are available for derived statements:

  (a) if g then S else S fi ;[q] = if g then S ;[q] else S ;[q] fi .

    (b) if g then [q];S else [q ];S fi = [(g ∩q)∪(¬ g ∩q )];if g then S else S fi . 476 28. Refinement in Context

(c) begin var y | p ; S end ;[q] = begin var y | p ; S ;[q] end provided that y does not occur free in q.

(d) begin var y | p ;[q];S end = [∀y • p ⇒ q];begin var y | p ; S end .

Proof The rules for conditional and block are proved in the same way as the corresponding rules for propagating assertions. 2

The idea behind guard propagation is that we want to eliminate guards from our program. By propagating guards backwards we hope to weaken the predicate inside the guard statement, so that it eventually reduces to true and the whole guard statement can be removed. Note that there is no rule for loops; we assume that whenever a loop is introduced, sufficient information about the loop is added as an assertion after the loop. As an example, we show how context assumptions are used to refine a simple sequential composition. Here we introduce the pair [x = 0] ; {x = 0} at a point in the program where we (by looking at the preceding assignment) feel confident that x = 0 must hold. x, z := 0, z · z ; y := x + 1 {add context assumption pair; skip  [p];{p}} x, z := 0, z · z ;[x = 0] ; {x = 0} ; y := “ x + 1” ={use assertion for refinement in context} x, z := 0, z · z ;[x = 0] ; y := 1 ={propagate guard statement} [true];x, z := 0, z · z ; y := 1 ={[true] = skip } x, z := 0, z · z ; y := 1

To show that the context assumption approach allows propagating smaller amounts of information, we also show an alternative derivation using context assertion propagation only:

x, z := 0, z · z ; y := x + 1 ={add initial assertion; skip ={true}} {true} ; x, z := 0, z · z ; y := x + 1 ={propagate assert}

• x, z := 0, z · z ; {x = 0 ∧ (∃z0 z0 · z0 = z)} ; y := x + 1 ={use assertion for refinement in context}

• x, z := 0, z · z ; {x = 0 ∧ (∃z0 z0 · z0 = z)} ; y := 1 28.5. Summary and Discussion 477

{drop assert} x, z := 0, z · z ; y := 1

These two examples also show the duality between the two approaches; the justi- fications in the first derivation are very similar to those in the second derivation, but in reverse order.

Correctness and Propagation Propagation of context assumptions is also closely linked to correctness of state- ments, in a way similar to propagation of context assertions.

Theorem 28.12 Assume that predicates p and q are given and that S is a conjunctive predicate transformer. Then

(a) p ∩ S.true {| S |} q ≡ S ;[q]  [p];S .

(b) p {| S |} q ⇒ S ;[q]  [p];S .

Proof The result follows directly from Corollary 28.5 and Lemma 13.12. 2

Exactly as in the case of the corresponding result for assertions (Corollary 28.5), this result can be useful when we assume a certain correctness condition and want to use it in an algebraic manipulation of statements.

28.5 Summary and Discussion

The idea of using state assertions as part of the statement language was introduced in Back’s original formulation of the refinement calculus [6, 7]. The basic rules for propagating context assertions into subcomponents were also given there. The present treatment generalizes this considerably and also takes into account the dual notion of context assumptions (guard statements). This dual way of handling context information was introduced by Morgan [105]. The propagation rules for context assertions and context assumptions are also inter- esting in their own right. They give a general paradigm for transporting information relevant for total correctness around in a program statement. Although one might expect otherwise, these rules are not the same as one would get by considering a proof system for partial correctness. There are a number of places where the context assertions and assumptions propagate in a different way than partial correctness assertions, which essentially express reachability of states. 478 28. Refinement in Context

28.6 Exercises

28.1 Prove the following rules for forward propagation of context assertions through different statements: {p} ; {q}={p} ; {q} ; {p ∩ q} , {p} ;[q] ={p} ;[q];{p ∩ q} , {p} ;  f ={p} ;  f ; {im. f.p} , {p} ;[Q] ={p} ;[Q];{im.Q.p} . 28.2 Show that the following rule follows from Theorem 28.3 (c): S {p} ; T ={p} ; S {p} ; T . 28.3 Give an example of a statement {p} ; S for which there is no predicate q satisfying {p} ; S = S ; {q} . Conclude that it is impossible to give a rule for forward propagation of assertion without loss of information. 28.4 Derive rules for propagating assertions backwards into and out of blocks. 28.5 In a language of disjunctive statements, assertions can also be used to indicate context information. In the disjunctive case, assertions have to be interpreted differently. For example, the assertion introduction S = S ; {p} now means that for all initial states there is a possible execution of S such that p holds in the final state. However, the principle of restricted refinement can still be used. Derive rules for introducing assertions before and after the angelic update command. Is there a rule for adding an assertion to one element of a join? 28.6 Complete the proof of Theorem 28.10. 28.7 Complete the proof of Corollary 28.5 (a); i.e., prove that

p ∩ S.true ⊆ S.q ≡{p} ; S  S ; {q} for conjunctive S. 28.8 Prove the following refinement with a detailed derivation:

x := max(x, y)  if x < y then x := y else skip fi . This is page 479 Printer: Opaque this 29

Iteration of Conjunctive Statements

In our preceding treatment of iteration, we have considered the situation where the body of the iteration is either monotonic or continuous. In this chapter, we look at some of the additional properties that iterative statements have when the bodies are conjunctive. We show how weak and strong iteration are related in this case. We also prove two important properties that hold for iterations of conjunctive predicate transformers: the decomposition property, which shows how iteration of a meet can be rewritten in terms of iteration and sequential composition only, and the leapfrog property. The chapter also gives examples of program transformations that are used to modify the control structure of program statements.

29.1 Properties of Fixed Points

Before going in more detail into properties of conjunctive iterations, we need to establish a few technical results about fixed points. The first shows how to treat fixed points in multiple variables, while the second describes how to distribute function application into a fixed-point expression.

Fixed-Point Diagonalization The following diagonalization lemma gives a way of moving between fixed points over one variable and fixed points over multiple variables. Exactly as for λ- abstraction, the notation (µ xy• t) abbreviates (µ x • (µ y • t)).

Lemma 29.1 Assume that the type  is a complete lattice and that the function f :  →  →  is monotonic in both its arguments, Then 480 29. Iteration of Conjunctive Statements

(µ xy• f.x.y) = (µ x • f.x.x).

Proof We prove mutual ordering. First: (µ xy• f.x.y)  (µ x • f.x.x) ⇐{least fixed point induction with f := (λxy• f.x.y)} (µ y • f.(µ x • f.x.x).y)  (µ x • f.x.x) ⇐{least fixed point induction with f := (λy • f.(µ x • f.x.x).y)} f.(µ x • f.x.x).(µ x • f.x.x)  (µ x • f.x.x) ≡{unfold least fixed point} f.(µ x • f.x.x).(µ x • f.x.x)  f.(µ x • f.x.x).(µ x • f.x.x) ≡{reflexivity} T

Then (µ x • f.x.x)  (µ xy• f.x.y) ⇐{least fixed point induction with f := (λx • f.x.x)} f.(µ xy• f.x.y).(µ xy• f.x.y)  (µ xy• f.x.y) ≡{unfold least fixed point with f := (λx • (µ y • f.x.y))} f.(µ xy• f.x.y).(µ xy• f.x.y)  (µ y • f.(µ xy• f.x.y).y) ≡{unfold least fixed point with f := (λy • f.(µ xy• f.x.y).y)} f.(µ xy• f.x.y).(µ xy• f.x.y)  f.(µ xy• f.x.y).(µ y • f.(µ xy• f.x.y).y) ≡{fold least fixed point with f := (λx • (µ y • f.x.y))} f.(µ xy• f.x.y).(µ xy• f.x.y)  f.(µ xy• f.x.y).(µ xy• f.x.y) ≡{reflexivity} T 2

Distributivity into a Fixed Point Many proofs in the rest of this chapter rest on the following important lemma for distributing function application into a fixed-point expression, also known as the rolling rule.

Lemma 29.2 Assume that f and g are monotonic functions on a complete lattice. Then 29.1. Properties of Fixed Points 481

f.(µ.(g ◦ f )) = µ.( f ◦ g), f.(ν.(g ◦ f )) = ν.( f ◦ g).

Proof We prove the first part; the proof for the greatest fixed point is similar. We first note that f ◦ g and g ◦ f are both monotonic, so the fixed points are well-defined. Now we prove mutual ordering:

µ.( f ◦ g)  f.(µ.(g ◦ f )) ≡{definition of composition, change to binder notation for µ} (µ x • f.(g.x))  f.(µ x • g.( f.x)) ⇐{least fixed point induction} f.(g.( f.(µ x • g.( f.x))))  f.(µ x • g.( f.x)) ≡{fold least fixed point} f.(µ x • g.( f.x))  f.(µ x • g.( f.x)) ≡{reflexivity} T and

f.(µ.(g ◦ f ))  µ.( f ◦ g) ≡{definition of composition, change to binder notation for µ} f.(µ x • g.( f.x))  (µ x • f.(g.x)) ≡{unfold least fixed point} f.(µ x • g.( f.x))  f.(g.(µ x • f.(g.x))) ⇐{f monotonic} (µ x • g.( f.x))  g.(µ x • f.(g.x)) ⇐{least fixed point induction} g.( f.(g.(µ x • f.(g.x))))  g.(µ x • f.(g.x)) ≡{fold least fixed point} g.(µ x • f.(g.x))  g.(µ x • f.(g.x)) ≡{reflexivity} T 2

We immediately have two important consequences of this result. 482 29. Iteration of Conjunctive Statements

Theorem 29.3 Assume that h is a monotonic function on predicate transformers and S is a mono- tonic predicate transformer. Then (λX • h.(S ; X)) and (λX • S ; h.X) are also monotonic functions, and

S ; (µ X • h.(S ; X)) = (µ X • S ; h.X), S ; (ν X • h.(S ; X)) = (ν X • S ; h.X).

Proof Use Lemma 29.2 with f := (λX • S ; X) and g := h. 2

The other consequence is the following rolling property for least fixed points of a sequential composition:

Corollary 29.4 For arbitrary monotonic predicate transformers S and T ,

µ.(S ; T ) = S.(µ.(T ; S)) .

Proof Use Lemma 29.2 with f := S and g := T . 2

29.2 Relating the Iteration Statements

In the conjunctive case, the weak and strong iterations are related as follows.

Lemma 29.5 Let S be an arbitrary conjunctive predicate transformer. Then

(a) Sω ={µ.S} ; S∗ .

(b) Sω = S∗ ; {µ.S} .

Proof The first part is proved as follows: S conjunctive ∗ ω {µ.S} ; S = S ≡{definitions, β reduction} ∗ (λp • {p} ; S ).(µ.S) = (µ X • S ; X  skip) ⇐{fusion theorem (Theorem 19.5)} ∗ ∗ (λp • {p} ; S ) ◦ S = (λX • S ; X  skip) ◦ (λp • {p} ; S ) ≡{definition of composition, pointwise extension, β reduction} ∗ ∗ (∀p • {S.p} ; S = S ; {p} ; S  skip) ≡{unfold least fixed point} ∗ ∗ (∀p • {S.p} ; (S ; S  skip) = S ; {p} ; S  skip) ≡{pointwise extension, statement definitions} 29.2. Relating the Iteration Statements 483

∗ ∗ (∀pq• S.p ∩ S.(S .q) ∩ q = S.(p ∩ S .q) ∩ q) ≡{S conjunctive} T We leave it to the reader to show that (λq • {q} ; S) is continuous for arbitrary monotonic S (this guarantees that the second step of the derivation is valid). The second part of the lemma can be proved in a similar way, showing {µ.S} ; S∗ = S∗ ; {µ.S} using the dual of fusion (with greatest fixed points) and relying on the fact that the function (λS • {q} ; S) is cocontinuous (Exercise 29.2). 2

Recall that µ.S characterizes those states from which repeated execution of S is guaranteed to terminate. Thus Lemma 29.5 shows how the strong iteration Sω can be divided into two parts; the assertion checks for possible nontermination, and the weak iteration performs the repeated execution of S. This is reminiscent of the tra- ditional decomposition of total correctness into termination and partial correctness. The same correspondence can also be expressed using infinite repetition:

Theorem 29.6 Assume that S is a conjunctive predicate transformer. Then Sω = S∗  S∞ .

Proof For arbitrary q,wehave ω S .q ={Lemma 29.5} ∗ ({µ.S} ; S ).q ={definitions} ∗ µ.S ∩ S .q ={S∞.q = µ.S for arbitrary q (Lemma 21.3)} ∞ ∗ S .q ∩ S .q ={pointwise extension} ∞ ∗ (S  S ).q which proves the lemma. 2

Now we can prove the result that was announced in Chapter 21.

Theorem 29.7 Strong iterations, while loops, and guarded iterations preserve conjunctivity.

Proof We have the following derivation: S conjunctive, I nonempty

ω •  S .(∩ i ∈ I pi ) ={Theorem 29.6} 484 29. Iteration of Conjunctive Statements

∗ • ∞ • S .(∩ i ∈ I pi )  S .(∩ i ∈ I pi ) ={weak iteration preserves conjunctivity}

• ∗ ∞ • (∩ i ∈ I S .pi )  S .(∩ i ∈ I pi ) ={S∞.q = µ.S for arbitrary q}

• ∗ • ∞ (∩ i ∈ I S .pi )  (∩ i ∈ I S .pi ) ={distributivity}

• ∗ ∞ (∩ i ∈ I S .pi  S .pi ) ={Theorem 29.6}

• ω (∩ i ∈ I S .pi ) The results for while loops and guarded iterations now follow directly, since they can be rewritten using strong iteration. 2

29.3 Iteration of Meets

We shall now prove a number of results for weak and strong iterations. These results show clearly the usefulness of algebraic reasoning in general and fixed- point reasoning in particular. Although the general results are for weak and strong iteration only, they can be used for deriving properties of while loops as well, as shown later in this chapter.

Leapfrog Properties The following lemma states that conjunctive iterations have what is sometimes called the leapfrog property.

Lemma 29.8 Assume that predicate transformer S is conjunctive and T monotonic. Then

S ; (T ; S)ω = (S ; T )ω ; S , S ; (T ; S)∗ = (S ; T )∗ ; S .

Proof The second part of the lemma is proved as follows: S conjunctive, T monotonic ∗  S ; (T ; S) ={definition} S ; (ν X • T ; S ; X  skip) ={Theorem 29.3 with h := (λX • T ; X  skip)} (ν X • S ; (T ; X  skip)) 29.3. Iteration of Meets 485

={S conjunctive} (ν X • S ; T ; X  S) ={Lemma 21.2} ∗ (S ; T ) ; S The proof of the first part is similar. 2

The proof of Lemma 29.8 rests heavily on the conjunctivity of S.IfS is not assumed to be conjunctive, it is possible to show the following (Exercise 29.3): S ; (T ; S)∗  (S ; T )∗ ; S .

The leapfrog property can now be used to prove a rule for manipulating while loops.

Lemma 29.9 Assume that S and T are conjunctive and that g and g are predicates such that g {| S |} g and ¬ g {| S |} ¬ g. Then the following refinement holds:  S ; while g do T ; S od  while g do S ; T od ; S .

Proof The proof is a short derivation: S ; while g do T ; S od ={Lemma 21.8} ω S ; ([g];T ; S) ;[¬ g] ={leapfrog property for strong iteration} ω (S ;[g];T ) ; S ;[¬ g] {assumptions, Theorem 28.12}  ω  ([g ];S ; T ) ;[¬ g ];S ={Lemma 21.8}  while g do S ; T od ; S 2

Decomposition Properties The following iteration decomposition lemma shows that it is possible to rewrite iterations of meets in a form that contains only sequential composition and iteration:

Lemma 29.10 Assume that predicate transformer S is conjunctive and T monotonic. Then (S  T )∗ = S∗ ; (T ; S∗)∗ , (S  T )ω = Sω ; (T ; Sω)ω . 486 29. Iteration of Conjunctive Statements

Proof We show the proof for strong iteration (the proof for weak iteration is similar). S conjunctive, T monotonic ω  (S  T ) ={definition} (µ X • (S  T ) ; X  skip) ={distributivity} (µ X • S ; X  T ; X  skip) ={diagonalization (Lemma 29.1)} (µ X • µ Y • S ; Y  T ; X  skip) ={Lemma 21.2} ω (µ X • S ; (T ; X  skip)) ={Theorem 29.3} ω ω S ; (µ X • T ; S ; X  skip) ={definition of strong iteration} ω ω ω S ; (T ; S ) 2

Combining Lemma 29.10 with the leapfrog properties in Lemma 29.8 we imme- diately get the following corollary:

Corollary 29.11 Assume that predicate transformer S is conjunctive and T monotonic. Then

(S  T )∗ = (S∗ ; T )∗ ; S∗ , (S  T )ω = (Sω ; T )ω ; Sω .

29.4 Loop Decomposition

The decomposition rule for iterations can be used to prove a decomposition rule for while loops. It can be seen as a basic rule, useful for deriving more specific rules.

Theorem 29.12 Assume that statements S and T are conjunctive. Then

while g ∪ h do if g then S else T fi od = while g do S od ; while h do T ; (while g do S od) od for arbitrary predicates g and h. 29.5. Other Loop Transformations 487

Proof while g ∪ h do if g then S else T fi do ={rewrite loop using iteration} ω ([g ∪ h];if g then S else T fi) ;[¬(g ∪ h)] ={rewrite conditional} ω ([g ∪ h];([g];S  [¬g];T )) ;[¬(g ∪ h)] ={distributivity, homomorphism, lattice properties} ω ([g];S  [¬g ∩ h];T ) ;[¬g ∩¬h)] ={decomposition (Lemma 29.10)} ω ω ω (([g];S) ;[¬g ∩ h];T ) ; ([g];S) ;[¬g ∩¬h)] ={homomorphism} ω ω ω (([g];S) ;[¬g];[h];T ) ; ([g];S) ;[¬g];[¬h] ={leapfrog (Lemma 29.8)} ω ω ω ([g];S) ;[¬g];([h];T ; ([g];S) ;[¬g]) ;[¬h] ={rewrite iterations into loops} while g do S od ; while h do T ; (while g do S od) od 2

Theorem 29.12 may not seem very useful at first sight, but if the inner loop on the right-hand side can be refined by a simple statement, then the right-hand side may be a more efficient loop than that on the left-hand side (it may involve a smaller number of repetitions).

29.5 Other Loop Transformations

By setting T = S in Theorem 29.12 we immediate get the following rule:

Corollary 29.13 Assume that statement S is conjunctive. Then

while g ∪ h do S od = while h do S od ; while g do S ; (while h do S od) od for arbitrary predicates g and h. Next, we consider a rule that allows us to remove (or add) a redundant loop: Theorem 29.14 Assume that S is monotonic. Then

while g do S od ; while g do S od = while g do S od . Furthermore, if S is conjunctive, then 488 29. Iteration of Conjunctive Statements

while g do S od = while g ∩ h do S od ; while g do S od .

Proof We have

while g do S od ; while g do S od ={rewrite loop using iteration} ω ω ([g];S) ;[¬g];([g];S) ;[¬g] ={unfold iteration} ω ω ([g];S) ;[¬g];([g];S ; ([g];S)  skip) ;[¬g] ={distributivity} ω ω ([g];S) ; ([¬g];[g];S ; ([g];S)  [¬g];skip) ;[¬g] ={skip is unit, guard homomorphism} ω ω ([g];S) ; ([¬g ∩ g];S ; ([g];S)  [¬g]) ;[¬g]; ={guard property} ω ω ([g];S) ; (magic ; S ; ([g];S)  [¬g]) ;[¬g] ={magic properties} ω ([g];S) ;[¬g];[¬g] ={guard property} ω ([g];S) ;[¬g] ={rewrite into loop} while g do S od

For the second half, we then have

while g do S od ={Theorem 29.12 with h := g ∩ h} while g ∩ h do S od ; while g do S ; (while g ∩ h do S od) od ={first part of theorem} while g ∩ h do S od ; while g ∩ h do S od ; while g do S ; (while g ∩ h do S od) od ={Theorem 29.12} while g ∩ h do S od ; while g do S od

2 29.6. Example: Finding the Period 489

29.6 Example: Finding the Period

We finish the chapter with a small case study, developing a program that finds the period of function iteration. Iterating a function f at a value x means constructing a sequence x, f.x, f 2.x,... (recall that f 0.x = x and f n+1.x = f.( f n.x)). When constructing the iterated sequence, two things can happen. Either the iteration generates new values forever, or else at some point f k+n.x = f k .x for some k and some n > 0. In the latter case the sequence has a period. Now assume that f : Nat →  is a function such that iteration of f from 0 has a period, i.e., assume + (∃kn• n > 0 ∧ f k n.0 = f k .0). (∗) The problem is to find the minimal period p and a value x in the period. Using the definition ∧ minper. f.p.x = p > 0 ∧ f p.x = x ∧ (∃m • x = f m .0) ∧ (∀n • n > 0 ∧ f n.x = x ⇒ p ≤ n) we can specify the problem as the following relational assignment:     [x, p := x , p | minper. f.p .x ] .

First Solution

Let us give the name k0 to the smallest number of iterations that takes us into the period of f : ∧ • k+n k k0 = min.{k |∃n n > 0 ∧ f .0 = f .0} . The assumption (∗) guarantees that the set is nonempty, so it has a well-defined smallest element (because the natural numbers are well-ordered). A simple way of finding the period would be to compute the iteration sequence (by executing x := f.x repeatedly, starting from x = 0) and at every step compare the new value with all the previous values. When the first repetition is found, we have reached the point where x = f k0 .0 and f p.x = x, and we know both the period and a value in the period. However, this solution consumes too much of both time and space, so we want to find a more efficient solution. We still want to use the idea of executing x := f.x repeatedly, starting with x = 0, but without having to store the values generated so far. One way of doing this is to introduce an extra variable y and repeatedly execute x, y := f.x, f.( f.y), starting from x = 0 ∧ y = f.0. An informal argument for this is as follows: 490 29. Iteration of Conjunctive Statements

(i) When x = f k0 .0, we know that y must also be a value inside the period, since y races ahead of x (and since both are then inside the period, y is also behind x). (ii) When we continue, we know that y moves inside the period one step closer to x every time x, y := f.x, f.( f.y) is executed.

This means that when y = x, we can be sure that x is inside the period. After this, we can find the length of the period separately. The idea described above is the basis for dividing the problem into subproblems:     [x, p := x , p | minper. f.p .x ] {introduce sequential composition}   [x := x | (∃n • minper. f.n.x )]; {∃n • minper. f.n.x} ;   [p := p | minper. f.p .x] {introduce block with local variable} begin var y : Nat ;    “[x, y := x , y | (∃n • minper. f.n.x )]”; {∃n • minper. f.n.x} ;    [p, y := p , y | minper. f.p .x] end {introduce sequential composition} begin var y : Nat ;      [x, y := x , y | x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x )]; {“ x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x) ”} ;    [x, y := x , y | (∃n • minper. f.n.x )]; {∃n • minper. f.n.x} ;    [p, y := p , y | minper. f.p .x] end {rewrite assertion (Exercise 29.9)} begin var y : Nat ;      [x, y := x , y | x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x )]; {x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y)} ;    [x, y := x , y | (∃n • minper. f.n.x )]; {∃n • minper. f.n.x} ;    [p, y := p , y | minper. f.p .x] end 29.6. Example: Finding the Period 491

Here the first step divides the derivation into two parts: first we find a value x in the period, and then we find the length p of the period. The second step adds the local variable y, and the third step divides finding the right value for x into two parts (in accordance with the argument above): first we make x equal to f k0 .0, and then we continue until we can detect that x is in fact within the period. The last step rewrites the assertion from stating that y is ahead of x to stating that y is both ahead of and behind x. This information will be crucial in later loop introductions. The verification of the two sequential composition introductions and the block introduction is straightforward, so we leave them to the reader.

Introducing Loops We are now in a position to introduce the repeated assignments to x and y that were announced earlier. We start from the first demonic assignment:      [x, y := x , y | x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x )] {add initial assignment} x, y := 0, f.0;{x = 0 ∧ y = f.0} ;      [x, y := x , y | x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x )] {introduce loop} x, y := 0, f.0; while x = f k0 .0 do      [x, y := x , y | x = f.x ∧ (∃m | m > 0 • y = f m .x )] od

• k • m The loop invariant here is (∃k k ≤ k0 ∧ x = f .0)∧(∃m | m > 0 y = f .x),“x k0 k has not passed f .0, and y is ahead of x”. The variant is k0 − min.{k | x = f .0}, “the number of iterations remaining” (the minimum operator min is discussed in Exercise 29.11). Note that by initializing y to f.0 we make sure that y is initially ahead of x. The nondeterminism in the loop body allows y to be updated freely on every iteration, as long as it is “ahead of” x. Next we can introduce a loop for the second part (an assertion plus a demonic assignment): {x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y)} ;    [x, y := x , y | (∃n • minper. f.n.x )]; {introduce loop} while x = y do x, y := f.x, f 2.y od In this case the loop invariant is (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y),“y is both ahead of and behind x”. The variant is min.{m | x = f m .y}, the number of iterations remaining. 492 29. Iteration of Conjunctive Statements

Finally, we introduce a loop for finding the period once x has its correct value (the final combination of an assertion and a demonic assignment):

{∃n • minper. f.n.x} ;    [p, y := p , y | minper. f.p .x] {add initial assignment} p, y := 1, f.x ;“{p = 1 ∧ y = f.x ∧ (∃n • minper. f.n.x)} ;    [p, y := p , y | minper. f.p .x]” {introduce loop} ···while x = y do p, y := p + 1, f.y od···

Here the loop invariant is (∃n • minper. f.n.x ∧ 0 < p ≤ n ∧ y = f p.x),“p has not passed the value of the period”. The variant is min.{n | n > 0 ∧ f n.x = x}−p, the number of iterations remaining.

Loop Introduction Proofs The second loop introduction is the most interesting one, so we show its proof in some detail. To show that the invariant holds initially, we need to prove

x = f k0 .0 ∧ x = y ∧ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y) ⇒ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y), which is trivially true. We then show invariance and termination separately. The loop body must preserve the invariant:

x = y ∧ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y) ⇒ + + (∃m | m > 0 • f 2.y = f m 1.x) ∧ (∃n • f.x = f n 2.y) ≡{quantifier rules} (∀mn• x = y ∧ m > 0 ∧ y = f m .x ∧ x = f n.y ⇒ + + (∃m | m > 0 • f 2.y = f m 1.x) ∧ (∃n • f.x = f n 2.y)) This is not hard; we choose m + 1 and n − 1 as existential witnesses (from x = y it follows that n > 0). Next, the loop body must decrease the variant:

(∀w • x = y ∧ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y) ∧ min.{m | x = f m .y}=w ⇒ + min.{m | f.x = f m 2.y} 0 ∧ y = f m .x ∧ x = f n.y ⇒ + min.{m | f.x = f m 2.y} < min.{m | x = f m .y}) 29.6. Example: Finding the Period 493

Here the argument is that for every element m in {m | x = f m .y}, the number m − 1 must be in {m | f.x = f m+2.y} (the antecedent implies that 0 is not in {m | x = f m .y} and that the sets are nonempty). The full formal proof requires some properties of the min operator (see Exercise 29.11). Finally, the postcondition must hold on exit: x = y ∧ (∃m | m > 0 • y = f m .x) ∧ (∃n • x = f n.y) ⇒ (∃n • minper. f.n.x).

This follows because (∃m | m > 0 • x = f m .x) implies (∃n • minper. f.n.x) (the natural numbers are well-ordered, so the nonempty set of periods at x has a smallest element).

Final Refinements Collecting the loop introductions, we get the following version of the program:

begin var y : Nat ; x, y := 0, f.0; “ while x = f k0 .0 do        [x, y := x , y | x = f.x ∧ x = y ∧ (∃m • y = f m .x )] od ; while x = y do x, y := f.x, f 2.y od ”; p, y := 1, f.x ; while x = y do p, y := p + 1, f.y od end .

There is still a loop guard that mentions the unknown value k0. However, if the body of that loop can be made the same as the body of the following loop, then we can try to merge the two loops. To do that, we first make the first loop body more deterministic and then use the loop invariant of the first loop to rewrite the guard:

while x = f k0 .0 do      “[x, y := x , y | x = f.x ∧ (∃m | m > 0 • y = f m .x )]” od ; while x = y do x, y := f.x, f 2.y od ={make loop body more deterministic} while “ x = f k0 .0”do x, y := f.x, f 2.y od ; while x = y do x, y := f.x, f 2.y od ={use loop invariant to rewrite guard} while x = y ∧ x = f k0 .0 do x, y := f.x, f 2.y od ; 494 29. Iteration of Conjunctive Statements

while x = y do x, y := f.x, f 2.y od ={Theorem 29.14} while x = y do x, y := f.x, f 2.y od When rewriting the guard of the first loop we used the following fact:

• k • m k0 (∃k k ≤ k0 ∧ x = f .0) ∧ (∃m | m > 0 y = f .x) ∧ x = f .0 ⇒ x = y ; i.e., “if x is not yet in the period and y is ahead of x, then x = y” (Exercise 29.10). Thus we have reached the final version of the program; if the sequence f.0, f.1,... has a period, then   [x := x | (∃n • minper. f.n.x )]  begin var y : Nat ; x, y := 0, f.0; while x = y do x, y := f.x, f 2.y od ; p, y := 1, f.x ; while x = y do p, y := p + 1, f.y od end .

29.7 Summary and Discussion

The properties of conjunctive iterations, as described here, are motivated by similar efforts by other researchers to handle the leapfrog properties and other related rules. The diagonalization lemma and the rolling lemma are “folk theorems’.’ The de- composition and leapfrog properties have been proved using a fixed-point calculus by Backhouse et al. [34]. Both the decomposition and the leapfrog properties hold for regular languages and are essential in applying the algebra of regular languages to reasoning about programs [124]. Leapfrog rules for loops have been discussed in unpublished notes by Nelson and van de Snepscheut. The problem of finding the period for the iteration of a function was used as an example of a “nice algorithm” by van de Snepscheut [132].

29.8 Exercises

29.1 Prove that if predicate transformer S is conjunctive, then the following holds: S ; S∗ = S∗ ; S . Compare this with Exercise 21.2. 29.8. Exercises 495

29.2 The dual of the fusion theorem guarantees that if h is universally meet homomor- phic, then h ◦ f = g ◦ h ⇒ h.(ν. f ) = ν.g . First show that the function (λS • {q} ; S) is universally meet homomorphic for an arbitrary predicate q. Then prove {µ.S} ; S∗ = S∗ ; {µ.S} for conjunctive S (begin by writing the right-hand side as (ν X • S ; X {µ.S})). 29.3 Prove that if predicate transformers S and T are monotonic, then the following holds: S ; (T ; S)∗  (S ; T )∗ ; S . 29.4 Investigate whether it is possible to use the leapfrog rule for strong iteration to get arefinement rule for loops in the opposite direction from that in Lemma 29.9, i.e., a rule of the form  while g do S ; T od ; S  S ; while g do T ; S od . 29.5 Show the following rule for an arbitrary monotonic predicate transformer S: Sω ; Sω = Sω . 29.6 Prove the following loop elimination rule:

{¬g} ; while g do S od  skip . 29.7 Prove the following variation on the loop decomposition rule, for conjunctive S and T :

while g do if h then S else T fi od = while g ∩ h do S od ; while g do {¬h} ; T ; (while g ∩ h do S od) od (the assertion can be useful in subsequent refinement steps). 29.8 Prove the following general facts about periodicity of function iteration: f k+p.x = f k .x ⇒ f k+2p.x = f k .x , f k+p.x = f k .x ⇒ f k+m+p.x = f k+m .x .

29.9 Prove the following, under assumption (∗) and the definition of k0 given in the text.

x = f k0 .0 ∧ (∃m | m > 0 • y = f m .x) ⇒ (∃n • x = f n.y).

29.10 Prove the following, under assumption (∗) and the definition of k0 given in the text.

• k • m k0 (∃k k ≤ k0 ∧ x = f .0) ∧ (∃m | m > 0 y = f .x) ∧ x = f .0 ⇒ x = y . 29.11 Define the minimum operator for sets of natural numbers as follows: ∧ min.A = (εn ∈ A • (∀m ∈ A • n ≤ m)) . 496 29. Iteration of Conjunctive Statements

Prove the following properties: A =∅ ⇒ (∀m ∈ A • min.A ≤ m), A =∅∧(∀m ∈ A • (∃n ∈ B • n < m)) ⇒ min.B < min.A . This is page 497 Printer: Opaque this 30

Appendix

A: Association and Precedence

The association rules for operators are as follows:

• Postfix operators associate to the left, so S◦◦ means (S◦)◦ • Function application associates to the left, so f.x.y means ( f.x).y • Prefix operators associate to the right, so ¬¬t means ¬ (¬ t) • Infix operators associate to the left, so a − b − c means (a − b) − c

Generally, the precedences are as follows (where 1 is highest):

1. Postfix operators (written as subscript or superscript) 2. Function application 3. Prefix operators 4. Infix operators (for details, see below) 5. Mixfix operators and binders (scope is indicated by a pair of delimiters)

Infix operators have precedences as follows (where 1 is highest):

1. · / div mod × 2. +− → 3. ≤ < ≥ > 4. ; •◦\ 5. ∩ 498 30. Appendix

6. ∪ ˆ 7. := 8.  < ⊆⊂ 9. =∈ 10. ∧ 11. ∨ 12. ⇒⇐ 13. ... {| ... |} ...... (| ... |)... 14. ≡

Negated operators have the same precedence as the original operator.

B: Homomorphism Tables

In these tables we use the following abbreviations for statement constructors:   IfThen = (λS, S • if g then S else S fi)

• If = (λS1,...,Sn if g1 → S1 [] ··· [] gn → Sn fi) While = (λS • while g do S od)

• Do = (λS1,...,Sn do g1 → S1 [] ··· [] gn → Sn od)

preserves  ⊥ lim ¬ ; yes yes yes yes yes yes yes  yes no yes yes no yes1 no yes yes no no yes yes no lim yes yes no no yes yes no IfThen yes yes yes yes yes yes no If yes yes no yes no yes no µ yes2 yes2 no no yes2 yes2 no ν yes2 no yes2 yes2 no no no (λS • Sω) yes yes no yes no yes no (λS • S∗) yes yes yes yes no no no (λS • S∞) yes yes no yes yes yes no While yes yes no yes yes yes no Do yes yes no yes no yes no

1if the meet is finite 2 when argument is in Mtran(, ) →m Mtran(, )

TABLE 30.1. Preservation of homomorphic properties 30. Appendix 499

is  ⊥ lim ¬ {p} yes yes no yes yes yes no [p] yes no yes yes yes no no  f yes yes yes yes yes yes yes {R} yes yes no no yes yes no [R] yes no yes yes no yes1 no

1if R is total and image-finite

TABLE 30.2. Homomorphic properties of basic statements

is  ⊥  lim ¬ 1; (λp • {p}) yes yes no yes1 yes yes no yes yes (λp • [p]) yes◦ yes◦ no yes◦1 yes◦ yes◦ no yes yes (λf •  f ) yes ------yes yes (λR • {R}) yes yes no no yes yes no yes yes (λR • [R]) yes◦ yes◦ no no yes◦ yes◦ no yes yes (λS • S ; T ) yes yes yes yes yes yes yes no no (λT • S ; T ) yes yes2 yes3 yes4 yes5 yes6 yes7 no no (λS, T • S ; T ) yes yes yes no no no no no no (λS • S  T ) yes yes no yes yes yes no no no (λS, T • S  T ) yes yes yes yes no no no no yes◦ (λS • S T ) yes no yes yes yes yes no no no (λS, T • S T ) yes yes yes no yes yes no no yes◦ (λS • ¬ S) yes◦ yes◦ yes◦ yes◦ yes◦ yes◦ yes no no (λS • S◦) yes◦ yes◦ yes◦ yes◦ yes◦ yes◦ yes yes yes (λS • S∗) yes yes no no no no no yes no (λS • Sω) yes yes no no no no no no no (λS • S∞) yes yes no no no no no no no IfThen yes yes yes no no no no no no If yes yes no no no no no no no While yes no no no no no no no no Do yes no no no no no no no no

1when the meet is taken over nonempty sets 2if S is strict 3if S is terminating 4if S is conjunctive 5if S is disjunctive 6if S is continuous 7if S is ¬-homomorphic

TABLE 30.3. Homomorphic properties of predicate transformers

For each operation on the left, the entry indicates whether the construct preserves (or has) the homomorphism property indicated in the column:  (monotonicity); ⊥ (bottom), (top),  (meet), (join), and ¬ (negation). Duality (yes◦) indicates that the property holds for the dual lattice. This is page 500 Printer: Opaque this This is page 501 Printer: Opaque this

References

[1]M. Abadi, L. Lamport, and P. Wolper. Realizable and unrealizable specifications of reactive systems. In G. Ausiello et al., editor, Proc. 16th ICALP, 1–17, Stresa, Italy, 1989. Springer-Verlag. [2]P. Aczel. Quantifiers, games and inductive definitions. In Proc. 3rd Scandinavian Logic Symposium. North-Holland, 1975. [3]P.B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth through Proof. Academic Press, 1986. [4]K.R. Apt. Ten years of Hoare’s logic: A survey — part 1. ACM Transactions on Programming Languages and Systems, 3:431–483, 1981. [5]K.R. Apt and G.D. Plotkin. Countable nondeterminism and random assignment. Journal of the ACM, 33(4):724–767, October 1986. [6]R.J. Back. On the correctness of refinement in program development. Ph.D. thesis, Report A-1978-4, Department of Computer Science, University of Helsinki, 1978. [7]R.J. Back. Correctness Preserving Program Refinements: Proof Theory and Applica- tions, volume 131 of Mathematical Centre Tracts. Mathematical Centre, Amsterdam, 1980. [8]R.J. Back. Semantics of unbounded non-determinism. Proc. 7th Colloqium on Au- tomata, Languages and Programming (ICALP), volume 85 of , Lecture Notes in Computer Science, 51–63. Springer-Verlag, 1980. [9]R.J. Back. On correct refinement of programs. Journal of Computer and Systems Sciences, 23(1):49–68, August 1981. [10]R.J. Back. Proving total correctness of nondeterministic programs in infinitary logic. Acta Informatica, 15:233–250, 1981. [11]R.J. Back. Procedural abstraction in the refinement calculus. Reports on computer science and mathematics 55, Abo˚ Akademi, 1987. [12]R.J. Back. A calculus of refinements for program derivations. Acta Informatica, 25:593–624, 1988. 502 References

[13]R.J. Back. Changing data representation in the refinement calculus. In 21st Hawaii International Conference on System Sciences, January 1989. [14]R.J. Back. Refinement calculus, part II: Parallel and reactive programs. In REX Work- shop for Refinement of Distributed Systems, volume 430 of Lecture Notes in Computer Science, Nijmegen, the Netherlands, 1989. Springer-Verlag. [15]R.J. Back. Refining atomicity in parallel algorithms. In PARLE Conference on Par- allel Architectures and Languages Europe, Eindhoven, the Netherlands, June 1989. Springer-Verlag. [16]R.J. Back. Refinement diagrams. In J.M. Morris and R.C.F. Shaw, editors, Proc. 4th Refinement Workshop, Workshops in Computer Science, 125–137, Cambridge, England, 9–11 Jan. 1991. Springer-Verlag. [17]R.J. Back. Atomicity refinement in a refinement calculus framework. Reports on computer science and mathematics 141, Abo˚ Akademi, 1992. [18]R.J. Back. Refinement calculus, lattices and higher order logic. Technical report, Marktoberdorf Summer School on Programming Logics, 1992. [19]R.J. Back and M.J. Butler. Exploring summation and product operations in the refine- ment calculus. In Mathematics of Program Construction, volume 947 of Lecture Notes in Computer Science, 128–158, Kloster Irsee, Germany, July 1995. Springer-Verlag. [20]R.J. Back and R. Kurki-Suonio. Distributed co-operation with action systems. ACM Transactions on Programming Languages and Systems, 10:513–554, October 1988. [21]R.J. Back, A. Martin, and K. Sere. Specifying the Caltech asynchronous micropro- cessor. Science of , 26:79–97, 1996. [22]R.J. Back and K. Sere. Stepwise refinement of action systems. In Mathematics of Pro- gram Construction, volume 375 of Lecture Notes in Computer Science, Groningen, the Netherlands, June 1989. Springer-Verlag. [23]R.J. Back and K. Sere. Stepwise refinement of parallel algorithms. Science of Computer Programming, 13:133–180, 1990. [24]R.J. Back and K. Sere. Stepwise refinement of action systems. Structured Programming, 12:17–30, 1991. [25]R.J. Back and J. von Wright. Duality in specification languages: a lattice-theoretical approach. Reports on computer science and mathematics 77, Abo˚ Akademi, 1989. [26]R.J. Back and J. von Wright. A lattice-theoretical basis for a specification language. In Mathematics of Program Construction, volume 375 of Lecture Notes in Computer Science, Groningen, the Netherlands, June 1989. Springer-Verlag. [27]R.J. Back and J. von Wright. Duality in specification languages: a lattice-theoretical approach. Acta Informatica, 27:583–625, 1990. [28]R.J. Back and J. von Wright. Refinement concepts formalised in higher-order logic. Formal Aspects of Computing, 2:247–272, 1990. [29]R.J. Back and J. von Wright. Combining angels, demons and miracles in program specifications. Theoretical Computer Science, 100:365–383, 1992. [30]R.J. Back and J. von Wright. Predicate transformers and higher order logic. In J.W. de Bakker, W-P. de Roever, and G. Rozenberg, editors, REX Workshop on Semantics: Foundations and Applications, LNCS 666, 1–20, Beekbergen, the Netherlands, June 1992. Springer-Verlag. References 503

[31]R.J. Back and J. von Wright. Trace refinement of action systems. In B. Jonsson and J. Parrow, editors, Proc. CONCUR-94, Volume 836 of Lecture Notes in Computer Science, Uppsala, Sweden, 1994. Springer-Verlag. [32]R.J. Back and J. von Wright. Games and winning strategies. Information Processing Letters, 53(3):165–172, February 1995. [33]R. Backhouse and B. Carr´e. Regular algebra applied to path finding problems. J. Inst. Math. Appl., 15:161–186, 1975. [34]R. Backhouse et al. Fixpoint calculus. Information Processing Letters, 53(3), February 1995. [35]J.W. de Bakker. Semantics and termination of nondeterministic programs. In S. Michaelson and R. Milner, editors, Proc. 3th International Colloqium on Au- tomata, Languages and of Programming, 435–477, Edinburgh, Scotland, July 1976. Edinburgh University Press. [36]J.W. de Bakker. Mathematical Theory of Program Correctness. Prentice-Hall, 1980. [37]H. Barendregt. The Lambda calculus: Its Syntax and Semantics. North-Holland, 1984. [38]M. Barr and C. Wells. Category Theory for Computing Science. Prentice-Hall, 1990. [39]E. Best. Notes on predicate transformers and concurrent programs. Technical Report 145, University of Newcastle Computing Laboratory, 1980. [40]G. Birkhoff. Lattice Theory. American Mathematical Society, Providence, 1961. [41]H.J. Boom. A weaker precondition for loops. ACM Transactions on Programming Languages and Systems, 4(4):668–677, October 1982. [42]M. Broy. A theory for nondeterminism, parallellism, communications and concur- rency. Theoretical Computer Science, 46:1–61, 1986. [43]R.M. Burstall and J. Darlington. Some transformations for developing recursive programs. Journal of the ACM, 24(1):44–67, 1977. [44]M.J. Butler. A CSP Approach to Action Systems. Ph.D. thesis, Oxford University, Oxford, England, 1992. [45]M.J. Butler, J. Grundy, T. Langbacka,˚ R. Ruksˇenas,˙ and J. von Wright. The Refinement Calculator: Proof support for Program refinement. Proc. Pacific (FMP’97), Wellington, New Zealand. Springer Series in Discrete Mathematics and Theoretical Computer Science, 1997. [46]M.J. Butler and C.C. Morgan. Action systems, unbounded nondeterminism and infinite traces. Formal Aspects of Computing, 7(1):37–53, 1995. [47]D. Carrington, I.J. Hayes, R. Nickson, G. Watson, and J. Welsh. Refinement in Ergo. Technical Report 94-44, Department of Computer Science, University of Queensland, QLD 4072, Australia, Dec. 1994. [48]A. Church. A formulation of the simple theory of types. Journal of Symbolic Logic, 5:56–68, 1940. [49]T. Coquand and G. Huet. The Calculus of Constructions. Information and Computation, 76:95–120, 1988. [50]B.A. Davey and H.A. Priestley. Introduction to Lattices and Order. Cambridge University Press, 1990. [51]E.W. Dijkstra. Notes on structured programming. In O. Dahl, E.W. Dijkstra, and C.A.R. Hoare, editors, Structured Programming. Academic Press, 1971. [52]E.W. Dijkstra. Guarded commands, nondeterminacy and the formal derivation of programs. Communications of the ACM, 18:453–457, 1975. 504 References

[53]E.W. Dijkstra. A Discipline of Programming. Prentice-Hall International, 1976. [54]E.W. Dijkstra and C.S. Scholten. Predicate Calculus and Program Semantics. Springer-Verlag, 1990. [55]E.W. Dijkstra and A.J.M. van Gasteren. A simple fixpoint argument without the restriction to continuity. Acta Informatica, 23:1–7, 1986. [56]R. Floyd. Assigning meanings to programs. Proc. 19th ACM Symp. on Applied Mathematics, 19–32, 1967. [57]P.H. Gardiner and O. de Moer. An algebraic construction of predicate transformers. In Mathematics of Program Construction, volume 669 of Lecture Notes in Computer Science. Springer-Verlag, 1992. [58]P.H. Gardiner, C. Martin, and O. de Moer. An algebraic construction of predicate transformers. Science of Computer Programming, 22:21–44, 1994. [59]P.H. Gardiner and C.C. Morgan. Data refinement of predicate transformers. Theoretical Computer Science, 87(1):143–162, 1991. [60]A.J.M. van Gasteren. On the Shape of Mathematical Arguments. Volume 445 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 1990. [61]S.L. Gerhart. Correctness preserving program transformations. Proc. 2nd ACM Conference on Principles of Programming Languages, 1975. [62]M.J.C. Gordon and T.F. Melham. Introduction to HOL. Cambridge University Press, New York, 1993. [63]G. Gr¨atzer. General Lattice Theory. Birkh¨auser Verlag, Basel, 1978. [64]D. Gries. The Science of Programming. Springer-Verlag, New York, 1981. [65]D. Gries and D. Levin. Assignment and procedure call proof rules. ACM Transactions on Programming Languages and Systems, 2(4):564–579, 1980. [66]D. Gries and F. Schneider. A Logical Introduction to Discrete Mathematics. Springer- Verlag, 1993. [67]L. Groves and R. Nickson. A tactic driven refinement tool. In C.B. Jones et al., editors, Proc. 5th Refinement Workshop, London, Jan. 1992. Springer-Verlag. [68]J. Grundy. A window inference tool for refinement. In C.B. Jones et al., editors, Proc. 5th Refinement Workshop, London, Jan. 1992. Springer-Verlag. [69]J. Grundy. Transformational Hierarchical Reasoning. The Computer Journal, 39(4):291–302, 1996. [70]J. Grundy and T. Langbacka.˚ Recording HOL proofs in a structured browsable format. Technical Report 7, Turku Center for Computer Science, 1996. [71]P. Guerreiro. Another characterization of weakest preconditions. In volume 137 of Lecture Notes in Computer Science. Springer-Verlag, 1982. [72]D. Harel. And/or programs: A new approach to structured programming. ACM Transactions on Programming Languages and Systems, 2(1):1–17, 1980. [73]J. Harrison. Inductive definitions: automation and application. In P. Windley, T. Schu- bert, and J. Alves-Foss, editors, Higher Order Logic Theorem Proving and Its Appli- cations: Proc. 8th International Workshop, volume 971 of Lecture Notes in Computer Science, 200–213, Aspen Grove, Utah, USA, September 1995. Springer-Verlag. [74]E. Hehner. do considered od: A contribution to the programming calculus. Acta Informatica, 11:287–304, 1979. [75]E. Hehner. The Logic of Programming. Prentice-Hall, 1984. References 505

[76]E. Hehner. Predicative programming, part I. Communications of the ACM, 27(2):134– 143, 1984. [77]W.H. Hesselink. An algebraic calculus of commands. Report CS 8808, Department of Mathematics and Computer Science, University of Groningen, 1988. [78]W.H. Hesselink. Interpretations of recursion under unbounded nondeterminacy. Theoretical Computer Science, 59:211–234, 1988. [79]W.H. Hesselink. Predicate-transformer semantics of general recursion. Acta Infor- matica, 26:309–332, 1989. [80]W.H. Hesselink. Command algebras, recursion and program transformation. Formal Aspects of Computing, 2:60–104, 1990. [81]W.H. Hesselink. Programs, Recursion and Unbounded Choice. Cambridge Univer- sity Press, 1992. [82]W.H. Hesselink. Nondeterminism and recursion via stacks and games. Theoretical Computer Science, 124:273–295, 1994. [83]J. Hintikka. Language Games and information. Clarendon, London, 1972. [84]C.A.R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12(10):576–583, 1969. [85]C.A.R. Hoare. Procedures and parameters: An axiomatic approach. In E. Engeler, editor, Symposium on Semantics of Algorithmic Languages, volume 188 of Lecture Notes in Mathematics, 102–116. Springer-Verlag, 1971. [86]C.A.R. Hoare. Proof of a program: Find. Communications of the ACM, 14:39–45, 1971. [87]C.A.R. Hoare. Proofs of correctness of data representation. Acta Informatica, 1(4):271–281, 1972. [88]C.A.R. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. [89]C.A.R. Hoare. Programs are predicates. In C.A.R. Hoare and J. Shepherdson, editors, Mathematical Logic and Programming Languages, 141–155. Prentice-Hall, 1985. [90]C.A.R. Hoare, I.J. Hayes, J. He, C.C. Morgan, A.W. Roscoe, J.W. Sanders, I.H. Sorensen, J.M. Spivey, and A. Sufrin. Laws of programming. Communications of the ACM, 30(8):672–686, August 1987. [91]C.A.R. Hoare and J. He. The weakest prespecification. Information Processing Letters, 24:127–132, 1987. [92]D. Jacobs and D. Gries. General correctness: A unification of partial and total correctness. Acta Informatica, 22:67–83, 1985. [93]P.T. Johnstone. Notes on logic and set theory. Cambridge University Press, 1987. [94]C.B. Jones. Systematic Software Development Using VDM. Prentice–Hall Interna- tional, 1986. [95]C.R. Karp. Languages with Expressions of Infinite Length. North-Holland, 1964. [96]D. Knuth. The Art of Computer Programming: Fundamental Algorithms. Addison- Wesley, 1973. [97]T. Langbacka,˚ R. Ruksˇenas,˙ and J. von Wright. TkWinHOL: A tool for doing win- dow inference in HOL. Proc. 1995 International Workshop on Higher Order Logic Theorem Proving and its Applications, Salt Lake City, Utah, USA. Volume 971 of Lecture Notes in Computer Science, Springer-Verlag, 1995. 506 References

[98]C.E. Martin. Preordered Categories and Predicate Transformers. Ph.D. thesis, Oxford University, 1991. [99]P. Martin-L¨of. A theory of types. Techn. Rep. 71-3, University of Stockholm, 1972. [100]R. Milner. The Standard ML core language. Internal Report CSR-168-84, , 1984. [101]R. Milner. Communication and Concurrency. Prentice-Hall, 1989. [102]C.C. Morgan. Data refinement by miracles. Information Processing Letters, 26:243– 246, January 1988. [103]C.C. Morgan. Procedures, parameters and abstraction: separate concerns. Science of Computer Programming, 11:27, 1988. [104]C.C. Morgan. The specification statement. ACM Transactions on Programming Languages and Systems, 10(3):403–419, July 1988. [105]C.C. Morgan. Types and invariants in the refinement calculus. In Mathematics of Pro- gram Construction, volume 375 of Lecture Notes in Computer Science, Groningen, the Netherlands, June 1989. Springer-Verlag. [106]C.C. Morgan. Programming from Specifications. Prentice-Hall, 1990. [107]C.C. Morgan. The cuppest capjunctive capping. In A.W. Roscoe, editor, A Classical Mind: Essays in Honour of C.A.R. Hoare. Prentice-Hall, 1994. [108]J.M. Morris. A theoretical basis for stepwise refinement and the programming calculus. Science of Computer Programming, 9:287–306, 1987. [109]Y.N. Moschovakis. The game quantifier. Proceedings of the American Mathematical Society, 31(1):245–250, January 1972. [110]Y.N. Moschovakis. A model of concurrency with fair merge and full recursion. Information and Computation, 114–171, 1991. [111]D.A. Naumann. Predicate transformer semantics of an Oberon-like language. In E.- R. Olderog, editor, Programming Concepts, Methods and Calculi, 460–480, San Miniato, Italy, 1994. IFIP. [112]D.A. Naumann. Two-Categories and Program Structure: Data Types, Refinement Calculi and Predicate Transformers. Ph.D. thesis, University of Texas at Austin, 1992. [113]G. Nelson. A generalization of Dijkstra’s calculus. ACM Transactions on Program- ming Languages and Systems, 11(4):517–562, October 1989. [114]D. Park. On the semantics of fair parallelism. In volume 86 of Lecture Notes in Computer Science, 504–526, Berlin, 1980. Springer-Verlag. [115]L.C. Paulson. Logic and Computation. Cambridge University Press, 1987. [116]B.C. Pierce. Basic Category Theory for Computer Scientists. MIT Press, 1991. [117]J. Plosila and K. Sere. Action systems in pipelined processor design. In Proc. 3rd International Symposium on Advanced Research in Asynchronous Circuits and Sys- tems (Async97), Eindhoven, the Netherlands, April 1997. IEEE Computer Society Press. [118]G.D. Plotkin. A power-domain construction. SIAM Journal of Computing, 5(3):452– 487, 1976. [119]G.D. Plotkin. Dijkstra’s weakest preconditions and Smyth’s powerdomains. In Ab- stract Software Specifications, volume 86 of Lecture Notes in Computer Science. Springer-Verlag, June 1979. References 507

[120]G.D. Plotkin. A structural approach to operational semantics. DAIMI FN 19, Computer Science Dept., Aarhus University, April 1981. [121]J.C. Reynolds. The Craft of Programming. Prentice-Hall, 1981. [122]P.J. Robinson and J. Staples. Formalising a hierarchical structure of practical mathematical reasoning. Journal of Logic and Computation, 3(1):47–61, 1993. [123]W.-P. de Roever. Dijkstra’s predicate transformer, non-determinism, recursion and termination. In Mathematical Foundationsof Computer Science, volume 45 ofLecture Notes in Computer Science, 472–481. Springer-Verlag, 1976. [124]S. R¨onn. On the Regularity Calculus and its Role in Distributed Programming. Ph.D. thesis, Helsinki University of technology, Helsinki, Finland, 1992. [125]D. Schmidt. Denotational Semantics: Methodology for Language Development. Allyn and Bacon, 1986. [126]D.S. Scott. Data types as lattices. SIAM Journal of Computing, 5:522–587, 1976. [127]D.S. Scott. Logic with denumerably long formulas and finite strings of quantifiers. In Symp. on the Theory of Models, 329–341. North-Holland, 1965. [128]D.S. Scott and C. Strachey. Towards a mathematical semantics for computer lan- guages. Tech. monograph PRG-6, Programming Research Group, University of Oxford, 1971. [129]E. Sekerinski. A type-theoretic basis for an object-oriented refinement calculus. In S. Goldsack and S. Kent, editors, Formal Methods and Object Technology. Springer- Verlag, 1996. [130]M.B. Smyth. Powerdomains. Journal of Computer and Systems Sciences, 16:23–36, 1978. [131]J.L.A. van de Snepscheut. Proxac: An editor for program transformation. Technical Report CS-TR-93-33, Caltech, Pasadena, California, USA, 1993. [132]J.L.A. van de Snepscheut. What Computing is All About. Springer-Verlag, 1994. [133]J.M. Spivey. The Z Notation. Prentice-Hall International, 1989. [134]J. Stoy. Denotational Semantics. Prentice-Hall International, 1976. [135]A. Tarski. On the calculus of relations. J. Symbolic Logic, 6:73–89, 1941. [136]A. Tarski. A lattice theoretical fixed point theorem and its applications. PacificJ. Mathematics, 5:285–309, 1955. [137]R.D. Tennent. The denotational semantics of programming languages. Communica- tions of the ACM, 19:437–452, 1976. [138]M. Utting and K. Robinson. Modular reasoning in an object-oriented refinement calculus. In R. Bird, C.C. Morgan, and J. Woodcock, editors, Mathematics of Program Construction, volume 669 of Lecture Notes in Computer Science, 344–367. Springer- Verlag, 1993. [139]N. Ward and I.J. Hayes. Applications of angelic nondeterminism. In P.A.C. Bailes, editor, Proc. 6th Australian Conference, 391–404, Sydney, Australia, 1991. [140]A. Whitehead and B. Russell. Principia Mathematica. Cambridge University Press, 1927. [141]N. Wirth. Program development by stepwise refinement. Communications of the ACM, 14:221–227, 1971. 508 References

[142]J. von Wright. Program inversion in the refinement calculus. Information Processing Letters, 37(2):95–100, January 1991. [143]J. von Wright. The lattice of data refinement. Acta Informatica, 31:105–135, 1994. [144]J. von Wright. Program refinement by theorem prover. In Proc. 6th Refinement Workshop, London, January 1994. Springer-Verlag.