<<

Science of Computer Programming Volume 74, Issue 8, 1 June 2009, Pages 629-653 Special Issue on Mathematics of Program Construction (MPC 2006)

The Shadow Knows: Refinement and security in sequential programs ￿

Carroll Morgan 1

Sch. Comp. Sci. & Eng., Uni. NSW, Sydney 2052 Australia

Abstract

Stepwise refinement is a crucial conceptual tool for system development, encour- aging program construction via a number of separate correctness-preserving stages which ideally can be understood in isolation. A crucial conceptual component of security is an adversary’s ignorance of concealed information. We suggest a novel method of combining these two ideas. Our suggestion is based on a mathematical definition of “ignorance-preserving” refinement that extends classical refinement by limiting an adversary’s access to concealed information: moving from specification to implementation should never increase that access. The novelty is the way we achieve this in the context of se- quential programs. Specifically we give an operational model (and detailed justification for it), a basic sequential and its operational semantics in that model, a “logic of ignorance” interpreted over the same model, then a program-logical semantics bringing those together — and finally we use the logic to establish, via refinement, the correctness of a real (though small) protocol: Rivest’s Oblivious Transfer. A previous report￿ treated Chaum’s Dining Cryptoraphers similarly. In passing we solve the Refinement Paradox for sequential programs.

Key words: Security, privacy, Hoare logic, specification, implementation, logic of knowledge

￿ This is an extension of an earlier report [1]. Email address: [email protected] (Carroll Morgan). 1 We acknowledge the support of the Aust. Res. Council Grant DP034557.

Preprint submitted to SCP 22 September 2007 1 Introduction

Stepwise refinement is an idealised process whereby program- or system de- velopment is “considered as a sequence of design decisions concerning the decomposition of tasks into subtasks and of data into data structures” [2]. We say “idealised” because, as is well known, in practice it is almost never fully achieved: the realities of vague- and changing requirements, of efficiency etc. simply do not cooperate. Nevertheless as a principle to aspire to, a way of organising our thinking, its efficacy is universally recognised: it is an instance of separation of concerns.

A second impediment to refinement (beyond “reality” as above) is its scope. Refinement was originally formalised as a relation between sequential pro- grams [3], based on a state-to-state operational model, with a correspond- ing logic of Hoare-triples Φ S Ψ [4] or equivalently weakest preconditions wp.S.Ψ [5]. As a relation,{ it} generates{ } an algebra of (in-)equations between program fragments [6,7].

Thus a specification S1 is said to be refined by an implementation S2, written S1 S2, just when S2 preserves all logically expressible properties of S1. The scope￿ of the refinement is determined by that expressivity: originally limited to sequential programs [3], it has in the decades since been extended in many ways (e.g. concurrency, real-time and probability).

Security is our concern of scope here. The extension of refinement in this article is based on identifying and reasoning about an adversarial observer’s “ignorance” of data we wish to keep secret, to be defined as his uncertainty about the parts of the program state he can’t see. Thus we consider a program of known source-text, with its state partitioned into a “visible” part v and a “hidden” part h, and we ask From the initial and final values of visible v, what can an (1) adversary deduce about hidden h?

For example, if the program is v:= 0, then what he can deduce afterwards about h is only what he knew beforehand; but if it is v:= h mod 2, then he has learned h’s parity; and if it is v:= h then he has learned h’s value exactly.

We assume initially (and uncontroversially) that the adversary has at least the abilities implied by (1) above, i.e. knowledge of v’s values before/after execution and knowledge of the source code. We see below, however, that if we make certain reasonable, practical assumptions about refinement –which we shall justify– then surprisingly we are forced to assume as well that the adversary can see both program flow and visible variables’ intermediate values: that is, there are increased adversarial capabilities wrt ignorance that accrue

2 as a consequence of using refinement. 2 Thus refinement presents an increased security risk, a challenge for maintaining correctness: but it is so important, as a design tool, that the challenge is worth meeting.

The problem is in essence that classical refinement [3,6–8] is indeed insecure in the sense that it does not preserve ignorance [9]. If we assume v, h both to have type T , then “choose v from T ” is refinable into “set v to h” — as such it is simply a reduction of demonic nondeterminism. But that refinement, which we write v: T v:= h, is called the “Refinement Paradox” (Sec. 7.3.2) precisely because∈ it does￿ not preserve ignorance: program v: T tells us nothing about h, whereas v:= h tells us everything. To integrate refinement∈ and security we must address this paradox at least.

Our first contribution is a collection of “refinement principles” that we claim any reasonable ignorance -refinement algebra [6,7] must adhere to if it is to be practical. Because basic principles are necessarily subjective, we give a detailed argument to support our belief that they are important (Sec. 2).

Our second, and main contribution is to realise those principles: our initial construction for doing so includes a description of the adversary (Sec. 3), a state-based model incorporating the adversary’s capabilities (Sec. 4.2), a small programming language (Fig. 1) with its operational semantics over that model (Fig. 2) and a thorough treatment of indicative examples (Figs. 3–5).

In support of our construction we give beforehand a detailed rationale for our design (Sec. 2), and an argument via abstraction that induces our model’s fea- tures from that rationale by factoring it through Kripke structures (Sec. 4.1).

Based on our construction we then give a logic for our proposed model (Sec. 6), a programming-logic based on that and the operational semantics (Secs. 7,8), and examples of derived algebraic techniques (Secs. 9,10).

Within the framework constructed as above we interpret refinement both op- erationally (Sec. 5.3) and logically (Secs. 7.2,8.1), which interpretations are shown to agree (again Sec. 7.2) and to satisfy the Principles (Sec. 8.4).

Ignorance-preserving refinement should be of great utility for developing zero- knowledge- or security-sensitive protocols (at least); and our final contri- bution (Sec. 11) is an example, a detailed refinement-based development of Rivest’s Oblivious Transfer Protocol [10].

2 In the contrapositive, what we will argue is that if we integrated refinement and ignorance without admitting the adversary’s increased capabilities, we would not get a reasonable theory.

3 2 Refinement Principles: desiderata for practicality

The general theme of the five refinement principles we now present is that local modifications of a program should require only local checking (plus if necessary accumulation of declarative context hierarchically): any wholesale search of the entire source code must be rejected out of hand.

RP0 Refinement is monotonic — If one program fragment is shown in isolation to refine another, then the refinement continues to hold in any context: robustness of local reasoning is of paramount importance for scaling up. RP1 All classical “visible-variable only” refinements remain valid — It would be impractical to search the entire program (for hidden variables) in order to validate local reasoning (i.e. in which the hiddens elsewhere are not even mentioned). RP2 All classical “structural” refinements remain valid — Associativity of se- quential composition, distribution of code into branches of a conditional etc. are refinements (actually equalities) that do not depend on the ac- tual code fragments affected: they are structurally valid, acting en bloc.It would be impractical to have to trawl through their interiors (including e.g. procedure calls) to validate such familiar rearrangements. (Indeed, the procedures’ interiors might by unavailable due to modularisation.) RP3 Some classical “explicit-hidden” refinements become invalid — This is neces- sary because, for example, the Refinement Paradox is just such a refinement. Case-by-case reasoning must be justified by a model and a compatible logic. RT Referential transparency — If two expressions are equal (in a declarative context) then they may be exchanged (in that context) without affecting the meaning of the program. This is a crucial component of any mathematical theory with equality.

3 Description of the adversary: gedanken experiments

We initially assume minimal, “weak” capabilities (1) of our adversary, but show via gedanken experiments based on our Principles (Sec. 2) that if we allow refinement then we must assume the “strong” capabilities defined below.

3.1 The weak adversary in fact has perfect recall

We ask Does program v:= h; v:= 0 reveal h to the weak adversary? According to our principles above, it must; we reason

4 (v:= h; v:= 0); v: T = v:= h;(v:= 0; v: ∈T ) “RP2 : associativity” ∈ = v:= h; v: T “RP1 : visibles only” (2) v:= h; skip∈ “RP1 : visibles only; RP0 ” =￿ v:= h, “RP2 : skip is the identity”

whence we conclude that, since the implementation (v:= h) fails to conceal h, so must the specification (v:= h; v:= 0; v: T ) have failed to do so. Our model must therefore have perfect recall [11],∈ because escape of h into v is not “erased” by the v-overwriting v:= 0. That is what allows h to be “copied” by the final v: T . 3 ∈

3.2 The weak adversary in fact can observe program flow

Now we ask Does program h:= 0 h:= 1 reveal h to the weak adversary? Again, according to our principles,￿ it must: we reason

(h:= 0 h:= 1); v: T =(h:= 0;￿v: T ) (∈h:= 1; v: T ) “RP2 : distribution and ;” ∈ ￿ ∈ ￿ (h:= 0; v:= 0) (h:= 1; v:= 1) “RP1 ; RP2 ” (3) =(￿ h:= 0; v:= h) ￿ (h:= 1; v:= h) “RT :haveh=0 left, h=1 right” =(h:= 0 h:= 1);￿v:= h, “reverse above steps” ￿

and we see that we can, by refinement, introduce a statement that reveals h explicitly. Our model must therefore allow the adversary to observe the pro- gram flow, because that is the only way operationally he could have discovered h in this case. Similar reasoning shows that if E then skip else skip fi reveals the Boolean value E.

3.3 The strong adversary: a summary

In view of the above, we accept that our of-necessity strong adversary must be treated as if he knows at any time what program steps have occurred and what the visible variables’ values were after each one. 3 Adding the extra statement v: T is our algebraic means of detecting information- ∈ flow leaks: if a value is accessible to the adversary, then it should be valid to refine v: T so that the value is placed into v. ∈

5 Concerning program flow, we note the distinction between composite nonde- terminism, written e.g. as h:= 0 h:= 1 and acting between syntactic atoms (or larger structures), and atomic￿ nondeterminism, written e.g. as h: 0, 1 and acting within atoms: ∈ { }

in the composite case, afterwards the adversary knows which of atoms h:= 0 • or h:= 1 was executed, and thus knows the value of h too; yet in the atomic case, afterwards he knows only that the effect was to set h to • 0 or to 1, and thus knows only that h 0, 1 . ∈{ } Thus h:= 0 h:= 1 and h: 0, 1 are different. (Regularity of syntax however allows v:= 0￿ v:= 1 and v∈: { 0,}1 as well; but since v is visible, there is no semantic diff￿erence between∈ those{ } latter two fragments.)

4 A Kripke-based security model for sequential programs

Perfect recall and program flow suggest the Logic of Knowledge and its Kripke models as a suitable conceptual basis for what we want to achieve.

The seminal work on formal logic for knowledge is Hintikka’s [12], who used Kripke’s possible-worlds semantics for the model: he revived the discussion on a subject which had been a topic of interest for philosophers for millennia. It was first related to multi-agent computing by Halpern and Moses [13], and much work by many other researchers followed. Fagin et al. summarise the field in their definitive introduction [14].

The standard model for knowledge-based reasoning [12–14] is based on possible “runs” of a system and participating agents’ ignorance of how the runs have interleaved: although each agent knows the (totality of) the possible runs, a sort of “static” knowledge, he does not have direct “dynamic” knowledge of which run has been taken on any particular occasion. Thus he knows a fact in a given global state (of an actual run) iff that fact holds in all possible global states (allowed by other runs) that have the same local state as his.

For our purposes we severely specialise this view in three ways. The first is that we consider only sequential programs, with explicit demonic choice. As usual, such choice can represent both abstraction, that is freedom of an imple- mentor to choose among alternatives (possible refinements), and ignorance, that is not knowing which environmental factors might influence run-time decisions.

6 Secondly, we consider only one agent: this is our adversary, whose local state is our system’s visible part and who is is trying to learn about (what is for him) the non-local, hidden part.

Finally, we emphasise ignorance rather than knowledge (its dual).

4.1 The model as a Kripke structure

We assume a sequential program text, including a notion of atomicity: unless stated otherwise, each syntactic atom changes the program counter when it is executed; semantically, an atom is simply a relation between initial and final states. Demonic choice is either a (non-atomic) choice between two pro- gram fragments, thus S1 S2, or an (atomic) selection of a variable’s new value from some set, thus ￿x: X. For simplicity we suppose we have just two (anonymously typed) variables,∈ the visible v and the hidden h.

The global state of the system comprises both v, h variables’ current and all previous values, sequences v, h, and a history-sequence p of the program counter; from Sec. 3 we assume the adversary can see v, p but not h. For ex- ample, even after S1;(S2 S3); S4 has completed he can use p to “remember” which of S2 or S3 was executed￿ earlier, and he can use v to recall the visible variables’ values after whichever it was.

The possible runs of a system S are all sequences of global states that could be produced by the successive execution of atomic steps from some initial v0,h0, a tree structure with branching derived from demonic choice (both and : ). ￿ ∈ If the current state is (v, h, p), then the set of possible states associated with it is the set of all (other) triples (v, h1, p) that S could (also) have produced from v ,h . We write (v, h, p) (v, h , p) for this (equivalence) relation of 0 0 ∼ 1 accessibility, which depends on S,v0,h0.

4.2 An operational model abstracted from the Kripke structure

Because of our very limited use of the Kripke structure, we can take a brutal abstraction of it: programs cannot refer to the full run-sequences directly; what they can refer to is just the current values of v, h, and that is all we need keep of them in the abstraction.

For the accessibililty relation we introduce a “shadow” variable H, set-valued, which records the possible values of h in all (other) runs that the adversary considers -equivalent to (cannot distinguish from) this one; the abstraction ∼

7 Identity skip Assigment x:= E Choose x: E ∈ Demonic choice S1 S2 ￿ Composition S1; S2 Conditional if E then S1 else S2 fi Declare visible [ vis v S ] | · | Declare hidden [ hid h S ] | · |

Fig. 1. Programming language syntax

to (v, h, H) is thus

v = last.v h = last.h H = h￿ (v, h￿, p) (v, h, p) last.h￿ . 4 ∧ ∧ { | ∼ · }

From sequences v, h, p we retain only final values v, h and the induced H. 5

5 The Shadow Knows: an operational semantics

We now use our model to give an ignorance-sensitive opera- tional interpretation of a simple sequential programming lan- guage including nondeterminism. To begin with, we continue to assume a state space with just two variables, the visible v and the hidden h. (In general of course there can be many visibles and hiddens.) Our semantics adds a third variable H called the shadow of the hidden variable h. The semantics will ensure that, in the sense of Sec. 4.2 above, the shadow H “knows” the set of values that h has potentially.

5.1 Syntax and semantics

The syntax of our example programming language is given in Fig. 1.

4 Read the last as “vary h￿ such that (v, h￿, p) (v, h, p) and take last.h￿ for each”. 5 In fact the H-component makes h redundant –∼i.e. we can make do with just (v, H)– but this extra “compression” would complicate the presentation subsequently.

8 The operational semantics is given in Fig. 2, for which we provide the fol- lowing commentary. In summary, we convert “ignorance-sensitive” (that is v, h-) programs to “ordinary” (that is v, h, H-) programs and then rely on the conventional relational semantics for those. 6 We comment on each case in turn.

The identity skip changes no variables, hence has no effect. • Assigning to a visible “shrinks” the shadow to just those values still consis- • tent with the value the visible reveals: the adversary, knowing the outcome and the program code, concludes that the other values are no longer possi- ble. Choosing a visible is a generalisation of that. Assigning to a hidden sets the shadow to all values that could have resulted • from the current shadow; again, choosing a hidden is a generalisation. Demonic choice and sequential composition retain their usual definitions. • Note in particular that the former induces nondeterminism in the shadow H as well. The conditional shrinks the shadow on each branch to just those values • consistent with being on that branch, thus representing the adversary’s observing which branch was taken.

5.2 Examples of informal- and operational semantics

In Fig. 3 we give informal descriptions of the effects of a number of small program fragments; then in Fig. 4 we apply the semantics above to give the actual translations, showing how they support the informal descriptions.

We begin by noting that h: 0, 1 , the simplest example (Fig. 3(.2)) of igno- rance, leads us as usual to∈ either{ } of two states, one with h=0 and the other with h=1; but since the choice is atomic an adversary cannot tell which of those two states it is. This is reflected in the induced assignment of 0, 1 to the shadow H, on both branches, shown in the corresponding semantics{ (4.2).}

In contrast, although the program h:= 0 h:= 1 (3.3) again leads unpre- dictably to h=0 or h=1, in both cases the semantics￿ (4.3) shows that we have H= h finally, reflecting that the non-atomic program flow has revealed h’s value{ implicitly} to the adversary. Thus an operational indication of (degrees of) ignorance is the size of H: the bigger it is, the less is known; and that is what distinguishes these two examples.

6 Our definitions are induced from the abstraction given in Sec. 4.2.

9 For an ignorance-sensitive program S we write [[S]] for its conversion into the shad- owed form. In this simplified presentation we suppose only single variables v, h (ranging over a set D, say), so that the shadow H is simply a set of the potential values for h (thus ranging over the powerset PD). On the right the classical semantics applies: in particular, use of : merely indicates ∈ an ordinary nondeterministic choice from the set given. Variable e is fresh, just used for the exposition.

Identity[[skip]] = skip Assign to visible [[ v:= E]] = e:= E; H:= h: H e=E ; v:= e ￿ { | } Choose visible [[ v: E]] = e: E; H:= h: H e E ; v:= e ∈ ￿ ∈ { | ∈ } Assign to hidden [[ h:= E]] = h:= E; H:= E h: H ￿ { | } Choose hidden [[ h: E]] = h: E; H:= E h: H ∈ ￿ ∈ ∪{ | } Demonic choice [[ S1 S2]] =[[S1]] [[ S2]] ￿ ￿ ￿ Composition [[ S1; S2]] =[[S1]] ; [[ S2]] ￿ ￿ Conditional [[ if E then S1 else S2 fi]] = if E then H:= h: H E ;[[S1]] else H:= h: H E ;[[S2]] fi { | } { |¬ } We defer discussion of declarations and local variables until Sec. 5.4. ￿ Fig. 2. Operational semantics

5.3 Operational definition of ignorance-preserving refinement

Given two states (v1,h1,H1) and (v2,h2,H2) we say that the first is refined by the second just when they agree on their v, h-components and ignorance is only increased in the H-component: that is we define

(v ,h ,H ) (v ,h ,H ) = v =v h =h H H . 1 1 1 ￿ 2 2 2 1 2 ∧ 1 2 ∧ 1⊆ 2 ￿ We promote this to sets of (v, h, H)-states in the standard (Smyth power- domain [15]) style, saying that we have refinement between two sets S1,S2 of states just when every state s2 S2 is a refinement of some state s1 S1; that is, we define ∈ ∈

S S =(s : S ( s : S s s )) . 1 ￿ 2 ∀ 2 2 · ∃ 1 1 · 1 ￿ 2 ￿ 10 In each case we imagine that we are at the end of the program given, that the initial values were v0,h0, and that we are the adversary (so we write “we know” etc.)

Program Informal commentary

3.1 both v: 0, 1 We can see the value of v, either 0 or 1.We ∈ { } and v:= 0 v:= 1 know h still has its initial value h , though ￿ 0 we cannot see it. 3.2 (one atomic statement) We know that h is either 0 or 1, but we h: 0, 1 don’t know which; we see that v is v . ∈ { } 0 3.3 (two atomic statements) We know the value of h, because we know h:= 0 h:= 1 from the program flow which of atomic ￿ h:= 0 or h:= 1 was executed. 3.4 h: 0, 1 ; We don’t know whether h is 0 or it is 1: ∈ { } v:= 0 v:= 1 even the -demon cannot see the hidden ￿ ￿ variable. 3.5 h: 0, 1 ; Though the choice of v refers to h it re- ∈ { } v: h, 1 h veals no information, since the statement ∈ { − } is atomic. 3.6 h: 0, 1 ; Here h is revealed, because we know which ∈ { } v:= h v:= 1 h of the two atomic assignments to v was ￿ − executed. 3.7 h: 0, 1, 2, 3 ; We see v;wededuceh since we can see ∈ { } v:= h v:= h in the program text. 3.8 h: 0, 1, 2, 3 ; We see v; from that either we deduce h is ∈ { } v:= h mod 2 0 or 2, or that h is 1 or 3. 3.9 h: 0, 1, 2, 3 ; We see v is 0; but our deductions about h ∈ { } v:= h mod 2; v:= 0 are as for 3.8, because we saw v’s earlier value.

In 3.4 the “ -demon” could be an adversarial scheduler, free at runtime to choose ￿ v:= 0 or v:= 1 — but it cannot use information about h to make that choice. We have assumed throughout that v, h are of type 0, 1 so that, for example, in { } 3.5 the choice h: 0, 1 reveals nothing. ∈ { } Fig. 3. Examples of ignorance, informally interpreted

We promote this a second time, now to (v, h, H)-programs, using the standard pointwise-lifting for functions in which S1 S2 just when for some initial s ￿ the set of possible outcomes S1 arising via S1 from that s is refined by the set of possible outcomes S2 arising via S2 from that same s. Examples are given in Fig. 5.

11 (v, h)-program S (v, h, H)-program [[S]]

4.1a v: 0, 1 e: 0, 1 ; H:= h: H e 0, 1 ; v:= e ∈ { } ∈ { } { | ∈{ }} and the rhs simplifies to v: 0, 1 ∈ { } 4.1b v:= 0 v:= 1 e:= 0; H:= h: H e=0 ; v:= e ￿ e:= 1; H:= {h: H | e=1}; v:= e ￿ { | } simplifies to v: 0, 1 again ∈ { } 4.2 h: 0, 1 h: 0, 1 ; H:= 0, 1 h: H ∈ { } ∈ { } ∪{{ }| } simplifies to h: 0, 1 ; H:= 0, 1 ∈ { } { } 4.3 h:= 0 h:= 1 h:= 0; H:= 0 h: H ￿ h:= 1; H:= {1 | h: H} ￿ { | } simplifies to h: 0, 1 ; H:= h ∈ { } { } 4.4 h: 0, 1 ; h: 0, 1 ; H:= 0, 1 ; v:=∈ { 0 }v:= 1 v:∈ {0, 1} { } ￿ ∈ { } simplifies to h: 0, 1 ; v: 0, 1 ; H:= 0, 1 ∈ { } ∈ { } { } 4.5 h: 0, 1 ; h: 0, 1 ; H:= 0, 1 ; v:∈ {h, 1} h v:∈ {h, 1} h ; H{:= h}: H v h, 1 h ∈ { − } ∈ { − } { | ∈{ − }} which is the same as 4.4 4.6 h: 0, 1 ; h: 0, 1 ; H:= 0, 1 ; v:=∈ {h }v:= 1 h v,∈ H{:= h,} h {v, H}:= 1 h, h ￿ − { } ￿ − { } simplifies to h: 0, 1 ; v: 0, 1 ; H:= h ∈ { } ∈ { } { } 4.7 h: 0, 1, 2, 3 ; h: 0, 1, 2, 3 ; H:= 0, 1, 2, 3 ; v:=∈ {h } v,∈ H{:= h, h } { } { } simplifies to h: 0, 1, 2, 3 ; v:= h; H:= h ∈ { } { } 4.8 h: 0, 1, 2, 3 ; h: 0, 1, 2, 3 ; H:= 0, 1, 2, 3 ; v:=∈ {h mod 2 } v:=∈ {h mod 2; }H:= h{: H v =}h mod 2 { | } simplifies to (H:= 0, 2 H:= 1, 3 ); h: H; v:= h mod 2 { } ￿ { } ∈ 4.9 h: 0, 1, 2, 3 ; h: 0, 1, 2, 3 ; H:= 0, 1, 2, 3 ; v:=∈ {h mod 2; } v:=∈ {h mod 2; }H:= h{: H v =}h mod 2 ; v:= 0 v:= 0 { | } simplifies to (H:= 0, 2 H:= 1, 3 ); h: H; v:= 0 { } ￿ { } ∈ The simplifications are made using classical semantics over v, h, H. Fig. 4. Operational-semantics examples

12 The initial state is some (v ,h , h ). 0 0 { 0} Program Final states in the shadowed model

5.1 both v: 0, 1 (0,h0, h0 ) , (1,h0, h0 ) and v:=∈ { 0 }v:= 1 { } { } ￿ 5.2 h: 0, 1 (v , 0, 0, 1 ) , (v , 1, 0, 1 ) ∈ { } 0 { } 0 { } 5.3 h:= 0 h:= 1 (v , 0, 0 ) , (v , 1, 1 ) ￿ 0 { } 0 { } 5.4 h: 0, 1 ; (0, 0, 0, 1 ) , (0, 1, 0, 1 ) , v:=∈ { 0 }v:= 1 (1, 0, {0, 1}) , (1, 1, {0, 1}) ￿ { } { } 5.5 h: 0, 1 ; (0, 0, 0, 1 ) , (1, 0, 0, 1 ) , Thus this and ∈ { } { } { } v: h, 1 h (0, 1, 0, 1 ) , (1, 1, 0, 1 ) 5.4 are equal. ∈ { − } { } { } 5.6 h: 0, 1 ; (0, 0, 0 ) , (1, 0, 0 ) , But this one ∈ { } { } { } v:= h v:= 1 h (0, 1, 1 ) , (1, 1, 1 ) differs. ￿ − { } { } 5.7 h: 0, 1, 2, 3 ; (0, 0, 0 ) , (1, 1, 1 ) , v:=∈ {h } (2, 2, {2}) , (3, 3, {3}) { } { } 5.8 h: 0, 1, 2, 3 ; (0, 0, 0, 2 ) , (1, 1, 1, 3 ) , v:=∈ {h mod 2 } (0, 2, {0, 2}) , (1, 3, {1, 3}) { } { } 5.9 h: 0, 1, 2, 3 ; (0, 0, 0, 2 ) , (0, 1, 1, 3 ) , The final v:= 0 ∈ { } { } { } v:= h mod 2; (0, 2, 0, 2 ) , (0, 3, 1, 3 ) does not affect H. v:= 0 { } { }

In (5.9) the first assignment to v is unpredictably 0 or 1, revealing h’s parity which indeed we did not know beforehand. Whichever occurs, however, that parity infor- mation remains even after the final assignment v:= 0, as shown by the H outcomes.

Based on the outcomes above, we have e.g. the strict refinements (5.3) ￿ (5.2), (5.6) ￿ (5.5) and ((5.7); v:= 0) ￿ (5.9). Fig. 5. Operational examples of outcomes and refinement

Finally, we say that two (v, h)-programs are related by refinement just when their (v, h, H)-meanings are: that is S1 S2 =[[S1]] [[ S2]]. ￿ ￿ In summary, we have ignorance-sensitive refinement￿ S1 S2 just when ￿ for each initial (v, h, H) every possible outcome (v2,h2,H2)of[[S2]] satisfies v =v h =h H H for some outcome (v ,h ,H )of[[S1]]. 1 2 ∧ 1 2 ∧ 1⊆ 2 1 1 1

5.4 Declarations and local variables

In the absence of procedure calls and recursion, we can treat multiple-variable programs over v1, ,vm; h1, hn, say, as operating over single tuple-valued variables v=(v , ···,v ); h=(···h , h ). A variable v or h in an expression 1 ··· m 1 ··· n i i

13 is actually a projection from the corresponding tuple; as the target of an assignment it induces a component-wise update of the tuple.

Within blocks, visible- and hidden local variables have separate declarations vis v and hid h respectively. Note that scope does not affect visibility: (even) a global hidden variable cannot be seen by the adversary; (even) a local visible variable can.

Brackets [ ] (for brevity) or equivalently begin end (in this section only, | | for clarity)· introduce a local scope that initially extends· either v or h, H as appropriate for the declarations the brackets introduce, and finally projects away the local variables as the scope is exited.

In the operational semantics for visibles this treatment is the usual one: thus we have [[ begin vis v S end ]] = begin var v [[ S]] end. For hidden variables · · however we arrange for H to be implicitly extended by the declaration; thus we have ￿

[[ begin hid h S end ]] = begin var h H:= H D; · · × [[ S]]; ￿ H:= H end ↓ where D is the set over which our variables –the new h in this case– take their values (Fig. 2), and we invent the notation H for the projection of the product H D back to H. ↓ ×

6 Modal assertion logic for the shadow model

We now introduce an assertion logic for reasoning over the model of the pre- vious section. Our language will be first-order predicate formulae Φ, inter- preted conventionally over the variables of the program, but augmented with a “knows” modal operator [14, 3.7.2] so that KΦ holds in this state just when Φ itself holds in all (other) states the adversary considers compatible with this one. From our earlier discussion (Sec. 3) we understand the adversary’s notion of compatibility to be, for a given program text, “having followed the same path through that text and having generated the same visible-variable values along the way”; from our abstraction (Sec. 4.2), we know we can determine that compatibility based on H alone.

The dual modality “possibly” is written PΦ and defined K( Φ); and it is the modality we will use in practice, as it expresses ignorance¬ directly.¬ (Because KΦ seems more easily grasped, however, we explain both.)

14 6.1 Interpretation of modal formulae

We give the language function- (including constant-) and relation symbols as needed, among which we distinguish the (program-variable) symbols visibles in some set V and hiddens in H; as well there are the usual (logical) variables in L over which we allow , quantification. The visibles, hiddens and variables are collectively the scalars∀ ∃ X = V H L. ∪ ∪ A structure comprises a non-empty￿ domain D of values, together with func- tions and relations over it that interpret the function- and relation symbols mentioned above; within the structure we name the partial functions v, h that interpret visibles and hiddens respectively; we write their types V D and H D (where the “crossbar” indicates the potential partiality of the￿→ func- tion).￿→

A valuation is a partial function from scalars to D, thus typed X D; one ￿→ valuation w1 can override another w so that for scalar x we have (w ￿ w1).x is w1.x if w1 is defined at x and is w.x otherwise. The valuation x d is defined only at x, where it takes value d. ￿ ￿→ ￿

A state (v, h, H) comprises a visible- v, hidden- h and shadow- part H; the last, in P(H D), is a set of valuations over hiddens only. 7 We require that h H. ￿→ ∈ We define truth of Φ at (v, h, H) under valuation w by induction in the usual style, writing (v, h, H), w = Φ.Lett be the term-valuation built inductively from the valuation v ￿ h ￿|w. Then we have the following [14, pp. 79,81]:

(v, h, H), w = R.t1. .tk for relation symbol R and terms t1 tk iff the • tuple (t.t ,| , t.t )··· is an element of the interpretation of R. ··· 1 ··· k (v, h, H), w = t1 = t2 iff t.t1 = t.t2. • (v, h, H), w |= Φ iff (v, h, H), w = Φ. • | ¬ ￿| (v, h, H), w = Φ1 Φ2 iff (v, h, H), w = Φ1 and (v, h, H), w = Φ2. • (v, h, H), w |=( l∧ Φ)iff (v, h, H), w |￿ l d = Φ for all d| in D. · • (v, h, H), w |=K∀Φ iff (v, h , H), w = Φ for￿ ￿→ all￿h| in H. • | 1 | 1 We write just (v, h, H) = Φ when w is empty, and = Φ when (v, h, H) = Φ for all v, h, H with h H,| and we take advantage of the| usual “syntactic sugar”| for other operators∈ (including P as K ). Thus for example we can show = Φ PΦ for all Φ, a fact which we¬ use¬ in Sec. 8.5. Similarly we can assume wlog| ⇒that modalities are not nested, since we can remove nestings via the validity =PΦ ( [h c]Φ P(h=c)). | ≡ ∃ · \ ∧ 7 To allow for declarations of additional hidden variables, we must make H a set of valuations rather than simply a set of values. This is isomorphic to a set of tuples (Sec. 5.4), but is easier to use in the definition of the logic.

15 Program Valid Ψ Invalid Ψ (in these examples, for any Φ) 6.1 both v: 0, 1 v 0, 1 v =0 and v:=∈ { 0 }v:= 1 ∈ { } ￿ 6.2 h: 0, 1 P(h=0) K(h=0) ∈ { } 6.3 h:= 0 h:= 1 h 0, 1 P(h=0) ￿ ∈ { } 6.4 h: 0, 1 ; P(v=h)K(v=h) v:=∈ { 0 }v:= 1 ￿ ￿ 6.5 h: 0, 1 ; P(h=0) P(v=0) v:∈ {h, 1} h In fact Program 6.5 equals Program 6.4. ∈ { − } 6.6 h: 0, 1 ; v 0, 1 P(h=0) v:=∈ {h }v:= 1 h But∈ { Program} 6.6 differs from Program 6.5. ￿ − 6.7 h: 0, 1, 2, 3 ; K(v=h)P(v=h) v:=∈ {h } ￿ 6.8 h: 0, 1, 2, 3 ; v=0 P(h=1) v:=∈ {h mod 2 } P(h 2, 4 ) P(h=2) ⇒ ∈{ } ∧ 6.9 h: 0, 1, 2, 3 ; P(h 1, 2 ) v=0 v:=∈ {h mod 2; }v:= 0 ∈{ } P(h 2, 4 ) ⇒ ∈{ } In 6.3 the invalidity is because might resolve to the right: then h=0 is impossible. · ￿ In 6.6 the invalidity is because : might choose 1 and the subsequent choose · ∈ ￿ v:= h, in which case v would be 1 and h=0 impossible. In 6.8 the validity is weak: we know h cannot be 4; yet still its membership of 2, 4 · { } is possible. The invalidity is because the assignment v:= h mod 2 reveals h’s parity; the adversary cannot simultaneously consider both 1 and 2 to be possible. In 6.9 the invalidity is due to the fact that v’s value might have been 1 earlier; the · assignment v:= 0 is an unsuccessful “cover up”. Fig. 6. Examples of valid and invalid postconditions

7 A program logic of Hoare-triples

7.1 Pre-conditions and postconditions

We say that Φ S Ψ just when any initial state (v, h, H) = Φ must lead via { } { } | S only to final states (v￿, h￿, H￿) = Ψ; typically Φ is called the precondition and Ψ is called the postcondition. Fig.| 6 illustrates this proposed program logic using our earlier examples from Fig. 3. (Because the example postconditions Ψ do not refer to initial values, their validities in this example are independent of the precondition Φ.)

16 7.2 Logical definition of ignorance-preserving refinement

We saw in Sec. 5.3 a definition of refinement given in terms of the model directly. When the model is linked to a programming logic we expect the earlier definition of refinement to be induced by the logic, so that S1 S2 just when all logically expressible properties of S1 are satisfied S2 also (Sec.￿ 1). The key is of course “expressible.”

Our expressible properties will be traditional Hoare-style triples, but over for- mulae whose truth is preserved by increase of ignorance: those in which all modalities K occur negatively, and all modalities P occur positively. We say that such occurrences of modalities are ignorant; and a formula is ignorant just when all its modalities are. Thus in program-logical terms we say that S1 S2 just when for all ignorant formulae Φ, Ψ we have ￿ that the property Φ [[ S1]] Ψ is valid { } { } (4) implies that the property Φ [[ S2]] Ψ is also valid. { } { }

We link these two definitions of refinement in the following lemma.

Lemma 1 The definitions of ignorance-sensitive refinement given in Sec. 5.3 and in (4) above are compatible.

Proof: Fix S1 and S2 and assume for simplicity single variables v, h so that we can work with states (v, h, H) rather than valuations. (Working more generally with valuations, we would define H1 H2 to mean H1.x H2.x for all hiddens x where both valuations are defined; the⊆ proof’s extension⊆ to this case is conceptually straightforward but notationally cumbersome.) We argue the contrapositive, for a contradiction, in both directions. 8

operational- (Sec. 5.3) implies logical refinement (Sec. 7.2) Suppose we have ignorant modal formulae Φ, Ψ so that Φ [[ S1]] Ψ is valid but Φ [[ S2]] Ψ is invalid. There must therefore be initial{ } (v , h{ ,}H ) = Φ { } { } 0 0 0 | such that final (v2, h2, H2) is possible via [[S2]] y e t ( v2, h2, H2) = Ψ. Because (we assume for a contradiction) S2 refines S1 operationally,￿| we must have some (v , h , H ) via [[S1]] from (v , h , H ) with H H and of 1 1 1 0 0 0 1⊆ 2 course as well v1=v2, and similarly h1=h2; thus from (v0, h0, H0) = Φ and Φ [[ S1]] Ψ we conclude (v , h , H ) = Ψ. | { } { } 2 2 1 | But (v2, h2, H1) = Ψ and H1 H2 implies (v2, h2, H2) = Ψ because Ψ is ignorant — which| gives us our contradiction.⊆ | logical- (Sec. 7.2) implies operational refinement (Sec. 5.3) Suppose we have (v2, h2, H2) via [[S2]] from some (v0, h0, H0) but there is

8 That is we establish A B via B A false. ⇒ ¬ ∧ ⇒

17 no (v2, h2, H1) with H1 H2 via [[S1]] — equivalently, for all [[S1]]-outcomes (v , h , H )wehave( h⊆: H h H ). 2 2 1 ∃ 1 | ￿∈ 2 For convenience let symbols v0 etc. from the state-triples act as constants denoting their values. Define formulae

Φ = v=v h=h ( x: H P(h=x)) 0 ∧ 0 ∧ ∀ 0 · and Ψ = v=v2 h=h2 P(h H2) . ￿ ∧ ⇒ ￿∈ Now Φ [[ S1]] Ψ￿ because any initial state with (v, h, H) = Φ must have { } { } | v=v0 h=h0 H H0, and we observe that expanding the H-component of an initial∧ state∧ can⊇ only expand the H component of outcomes. (This is a form of ignorance-monotonicity, established by inspection of Fig. 2.) That is, because all outcomes from (v0, h0, H0) itself via [[S1]] satisfy Ψ, so must all outcomes from any (other) initial such (v0, h0, H) = Φ satisfy Ψ, since Ψ is ignorant. | But (since we assume logical refinement, for a contradiction) we have Φ [[ S1]] Ψ implies Φ [[ S2]] Ψ — whence also (v2, h2, H2) = Ψ, which is{ our} contradiction{ } (by{ inspection} { } of Ψ). |

7.3 Two negative examples: excluded refinements

We now give two important examples of non-refinements: the first appears valid wrt ignorance, but is excluded classically; the second is the opposite, valid classically, but excluded wrt ignorance.

7.3.1 Postconditions refer to h even though the adversary cannot see it

We start with the observation that we do not have h: 0, 1 ? h: 0, 1, 2 , even though the ignorance increases and h cannot be∈ observed{ } ￿ by∈ the{ adver-} sary: operationally the refinement fails due to the outcome ( , 2, 0, 1, 2 )on the right for which there is no supporting triple ( , 2, ?) on the· left;{ logically} it fails because all left-hand outcomes satisfy h=2 but· some right-hand outcomes do not. ￿

The example illustrates the importance of direct references to h in our postcon- ditions, even though h cannot be observed by the adversary: for otherwise the above refinement would after all go through logically. The reason it must not is that, if it did, we would by monotonicity of refinement (RP0 ) have to admit h: 0, 1 ; v:= h ? h: 0, 1, 2 ; v:= h as well, clearly false: the outcome v=2 is∈ not{ possible} on￿ the left,∈ { excluding} this refinement classically.

18 7.3.2 The Refinement Paradox

We recall that the Refinement Paradox [9] is an issue because classical refine- ment allows the “secure” v: T to be refined to the “insecure” v:= h as an instance of reduction of demonic∈ nondeterminism. We can now resolve it: it is excluded on ignorance grounds.

Operationally, we observe that v: T v:= h because from (?,h0,H0) the for- mer’s final states are (e, h ,H ) ∈e: T￿￿; yet the latter’s are just (h ,h , h ) { 0 0 | } { 0 0 { 0} } and, even supposing h0 T so that e=h0 is a possibility, still in general H0 h0 as refinement would require.∈ ￿⊆{ }

Logically, we can prove (Fig. 9 below) that P(h=C) v: T P(h=C) holds while P(h=C) v:= h P(h=C) does not.{ } ∈ { } { } { }

8 Weakest-precondition calculus for (v, h)-semantics directly

Our procedure so far for reasoning about ignorance has been to translate (v, h)-programs into (v, h, H)-programs via the operational semantics of Fig. 2, and then to reason there about refinement either operationally (Sec. 5.3) or logically (Sec. 7.2(4)); in view of Lem. 1 we can do either.

We now streamline this process by developing a weakest-precondition-style program logic on the (v, h)-level program text directly, consistent with the above but avoiding the need for translation to (v, h, H)-semantics. At (6) we give the induced (re-)formulation of refinement.

8.1 Weakest preconditions: a review

Given a postcondition Ψ and a program S there are likely to be many candidate preconditions Φ that will satisfy the Hoare-triple Φ S Ψ ; because it is a { } { } property of Hoare-triples that if Φ1, Φ2 are two such (for a given S, Ψ), then so is Φ1 Φ2 (and this extends even to infinitely many), in fact there is a so-called weakest∨ precondition which is the disjunction of them all: it is written wp.S.Ψ and, by definition, it has the property

Φ S Ψ iff = Φ wp.S.Ψ . (5) { } { } | ⇒

Because the partially evaluated terms wp.S can be seen as functions from postconditions Ψ to preconditions Φ, those terms are sometimes called pred- icate transformers; and a systematic syntax-inductive definition of wp.S for

19 every program S in some language is called predicate-transformer semantics. We give it for our language in Fig. 8; it is consistent with the operational semantics of Fig. 2.

Our predicate-transformer semantics for ignorance-sensitive programs is thus derived from the operational semantics and the interpretation in Sec. 6 of modal formulae. With it comes a wp-style definition of refinement

S1 S2 iff = wp.S1.Ψ wp.S2.Ψ for all ignorant Ψ (6) ￿ | ⇒ which, directly from (5), is consistent with our other two definitions.

8.2 Predicate-transformer semantics for ignorance: approach

To derive a wp-semantics for the original (v, h)-space, we translate our modal (v, h)-formulae into equivalent classical formulae over (v, h, H), calculate the classical weakest preconditions for wrt the corresponding (v, h, H)-programs, and finally translate those back into modal formulae.

From Sec. 6 we can see that the modality PΦ corresponds to the classical formula ( (h , ): H Φ), where hidden variables h , are all those in 1 · 1 scope; for∃ a general··· formula Ψ we will write [[Ψ]] for the result··· of applying that translation to all modalities it contains; we continue to assume that modalities are not nested. We write wp for our new transformer semantics (over v, h), using wp in this section for the classical transformer-semantics (over v, h, H).

As an example of this procedure (recalling Fig. 6.8) we have the following:

[[ v:= h mod 2]] = v:= h mod 2; H:= h: H v = h mod 2 { | } and [[ v=0 P(h 2, 4 )]] = v=0 ⇒ ∈{ } ( h: H h 2, 4 ) . ⇒ ∃ | ∈{ }

Then the classical wp-semantics [5] is used over the explicit (v, h, H)-program fragments as follows in this example:

v=0 ( h: H h 2, 4 ) ⇒ ∃ | ∈{ } through wp.(H:= h: H v = h mod 2 ) gives v=0 ( h: h{: H v| = h mod 2 } h 2, 4 ) ⇒ ∃ { | }| ∈{ } = v=0 ( h: h￿: H v = h￿ mod 2 h 2, 4 ) “rename bound variable” = v=0 ⇒ (∃h: H{ v =| h mod 2 h }|2, 4∈{) } “flatten comprehensions” ⇒ ∃ | ∧ ∈{ }

20 = v=0 ( h: H v =0 h 2, 4 ) “arithmetic” = v=0 ⇒ (∃h: H | h 2,∧4 )∈{ } “predicate calculus” ⇒ ∃ | ∈{ } through wp.(v:= h mod 2) gives h mod 2=0 ( h: H h 2, 4 ) ⇒ ∃ | ∈{ } which, translated back into the modal P-form (i.e. “un-[[ ]]’d”) gives the weak- est precondition h mod 2=0 P(h 2, 4 ). · ⇒ ∈{ }

Our general tool for the transformers will be the the familiar substitution [e E] which replaces all occurrences of variable e by the term E, introducing α-conversion\ as necessary to avoid capture by quantifiers. There are some special points to note, however, if we are applying the substitution over a modal formula Φ and wish to be consistent with the way the substitution would actually be carried out over [[Φ]].

(1) Substitution into a modality for a hidden variable has no effect; the sub- stitution is simply discarded. Thus [h E]PΦ is just PΦ. (2) If x does not occur in Φ then again\ the substitution is just discarded. Thus [x E]PΦ is just PΦ if x does not occur in Φ. (3) If x, E contain\ no hiddens, then [x E]PΦ is just P([x E]Φ). (4) Otherwise the substitution “stalls”\ at the modality (there\ is no reduc- tion).

These effects are due to the fact that, although the modality P is implicitly a quantification over hidden variables, we have not listed variables after the “quantifier” as one normally would: so we cannot α-convert them if that is what is necessary to allow the substitution to go through. (This accounts for the stalling in Case 4 and is the price we pay for suppressing the clutter of H and its quantification.) Usually the body Φ of the modality can be manipulated until one of Cases 1–3 holds, and then the substitution goes through after all.

8.3 Example calculations of atomic commands’ semantics

We now calculate the wp-semantics for Identity, Assign to visible and Choose hidden. Occurrences of v, h in the rules may be vectors of visible- or vectors of hidden variables, in which case substitutions such as [h h￿] apply throughout the vector. We continue to assume no nesting of modalities.\

21 8.3.1 (v, h)-transformer semantics for skip

We must define wp in this case so that for all modal formulae Ψ we have

[[ wp.skip.Ψ]] = wp.[[ skip]] .[[ Ψ]] ,

and that is clearly satisfied by the definition wp.skip.Ψ = Ψ.

￿ 8.3.2 (v, h)-transformer semantics for Assign to visible

Here we need

[[ wp.(v:= E).Ψ]] = wp.[[ v:= E]] .[[ Ψ]] (from Fig. 2) = [e E][H h: H e=E ][v E][[Ψ]] , \ \{ | } \

where the second equality comes from the operational-semantic definitions of Fig. 2 and classical wp. Note that we have three substitutions to perform in general.

The middle substitution [H h: H e=E ] leads to the definition of what we will call a “technical transformer,”\{ |Shrink} Shadow, and it is a feature of our carrying H around without actually referring to it; in our v, h semantics we will write it [ e=E], where the equality e=E expresses the constraint (the “shrinking”) to⇓ be applied to the potential values for h as recorded in H.

The general [ φ] is a simple substitution (too); but since it is a substitution for H it affects only⇓ the (translated) modal formulae, having no effect on classical atomic formulae (because they don’t contain H). Thus for modal formulae (only) we have

[H h: H φ ][[PΦ]] =[H\{h: H | φ}]( h: H Φ) “translate” \{ | } ∃ · =(h: h￿: H φ￿ Φ) “let φ be [h h ]φ for clarity” ￿ \ ￿ =(∃h: H{ φ | Φ)} · “simplify” =[[P(∃ φ ·Φ)]]∧ “retranslate” =[[[φ]P∧Φ ]] . “postulated definition Shrink Shadow” ⇓

Note that hidden variables in φ are not protected in Shrink Shadow from implicit capture by P.

Collecting this all together gives us

wp.(v:= E).Ψ =[e E][ e=E][v E]Ψ , \ ⇓ \

22 which will be the definition we give in Fig. 8 below. We place the synthesised definition of Shrink Shadow in Fig. 7.

8.3.3 (v, h)-transformer semantics for Choose hidden

Here we need

[[ wp.(h: E).Ψ]] = wp.[[ h: E]] .[[ Ψ]] ∈ ∈ =(h: E [H E h: H ][[Ψ]]) , ∀ · \∪{ | } where again we have referred to Fig. 2. The substitution [H E h: H ]we call Set Shadow; in our (v, h)-semantics we write it [h E]. \∪{ | } ⇐ Since Set Shadow is again a substitution for H it too affects only the (trans- lated) modal formulae:

[H E h: H ][[PΦ]] =[H\∪{E | h: H}]( h: H Φ) “translate” \∪{ | } ∃ · =[H E h: H ]( h￿: H Φ￿) “let Φ be [h h ]Φ for clarity” \∪{ | } ∃ · ￿ \ ￿ =(h￿: E h: H Φ￿) “substitute” ∃ ∪{ | } · =(h: H ( h￿ h￿ E Φ￿)) “convert to ” ∃ · ∃ · ∈ ∧ ∪ ∃ =[[P(h￿ h￿ E Φ￿)]] “retranslate” =[[P(∃h: E· ∈Φ)]]∧ “remove primes” =[[[h ∃ E]PΦ· ]] . “definition Set Shadow” ⇐

Note that h’s in E are not captured by the quantifier ( h ): rather they are implicitly captured by the modality P. Collecting this∃ together··· gives us

wp.(h: E).Ψ =(h: E [h E]Ψ) ∈ ∀ · ⇐ which again will be the definition we give in Fig. 8 below. The synthesised definition of Set Shadow joins the other technical transformers in Fig. 7.

8.4 Adherence to the principles

The wp-logic of Figs. 7,8 has the following significant features, some of which (1–4) bear directly on the principles we set out in Sec. 2, and some of which (5–7) are additional properties desirable in their own right.

23 “Meta-” modality “M” stands for both K and P.

Substitute [e E] Replaces e by E, with alpha-conversion as necessary if \ distributing through , . ∀ ∃ Distribution through M however is affected by the modalites’ implicit quantifica- tion over hidden variables: if e is a hidden variable, then [e E]MΦ is just MΦ; and \ if E contains hidden variables, the substitution does not distribute into MΦ at all (which therefore requires simplification by other means).

Shrink Shadow [ E] Distributes through all classical operators, with renaming; ⇓ has no effect on classical atomic formulae. We have [ E]PΦ =P(E Φ) and [ E]KΦ =K(E Φ); in both cases, hidden ⇓ ∧ ⇓ ⇒ variables in E are not renamed. ￿ ￿ Set hidden [h E] Distributes through all operators, including M, with re- ← naming as necessary to avoid capture by , (but not M). ∀ ∃ Replaces h by E.

Set Shadow [h E] Distributes through all classical operators, with renaming; ⇐ has no effect on classical atomic formulae. For modal formulae [h E]PΦ =P( h: E Φ); and [h E]KΦ =K( h: E Φ); ⇐ ∃ · ⇐ ∀ · note that h’s in E (if any) are not captured by the introduced quantifier. ￿ ￿ Fig. 7. Technical predicate transformers

(1) Refinement is monotonic — RP0. In view of (6) this requires only a check of Figs. 7,8 for the transformers’ monotonicity with respect to . (2) All visible-variable-only refinements⇒ remain valid — RP1. Reference to Fig. 2 confirms that for any visible-only program fragment S its ignorance-sensitive conversion [[S]]has exactly the conventional effect on variable v and no effect on the shadow H (or indeed on h). (3) All structural refinements remains valid — RP2. At this point we define structural to mean “involving only Demonic choice, Composition, Identity and right-hand distributive properties of Conditional.” Such properties can be proved using the operational definitions in Fig. 2, which in the first three cases above are exactly the same as their classical counterparts, merely distributing the relevant operator out- wards: that is why their program-algebraic properties are the same. In the fourth case, Conditional, distribution from the left is not consid- ered structural as it requires inspection of the condition E; distribution

24 Identity wp.skip.Ψ = Ψ Assign to visible wp.(v:= E).Ψ =[e E][ e=E][v e] Ψ ￿ \ ⇓ \ Choose visible wp.(v: E).Ψ =(e: E [ e E][v e] Ψ) ∈ ￿ ∀ · ⇓ ∈ \ Assign to hidden wp.(h:= E).Ψ =[h E] Ψ ￿ ← Choose hidden wp.(h: E).Ψ =(h: E [h E] Ψ) ∈ ￿ ∀ · ⇐ ￿ Demonic choice wp.(S1 S2).Ψ = wp.S1.Ψ wp.S2.Ψ ￿ ∧ Composition wp.(S1; S2).Ψ = wp.S1.(wp.S2.Ψ) ￿ Conditional wp.(if E then S1 else S2 fi).Ψ ￿ = E [ E]wp.S1.Ψ E [ E]wp.S2.Ψ ⇒ ⇓ ∧ ¬ ⇒ ⇓ ¬ ￿ Declare visible wp.(vis v).Ψ =(e [v e] Ψ) Note that both these ∀ · \ substitutions propagate within modalities in Ψ. Declare hidden wp.(hid h).Ψ =(e [h e] Ψ) ￿ ￿ ∀ · ← Logical variable e is fresh. ￿ The assign to visible rule has two components conceptually. The first is of course an assignment of E to v, although this is split into two sections [e E] [v e] so that \ ··· \ v’s initial- and final values are distinguished in between, necessary should v occur in E. The second is the “collapse” of ignorance caused by E’s value being revealed: the effect of this is inserted by [ e=E], into the body of modalities. ⇓ Fig. 8. Weakest-precondition modal semantics

from the right however remains as in the classical case. Thus we retain for example

(if E then S1 else S2 fi); S = if E then (S1; S) else (S2; S) fi .

(4) Referential transparency remains valid — RT. The operational definitions show that effect of a program S, if classi- cally interpreted over its variables v, h, is the same as the effect of [[S]] on those variables (i.e. ignoring H). Thus two v, h-expressions being equal carries over from S to [[S]], where we can appeal to classical referential transparency in the v, h, H semantics in order to replace equals by equals. (5) The ignorance-sensitive predicate transformers distribute conjunction, as classical transformers do [5]. Thus complicated postconditions can be treated piecewise. This can be seen by inspection of Fig. 8.

25 wp.(v: T ).(P(h=C)) wp.(v:= h).(P(h=C)) ∈ “Choose visible” “Assign to visible” ≡ ≡ ( e: T [ e T ][v e]P(h=C)) [e h][ e=h][v e]P(h=C) ∀ · ⇓ ∈ \ \ ⇓ \ “v not free” “v not free; Shrink Shadow” ≡ ≡ ( e: T [ e T ]P(h=C)) [e h]P(e=h h=C) ∀ · ⇓ ∈ \ ∧ “Shrink Shadow” [e h]P(e=C h=C)“h=C” ≡ ≡ \ ∧ ( e: T P(e T h=C))) ∀ · ∈ ∧ “h not free in e=C” “h not free in e T ” ≡ ≡ ∈ [e h](e=C P(h=C)) ( e: T e T P(h=C)) \ ∧ ∀ · ∈ ∧ h=C P(h=C)“e not free” P(h=C)“e not free in P(h=C)” ≡ h=C ∧ “= Φ PΦ” ≡ ≡ | ⇒

We exploit that =P(Φ Ψ) Φ PΨ when Φ contains no hidden variables. | ∧ ≡ ∧ The right-hand calculation shows that h=C is the weakest precondition Φ such that Φ v:= h P(h=C) ,yet(v, h, H) =P(h=C) (h=C) for all C only when { } { } | ⇒ H = h . Thus, when the expression T contains no h, the fragment v: T can be { } ∈ replaced by v:= h only if we know h already. Fig. 9. Avoiding the Refinement Paradox via wp-calculations

(6) Non-modal postconditions can be treated using classical semantics [4,5], even if the program contains hidden variables. This follows from inspection of Figs. 7,8 to show that the transformers reduce to their classical definitions when there are no modalities. (7) Modal semantics can be restricted to modal conjuncts. From (5) we can treat each conjunct separately; from (6) we can use classical semantics on the non-modal ones. Thus the modal semantics is required only for the others.

8.5 Example of wp-reasoning

Fig. 9 uses weakest preconditions to support our earlier claim (Sec. 7.3.2) that the Refinement Paradox is resolved logically: if = wp.S1.Ψ wp.S2.Ψ, then from (5) we know wp.S1.Ψ S1 Ψ holds (perforce)￿| but wp⇒.S1.Ψ S2 Ψ does not — and that{ is the} situation{ } in Fig. 9. { } { }

26 9 Examples of further techniques

Beyond the general identities introduced above, a number of specific tech- niques are suggested by the examples and case studies that we address below. Here we introduce some of them, in preparation for the Encryption Lemma (Sec. 10) and the Oblivious Transfer Protocol (Sec. 11).

9.1 Explicit atomicity

If we know that only our initial and final states are observable, why must we assume a “strong” adversary, with perfect recall and access to program flow?

The answer is that if we do not make those assumptions then we cannot soundly develop our program by stepwise refinement. But if we are prepared to give up stepwise refinement in a portion of our program, then we can for that portion assume our adversary is “weak,” seeing only initial and final values.

We formalise that via special brackets whose interior will have a dual meaning: at run time, a weak adversary is{{·}} assumed there (an advantage); but, at development time, refinement is not allowed there (a disadvantage). Thus it is a trade-off. Intuitively we say that S interprets S atomically even when S is compound: it cannot be internally{{ }} observed; but neither can it be internally refined.

Thus for example h:= 0 h:= 1 is equal to the atomic h: 0, 1 — in the former, the atomicity{{ brackets￿ prevent}} an observer’s using the∈ program{ } flow at runtime to determine the value of h, and so the ’d choice-point is not the security leak it would normally be. But the price{{·}} we consequentially￿ pay is that we are not allowed to use refinement there at development-time: for example, although the refinement (h:= 0 h:= 1) h:= 0 is trivial, nevertheless we cannot apply that refinement within￿ the brackets￿ to conclude {{·}} h:= 0 h:= 1 ? h:= 0 {{ ￿ }} ￿ {{ }}

— and doing that is equivalent to asserting the falsehood h: 0, 1 ? h:= 0. (The lhs achieves postcondition P(h=1); the rhs does not.)∈ { } ￿

We do not give the formal definition of here; 9 but we note several useful and reasonable properties of it. {{·}}

9 The details are available in an addendum [16]. In particular Prop. 1 is important as a statement of consistency, since our definitions in Fig. 8 should be equally valid if presented as lemmas proved directly from the definition of . {{·}}

27 Proposition 1 If program S is syntactically atomic then S = S . ✷ {{ }} Proposition 2 For all programs S we have S S . ✷ ￿ {{ }} Proposition 3 If either of the programs S1,S2 does not assign to any visible variable, or (more generally) program S1’s final visibles are completely deter- mined by its initial visibles, or program S2’s initial visibles are completely determined by its final visibles, then S1; S2 = S1 ; S2 . ✷ {{ }} {{ }} {{ }} For Prop. 3, informally, we note that allowing intrusion at a semicolon is harmless when there’s nothing (new) for the run-time attacker to see: for example, if S1 changes no visible variable, then any visible values revealed at the semicolon were known at the beginning of the whole statement already; if S2 does not, then they will be known at the end.

9.2 Referential transparency and classical reasoning

A classical use of referential transparency is to reason that

A program fragment S1 say establishes some (classical) formula φ over the • program state; That φ implies the equality of two expressions E and F ; and that therefore • Expression E can be replaced by expression F in some fragment S2 that • immediately follows S1.

The same holds true in our extended context, provided we realise that S1 must establish φ in all possible states: in effect we demand Kφ rather than just φ. In this section we check that this happens automatically, even when visibles and hiddens are mixed, and that thus the classical use of RT continues to apply.

Because Kφ is itself not ignorant, we use it negatively by appealing to “coer- cions”: a coercion is written [Φ] for some formula Φ and is defined 10

Coerce to Φ — wp.[Φ].Ψ = Φ Ψ . (7) ⇒ ￿ The principal use of a coercion is to formalise the reasoning pattern above, i.e. the establishing of some φ in all potential states (by S1) and its use as a context for RT (in S2). We begin with using a coercion as a context.

10 Coercions are dual to the more familiar assertions that allow abnormal termina- tion if they fail (i.e. abort); in contrast, coercions cause backtracking if they fail (i.e. magic) [6].

28 Lemma 2 For any variable x, whether visible or hidden, and classical formula φ such that φ E=F ,wehave ⇒ [Kφ]; x: E = [Kφ]; x: F. ∈ ∈

Proof: Appendix A. ✷

Note that the result specialises to assignments, as those are just singleton choices, and that we can view the lemma as formalising RT : the truth of Kφ establishes the context in which E=F , allowing their use interchangeably.

We now address how coercions are introduced, moved and finally removed. Introduction and removal are easily dealt with, as the following lemma shows that any coercion can be introduced at any point, up to refinement. The coercion [true] can always be removed:

Lemma 3 We have [true]=skip and skip [Φ]foranyΦ . ￿ Proof: Trivial. ✷

The only problem thus is removing a coercion that is not just [true] — and that is done by moving it forward through statements whose combined effect is to establish it, i.e. to “make” it true operationally:

Lemma 4 For classical formulae φ, ψ and program S, we have

= φ wp.S.ψ implies S; [Kψ] [Kφ]; S. | ⇒ ￿

Proof: Appendix A. ✷

The importance of Lem. 4 is that its assumption = φ wp.S.ψ can usually be established by classical reasoning (recall Sec. 8(6)).| In⇒ the special case where φ and ψ are the same, we say that S preserves φ and, as a result, can think of “pushing [Kφ] to the left, through S ” as being a refinement.

In summary, the above lemmas work together to provide the formal justifica- tion for carrying information from one part of a program to another, as in the following example. Suppose we have a linear fragment

S; SS; x:= E

where, informally, program S establishes some classical property φ, that prop- erty can be shown by classical reasoning to be preserved by SS, and it implies that E=F . Then to change E to F in the final statement, we would reason as follows:

29 S; SS; x:= E S; SS; [Kφ]; x:= E “Lem. 3” =￿ S; SS; [Kφ]; x:= F “assumption; Lem. 2” = S; SS; [Kφ]; x:= F “associativity” S; [Kφ]; SS; x:= F “SS preserves φ” =￿ S; [Kφ]; SS; x:= F “associativity” [true]; S; SS; x:= F “S establishes φ” =￿ S; SS; x:= F. “Lem. 3”

Thus the general pattern is to assume the property φ of the state necessary to replace E by F , and then to move it forward, preserved, through the program SS to a point S at which it can be shown to have been established. An example of this is given in Sec. 10(5).

We have thus checked that the “establish context, then use it” paradigm con- tinues to operate in the normal way, even with mixed visibles and hiddens; therefore we will usually appeal to the above results only informally.

9.3 Hiding leads to refinement

We show that changing a variable from visible to hidden, leaving all else as it is, has the effect of a refinement: informally, that is because the nondetermin- ism has not been changed while the ignorance can only have increased. More precisely, we show that

[ vis v S(v) ] [ hid h S(h) ] | · | ￿ | · |

for any program S(v) possibly containing references to variable v. This is a syntactic transformation, as we can see by comparing the two fragments

[ vis v (v:= 0 v:= 1); h￿:= v ] and [ vis v v: 0, 1 ; h￿:= v ] (8) | · ￿ | | · ∈ { } |

in the context of global hidden h￿; they are both equivalent semantically to h￿:= 0 h￿:= 1 since the assignment to v is visible in each case. However, transforming￿ visible v to hidden h gives respectively

[ hid h (h:= 0 h:= 1); h￿:= h ] and [ hid h h: 0, 1 ; h￿:= h ] , (9) | · ￿ | | · ∈ { } |

which differ semantically: the lhs remains h￿:= 0 h￿:= 1, since the resolution ￿ of is visible (even though h itself is not); but the rhs is now h￿: 0, 1 . Note that￿ (9,rhs) is a refinement of (8,rhs) nevertheless, which is precisely∈ { what} we seek to prove in general.

30 Lemma 5 If program S2 is obtained from program S1 by a syntactic trans- formation in which some local visible variable is changed to be hidden, then S1 S2. ￿ Proof: Routine applications of definitions, as shown in Appendix A. ✷

10 The Encryption Lemma

In this section we see our first substantial proof of refinement; and we prepare for our treatment of the OTP (Oblivious Transfer Protocol).

When a hidden secret is encrypted with a hidden key and the result published as a visible message, the intention is that adversaries ignorant of the key cannot use the message to deduce the secret, even if they know the encryption method. A special case of this occurs in the OTP, where a secret is encrypted (via exclusive-or) with a key (a hidden Boolean) and becomes a message (is published). We examine this simple situation in the ignorance logic.

Lemma 6 Let s: S be a secret, let k: K be a key, and let be an encryption method so that s k is the encryption of s. Then in a context￿ including hid s we have the refinement￿

skip [ vis m; hid k k: K; m:= s k ] , ￿ | · ∈ ￿ |

which expresses that publishing the encryption s k as a visible message m reveals nothing about the secret s, provided the￿ key k is hidden and the following Key-Complete Condition is satisfied: 11

KCC —(set M ( s: S s K = M)) . (10) ∃ · ∀ · ￿

By s K we mean s k k: K , that is the set formed by -ing the secret s with￿ all possible keys{ ￿ in|K. Informally} KCC states that the￿ set of possible encryptions is the same for all secrets — it is some fixed set M.

Proof: Assume a context containing the declaration hid s: S. Then we rea- son 12

skip

11 The Key-Complete Condition is very strong, requiring as many potential keys as messages; yet it applies to the OTP. 12 This can be done by direct wp-calculation also [1]; we thank Lambert Meertens for suggesting it be done in the program algebra.

31 = [ vis m m: M ] “for any fixed non-empty set M (1)” → = |[ vis m,· v v∈: M|; m:= v ] “visible-only reasoning” |[ vis m; hid· h∈ h: M; m:=| h ] “Lem. 5 and renaming (2)” → =￿ |[ vis m; hid h · h:∈ s K; m:=|h ] “s S and ( s: S s K = M) (3)” | · ∈ ￿ | ∈ ∀ · ￿ → “using atomicity-style reasoning (4)” ￿ → [ vis m; hid h [ hid k k: K; h:= s k ]; m:= h ] | ·| · ∈ ￿ | |

= [ vis m; hid h, k k: K; h:= s k; m:= h ] “rearrange scopes” = |[ vis m; hid h, k · k:∈ K; h:= s￿k; m:= s |k ] “RT (5)” → = |[ vis m; hid k k·: K∈; m:= s ￿k ] . ￿ | “h now superfluous (6)” | · ∈ ￿ | →

Our only assumption was the KCC at Step (3). ✷

As a commentary on the proof steps we provide the following:

(1) This and the next step are classical, visible-only reasoning. (2) The choice from M was visible; now it is hidden. (3) From s’s membership of its type S and the KCC we thus know that s K and M are equal, whatever value the secret s might have, and so RT￿ applies. This is the crucial condition, and it does not depend on M directly — all that is required is that there be such an M. (4) To justify this step in detail we introduce atomicity briefly, to simplify the reasoning; then we remove it again. We have h: s K = [ hid∈ ￿k k: K ]; h: s K “introduced statement is skip” = |[ hid k · k:∈ K; h| : ∈s ￿K ] “adjust scopes” |[ hid k · k∈: K; h∈: ￿s K| ] “Prop. 2” =￿ |[ hid k · {{ k:∈ K; h:=∈ s￿k;}}k: | K ] “classical reasoning” = |[ hid k · k{{ : ∈K; h:= s ￿k; k: ∈K ]}} | “Prop. 3; Prop. 1” = |[ hid k · k:∈ K; h:= s￿k ] .∈ | “k local: discard final assignment” | · ∈ ￿ | (5) That (h:= s k; m:= h)=(h:= s k; m:= s k) cannot be shown by di- rect appeal to￿ classical reasoning,￿ simple as￿ it appears, because it mixes hidden (h, s, k) and visible (m) variables. Nevertheless, it is a simple ex- ample of the coercion reasoning from Sec. 9.2; in detail we have h:= s k; m:= h h:= s￿k; [K(h = s k)]; m:= h “Lem. 3” =￿ h:= s￿k; [K(h = s￿k)]; m:= s k “Lem. 2” = h:= s￿k; [K(h = s ￿k)]; m:= s￿k “associativity” [true￿]h:= s k; m￿:= s k ￿ “Lem. 4” =￿ h:= s k; m￿:= s k. ￿ “Lem. 3” ￿ ￿ Alternatively we could see it as an application of RT, in the spirit of the comment following Lem. 2.

32 (6) This is justified by the general equality

[ hid h h: E ] = skip , | · ∈ | which can be proved by direct wp-reasoning. Note however that for visi- bles [ vis v v: E ] = skip in general: let E be h for example. | · ∈ | ￿ { }

11 Deriving the Oblivious Transfer Protocol

Rivest’s Oblivious Transfer Protocol is an example of ignorance preservation [10]. 13 From Rivest’s description (paraphrased) the specification is as below; we use for exclusive-or throughout. ⊕ N Specification We assume that Alice has two messages m0,m1: B , where we write B = 0, 1 for the Booleans as bits. The protocol is to ensure that { } Bob will obtain mc for his choice of c: B. But Alice is to have no idea which message Bob￿ chose, and Bob is to learn nothing about Alice’s other message m1 c. ⊕ Rivest’s solution introduces a trusted intermediary Ted, who is active only at the beginning of the protocol and, in particular, knows neither Alice’s messages m0,1 nor Bob’s choice c. He participates as follows:

N Setup Ted privately gives Alice two random N-bit strings r0,r1: B ; and he flips a bit d, then giving both d and rd privately to Bob. Ted is now done, and can go home.

Once Ted has gone, the protocol continues between Alice and Bob:

Request Bob chooses privately a bit c: B; he wants to obtain mc from Alice. He sends Alice the bit e = c d. ⊕ Reply Alice sends Bob the values f0,f1 = m0 re,m1 r1 e. ⊕ ⊕ ⊕ Conclusion Bob now computes￿ m = fc rd. ⊕￿ The whole protocol is summarised in￿ Fig. 10.

Given that there are three principals, we have three potential points of view. We will take Bob’s for this example, and the declarations of Fig. 10 therefore

13 Earlier [1] we derived Chaum’s Dining Cryptographers’ Protocol [17] instead.

33 N var m0,m1: B ; c: B — Alice has strings m0,1; Bob wants mc. · N [ var r, r0,r1,f0,f1: B ; d, e: B | · N N r0,r1: B , B ; — Ted gives random r-strings to Alice. ∈ d: B; r:= rd; — Ted gives random d, and rd, to Bob. ∈ — Ted goes home. ··· e:= c d; — Bob sends request e to Alice. ⊕ f0,f1:= m0 re,m1 r1 e; — Alice sends encrypted strings to Bob. ⊕ ⊕ ⊕ m:= f r ] — Bob decodes the answer. c ⊕ | The classical correctness of the protocol is easily established by backwards reasoning through the equalities m = f r “m:= f r” c ⊕ c ⊕ =(mc rc e) r “fc:= mc rc e for c =0, 1” ⊕ ⊕ ⊕ ⊕ ⊕ = mc rc (c d) r “e:= c d” = m ⊕ r ⊕ ⊕r ⊕ “c c =⊕ 0” c ⊕ d ⊕ ⊕ = mc rd rd “r:= rd” = m ⊕. ⊕ “r r = 0” c d ⊕ d What we must address here, however, is the more interesting question of to what extent the values m0,1 and c might be “leaked”. Fig. 10. Rivests’s Oblivious Transfer Protocol become the context for our derivation:

N Globals: hid m0,m1: B ; — Bob cannot see m0,1. vis c: B · (11) N Locals: hid r0,r1: B ; — Bob cannot see r0,1. N vis r, f0,f1: B ; d, e: B ·

The types are unchanged but the visibility attributes are altered `ala Bob. 14

The significance of e.g. declaring m0 to be global while r0 is local is that local variables are exempted from refinement: a simple classical example in

14 For comparison, we note that Alice’s point of view is

N vis m0,m1: B ; hid c: B — Alice cannot see c. N · N vis r0,r1,f0,f1: B ; e: B; hid r: B ; d: B — Alice cannot see r, d. ·

It is not a simple inversion, since the local variables f0,1 and e, that is the messages passed between Alice and Bob, are visible to both principals.

34 the global context var g is the refinement

[var l l := 1; g:= g+l ] [var l l := 1; g:= g l ] , | 1 · 1 1 | ￿ | 2 · 2 − − 2 | where the two fragments are in refinement (indeed both are equal to g:= g +1) in spite of the fact that the lhs does not allow manipulation of l2 and the rhs does not implement l1:= 1. Thus the variables we make local are those in which we have no interest wrt refinement.

Rather than prove directly that Rivest’s protocol meets some logical pre/post- style specification, instead

We use program algebra to manipulate a specification-as-program whose cor- rectness is obvious.

In this case, the specification is just m:= mc, and we think it is obvious that it reveals nothing about m1 c to Bob: indeed that variable is not even men- ⊕ tioned. 15 Under the declarations of (11), we will show it can be refined to the code of Fig. 10, which is sufficient because (following Mantel [18])

An implementation reached via ignorance-preserving refinement steps re- quires no further proof of ignorance-preservation.

An immediate benefit of such an approach is that we finess issues like “if it is known that m0=m1 and Bob asks for m0, then he has learned m1 — isn’t that incorrect?” A pre/post-style direct proof of the implementation would require in the precondition an assertion that m0 and m1 were independent, were not related in any way, in order to conclude that nothing about m1 c ⊕ is revealed. But a refinement-algebraic approach does not require that: it is obvious from the specification the role that independence plays in the proto- col, and the derivation respects that automatically and implicitly due to its semantic soundness.

We now give the derivation: in the style of Sec. 10, a commentary on the proof steps follows; in some cases arrows in the left margin indicate a point of → N interest. Within Bob’s context (11), that is hid m0,m1: B ; vis c: B we reason

m:= mc “specification”

= skip; m:= mc “Identity”

15 When a specification is not so simple, one can elucidate its properties with logical experiments, i.e. attempted proofs of properties one believes or hopes it has; our previous report [1] shows that for Chaum’s Dining Cryptographers [17].

35 = [ vis d, e: B d: B; e:= c d ]; “refine skip: visible-only reasoning (7)” | · ∈ ⊕ | → m:= mc

= [ vis d, e: B d: B; e:= c d; “adjust scopes; introduce skip again” | · ∈ ⊕ skip; m:= m ] c | [ vis d, e: B d: B; e:= c d; “encryption lemma (8)” ￿ | · ∈ ⊕ → N N [ vis f0,f1: B ; hid r0,r1 : B → | N N · r0,r1: B , B ; → ∈ f0,f1:= m0 re,m1 r1 e ]; → ⊕ ⊕ ⊕ | m:= m ] c | N N = [ vis d, e: B; f0,f1: B ; hid r0,r1: B “adjust scopes” | · d: B; e:= c d; ∈ N ⊕N r0,r1: B , B ; ∈ f0,f1:= m0 re,m1 r1 e; ⊕ ⊕ ⊕ m:= m ] c | = “introduce r: visible-only reasoning” N N [ vis d, e: B; f0,f1,r: B ; hid r0,r1: B → | · d: B; e:= c d; ∈ N ⊕N r0,r1: B , B ; ∈ f0,f1:= m0 re,m1 r1 e; ⊕ ⊕ ⊕ m:= mc; r:= f m ] → c ⊕ |

N N = [ vis d, e: B; f0,f1,r: B ; hid r0,r1: B “classical reasoning (9)” | · → d: B; e:= c d; ∈ N ⊕N r0,r1: B , B ; ∈ f0,f1:= m0 re,m1 r1 e; ⊕ ⊕ ⊕ m:= mc; r:= r ] → d | N N = [ vis d, e: B; f0,f1,r: B ; hid r0,r1: B “reordering (10)” | N N · → r0,r1: B , B ; ∈ d: B; ∈ r:= rd; e:= c d; ⊕ f0,f1:= m0 re,m1 r1 e; ⊕ ⊕ ⊕ m:= m ] c |

36 N N = [ vis d, e: B; f0,f1,r: B ; hid r0,r1: B “classical reasoning (11)” | N N · → r0,r1: B , B ; ∈ d: B; ∈ r:= rd; e:= c d; ⊕ f0,f1:= m0 re,m1 r1 e; ⊕ ⊕ ⊕ m:= f r ] , → c ⊕ |

which concludes the derivation. Given our context (11), we have established that Rivest’s protocol reveals no more about the (hidden) variables m0,m1 than does the simple assignment m:= mc that was our specification.

A technical point is that our manipulation of r –introduced “late” then moved “early”– illustrates a feature which seems to be typical: although r’s opera- tional role in the protocol is early (it is used by Ted), its conceptual role is late: in order to manipulate the rhs of the assignment to it, we require facts estab- lished by the earlier part of the program. Once that is done, it can be moved into its proper operational place. Similarly, the statement m:= mc, from our specification, was carried through the whole derivation until the program text lying before it was sufficiently developed to justify via RT an equals-for-equals substitution in its rhs.

The commentary on the derivation is as follows:

(7) The order in which the variables are introduced is not necessarily their final “execution order”; rather it is so that the right variables are in scope for subsequent steps. In this case our use of the encryption lemma refers to e, which thus must be introduced first. (8) The visible variable (m) is the 2N-bit pair f0,f1; strictly speaking there should be a single variable f on which a subscript operation is defined, ·i so that the syntax fi is actually an expression involving two separate variables f and i. Similarly, the key (k) is the 2N-bit pair r0,r1; and the encryption operation ( ) is exclusive-or between 2N-bit strings, which satisfies KCC. ￿ ⊕ (9) The classical reasoning referred to shows that rd = fc m at this point in the program, allowing an equals-for-equals substitution⊕ in the rhs of the assignment to r as described in Sec. 9.2. We could not introduce r:= rd directly in the previous step because that reasoning would not have been visible-only: the expression rd on the right refers to the hiddens r0,1. (See the comment (8) that fi is an abbreviation; the same is true of r and m.) (10) We have used the principle that (x:= E; y:= F )=(y:= F ; x:= E), given the usual conditions that E contains no y and F no x,toshuffle the assignments around into their final order. (Recall (7) above for why they are out of order.) It holds even when some of the variables are hidden.

37 (11) The justification here is, as for (9), that classical reasoning (as in Fig. 10) establishes an equality at this point operationally, that mc = fc r, and then RT applies. ⊕

We conclude this example with some general comments on this style of de- velopment. The specification m:= mc of the OTP is unsatisfactory as a direct implementation only because the rhs mixes variables of Alice and Bob, two principals who are physically separated: we are following the convention that statements xA:= EA, where the A indicates “uses only variables located with Alice”, describe computations that· Alice undertakes on her own; on the other hand, statements xA:= EB describes a message EB constructed by Bob and sent to Alice, received by her in xA. Thus the difficulty with direct use of the specification is that the expression mc is neither an EA nor an EB, and so there is no single “place” in which it can be evaluated.

Thus we refine “only” in order to solve the locality problem (but open a Pandora’s Box in doing so). It could readily be avoided in a Bob’s-view im- plementation such as

N [ vis m0￿ ,m1￿ : B — Local variables of Bob. | · m0￿ ,m1￿ := m0,m1; — Send both messages to Bob. (12) m:= m ] , — Bob takes chosen message. c￿ |

trivially a valid classical refinement and now locality-valid too, since both m0￿ ,1 and c are Bob’s variables. But then it is just as trivially invalid for ignorance

preservation since Bob can deduce m1 c by examining m1￿ c. Fortunately the ⊕ rules of ignorance refinement prevent us from making this⊕ mistake, since in general we have skip [ vis v v:= h ], thus invalidating the very step that ￿￿ | · | introduced the m0￿ ,1 variables.

12 Contributions, inspirations, comparisons and conclusions

Our contribution is to have altered the rules for refinement of sequential programs, just enough, so that ignorance of hidden variables is preserved: we can still derive correct protocols (Sec. 11), but can no longer mistakenly propose incorrect ones (Sec. 7.3).

Thus we allow Stepwise Refinement (Sec. 1) to be used for the development of security-sensitive sequential programs, avoiding the usual “paradoxes” that in the past have attended this (Sec. 7.3.2). But we must then adopt a pessimistic stance in which we treat even a “weak” adversary as if he were a “strong” one: refinement comes at a price.

38 We argued that this price is “non-negotiable” in the sense that any reasonable definition of refinement must pay it (Sec. 3); but it can be mitigated some- what by indicating explicitly any portions of the program in which we are prepared to forego the ability to refine: declaring them “atomic” (Sec. 9.1) protects them from adversarial intrusion (good), but also disallows Stepwise Refinement within them (bad).

Part of the inspiration for our approach was work by Van der Meyden, Engelhardt and Moses who earlier treated the OTP, and Chaum’s Dining Cryptographers [17,1], via a refinement-calculus of knowledge and ignorance [19]. That approach (together with advice from them) is the direct inspiration for what is here.

Compared to the work of Halpern and O’Neill, who apply the Logic of Knowl- edge to secrecy [11] and anonymity [20], ours is a very restricted special case: we allow just one agent; our (v, h, H) model allows only h to vary in the Kripke model [14]; and our programs are not concurrent. What we add back –having specialised away so much– is reasoning in the wp-based assertional/sequential style, thus exploiting the specialisation to preserve traditional reasoning pat- terns where they can apply.

Comparison with non-interference [21] comes from regarding hidden variables as “high-security” and visible variables as “low-security”, and concentrating on program semantics rather than e.g. extra syntactic annotations: thus we take the extensional view [22] of non-interference where security properties are deduced directly from the semantics of a program [23, III-A]. Recent examples of this include elegant work by Leino et al. [24] and Sabelfeld et al. [25].

Again we have specialised severely — we do not consider lattices, nor loops (and thus possible divergence), nor concurrency, nor probability. However our “agenda” of Refinement, the Logic of Knowledge, and Program Algebra, has induced four interesting differences from the usual approaches:

(1) We do not prove “absolute” security of a program. Rather we show that it is no less secure than another; this is induced by our refinement agenda. After all, the OTP specification is not secure in the first place: it reveals one of Alice’s messages to Bob. (To attempt to prove the OTP imple- mentation absolutely secure is therefore pointless.) Thus there is similarity of aims with delimited information release approaches. In this example borrowed from Sabelfeld and Myers [26, Av- erage salary] we suppose two hidden variables declared hid h1,h2 whose sum we are allowed to release, but not their actual values. Our specifica-

39 tion would be the “procedure-” or “method” call 16

(Avg) [ vis v; hid p ,p p ,p := h ,h ; v:= p +p ] , | 1 2 · 1 2 1 2 1 2 | that reveals that sum. The suggested attack is

(Avg-attack) [ vis v; hid p ,p p ,p := h↓ , h↓ ; v:= p +p ] , | 1 2 · 1 2 1 1 1 2 |

which clearly reveals h1 by using the parameters in an unexpected way. Indeed we have (Avg) (Avg-attack) — so we do not allow this either. Thus our approach￿￿ to delimited security is to write a specification in which the partial release of information is allowed: refinement then ensures that no more than that is released in any implementation. (2) We concentrate on final- rather than initial hidden values. This is induced by the Kripke structure of the Logic of Knowledge approach (Sec. 4), which models what other states are possible “now” (rather than “then”). The usual approach relates instead to hidden initial values, so that h:= 0 would be secure and v:= h; h: T insecure; for us just the op- posite holds. Nevertheless, we could∈ achieve the same effect by oper- ating on a local hidden copy, thrown away at the end of the . Thus [ hid h : h h := 0 ] is secure (for both interpretations), and | ￿ · ￿ | [ hid h : h v{:=}h ; h : T ] is insecure. | ￿ · ￿ ￿ | (3) A direct comparison{ } with∈ non-interference considers the relational seman- tics R of a program over v, h: T ; the refinement v: T v: R.v.h then expresses absolute security for the rhs with respect∈ to￿h’s∈ initial (and final) value. We can then reason operationally, as follows. Fig. 2 shows that from initial v, h, H the possible outcomes on the left are (t, h, H) for all t: T , and that they are (t￿, h, h : H t￿ R.v.h )on { | ∈ } the right for all t￿: R.v.h. For refinement from that initial state we must therefore have t￿ R.v.h t￿ T , which is just type-correctness; but also ∈ ⇒ ∈

t￿ R.v.h H h : H t￿ R.v.h (13) ∈ ⇒ ⊆ { | ∈ } must hold, both conditions coming from the operational definition of re- finement (Sec. 5.3). Since (13) constrains all initial states v, h, H, and all t￿, we close it universally over those variables to give a formula which via predicate calculus and elementary set-theory reduces to the non- interference condition ( v, h, h￿: T R.v.h = R.v.h￿) as usually given for · demonic programs [24,25].∀ (4) We insist on perfect recall. This is induced by our algebraic principles (Sec. 3.1), and thus we consider v:= h to have revealed h’s value at that

16 Here we are simulating a function call Avg (h1,h2) with formal parameters p1,p2 and formal return value v. These three variables are local because we are not inter- ested in them directly (they “sit on the stack”): we care only about their effect on what we know about h1,h2.

40 point, no matter what follows. The usual semantic approach allows in- stead a subsequent v:= 0 to “erase” the information leak. Perfect recall is also a side-effect of () concurrency [11],[23, IV- B], but has different causes. We are concerned with ignorance-preservation during program development; the concurrency-induces-perfect-recall prob- lem occurs during program execution. We commented on the link between the two in our discussion of atomicity (Sec. 9.1). The “label creep” [23, II-E] caused by perfect recall, where the build- up of un-erasable leaks makes the program eventually useless, is mitigated because our knowledge of the current hidden values can decrease (via h: T for example), even though knowledge of initial- (or even previous) values∈ cannot. (5) We do not require “low-view determinism” [23, IV-B]. This is induced by our explicit policy of retaining abstraction, and of determining exactly when we can “refine it away” and when we cannot. Roscoe and others instead require low-level behaviour to be deterministic [27].

Our emphasis on refinement continues the spirit of Mantel’s work [28,18], but is different in its framework. Like Bossi et al. [29], Mantel takes an unwinding- style approach [21] in which the focus is on events and their possible occur- rences, with relations between possible traces that express what can or cannot be deduced, about a trace that includes high-security events, based on projec- tions of it from which the high-security events have been erased.

We focus instead primarily on variables’ values, and our interest in the traces is not intrinsic but rather is induced temporarily by the refinement princi- ples and their consequences (Sec. 3). While traces feature in our underlying model (Sec. 4.1), we abstract almost completely from that (Sec. 4.2) into a variable (the Shadow variable H) which we can then “blend in” to a semi- conventional program-logic formulation of correctness: thus our case-study example (Sec. 11) does not refer to traces at all. Nevertheless other authors’ trace-style abstract criteria for refinement (i.e. preservation of properties), and ours, agree with each other and with the general literature of refinement and its issues [2–9,30].

We conclude that ignorance refinement is able to handle examples of simple design, at least — even though their significance may be far from simple. Because wp-logic for ignorance retains most structural features of classical wp, we expect that loops and their invariants, divergence, and concurrency via e.g. action systems [31] could all be feasible extensions.

Adding probability via modal “expectation transformers” [30] is a longer-term goal, but will require a satisfactory treatment of conditional probabilities (the probabilistic version of Shrink Shadow) in that context.

41 Acknowledgements

Thanks to Yoram Moses, Kai Engelhardt and Ron van der Meyden for the problem, the right approach and the historical background, and to Anindya Banerjee, Annabelle McIver, Jeff Sanders, Mike Spivey, Susan Stepney, mem- bers of IFIP WG2.1 and delegates to Dagstuhl Seminars 06191 (organisers Abrial, Gl¨asser) and 06281 (Broy, Cousot, Misra, O’Hearn) for suggestions.

Michael Clarkson and Chenyi Zhang supplied useful references. Yoram Moses contributed material to Sec. 4.

Finally I would like to thank the referees of this journal version, and of the earlier report [1], for their very thorough advice.

References

[1] C. Morgan, The Shadow Knows: Refinement of ignorance in sequential programs, in: T. Uustalu (Ed.), Math Prog Construction, Vol. 4014 of Springer, Springer, 2006, pp. 359–78.

[2] N. Wirth, Program development by stepwise refinement, Comm ACM 14 (4) (1971) 221–7. URL http://www.acm.org/classics/dec95/

[3] R.-J. Back, On the correctness of refinement steps in program development, Report A-1978-4, Dept Comp Sci, Univ Helsinki (1978).

[4] C. Hoare, An axiomatic basis for computer programming, Comm ACM 12 (10) (1969) 576–80, 583.

[5] E. Dijkstra, A Discipline of Programming, Prentice-Hall, 1976.

[6] C. Morgan, Programming from Specifications, 2nd Edition, Prentice-Hall, 1994, web.comlab.ox.ac.uk/oucl/publications/books/PfS/.

[7] R.-J. Back, J. von Wright, Refinement Calculus: A Systematic Introduction, Springer, 1998.

[8] C. Morgan, T. Vickers (Eds.), On the Refinement Calculus, FACIT Series in , Springer, Berlin, 1994.

[9] J. Jacob, Security specifications, in: IEEE Symposium on Security and Privacy, 1988, pp. 14–23.

[10] R. Rivest, Unconditionally secure commitment and oblivious transfer schemes using private channels and a trusted initialiser, Tech. rep., M.I.T., //theory.lcs.mit.edu/~rivest/Rivest-commitment.pdf (1999).

42 [11] J. Halpern, K. O’Neill, Secrecy in multiagent systems, in: Proc 15th IEEE Foundations Workshop, 2002, pp. 32–46.

[12] J. Hintikka, Knowledge and Belief: an Introduction to the Logic of the Two Notions, Cornell University Press, 1962, available in a new edition, Hendricks and Symonds, Kings College Publications, 2005.

[13] J. Y. Halpern, Y. Moses, Knowledge and common knowledge in a distributed environment, Jnl ACM 37 (3) (1990) 549–87, earlier in Proc PoDC, 1984.

[14] R. Fagin, J. Halpern, Y. Moses, M. Vardi, Reasoning about Knowledge, MIT Press, 1995.

[15] M. Smyth, Power domains, Jnl Comp Sys Sci 16 (1978) 23–36.

[16] C. Morgan, Unpublished notes “060320 atomicity”, www.cse.unsw.edu.au/~carrollm/Notes (2006).

[17] D. Chaum, The Dining Cryptographers problem: Unconditional sender and recipient untraceability, J. Cryptol. 1 (1) (1988) 65–75.

[18] H. Mantel, Preserving information flow properties under refinement, in: Proc IEEE Symp Security and Privacy, 2001, pp. 78–91.

[19] K. Engelhardt, Y. Moses, R. van der Meyden, Unpublished report, Univ NSW (2005).

[20] J. Halpern, K. O’Neill, Anonymity and in multiagent systems, in: Proc 16th IEEE Computer Security Foundations Workshop, 2003, pp. 75–88.

[21] J. Goguen, J. Meseguer, Unwinding and inference control, in: Proc IEEE Symp on Security and Privacy, 1984, pp. 75–86.

[22] E. Cohen, Information transmission in sequential programs, ACM SIGOPS Operatings Systems Review 11 (5) (1977) 133–9.

[23] A. Sabelfeld, A. Myers, Language-based information-flow security, IEEE Jnl Selected Areas Comm. 21 (1) (2003) 5–19.

[24] K. Leino, R. Joshi, A semantic approach to secure information flow, Science of Computer Programming 37 (1–3) (2000) 113–38.

[25] A. Sabelfeld, D. Sands, A PER model of secure information flow, Higher-Order and Symbolic Computation 14 (1) (2001) 59–91.

[26] A. Sabelfeld, A. Myers, A model for delimited information release, in: Proc ISSS, Vol. 3233 of LNCS, Springer, 2004, pp. 174–91.

[27] A. Roscoe, J. Woodcock, L. Wulf, Non-interference through determinism., Journal of Computer Security 4 (1) (1996) 27–54.

[28] H. Mantel, Possibilistic definitions of security — an assembly kit, in: Proc CSFW ’00, IEEE Computer Society, 2000, pp. 185–99.

43 [29] A. Bossi, R. Focardi, C. Piazza, S. Rossi, Refinement operators and information flow security., in: SEFM, IEEE, 2003, pp. 44–53. [30] A. McIver, C. Morgan, Abstraction, Refinement and Proof for Probabilistic Systems, Tech Mono Comp Sci, Springer, New York, 2005. URL http://www.cse.unsw.edu.au/ carrollm/arp/ [31] R.-J. Back, R. Kurki-Suonio, Decentralisation of process nets with centralised control, in: 2nd ACM SIGACT-SIGOPS Symp PoDC, 1983, pp. 131–42.

A Proofs of certain lemmas (Sec. 9)

Proof of Lem. 2 When x is visible v, say, we have for any Ψ that wp.([Kφ]; v: E).Ψ =Kφ ( e∈: E [ e E][v e]Ψ) “definitions” =(e: E⇒ K∀φ [· ⇓e ∈E][v e\]Ψ) “e fresh” =(∀e: E · Kφ ⇒ [⇓ φ∈][ e \E][v e]Ψ) “Kφ (Pψ P(φ ψ))” ⇒ ⇔ ∧ =(∀e: E · Kφ ⇒ [⇓ φ ⇓e ∈E][v \e]Ψ) “merge Shrink Shadows” =(∀e: E · Kφ ⇒ [⇓ φ ∧ e∈F ][v\e]Ψ) “assumption” = wp∀ .([Kφ· ]; v:=⇒F )⇓.Ψ ∧. ∈ \ “symmetric argument” When x is hidden h, say, we have for any Ψ that wp.([Kφ]; h: E).Ψ =Kφ ( h∈: E [h E]Ψ) “definitions” =(h: E⇒ K∀φ [·h ⇐E]Ψ) “h not free in Kφ” =(∀h: E · Kφ ⇒ [h⇐F ]Ψ) “Kφ (Pψ P(φ ψ))” ⇒ ⇔ ∧ = wp∀ .([Kφ·]; h: ⇒F ).Ψ⇐. “symmetric argument” ∈ Proof of Lem. 4 The proof is by structural induction over programs; we give the base cases here, using Ψ for a general (non-classical) ignorant post- condition formula. For skip we have wp.(skip; [Kψ]).Ψ =Kψ Ψ “(7)” Kφ ⇒ Ψ “assumption φ skip.ψ” =⇒ wp.([K⇒ φ]; skip).Ψ ⇒ For Choose visible (of which Assign to visible is a special case), we have wp.(v: E; [Kψ]).Ψ =(e: E∈ [ e E][v e](Kψ Ψ)) “Choose visible; (7)” =(∀e: E · K(⇓ e∈E \[v e]ψ)⇒ [ e E][v e]Ψ) “Fig. 7” (∀e: E · Kφ ∈ [ ⇒e E\][v e]⇒Ψ) ⇓ ∈ “assumption\ φ wp.(v: E).ψ” ⇒ ∈ =K⇒ ∀φ ·( e: E⇒ [⇓ e∈E][v\e]Ψ) “e is fresh” = wp.([K⇒ φ∀]; v: E· )⇓.Ψ∈. \ “Choose visible; (7)” ∈ For Choose hidden we have wp.(h: E; [Kψ]).Ψ ∈

44 =(h: E [h E](Kψ Ψ)) “Choose hidden; (7)” =(∀h: E · K(⇐h: E ψ⇒) [h E]Ψ) “Fig. 7” (∀h: E · Kφ∀ [h· E]⇒Ψ) ⇐ “assumption φ wp.(h: E).ψ” ⇒ ∈ =K⇒ ∀φ (· h: E⇒ [h⇐E]Ψ) “e is fresh” = wp.([K⇒ φ∀]; h: E· ).Ψ⇐. “Choose hidden; (7)” ∈ Proof of Lem. 5 The proof is by structural induction. Let S(x) be a pro- gram possibly containing references to a free variable x. We show for all ignorant postconditions Φ that

= wpv .S(x).Φ wph.S(x).Φ , | x ⇒ x v h where wpx treats x as visible and wpx treats it as hidden, working through the cases of Fig. 8. Identity Trivial. Assign to visible/hidden Special case of Choose visible/hidden. Choose visible If x occurs in E then [ e E]PΨ will differ in the two cases: when x is visible, it will be free in⇓ [ ∈e E]PΨ for any PΨ occurring (positively) within Φ; when it is hidden,⇓ however,∈ it will be captured by the P-modality. Examination of the logical interpretation (Sec. 6) shows however that the former implies the latter, and this implication will prop- agate unchanged to the whole of Φ, since PΨ occurs positively within it. Choose hidden By similar reasoning to Choose visible, different treat- ment of [h E] can only result in an implication overall. Choose x We⇐ must compare ( e: E [ e E][x e] Φ), for when x is visible, · with ( x: E [x E] Φ) for when∀ x is⇓ hidden.∈ \ Because both [ e E] and · [x E]∀ have no⇐ effect on the classical part of Φ, we can concentrate⇓ ∈ on the⇐ case PΨ — thus we compare

( e: E [ e E][x e]PΨ) and ( x: E [x E]PΨ) ∀ · ··· ⇓ ∈ \ ∀ · ··· ⇐ which, from Fig. 7, is in fact a comparison between

( e: E P(e E [x e] Ψ)) and P( e: E [x e] Ψ) , ∀ · ··· ∈ ∧ \ ··· ∃ · \ where the ( x: E ) is dropped in the rhs because the body contains · no free x, and∀ we have··· introduced the renaming to make it more like the lhs. Sec. 6 now shows, again, that the lhs implies the rhs. Demonic choice Trivial (inductively). Sequential composition Trivial (inductively). Conditional This too is trivial, because if E can be rewritten using a fresh visible as [vis b b:= E; if b ] and it··· is then handled by the other | · | cases. ··· Local variables Trivial since the substitution simply distributes through in both cases.

45