The Shadow Knows: Refinement and Security in Sequential Programs *
Total Page:16
File Type:pdf, Size:1020Kb
Science of Computer Programming Volume 74, Issue 8, 1 June 2009, Pages 629-653 Special Issue on Mathematics of Program Construction (MPC 2006) The Shadow Knows: Refinement and security in sequential programs Carroll Morgan 1 Sch. Comp. Sci. & Eng., Uni. NSW, Sydney 2052 Australia Abstract Stepwise refinement is a crucial conceptual tool for system development, encour- aging program construction via a number of separate correctness-preserving stages which ideally can be understood in isolation. A crucial conceptual component of security is an adversary’s ignorance of concealed information. We suggest a novel method of combining these two ideas. Our suggestion is based on a mathematical definition of “ignorance-preserving” refinement that extends classical refinement by limiting an adversary’s access to concealed information: moving from specification to implementation should never increase that access. The novelty is the way we achieve this in the context of se- quential programs. Specifically we give an operational model (and detailed justification for it), a basic sequential programming language and its operational semantics in that model, a “logic of ignorance” interpreted over the same model, then a program-logical semantics bringing those together — and finally we use the logic to establish, via refinement, the correctness of a real (though small) protocol: Rivest’s Oblivious Transfer. A previous report treated Chaum’s Dining Cryptoraphers similarly. In passing we solve the Refinement Paradox for sequential programs. Key words: Security, privacy, Hoare logic, specification, implementation, logic of knowledge This is an extension of an earlier report [1]. Email address: [email protected] (Carroll Morgan). 1 We acknowledge the support of the Aust. Res. Council Grant DP034557. Preprint submitted to SCP 22 September 2007 1 Introduction Stepwise refinement is an idealised process whereby program- or system de- velopment is “considered as a sequence of design decisions concerning the decomposition of tasks into subtasks and of data into data structures” [2]. We say “idealised” because, as is well known, in practice it is almost never fully achieved: the realities of vague- and changing requirements, of efficiency etc. simply do not cooperate. Nevertheless as a principle to aspire to, a way of organising our thinking, its efficacy is universally recognised: it is an instance of separation of concerns. A second impediment to refinement (beyond “reality” as above) is its scope. Refinement was originally formalised as a relation between sequential pro- grams [3], based on a state-to-state operational model, with a correspond- ing logic of Hoare-triples Φ S Ψ [4] or equivalently weakest preconditions wp.S.Ψ [5]. As a relation,{ it} generates{ } an algebra of (in-)equations between program fragments [6,7]. Thus a specification S1 is said to be refined by an implementation S2, written S1 S2, just when S2 preserves all logically expressible properties of S1. The scope of the refinement is determined by that expressivity: originally limited to sequential programs [3], it has in the decades since been extended in many ways (e.g. concurrency, real-time and probability). Security is our concern of scope here. The extension of refinement in this article is based on identifying and reasoning about an adversarial observer’s “ignorance” of data we wish to keep secret, to be defined as his uncertainty about the parts of the program state he can’t see. Thus we consider a program of known source-text, with its state partitioned into a “visible” part v and a “hidden” part h, and we ask From the initial and final values of visible v, what can an (1) adversary deduce about hidden h? For example, if the program is v:= 0, then what he can deduce afterwards about h is only what he knew beforehand; but if it is v:= h mod 2, then he has learned h’s parity; and if it is v:= h then he has learned h’s value exactly. We assume initially (and uncontroversially) that the adversary has at least the abilities implied by (1) above, i.e. knowledge of v’s values before/after execution and knowledge of the source code. We see below, however, that if we make certain reasonable, practical assumptions about refinement –which we shall justify– then surprisingly we are forced to assume as well that the adversary can see both program flow and visible variables’ intermediate values: that is, there are increased adversarial capabilities wrt ignorance that accrue 2 as a consequence of using refinement. 2 Thus refinement presents an increased security risk, a challenge for maintaining correctness: but it is so important, as a design tool, that the challenge is worth meeting. The problem is in essence that classical refinement [3,6–8] is indeed insecure in the sense that it does not preserve ignorance [9]. If we assume v, h both to have type T , then “choose v from T ” is refinable into “set v to h” — as such it is simply a reduction of demonic nondeterminism. But that refinement, which we write v: T v:= h, is called the “Refinement Paradox” (Sec. 7.3.2) precisely because∈ it does not preserve ignorance: program v: T tells us nothing about h, whereas v:= h tells us everything. To integrate refinement∈ and security we must address this paradox at least. Our first contribution is a collection of “refinement principles” that we claim any reasonable ignorance -refinement algebra [6,7] must adhere to if it is to be practical. Because basic principles are necessarily subjective, we give a detailed argument to support our belief that they are important (Sec. 2). Our second, and main contribution is to realise those principles: our initial construction for doing so includes a description of the adversary (Sec. 3), a state-based model incorporating the adversary’s capabilities (Sec. 4.2), a small programming language (Fig. 1) with its operational semantics over that model (Fig. 2) and a thorough treatment of indicative examples (Figs. 3–5). In support of our construction we give beforehand a detailed rationale for our design (Sec. 2), and an argument via abstraction that induces our model’s fea- tures from that rationale by factoring it through Kripke structures (Sec. 4.1). Based on our construction we then give a logic for our proposed model (Sec. 6), a programming-logic based on that and the operational semantics (Secs. 7,8), and examples of derived algebraic techniques (Secs. 9,10). Within the framework constructed as above we interpret refinement both op- erationally (Sec. 5.3) and logically (Secs. 7.2,8.1), which interpretations are shown to agree (again Sec. 7.2) and to satisfy the Principles (Sec. 8.4). Ignorance-preserving refinement should be of great utility for developing zero- knowledge- or security-sensitive protocols (at least); and our final contri- bution (Sec. 11) is an example, a detailed refinement-based development of Rivest’s Oblivious Transfer Protocol [10]. 2 In the contrapositive, what we will argue is that if we integrated refinement and ignorance without admitting the adversary’s increased capabilities, we would not get a reasonable theory. 3 2 Refinement Principles: desiderata for practicality The general theme of the five refinement principles we now present is that local modifications of a program should require only local checking (plus if necessary accumulation of declarative context hierarchically): any wholesale search of the entire source code must be rejected out of hand. RP0 Refinement is monotonic — If one program fragment is shown in isolation to refine another, then the refinement continues to hold in any context: robustness of local reasoning is of paramount importance for scaling up. RP1 All classical “visible-variable only” refinements remain valid — It would be impractical to search the entire program (for hidden variables) in order to validate local reasoning (i.e. in which the hiddens elsewhere are not even mentioned). RP2 All classical “structural” refinements remain valid — Associativity of se- quential composition, distribution of code into branches of a conditional etc. are refinements (actually equalities) that do not depend on the ac- tual code fragments affected: they are structurally valid, acting en bloc.It would be impractical to have to trawl through their interiors (including e.g. procedure calls) to validate such familiar rearrangements. (Indeed, the procedures’ interiors might by unavailable due to modularisation.) RP3 Some classical “explicit-hidden” refinements become invalid — This is neces- sary because, for example, the Refinement Paradox is just such a refinement. Case-by-case reasoning must be justified by a model and a compatible logic. RT Referential transparency — If two expressions are equal (in a declarative context) then they may be exchanged (in that context) without affecting the meaning of the program. This is a crucial component of any mathematical theory with equality. 3 Description of the adversary: gedanken experiments We initially assume minimal, “weak” capabilities (1) of our adversary, but show via gedanken experiments based on our Principles (Sec. 2) that if we allow refinement then we must assume the “strong” capabilities defined below. 3.1 The weak adversary in fact has perfect recall We ask Does program v:= h; v:= 0 reveal h to the weak adversary? According to our principles above, it must; we reason 4 (v:= h; v:= 0); v: T = v:= h;(v:= 0; v: ∈T ) “RP2 : associativity” ∈ = v:= h; v: T “RP1 : visibles only” (2) v:= h; skip∈ “RP1 : visibles only; RP0 ” = v:= h, “RP2 : skip is the identity” whence we conclude that, since the implementation (v:= h) fails to conceal h, so must the specification (v:= h; v:= 0; v: T ) have failed to do so. Our model must therefore have perfect recall [11],∈ because escape of h into v is not “erased” by the v-overwriting v:= 0.