2013 IEEE 26th Computer Security Foundations Symposium

Gradual Security Typing with References

Luminous Fennell, Peter Thiemann Dept. of Computing Science University of Freiburg Freiburg, Germany Email: {fennell, thiemann}@informatik.uni-freiburg.de

Abstract—Type systems for information-flow control (IFC) The typing community studies a similar interplay between are often inflexible and too conservative. On the other hand, static and dynamic checking. typing [25], [26] is dynamic run-time monitoring of information flow is flexible and an approach where explicit cast operations mediate between permissive but it is difficult to guarantee robust behavior of a program. for IFC enables the programmer to coexisting statically and dynamically typed program frag- choose between permissive dynamic checking and a predictable, ments. The type of a cast operation reflects the outcome of conservative static , where needed. the run-time test performed by the cast: The return type of We propose ML-GS, a monomorphic ML core language with the cast is the compile-time manifestation of the positive test; references and higher-order functions that implements gradual a negative outcome of the test causes a run-time exception. typing for IFC. This language contains security casts, which enable the programmer to transition back and forth between Disney and Flanagan propose an integration of security static and dynamic checking. types with dynamic monitoring via gradual typing [11]. Such In particular, ML-GS enables non-trivial casts on reference an integration enables the stepwise introduction of checked types so that a reference can be safely used everywhere in a security policies into a software system. The programmer program regardless of whether it was created in a dynamically inserts checks for these policies as run-time casts at strategic or statically checked part of the program. The reference can be shared between dynamically and statically checked parts. points in the code. The type system statically guarantees We prove the soundness of the gradual security type system adherence to the policies “on the static side” of a cast, along with termination insensitive non-interference. whereas the run-time system checks the policies “on the Keywords-gradual typing; security typing; ML; references dynamic side”. This procedure creates statically checked regions in a I.INTRODUCTION dynamically checked environment. These regions can be en- Language-based analysis and control of information flow larged by rewriting code and moving casts to make programs in software systems has been studied by numerous authors more efficient and to avoid potential run-time errors. [13], [23]. Many studies propose type-based, static program Alternatively, a programmer may impose a static secu- analyses where the non-interference property is guaranteed rity typing discipline on a program and revert to dynamic for well-typed programs. Other studies concentrate on dy- checking by inserting casts demarcating the regions where namic monitoring of program execution where potentially the static checker fails. This approach leads to dynamically interfering behavior of a program is detected and prevented checked regions and the programmer should strive to restrict at run time. Hybrid approaches (e.g., [12], [22]) combine them to places where the static analysis would be too both kinds of analysis. conservative or where the code contains language features Static approaches are advantageous for the specification not supported by the analysis. The programmer may also of security policies that are known up-front and where restrict debugging efforts that chase policy violations to the the program can be built to suit the analysis. The analy- dynamically checked parts. sis guarantees that no security mismatches happen at run time. However, security policies are often formulated as A. Contributions an afterthought, when a large part of a system is already Disney and Flanagan [11] consider a pure lambda cal- implemented, or they may evolve as the implications of a culus. Building on their ideas, we study a monomorphic system design become understood. Unfortunately, a program ML core language with references. The consideration of ref- that was not built with the static analysis in mind can be erences introduces significant complications, as references difficult to modify so that it accepted by the analysis even enable flow-sensitivity attacks [22]. The underlying type though the program respects the desired security policies. system is inspired by Pottier and Simonet’s work on a Dynamic approaches, on the other hand, enforce a safe security type system for CoreML [21]. We extend this static approximation to the non-interference property. They do not system with casts and suitable run-time representations of give static guarantees, but are amenable to changes in the security levels on the dynamic side of a cast. Our casts are security policy without requiring a rewrite of the code. very powerful because they are able to change the security

© 2013, Luminous Fennell. Under license to IEEE. 224 DOI 10.1109/CSF.2013.22 type of the content of a reference. This choice enables 1 (* Some privileged information *) H our calculus to safely perform casts between security types 2 let infoH : ref Report = ... that are not related by the subtype relation induced by the 3 ordering on security levels: A sound subtype relation 4 (* Optionally enhance a report 5 with privileged information ) treat references invariantly. The extended calculus uses a no- * * 6 addPrivileged isPrivileged worker report = sensitive-upgrade strategy (NSU, [3]) for information flow 7 if isPrivileged control (IFC) on the dynamic side and we prove termination 8 then report := report + !infoH; insensitive non-interference for the combined system. 9 worker report Listing 1. Example function with manual security enforcement. B. Terminology Security levels are drawn from a two-element lattice with elements L for low-security, public information and H for requiring any run-time checks. Here are two examples of high-security, secret information. They are ordered by L @ H such procedures with their type signatures:1 and the operator t denotes the least upper bound. Our results H H generalize to arbitrary discrete security lattices as outlined sendToManager : ref Report → unit L for CoreML [21]. In the remainder of the paper, we will sendToFacebook : ref ReportL → unit use a capital “B” as a meta-variable for the security levels The sendToManager function takes a confidential report of values and a capital “PC ” for the security context.A and guarantees that it is not leaked to the public. The H on security context is the security level of the program counter the arrow indicates the lowest security level that is modified. at run-time. The sendToFacebook function takes a public report and We write intL for the type of low-security integers and L H publishes it to an open Internet message board, as indicated ref int for the type of low-security pointers that point L by the low-security effect →. H L to high-security integers in memory. The type (int → Listing 1 illustrates a proposed extension of the re- H L int ) specifies a low-security function that takes a high- port processing application. It contains a utility function security argument and yields a high-security result. The addPrivileged that adds privileged information to a report annotation on the arrow restricts the effect of the function, before passing it to a procedure like sendToManager or i.e., the function allocating or modifying a memory cell, to sendToFacebook. The flag isPrivileged indicates if the cells of at least the level indicated by the annotation. The worker argument is sufficiently trusted to handle a privi- annotation L means that all cells may be modified, whereas leged report. If isPrivileged is true, then the privileged H would restrict modification to high-security memory cells. information is retrieved through a global reference infoH Upgrading a memory cell means to overwrite its low- and added to report. Otherwise, worker is called with an security content with a high-security value. Because up- unmodified report. grades that occur in high-security contexts may leak con- Listing 1 type checks in FlowCaml and also the term fidential information through implicit flows, they have to be treated specially by dynamic IFC techniques [5]. Security addPrivileged true sendToManager type systems typically forbid upgrades altogether. is accepted with type ref ReportH→H unit, as We will employ the non-sensitive-upgrade (NSU) strategy sendToManager is safe for consuming high-security of Austin and Flanagan [3] as a dynamic IFC technique. In information. In contrast, the term an NSU semantics, assignments fail if they would upgrade a memory cell under a high-security context. addPrivileged false sendToFacebook is rejected by the type checker, even though it is semantically II.MOTIVATION safe. No privileged information is leaked, but the type The following examples illustrate the creation of dy- checker does not track the correspondence between the namically checked regions in otherwise statically checked security level of the worker and the isPrivileged flag. programs and vice versa. The motivation for the examples Our proposal for gradual security typing allows us to is to work around restrictions imposed by the underlying embed a (security-) untyped code fragment in a typed security type system, for example the dynamic, manual setting. A cast expression asserts the required security type administering of access rights where the code checks a run- signature for the untyped code and the dynamic semantics time representation of the accessor’s security level. guarantees adherence to the type by checking for violations To start off, consider a report processing application that of the type signature using dynamic IFC techniques. In our is statically verified against a standard security type system 1Type signatures are simplified by omitting some L annotations. The like that of FlowCaml [21]. It contains numerous security- annotation on the unit type does not matter because it cannot convey typed procedures that process reports by reference without information in our setting.

2252 type language, the annotation “?” stands for a single security 1 wrap f buf = L level, unknown from the static point of view, for objects 2 buf := 0 ; f buf that will be dynamically checked. With this understanding, 3 L L L the utility function addPrivileged can be given the type 4 let buf : ref Buf = new 0 in ? ? ? ? ? ? ? L L bool →(ref Report →unit)→ref Report →unit 5 let wL : ref Buf → unit = ... in H H where all security levels (except the implicit L on the 6 let wH : ref Buf → unit = ... in references) are checked at run time using dynamic IFC 7 ... ? ? techniques, as advertised by the “?”. A programmer must 8 let wLd = {ref Buf → unit} wL in ? ? insert casts to use the function addPrivileged with 9 let wHd = {ref Buf → unit} wH in sendToManager and sendToFacebook as shown in the 10 ... following safe code fragment.2 11 wrap wLd buf 12 wrap wHd buf let w = {ref Report? →? unit} 13 ... sendToFacebook in Listing 2. Example function with manual security enforcement. {ref ReportL →L unit}(addPrivileged false w)

The first cast {ref Report?→? unit} wraps the statically checked sendToFacebook function such that the resulting acceptable in a type system like that of FlowCaml, our function w expects a report with a run-time tag that indicates gradual system accepts it with a suitable dynamic type. its security level. The run-time system also passes a run-time value that indicates the lower bound on the effect of w. The Using the definitions of Listing 2 it is easily possible to cast operation checks that the run-time tag on the report is L provoke a security exception: and strips it off, conceptually speaking. It further tests that the allowed effect is L, as expected by sendToFacebook. let h : boolH = ... in L ... {ref ReportL→unit} The cast in this example applies if h then wrap wLd buf to a function type and is thus delayed to the point where the function is applied. Then the argument is cast according to ref ReportL and the context is also cast to L so that all As the execution of wLd depends on the high-security value effects are allowed. This cast reifies static information by h, all low-security effects executed by wLd are potential attaching or modifying run-time tags on values.3 security violations. Our dynamic semantics prevents such In contrast, the following code fragment shows an exam- security leaks by aborting execution and issuing a security ple, where a mismatch of the isPrivileged flag with the exception. actual security labels causes a run-time exception. Although the error that we just provoked occurs during L the execution of the typed body of wL, a gradually typed {ref ReportL → unit}(addPrivileged true w) system has significant advantages over a purely dynamic one In many cases, the code can be transformed to make during development. Suppose that the unsafe code above is it amenable to a FlowCaml-style type checker or to use the result of a programming error that occurred during the simpler casts (i.e., no casts on reference types). To make integration of the shared buffer in the system. As wL and wH the addPrivileged example work, two specialized copies are fully typed, the developers can restrict their debugging of addPrivileged are needed, one where isPrivileged efforts to the dynamic fragment of the system. is true and another where it is false. It is one of our points The examples up to now demonstrate uses of statically that the use of gradual security typing often avoids such typed procedures in a dynamic context. The converse is code duplication, thus improving software quality. also possible. Consider a procedure that formats a document As another example, consider the program fragment in (Listing 3). This procedure should work alike on high and Listing 2, which declares two statically type checked proce- low security documents without leaking. For that reason it dures in lines 5 and 6. Suppose that they must share a single is defined on a dynamic document type. Using appropriate buffer buf because of resource constraints. A programmer casts, the dynamic formatter can be applied without affecting who has to avoid information leaks between wL and wH the static security guarantees for the documents. Both casts interposes the function wrap to clear the buffer before arising in lines 6 and 7 are reference casts that modify the passing it to a procedure. While the function wrap is not pointer to dereference to a document with dynamic security level. If the formatting depends on further parameters of 2 In a cast {t}e, the t is the target type of the cast and e is a term. In unclear security status, then these parameters could be made the formal system (Sec. III), a cast also mentions the source type. 3The actual situation is more complicated because of our liberal handling dynamic in the same way, thus leaving the detection of of reference casts. problems to testing.

2263 (t,p) ? ? unit values (), and pointers l . Pointers do not appear in 1 let format : ref Doc → unit = ... 2 source programs and are explained at the end of this section. L B B 3 let docL : ref Doc = ... For readability, we write λ x. e instead of (λx. e) . Let H 4 let docH : ref Doc = ... expressions and sequencing are defined as syntactic sugar: L 5 let x = e1 in e2 desugars to (λ x. e2) e1 and (e1; e2) to ? 6 format ({ref Doc } docL); let z = e1 in e2 where z is a fresh variable. ? 7 format ({ref Doc } docH); Further expressions are function application e1 e2, alloca- Listing 3. Formatting low and high security documents. tion and initialization of a new heap cell newt,Be (where B is the security level of the pointer and the type t represents the initial type of the cell’s content), dereference of an pointer Security Levels B, PC ::= L | H ! e, and assignment of a new value to a cell e := e. Type Annot. b, pc ::= B | ? The expression {t0 ⇐ t}pe casts the type of e from source 0 pc type t to target type t . Only the target type is needed for Raw Types s ::= | | t → t | t int unit ref the run-time safety checks, but we include the source type Types t ::= sb to distinguish safe casts from unsafe ones (Sec. VI). A cast Blame Id P ::= (unique identifiers) carries a blame label, a non-empty set p of blame ids P . As Blame Labels p, q ::= P | p, P in work on contracts [1], [29], a blame id corresponds to the location of a cast in the source code. If the cast leads to a Variables x ::= (unique identifiers) run-time failure at a boundary between static and dynamic Constants k ::= 0 | 1 | 2 | ... checking, then its blame id (i.e., source location) is part Raw Values w ::= λx. e | k | () | l(t,p) of the blame label reported by the failure. Initially, each cast carries its source location as a singleton blame label P . Values v ::= wB During execution casts with joined blame labels arise. We Expressions e ::= v | x | e e write pq for the join of two blame labels p and q. | newt,Be | ! e | e := e The remaining expressions are all introduced by reduc- p B tions and do not occur in source programs. The protect | {t ⇐ t} e | prot e B p expression prot e ensures dynamically that the security | {[]b ⇐ []b} e level of the result of e is at least B and that no effect occurs pc⇐pc | {ref pc ⇐ pc}pe | {∗ → ∗}pe at a level less than B. Protect expressions from the literature often only do not place a restriction on the effect [14]. Figure 1. Syntactic domains of ML-GS. The guard casts {[]pc ⇐ []pc0}pe, function guard casts pc⇐pc0 {∗ → ∗}pe, and pointer casts {ref pc ⇐ pc0}p are technical devices that ensure during the reduction III.AN MLCORE LANGUAGEWITH GRADUAL of casts. They are explained along with their typing and SECURITY reduction rules in Sec. IV. Fig. 1 defines the syntax of ML-GS, an ML core language The raw pointer value l(t,p) consists of an address l, an with gradual security. Types are part of the expression syntax access type t, and a blame label p. The access type t is because they appear in casts. A type t consists of a raw type used for typing the dereference operation. Its annotations s and a type annotation b, which approximates a run-time may be different from the current type of the value stored security level. A type annotation is either a security level, B, at address l. A reference cast changes the access type of a or dynamic, ?. The ? is a static security level that represents pointer without affecting the current type and a subsequent objects that will be dynamically checked. It is treated as a assignment through this pointer synchronizes the current new top element in static checking such that H @ ?. type with the access type. The blame label p tracks the blame A raw type is either the integer type int, a function ids of such casts. A dereference operation on a pointer will pc 0 type t → t , the unit type unit, or a reference type only succeed if its access type is at least as secure as the ref t. The program counter security level, pc, on a function current type of the referenced memory cell. As a short-hand, B type indicates the minimum permitted security level for the we write l(t,q),B for l(t,p) . function’s effects. The significance of pc for typing and execution of ML-GS programs is explained in Sec. IV and IV. GRADUAL SECURITY TYPING Sec. V. Most work on gradual typing is geared towards dynamic A value is a raw value labeled with its run-time security languages where casts serve to discover the concealed struc- level B, which starts off as the level of its originating ture of run-time values and manifest it in the types. In principal and which can be checked during execution. Raw our work, the erasure of security levels in types results values are lambda abstractions λx. e, integer constants k, in a simply typed program. Only the run-time security

2274 0 0 Variable Env. Γ ::= - | Γ, x : t b ≺ b t ≺ t Address Env. M ::= - | M, l : t b v b0 b v b0 b v b0 b b0 b b0 b b0 pc; Γ; M ` e : t int ≺ int unit ≺ unit ref t ≺ ref t

0 0 0 0 T-Var T-Int b v b pc v pc t1 ≺ t1 t2 ≺ t2 B B pc;Γ, x : t; M ` x : t pc; Γ; M ` k : int 0 pc b 0 pc 0 b0 (t1 → t2) ≺ (t1 → t2) T-Unit 0 pc; Γ; M ` ()B : unitB t ∼ t 0 0 t1 ∼ t1 t2 ∼ t2 b b0 T-Addr 0 int ∼ int t ∼ t0 pc b 0 pc 0 b0 (t1 → t2) ∼ (t1 → t2) 0 pc; Γ; M, l : t ` l(t ,p),B : ref B t0 0 b b0 t ∼ t unit ∼ unit 0 T-Protect ref b t ∼ ref b t0 pc t B; Γ; M ` e : sb pc; Γ; M ` protBe : sbtB Figure 3. and compatibility.

T-Abs 0 0 pc ;Γ, x : t; M ` e : t implicitly weakened to functions with low-security pc. The pc0 pc; Γ; M ` λBx. e :(t → t0)B dynamic annotation becomes the most general security level; it subsumes statically low- and high-security annotations. T-App A type environment Γ associates variables with types pctb b0 b whereas an address environment M provides types for heap pc; Γ; M ` e1 :(t → s ) pc; Γ; M ` e2 : t b0tb addresses (see Fig. 2). Fig. 2 also defines the rules for the pc; Γ; M ` e1 e2 : s typing judgment pc; Γ; M ` e : t for all source expressions (Fig. 8 contains rules for some special expressions that occur T-New pc; Γ; M ` e : sb pc v b only at run time). It relates a pc, a type environment, an address environment and an expression with a type. The sb,B B b pc; Γ; M ` new e : ref s program counter security level pc restricts the write effects of a typed expression e: The execution of e may only modify T-Asgn memory that is at least as secure as pc. pc t b v b0 The rules are based on the principles of secure information b b0 T-Deref pc; Γ; M ` e1 : ref s flow [14], [21]: the security level of the result of a compu- b b0 b0 pc; Γ; M ` e : ref s pc; Γ; M ` e2 : s tation must be greater than the levels of the arguments and b0tb b0 pc; Γ; M ` ! e : s pc; Γ; M ` e1 := e2 : unit the security level of information that escapes via side effects must not be less than the pc. T-Sub The typing rule for variables, T-Var, is standard. By rule 0 0 t ≺ t pc v pc pc; Γ; M ` e : t T-Int, an integer constant has the underlying type int and pc0; Γ; M ` e : t0 a type annotation matching the value. Similarly, the unit value has a type annotation matching its security level. The 0 T-Cast rule T-Addr types a pointer l(t ,p),B according to its security 0 pc; Γ; M ` e : t t ∼ t level B and its access type t0. Furthermore, the address pc; Γ; M ` {t0 ⇐ t}pe : t0 environment associates the address l with a type t that is compatible to t0. The compatibility requirement t ∼ t0 forces Figure 2. Typing rules. t and t0 to have the same structure, but with potentially different annotations (see Fig. 3). The protect expression protBe ensures a minimum secu- properties of the values may be concealed in the types by rity level for its argument e : sb by joining its type’s level b using the dynamic annotation ?. The system also supports with B. In particular, if any of the annotations is dynamic, standard security subtyping [14], [21] which allows low- the result is also dynamic. The T-Protect rule also enforces security information to be implicitly promoted to a high a lower bound on the pc of the protected expression e. security level and functions with high-security pc to be The unannotated part of rule T-Abs is standard. The

2285 function type annotation is derived from the security level, Ev. Ctx E ::= [ ] e | v [] | {t ⇐ t}p[] B, of the abstraction. Furthermore, the arrow carries a | newt,B[] | [ ] := e | v := [ ] | ![] program counter security level, pc0, indicating the maximum pc⇐pc pc under which the function can be called. As abstractions | {∗ → ∗}p[] | {ref pc ⇐ pc}p[] are values, the program counter security level pc, which the Heaps µ ::= - | µ, (l 7→ {t}pv) abstraction is typed under, is irrelevant. Results r ::= e / µ | ⇑ p The rule for function application T-App checks for a function type and a matching type for the argument as usual. Figure 4. Semantic domains of ML-GS. The join of the function’s security annotation, b, and that of its result type, b0, is sufficient to protect the application’s result. The function’s pc has to respect the join of b and the security level annotations at run time. In this fragment, casts current pc. would erase or add these annotations as appropriate. The rules for side-effecting expressions are mostly stan- However, reference casts that modify the type of the stored dard. The security level of the pointer in which information value complicate such an erasure transformation because a is written is protected by the pc (rules T-New and T-Asgn). number of mutually compatible types may be associated Reading from memory requires the result to subsume the with a single memory address. Directly after allocation, there security level of the pointer (rule T-Deref). The rule T-Asgn is one pointer to a fresh address, and the current type and requires that the updated memory content is more secure the access type of this address coincide. Next, a cast may than the current pc and the pointer’s annotation. change the access type without updating the value so that a Rule T-Sub enables standard security subtyping. The sub- subsequent dereference operation expects a differently typed typing relation ≺ (see Fig. 3) lifts the ordering on type value. At this point, run-time information is needed to check annotations to types, with contravariant function parameter that the value is acceptable at the new access type. The types and invariant reference content types, as usual. The operational semantics inserts a cast at this point. program counter security level on functions is contravariant If we consider this scenario in the context of a hypothet- with respect to the security lattice: If an expression can ical erasure semantics, the problem comes up immediately. execute under a high-security pc, it can also execute in an In particular, the initial allocation may happen in a fully unrestricted environment. statically checked setting so that the stored value carries no The rule T-Cast converts a value of type t to a value annotations. Then the reference content is cast to dynamic. of type t0 if these types are compatible. Compatibility is (In our semantics, this cast only affects the type of the an equivalence relation that identifies types solely by their reference, but not the stored value. After all, there might structure, ignoring security annotations; any two types that be a static alias of the pointer.) The subsequent dereference only differ in their security annotations can be cast into each operation needs to produce a fully annotated value, but they other. In particular, compatibility subsumes the subtyping are not available due to erasure. relation. For example, intL ∼ intH and intH ∼ intL and The presence of references also makes it hard to give a therefore {intL ⇐ intH}pe, {intH ⇐ intL}pe are both clear boundary between statically and dynamically checked valid casts (although the former will raise an error at run- code. In principle, run-time errors due to security mis- time). The subtyping relation only holds for intL ≺ intH. matches may happen at downcasts, e.g. from ? to L, and at In contrast, the cast {intH ⇐ ref L intH}pe is invalid assignments that fail the NSU check. While the assignments because references are always incompatible with integers. may be flagged according to the dynamic annotation of The typing rules for guard casts and pointer casts are their pointer, the downcasts may not be present in the discussed in Sec. V in the context of cast reduction, where program from the start. As just discussed, they may arise these expressions arise. from dereference operations where the semantics also inserts casts. Thus, any dereference operation that may dereference V. SEMANTICS a cast pointer needs to be checked dynamically. Before delving into the details of the formal semantics, Thus, as a first approximation, we adopt a fully annotated we give an overview of the execution model. The operational execution model, knowing that the outcome of a significant semantics keeps track of security levels exactly as indicated of the run-time checks is statically determined by in the syntax: each value carries its security level at run the type system. time and non-interference is checked according to the NSU policy all the time, regardless of the guarantees given by the A. Configurations type system. The operational semantics of ML-GS is defined by a For the fragment of the calculus that disallows casts on reduction relation e / µ −→PC r between configurations e / µ, reference types, it would be possible to define an erasure which are pairs of an expression and a heap, and results r, semantics that executes the statically typed parts without which are either configurations or blame exceptions, indexed

2296 e / µ −→PC r abstraction to a value (rule R-App) is reduced to the pro- tected function body with the argument value substituted for R-Ctxt the formal parameter, using standard, capture-free substitu- B e / µ −→PC e0 / µ0 tion. The resulting prot expression reduces according to R-Protect as soon as its argument becomes a value. If protB PC 0 0 E [e] / µ −→ E [e ] / µ appears in the context of a reduction, rule R-Ctxt-Protect raises the pc of the execution. R-App In the interplay of the rules R-Protect and R-App, the PC (λBx. e) v / µ −→ protB(e[x 7→ v]) / µ context protB reflects the security constraint of the T-App 0 typing rule for the result of the application (sbtb , cf. Fig. 2). R-Protect In a typed, cast-free program, the protB context could B B0 PC B0tB prot w / µ −→ w / µ be omitted because the typing rules ensure the necessary protection of the result. In ML-GS, protB is needed in a R-Ctxt-Protect dynamic subcomputation which is not restricted by typing. e / µ PC−→tB e0 / µ0 PC . References protBe / µ −→ protBe0 / µ0 The rules R-New, R-Deref, and R-Asgn allocate a new R-New memory cell, dereference a pointer, and assign to a pointer. 0 l∈ / dom(µ) p fresh µ0 = µ, (l 7→ {t}pwB tPC ) Additionally they pursue two further objectives: On the one 0 hand they implement dynamic information flow control by t,BwB / µ −→PC l(t,p),B / µ0 new checking the value annotations against the current dynamic R-Deref pc. On the other hand, they manipulate the current types µ = µ0, (l 7→ {t }pv) of heap values to discover inconsistencies that may be 2 introduced by reference casts. In the following, we discuss 0 PC (t2,q),B B 0 pq ! l / µ −→ prot {t2 ⇐ t2} v / µ both aspects in turn. The constraints on security levels in the reduction rules R-Asgn reflect the constraints on type annotations enforced by the 00 b1 p B1 PC t B v B1 µ = µ , (l 7→ {s1 } w1 ) respective typing rules. Allocating a value protects it with 0 00 b tPC tB pq B tPC tB µ = µ , (l 7→ {s 2 } w 2 ) the current pc. A value that is read from the heap needs to b2 PC l(s ,q),B := wB2 / µ −→ ()PC / µ0 be protected with the pointer’s security level. A heap update (rule R-Asgn) has to pass the NSU check: Figure 5. Semantics. the security level of the value on the heap that is updated, B2, has to subsume that of the security context, PC , and that of the pointer, B. The updated heap stores the raw value by a PC , which indicates the current security context. It w with a security level that is sufficiently high to respect is the dynamic counterpart to the program counter security the context PC and B. The cast failure rules, which are level pc in the typing rules. The heap µ maps an address l to described in Sec. V-, cover the case when the update value a combination of a value, a blame label, and a type, written is more secure than the original one and the NSU check fails. {t}pv. Here, t is the current type of the value v stored in Next we turn to the tracking and checking of reference the cell. The blame label p comes into play when the cell casts. For a newly allocated reference, the access type and is dereferenced with an access type which is not a subtype the current type coincide. At this point, the initial blame of the current type. This subtype relation is checked with a label p does not matter because it is not associated with a cast labeled with p. type change. A cast may change the access type (leaving its A blame exception ⇑ p flags the violation of a typing blame label at the modified pointer) and an assignment may 0 assumption. The blame label p contains the blame id of the change the current type. In rule R-Deref, the access type, t2, responsible cast. of the pointer, which is used in typing the pointer, may differ from the current type t2 of the stored value v. Thus, v is B. Lambda calculus fragment retrieved from the heap and cast from its current type t2 to 0 Figure 4 formally defines results, heaps, and evaluation the expected type t2. The blame labels on access and current contexts E that guide the search for redexes in a standard type are joined on this cast, because both a preceding as- call-by-value evaluation step. signment or a preceding cast may have caused the mismatch Figures 5, 6, and 9 contain the reduction rules of ML- between the access type and the current type. The R-Asgn GS. The rules in Fig. 5 cover non-cast expressions. The rule allows updates that change the type annotations of the context rule R-Ctxt is standard. The application of a lambda stored value. It implements NSU and refuses to overwrite a

2307 0 e / µ −→PC r cont. such that it admits raw type s . The blame label p indicates the cast that initiated the propagation (cf. rules R-Cast-Sub R-Ctxt-Fail R-Ctxt-Protect-Fail and R-Cast-To-Dyn). PC PC tB e / µ −→ ⇑ p e / µ −→ ⇑ p Nothing happens to an integer constant or a unit value. To PC PC reflect the cast of a reference’s content type from t to t0 , the E [e] / µ −→ ⇑ p protBe / µ −→ ⇑ p 1 1 propagation updates the access type of a pointer accordingly. R-Cast-Sub-Fail The most interesting case of cast propagation are function B 6v b1 casts where a cast lambda abstraction is rewritten to admit PC the desired target function type. The handling of conversions {sb1 ⇐ sb2 }pwB / µ −→ ⇑ p 1 2 for parameter and result types is analogous to that employed by other systems with gradual typing [29]: The body is R-Asgn-NSU-Fail 0 wrapped in an abstraction and applied to the argument after PC t B 6v B0 µ = µ0, (l 7→ {t }pwB ) 2 casting the target argument type to the source argument type. (t0 ,q),B 0 PC l 2 := v / µ −→ ⇑ pq The result of the application is then cast to the target result type. As a minor adjustment for the current security setting, R-Cast-Sub the inner lambda abstraction obtains security level L such s1⇐s2 w2 −−−−→p w1 B v B1 that it can be typed under any pc. PC {sB1 ⇐ sb2 }pwB / µ −→ wBtB1 / µ It remains to consider the conversion of the program 1 2 2 1 counter security level of the abstraction’s body e. The target 0 R-Cast-To-Dyn pc may not admit the e as it was typed under an unrelated s1⇐s2 pc. The cast propagation rule for functions of Fig. 7 main- w2 −−−−→p w1 tains type safety in such a situation by replacing the original b PC ? 2 p B B 0 p {s1 ⇐ s2 } w2 / µ −→ w1 / µ body e with the guard cast expression {[]pc ⇐ []pc} e.

Figure 6. Semantics: Casts, failure, and propagation. Example 1. To illustrate the purpose of guard casts consider the following well-typed cast:

L H L L L L p L (sL,q),L L value in a high context. If the assignment succeeds, then {(s → unit) ⇐ (s → unit) } (λ x. l := x) the blame labels are joined because either the current update By the rules of Fig. 3, the types (sL →H unit)L and or a preceding one may be responsible for a later mismatch. (sL →L unit)L are compatible. However, the function’s body L D. Casts l(s ,q),L := x is a low-security assignment and cannot be typed under the high-security target pc. In this example The reduction rules in Fig. 6 define failed reductions and type safety under the high-security pc can be restored by the reduction of successful casts. The rules R-Ctxt-Fail and modifying the assignment in the body to have a high security R-Ctxt-Protect-Fail propagate security violations through H write effect: l(s ,q),L := x. A guard cast performs such type evaluation contexts all the way to the top level, thus assuring 4 that program execution stops. preserving adjustments. Failures originate from attempts to expose secure infor- The guard cast related typing rules, given in Fig. 8, are mation by downgrading the security level of a value directly designed to ensure type preservation until the guard cast or by overwriting a low security value under a high pc. Rule can perform the adjustments. Rule T-GuardCast converts R-Cast-Sub-Fail covers the former case by detecting the freely between pcs in a typing derivation. It allows the mismatch of the cast’s target type and the run-time security cast propagation rule for functions in Fig. 7 to preserve level on its value argument. Rule R-Asgn-NSU-Fail avoids typing under the target pc. Rule T-FunCast casts the pc illegal upgrades on the heap. This rule enforces the NSU of a function like a regular cast, but without having to semantics [3]. It checks the current pc against the security consider parameter and result types. The rule for pointer level of the memory cell’s current content that is to be casts, T-RefCast, converts the top-level type annotation of overwritten. The blame labels are joined. the content of a reference. The additional constraint pc2 v b A cast can be successfully reduced if the top-level anno- is a technical restriction that helps to maintain an invariant tation of its target type admits the run-time security level of needed in a preservation proof (Sec. VI). its argument value (rules R-Cast-Sub and R-Cast-To-Dyn). Fig. 9 shows the interesting cases for reductions related Additionally there may be sub-casts to consider, which to guard casts. Fig. 8 contains the corresponding typing are propagated to the sub-components of the result value. 0 4 s ⇐s 0 In principle such adjustments to function bodies could be performed Figure 7 defines the cast propagation w −−−→p w . It by a meta function in the rule for function propagation. To streamline the rewrites a raw value w that has raw type s to a raw value w0 proofs, we choose to integrate the adjustments into the reduction relation.

2318 0 s ⇐s 0 w −−−→p w

0 ref t ⇐ref t1 0 int⇐int unit⇐unit (t1,q) 1 (t ,q) k −−−−−→p k () −−−−−−−→p () l −−−−−−−−−−→p l 1

0 0 pc 0 pc t1 →t2 ⇐t1→t2 0 p L 0 p 0 p λx. e −−−−−−−−−→p λx. {t2 ⇐ t2} (λ x. {[]pc ⇐ []pc} e) {t1 ⇐ t1} x

Figure 7. Semantics: Cast propagation. e / µ −→PC r cont.

R-GuardCast-New b PC btpc0 {[]pc0 ⇐ []pc}pnews ,Be / µ −→ news ,B{[]pc0 ⇐ []pc}pe / µ

R-GuardCast-Asgn 0 p PC 0 p 0 p 0 p {[]pc ⇐ []pc} (e1 := e2) / µ −→ ({ref pc ⇐ pc} {[]pc ⇐ []pc} e1) := ({[]pc ⇐ []pc} e2) / µ

R-GuardCast-App 0 0 p PC pc ⇐pc p 0 p 0 p {[]pc ⇐ []pc} (e1 e2) / µ −→ ({∗ → ∗} {[]pc ⇐ []pc} e1)({[]pc ⇐ []pc} e2) / µ

R-GuardCast-GuardCast p 0 0 q PC 0 pq {[]pc1 ⇐ []pc2} {[]pc1 ⇐ []pc2} e / µ −→ {[]pc1 ⇐ []pc2} e / µ

R-FunCast R-RefCast pc1⇐pc2 p B PC B p p (sb,q),B PC (sbtpc1 ,pq),B {∗ → ∗} λ x. e / µ −→ λ x. {[]pc1 ⇐ []pc2} e / µ {ref pc1 ⇐ pc2} l / µ −→ l / µ

Figure 9. Reduction rules for guard casts (interesting cases).

T-GuardCast ence is fully evaluated, rule T-RefCast adjusts the pointer’s pc; Γ; M ` e : t access type analogously to Example 1. The situation for pc0; Γ; M ` {[]pc0 ⇐ []pc}pe : t function application is similar: rule R-GuardCast-App needs to introduce a function guard cast to the function being T-FunCast applied. Once the function is evaluated, it converts the pc pc2 b pc; Γ; M ` e :(t1 → t2) of the function’s body to a sufficiently high level with rule

pc1⇐pc2 pc1 R-FunCast. If two guard casts collide they are joined by pc; Γ; M ` {∗ → ∗}pe :(t → t )b 1 2 rule R-GuardCast-GuardCast. For well-typed expressions, 0 T-RefCast the target pc of the first cast, pc1 and the source pc of the 0 0 0 b b second cast, pc2 are related contravariantly, i.e. pc2 v pc1, pc ; Γ; M ` e : ref s pc2 v b 0 due to rules T-GuardCast and T-Sub. 0 p b btpc1 pc ; Γ; M ` {ref pc1 ⇐ pc2} e : ref s VI.TYPE SOUNDNESS Figure 8. Typing rules for guard casts and pointer casts. Type soundness for ML-GS is established in the usual way by proving type preservation and progress. As these results need to be proven for configurations, typing must be rules. The cases not shown in Fig. 9 either drop the guard augmented with a heap typing judgment M ` µ defined in cast at values or propagate it to the subexpressions. Rule Fig. 10. It characterizes well-typed heaps using the address R-GuardCast-New augments the allocation type of a new environment M. Its single rule, T-Heap, checks that the expression, allowing it to allocate a reference at a sufficiently typing information in the cast values is consistent. The high security level to be permitted by the target pc. Rule auxiliary judgment M; l ` v : t requires that the value v R-GuardCast-Asgn applies an pointer cast to the updated is typed with the current type t and that t is compatible reference making the result well-typed under the target with the type that M binds to l. Requiring compatibility pc by typing rules T-RefCast and T-Asgn. Once the refer- ensures, together with the R-Loc typing rule, that reduction

2329 T-Heap-Aux Thus, for a cast to be unsafe it has to violate the subtyping t ∼ M(l) L; -; M ` v : t relation which means it performs a conversion that is not M; l ` v : t allowed by the static fragment of the type system. For pointers and heap bindings to be unsafe, their current types T-Heap and access types must differ from the type that is assumed dom(M) = dom(µ) ∀(l 7→ {t}qv) ∈ µ. M; l ` v : t by the static address environment M. M ` µ Well-typed programs reduce to unsafe configurations only if they were originally unsafe. Figure 10. Heap typing. Theorem 3. Let p be a blame label, M and M 0 address en- vironments, e / µ and e0 / µ0 configurations. Let furthermore e / µ be well-typed under address environment M and e0 / µ0 rule R-Deref constructs a valid cast when dereferencing a well typed under address environment M,M 0. It holds that pointer. PC if e0 / µ0 is unsafe for p under M,M 0 and e / µ −→ e0 / µ0, Theorem 1 (Preservation). Let PC be a run-time program then e / µ is unsafe for p under M. counter security level and pc a typing program counter Proof: By induction on the derivation of e / µ −→PC security level such that PC v pc. If pc; -; M ` e : t and 0 0 PC e / µ . In the R-Cast-∗ cases, check that cast propagation M ` µ and e / µ −→ e0 / µ0 then there exists M 0 such that decomposes casts in accordance to the subtyping rules; a pc; -; M,M 0 ` e0 : t and M,M 0 ` µ0. configuration that is not unsafe cannot produce an unsafe Proof: By induction on the derivation of e / µ −→PC one. The other interesting cases are R-New, R-Deref, R-Asgn, e0 / µ0. The proof relies on the usual substitution and weak- and R-Ref-Cast. In case R-New, where the heap is extended, ening results for typing contexts. As in Pottier and Simonet’s check that the resulting fresh reference is never unsafe for 0 pq work [21], the pc of a typing derivation can be decreased any blame label. In case R-Deref, if the cast {t2 ⇐ t2} v is 0 arbitrarily. Further details are given in Appendix A. the cause for unsafety, it holds that t2 6≺ t2 and in particular 0 Progress characterizes the possible outcomes of the one- t2 6= t2. It follows that the access type of the pointer and the step evaluation of a typed program. current type of heap binding can never agree on a common type M(l) and therefore the heap binding or the pointer Theorem 2 (Progress). If pc; -; M ` e : t and M ` µ then, makes the original configuration unsafe. In case R-Asgn, for all PC v pc either (i) e is a value, (ii) there exists p when the updated heap binding is the cause of unsafety, PC 0 0 such that e / µ −→ ⇑ p, or (iii) there exist µ and e such we have B1 = B1 t PC t B because of the NSU check. PC 0 0 that e / µ −→ e / µ . There are two possible cases: In case b1 6= b2 t PC t B and thus, b t PC t B 6= b t PC t B we have b 6= b Proof: By induction on the derivation of pc; -; M ` e : 1 2 1 2 and either the original heap binding or the pointer does not t. Details are given in Appendix B. agree with M. In case b = b t PC t B the original heap The statement of progress shows that even a well-typed 1 2 binding is already unsafe. In the case of rule R-RefCast the program may fail with an unsafe cast. The following def- technical constraint pc v b in the corresponding typing rule inition of an unsafe program classifies a subset of 2 T-RefCast ensures, together with the assumed safety of the configurations for which evaluation can result in a failure. cast expression, that pc v pc v b. Therefore b t pc = b The complement of this subset are safe configurations that 1 2 1 and it follows that an unsafe pointer in the result expressions cannot fail. is also present in the original configuration. Definition 1. Let p be a blame label and M an address The cause for a failing reduction is always an unsafe environment. Configuration e / µ is unsafe for p under M configuration: iff Theorem 4. Let p be a blame label and M an ad- 0 q • there is a casts {t ⇐ t } e that occurs in e or µ where dress environment and e / µ a configuration. Further, let q ⊆ p and t0 6≺ t, PC 0 pc; -; M ` e : t. If e / µ −→ ⇑ p then e / µ is unsafe pc⇐pc q 0 • there is a function guard casts {∗ → ∗} e , pointer for p. 0 q 0 casts {ref pc ⇐ pc } e or guard casts {[]pc ⇐ PC []pc0}qe0 that occurs in e or µ where q ⊆ p and Proof: By induction on the derivation of e / µ −→ ⇑ p. pc 6v pc0, (t,q),b • there is a pointer l that occurs in e or µ where From the Theorems 3 and 4 it follows that evaluating a q ⊆ p and t 6= M(l), or safe configuration will not fail. 0 q • there is a heap binding (l 7→ {t } v) that occurs in µ Corollary 1. Let e / µ be a closed, well-typed configuration 0 where q ⊆ p and t 6= M(l). under M and p a blame label. If e / µ is safe for all q ⊆ p

23310 dteL Raw Values wL ::= ... | H t,B b L b L El. Contexts V ::= [ ] v | new [] dint eL = int dunit eL = unit | ![] | [ ] := v b L dref teL = ref dteL R-High-Deref pc b L L R-High-Elim d(t1 → t2) e = (dt1e → dt2e ) l 6∈ dom(µ) L L L V [HL] / µ −→L HL / µ (t0 ,q),B L L L(e) L(µ) ! l 2 / µ −→ H / µ H L L L R-High-Asgn L(w ) = H L(() ) = () 0 0 q L L( He) = HL L(kL) = kL l 6∈ dom(µ) µ = µ, (l 7→ {t2} w ) prot t,H L L L (t0 ,q), L 0 l 2 L := wL / µ −→ ()L / µ L(new e) = H L(λ x. e) = λ x. L(e) 0 0 H p L (t ,p),L (dt eL,p),L L({s ⇐ t2} e) = H L(l ) = l Figure 11. Extensions for projection syntax and semantics. L(x) = x b p b p L({s ⇐ t2} e) = {ds eL ⇐ dt2eL} L(e) if b 6= H under M then e / µ 6−→PC ⇑ p. L({[]pc0 ⇐ []pc}pe) = L(e) It is easy to realize that with this result and with Theorems pc0⇐pc L({∗ → ∗}pe) = L(e) 1 and 2 the safe subset of ML-GS enjoys the usual type 0 p soundness property. L({ref pc ⇐ pc} e) = L(e) L(newt,Le) = newdteL,LL(e) VII.NON-INTERFERENCE L(-) = - We prove non-interference for ML-GS by adapting a p H proof technique due to Li and Zdancewic [19]. A projection L(µ, (l 7→ {t2} w )) = L(µ) p L p L function first removes all security features of a program and L(µ, (l 7→ {t2} w )) = L(µ), (l 7→ {dt2eL} w ) replaces high-security values with an opaque placeholder value, H. The semantics is extended with elimination rules Figure 12. Projections (interesting cases). for H values that simply result in H, thus hiding high- security computations. Then we show the projection the- orem: all ML-GS executions have corresponding projected projection d·eL replaces all type annotations by L and thus executions. Non-Interference is a corollary of the projection transforms all other casts into identities. theorem together with type soundness. On heaps, L(·) removes all high-security cells which reflects our assumption that a difference in the size of the A. Projections high-security part of the heap is not a security leak. Our non-interference proof relies on the removal, as ML-GS The extensions for the projected language, here called cannot guarantee that all possible high-security dependent ML-GS , are given in Fig. 11. We write e , v , and w to L L L L executions allocate the same amount of memory. distinguish expressions, values and raw values of ML-GS L R-High-Elim from the respective ML-GS forms. Raw values are extended In summary, as the rule ensures the correct L(e) with the value H. The additional reduction rules ensure flow of hidden high-security information, an expression that the flow of high-security information is approximated can be executed without security concerns. correctly. Rule R-High-Elim reduces to H on any attempt to B. Non-Interference Proof eliminate a H value. Rules R-High-Deref and R-High-Asgn cover the cases where the projection removed high-security The projection theorem is the main stepping stone towards values from the heap whose addresses were in scope of the proving non-interference. We need some lemmas. First, projected expression. check that heap updates under high-security pcs are invisible The projection function L(·) transforms an ML-GS ex- under projection. H pression into an ML-GSL expression thereby abstracting Lemma 1. If e / µ −→ e0 / µ0 then L(µ) = L(µ0). all high-security information to H and removing all se- H curity features. Fig. 12 defines the interesting cases of Proof: By induction on the derivation of e / µ −→ L(·); the remaining cases propagate L(·) homomorphically. e0 / µ0. In the case R-New, a high-security binding is created The affected expressions are high-security values, protection on the heap due to the high-security rely on the fact that by a high security level, high-security memory allocation, projections throws away all high-security bindings. and casts to a high-security type. The accompanying type Using this lemma the projection theorem follows:

23411 L L H L Lemma 2. If e / µ −→ e0 / µ0 then L(e) / L(µ) −→∗ 3 ({s <= s }[q] l) := v; L L L(e0) / L(µ0). 4 !{s <= s }[p]l 5 L 6 p1 = Proof: By induction on the derivation of e / µ −→ H 0 0 7 let l = new v’ in e / µ . Use lemma 1 in case R-Ctxt-Protect. H H 8 ({s <= s }[q] l) := v; L L H 9 !{s <= s }[p]l Theorem 5 (Projection). If e / µ −→∗ e0 / µ0 then L In program p1 the cast q is at fault whereas the cast p L(e) / L(µ) −→∗ L(e0) / L(µ0). is the culprit in program p2. This distinction cannot be With the projection theorem and type soundness, non- reconstructed in (1) with the current run-time annotations. interference follows. We restrict ourselves here to integer However, annotating heap bindings and pointers with the results; pointers work similarly and non-interference for original, allocation-time type of the pointer allows us to units is trivial. Non-Interference of function values can be recover the distinction. shown, similarly as in Austin and Flanagan’s work [5], using Values v ::= ... | l(t⇐t,p),B an equivalence relation based on projection. Heaps µ ::= · | (l 7→ {t ⇐ t}pv) Theorem 6 (Non-Interference). Let vi, i ∈ {1, 2} be two arbitrary, high-security values with respective typings The left of the type annotations in the new pointer and H H L heap syntax are the access and current type respectively, as L; -; M ` vi : s . If L; x : s ; M ` e : int and L before. The right annotation is the static type which always ∗ 0 0 0 e[x 7→ vi] / µ −→ vi / µi then v1 = v2. coincides with the type in the address environment. The Proof: By the projection theorem it holds that L(e[x 7→ static type is fixed at allocation-time and, for a particular L address, the static type is the same for the heap binding ∗ 0 0 L vi]) / L(µ) −→ L(µi) / L(vi). By type soundness, vi = ki and all pointers. With a straightforward extension of the are low-security integers. With L(e[x 7→ v1]) = L(e[x 7→ semantics, p1 now reduces to

v2]) and deterministic evaluation for ML-GSL it follows L L L L ! l(s ⇐s ,p),L / (l 7→ {sH ⇐ sL}qv) that L(k1) = L(k2). Projection is injective for low-security values and therefore the result follows. where the static type is sL due to the low-security initial- VIII.DISCUSSION ization. It is now easy to decide that cast q is to blame, as The blame labels of ML-GS are sets of blame ids. In a static type and current type differ, and it would suffice to concrete implementation, a programmer (or rather the IDE track label q from here on. or compiler) would label all casts with distinct singleton sets IX.RELATED WORK of blame ids. The blame label attached to a blame exception There are many proposals for static program analysis gives the programmer a hint where to look for the bug that of information flow. Most are inspired by the lattice-based caused the security leak. This hint should be as precise as model of the Dennings [8], [9], which categorizes informa- possible. However, currently the blame labels of ML-GS tion according to security levels. On that basis, Volpano, exceptions are very imprecise: Corollary 1 only guarantees Smith, and Irvine [28] were the first to construct a type that a non-empty subset of the blame label signaled by the system that analyzes confidentiality for a simple imperative exception is responsible for the failure. language, followed by a flurry of subsequent work. Later The cause of this imprecision are the unconditional joins work, relevant to our paper, extends their principles to of blame labels in rules like R-Deref. With the type informa- integrity checking and to higher-order languages [14], [21]. tion that is currently available at run time it is not possible to Sabelfeld and Myers give a comprehensive overview [23]. determine which of the blame labels, that on the access type The approaches based on type systems augment types with of the pointer or that on the current type of the heap binding, security annotations that over-approximate the confidential- is the real culprit for a potential blame exception later on. ity level of the information contained in a value. These As an example, consider the following configuration: systems rule out programs where a high-confidential input L ! l(s ,p),L / (l 7→ {sH}qv) (1) may leak to a low-confidential output. As an alternative for static analysis, a number of authors The dereferencing would result in a blame exception ⇑ pq have examined dynamic information flow analysis. The due to the low-security access of the high-security content basic approach extends the run-time system with a monitor of the address l. The following two programs both reduce or augments values with security levels so that potential to configuration (1). security violations can be detected during the execution of

1 p1 = a program [24]. The monitoring modifies the observable L 2 let l = new v’ in output of the program to guarantee non-interference.

23512 The main difficulty of dynamic analysis is the taming of can fail early, before the coercion is applied. In contrast, implicit flows, which was deemed infeasible for some time our reference coercions have one component and behave and gave rise to hybrid approaches [12]. Recent advances differently: we can always write to the reference according have developed a range of techniques with increasingly to its current type, even if that type is the result of a cast. good results. Sabelfeld and Russo [24] demonstrate a run- Read operations may lead to failure. In the other approaches, time monitor that guarantees termination-insensitive non- read and write operations may fail. We do not check casts interference (TINI) and is more permissive than a flow- for early failure because even a cast from ref H to ref L is insensitive analysis. Austin and Flanagan proposed the no- acceptable, if the next operation on the reference is a write, sensitive-upgrade policy [3], improved that to the permis- which is enforced by our semantics. A variant of gradual sive upgrade policy [4], and subsumed both by faceted security typing that behaves like the two papers is feasible, execution [5] (all guaranteeing TINI), which approximates but we believe that our semantics fits better with patterns secure multi-execution (SME), where a program is executed used in secure software construction. multiple times, once for each security level [10]. When SME Siek and coworkers [27] emphasize precise blame track- executes a level, the information that is confidential for that ing whereas we deliberately keep that part of the system level is overridden with a default value. simple. Furthermore, their system does not keep track of Russo and Sabelfeld [22] compare flow-sensitive static effects, which is indispensable for handling implicit flows security analysis with an axiomatically described family of in our system. dynamic analyses and prove that their results are incompa- The combination of gradual typing and security analysis rable. More accurately, they prove that a dynamic analysis has been considered [11], but only in the context of the pure which is strictly stronger than the flow-sensitive analysis of lambda calculus, whereas we consider an ML core language Hunt and Sands [17] is not possible. Russo and Sabelfeld with references. Our approach to modeling the calculus further suggest a hybrid analysis that processes a high- and to proving non-interference are quite different and are security conditional by executing one branch and statically extensible to a realistic language with arbitrary effects. analyzing the other [12]. This analysis is more permissive than the flow-sensitive static analysis. Their results support X.CONCLUSION the need for combining static and dynamic analysis like our We apply the ideas of gradual typing to an annotated type proposal. As our base analysis is flow-insensitive, Sabelfeld system for information flow control. The construction of the and Russo’s earlier result [24] shows that our dynamic gradual system is straightforward in a pure lambda calculus analysis is strictly more permissive than our static analysis, setting, but poses significant challenges when performed for so our combination is useful. Furthermore, we believe our an ML core language with references as we do. While the approach is complementary because even a flow-sensitive statically typed part can be adapted from previous work static analysis cannot cope with our example from Sec. II. [21], the design of the cast operators and the integration TINI, as established for our system, does not provide of dynamic information flow techniques in the operational perfect security. Askarov and coworkers [2] generally crit- semantics are novel to our work. The particular challenge icize the notion of termination insensitive non-interference in our system is the design of casts on references. They are and suggest alternative definitions. Non-interference is also not restricted by the subtyping relation, they never fail, and insufficient in the presence of timing attacks as discussed by they always result in a reference that is ready for writing at Kashyap and coworkers [18]. the target type of the cast. We further demonstrate that the Gradual typing [25], [26] originates from the desire to gradual type system is independent from the enforcement execute dynamically typed programs efficiently and builds strategy used in the untyped part of a program. The formal- on earlier work on dynamic typing [15] and soft typing ization in the paper uses the NSU strategy, but we have also [7]. Cast expressions information locally worked out the details for the FE strategy, which is only so that more efficient, untagged data representations can sketched here. be employed. This approach has also proven useful in the We envisage ML-GS to be useful in contexts where secu- integration of dynamically typed scripting languages with rity requirements change or where language features prohibit typed languages [6], [20], [30], where the primary goal is the use of static analysis throughout. For example, ML- improved maintainability and interoperability. Wadler and GS can integrate manually security-checked code in a typed Findler [29] characterize the interaction precisely with their setting, thus creating safe, but dynamically checked, regions blame theorem, which identifies safe parts of a program that inside of statically checked code. Likewise, the integration never give rise to type errors. of statically checked regions in dynamically checked code Gradual typing for mutable data has only been considered is also possible. These regions can be enlarged or shrunk by two papers [16], [27]. Both papers employ reference according to security and robustness requirements. coercions with two components, one for reading and one An extension of the blame theorem [29] can be stated and for writing, and they perform coercion simplification that proved for ML-GS, but it is omitted to conserve space.

23613 Future work considers an inference for placing cast ex- [13] D. Hedin and A. Sabelfeld. A perspective on information- pressions and the addition of ML-polymorphism along the flow control. In 2011 Marktoberdorf Summer School. IOS lines of Pottier and Simonet. It should also be possible to Press, 2011. combine gradual security typing with plain gradual typing, [14] N. Heintze and J. G. Riecke. The SLam calculus: Pro- but the focus of the present work is on the security aspect. gramming with security and integrity. In L. Cardelli, editor, Proc. 25th ACM Symp. POPL, pages 365–377, San Diego, ACKNOWLEDGMENT CA, USA, Jan. 1998. ACM Press. Thanks to Joshua Guttman for his extensive, thoughtful comments on draft versions of this paper, which helped to [15] F. Henglein. Dynamic typing: Syntax and proof theory. Science of Computer Programming, 22:197–230, 1994. improve the presentation considerably. [16] D. Herman, A. Tomb, and C. Flanagan. Space-efficient REFERENCES gradual typing. In Trends in Functional Programming (TFP), [1] A. Ahmed, R. B. Findler, J. Matthews, and P. Wadler. Blame 2007. for all. In Proceedings for the 1st workshop on Script to Program Evolution, pages 1–13, Genova, Italy, 2009. ACM. [17] S. Hunt and D. Sands. On flow-sensitive security types. In S. Peyton Jones, editor, Proc. 33rd ACM Symp. POPL, pages [2] A. Askarov, S. Hunt, A. Sabelfeld, and D. Sands. 79–90, Charleston, South Carolina, USA, Jan. 2006. ACM Termination-insensitive noninterference leaks more than just Press. a bit. In Proceedings of the 13th European Symposium on Re- search in Computer Security: Computer Security, ESORICS [18] V. Kashyap, B. Wiedermann, and B. Hardekopf. Timing- ’08, pages 333–348, Berlin, Heidelberg, 2008. Springer- and termination-sensitive secure information flow: Exploring Verlag. a new approach. In Proceedings of the 2011 IEEE Symposium on Security and Privacy, SP ’11, pages 413–428, Washington, [3] T. H. Austin and C. Flanagan. Efficient purely-dynamic DC, USA, 2011. IEEE Computer Society. information flow analysis. In S. Chong and D. A. Naumann, editors, PLAS, pages 113–124, Dublin, Ireland, June 2009. [19] P. Li and S. Zdancewic. Arrows for secure information flow. ACM. Theoretical Computer Science, 411(19):1974–1994, 2010.

[4] T. H. Austin and C. Flanagan. Permissive dynamic informa- [20] J. Matthews and R. B. Findler. Operational semantics for tion flow analysis. In Proceedings of the 5th ACM SIGPLAN multi-language programs. ACM TOPLAS, 31:12:1–12:44, Workshop on Programming Languages and Analysis for Secu- Apr. 2009. rity, PLAS ’10, pages 3:1–3:12, New York, NY, USA, 2010. ACM. [21] F. Pottier and V. Simonet. Information flow inference for ML. ACM TOPLAS, 25(1):117–158, Jan. 2003. [5] T. H. Austin and C. Flanagan. Multiple facets for dynamic information flow. In Proc. 39th ACM Symp. POPL, pages [22] A. Russo and A. Sabelfeld. Dynamic vs. static flow-sensitive 165–178, Philadelphia, USA, Jan. 2012. ACM Press. security analysis. In CSF, pages 186–199. IEEE Computer Society, 2010. [6] G. M. Bierman, E. Meijer, and M. Torgersen. Adding dynamic types to C#. In T. D’Hondt, editor, ECOOP, [23] A. Sabelfeld and A. C. Myers. Language-based information- volume 6183 of LNCS, pages 76–100, Maribor, Slovenia, flow security. IEEE J. Selected Areas in Communications, 2010. Springer. 21(1):5–19, Jan. 2003.

[7] R. Cartwright and M. Fagan. Soft typing. In Proc. PLDI ’91, [24] A. Sabelfeld and A. Russo. From dynamic to static and back: pages 278–292, Toronto, Canada, June 1991. ACM. Riding the roller coaster of information-flow control research. In A. Pnueli, I. Virbitskaite, and A. Voronkov, editors, Ershov [8] D. Denning. A lattice model of secure information flow. Memorial Conference, volume 5947 of Lecture Notes in Comm. ACM, 19(5):236–242, 1976. Computer Science, pages 352–365. Springer, 2009.

[9] D. Denning and P. Denning. Certification of programs for [25] J. Siek and W. Taha. Gradual typing for objects. In E. Ernst, secure information flow. Comm. ACM, 20(7):504–513, 1977. editor, 21st ECOOP, volume 4609 of LNCS, pages 2–27, Berlin, Germany, July 2007. Springer. [10] D. Devriese and F. Piessens. Noninterference through se- cure multi-execution. In IEEE Symposium on Security and [26] J. G. Siek and W. Taha. Gradual typing for functional lan- Privacy, pages 109–124, Berkeley/Oakland, California, USA, guages. In Scheme and Functional Programming Workshop, May 2010. IEEE Computer Society. Sept. 2006.

[11] T. Disney and C. Flanagan. Gradual information flow typing. [27] J. G. Siek, M. M. Vitousek, and S. Bharadwaj. Gradual typing In STOP 2011, 2011. for mutable objects. http://ecee.colorado.edu/∼siek/gtmo.pdf, Dec. 2012. [12] G. L. Guernic, A. Banerjee, T. P. Jensen, and D. A. Schmidt. Automata-based confidentiality monitoring. In M. Okada and [28] D. Volpano, G. Smith, and C. Irvine. A sound type system for I. Satoh, editors, ASIAN, volume 4435 of LNCS, pages 75–89. secure flow analysis. Journal of Computer Security, 4(3):1– Springer, 2006. 21, 1996.

23714 [29] P. Wadler and R. B. Findler. Well-typed programs can’t be 1) Case R-App: Assumptions:

blamed. In G. Castagna, editor, Proc. 18th ESOP, volume 0 5502 of LNCS, pages 1–16, York, UK, Mar. 2009. Springer- pc; -; M ` (λBx. e) v : sbtb (2) Verlag. The heap typing result is immediate. [30] T. Wrigstad, F. Z. Nardelli, S. Lebresne, J. Ostlund,¨ and By (2) and subtyping: J. Vitek. Integrating typed and untyped code in a scripting language. In J. Palsberg, editor, Proc. 37th ACM Symp. POPL, 0 0 b1 0 pc ; x : t2; M ` e : s1 where pc t b v pc pages 377–388, Madrid, Spain, Jan. 2010. ACM Press. 0 b1 btb and s1 ≺ s (3) APPENDIX A. PROOFOF TYPE PRESERVATION 00 00 0 pc; -; M ` v : t2 where t2 ≺ t2 (4) A. Auxiliary Lemmas By (3), (4), and substitution: It is straightforward to check that the subtyping relation 0 b1 is well-behaved, in that each well-typed value can be typed pc ; -; M ` e[x 7→ v]: s1 (5) by applying the subsumption rule first, followed by the canonical rule for that value (i.e. T-Abs is the canonical rule By (5), (3), subtyping and typing rule T-Protect: 0 for function abstractions). We will use this fact implicitly in pc; -; M ` protBe[x 7→ v]: sbtb (6) the following. which is the desired result. Lemma 3 (Substitution). If pc; x : t; M ` e : t0 and 2) Case R-Protect: Assumptions pc; -; M ` v : t then pc; -; M ` e[x 7→ v]: t0. B B1 Btb1 Proof: By induction on the derivation of pc; x : t; M ` pc; -; M ` prot w : s (7) 0 e : t . The heap typing result is immediate. Lemma 4. If pc; -; M ` e : t then for all pc0 v pc it holds By subtyping and (7): 0 that pc ; -; M ` e : t. B1 B1 B1 b1 pc t b; -; M ` w : s2 and s2 ≺ s (8) Proof: By induction on the derivation of pc; x : t; M ` By (8) and lemma 6: e : t0. B1 B1 Lemma 5. If pc; -; M ` e : t then for all M 0 pc; -; M ` w : s2 (9) 0 where dom(M) and dom(M ) are disjoint, it holds that By (9), (8) and subtyping: pc; -; M,M 0 ` e : t. pc; -; M ` wB1tB : sBtb1 (10) Lemma 6. If pc; -; M ` v : t then for all pc0 it holds that pc0; -; M ` v : t. which is the desired result. 3) Case R-New: Assumptions: Proof: By induction on the derivation of pc; -; M ` v : b1 t. pc; -; M ` news ,BwB1 : ref b sb1 (11) Lemma 7. If pc; -; M ` E[e]: t and pc; -; M ` e : t0 and M ` µ (12) pc; -; M ` e0 : t0 then pc; -; M ` E[e0]: t. By (11): Proof: By induction on the derivation of pc; -; M ` B1 b1 E[e]: t. pc; -; M ` w : s where PC v b1 (13)

B b s1⇐s Lemma 8. If pc; -; M ` w : s and w −−−→p w1 and By (13): sb ∼ sb then pc; -; M ` wB : sb 1 1 1 pc; -; M ` wB1tPC : sb1 (14) s1⇐s Proof: By examining the cases of w −−−→p w1 and straightforward type reconstruction, using lemma 4. By (14), (12):

b1 p B1tPC B. Proof M, l : s ` µ, (l 7→ {t} w ) (15) Let PC be a run-time security guard and pc a typing which is the desired heap typing result. security guard such that PC v pc. If pc; -; M ` e : t and By (13) and rule T-Addr: PC 0 0 0 M ` µ and e / µ −→ e / µ then there exists M such b1 pc; -; M, sb1 ` l(s ,p),B : ref b sb1 (16) that pc; -; M,M 0 ` e0 : t and M,M 0 ` µ0. The proof is an induction of the derivation of e / µ −→PC e0 / µ0. which is the desired expression typing result.

23815 4) Case R-Deref: Assumptions:

(sb,p),B btb0 T-Protect pc; -; M, l : t1 ` ! l : s (17) Case: e is not a value 0 q M, l : t1 ` µ, (l 7→ {t1} v) (18) By the induction assumption, either rule R-Ctxt-Protect or rule R-Ctxt-Protect-Fail applies. The heap typing result is immediate. Case: e is a value By (17): Rule R-Protect applies. (sb,p),B b b0 pc; -; M, l : t1 ` l : ref s where B v b b T-App and s ∼ t1 (19) Case: e1 or e2 is not a value By (18) and lemma 6: By the induction assumption, either rule R-Ctxt or rule R-Ctxt-Fail applies. 0 0 pc t B; -; M, l : t1 ` v : t1 where t1 ∼ t1 (20) Case: e1 and e2 are values By (19), (20), typing rules T-Cast and T-Protect, and sub- Rule R-App applies. typing: The cases for T-New,T-GuardCast-∗, T-FunCast-∗, B b 0 pq b 0 pc; -; M, l : t1 ` prot {s ⇐ t1} v : s where t1 ∼ t1 and T-RefCast work similarly as case T-App. (21) Which is the desired result. T-Sub 5) Case R-Asgn: Assumptions: The result follows by the induction assumption.

b2 (s ,p),B B2 b2 pc; -; M, l : t1 ` l := w : unit (22) T-Deref 0 q Case: e is not a value M, l : t1 ` µ, (l 7→ {t1} v) (23) By the induction assumption, either rule R-Ctxt or By (22) and lemma 6: rule R-Ctxt-Fail applies. Case: e is a value The typing

b2 assumption yields (s ,p),B b b2 pc; -; M, l : t1 ` l : ref s µ = µ0, (l 7→ {t0}pv) (27) b2 where s ∼ t1 and B v b With this result, rule R-Deref applies. and b v b2 and pc v b2 (24) T-Asgn B2tPC tB B2tPC tB pc; -; M, l : t1 ` w : s1 Case: e1 or e2 is not a value B2tPC tB b2tPC tB By the induction assumption, either rule R-Ctxt or rule where s1 ≺ s (25) R-Ctxt-Fail applies. By (23), (24), and (25): Case: e1 and e2 are values The typing assumption yields

b2tPC tB pq B2tPC tB M, l : t1 ` µ, (l 7→ {s } w ) (26) µ = µ0, (l 7→ {t0}pwB) (28) which is the desired heap typing result. The expression pc; -; M ` l(t,q),B1 := v : unitb (29) typing follows immediately by subtyping and pc v b2 (24). • Case PC t B v B: R-Asgn applies. 6) Cases R-Ctxt, R-Protect-Ctxt: The result follows by 1 • Case PC t B 6v B: R-Asgn-NSU-Fail applies. lemma 7, lemma 5 and the induction assumption. 1 7) Cases R-Cast-∗: The result follows straightforwardly T-Cast with lemma 8. Case: e is not a value 8) Cases R-GuardCast-∗: Using lemma 4 in By the induction assumption, either rule R-Ctxt or rule case R-GuardCast-Protect and lemma 6 in case R-Ctxt-Fail applies. Case: e is a value wB with type sb R-GuardCast-Value, the result follows straightforwardly. The compatibility requirement of the typing assumptions APPENDIX B. yields PROOFOF PROGRESS s1⇐s w −−−→p w1 (30) If pc; -; M ` e : t and M ` µ then, for all PC v pc either PC b1 (i) e is a value, (ii) there exists p such that e / µ −→ ⇑ p, where s1 is the destination type of the cast. PC or (iii) there exist µ0 and e0 such that e / µ −→ e0 / µ0. • Case B 6v b1: Rule R-Cast-Sub-Fail applies. Proof by induction on the derivation of pc; -; M ` e : t. • Case b1 = ?: Rule R-Cast-To-Dyn applies. The cases T-Var, T-Int, T-Unit, and T-Addr are trivial. • Case B v b1, b1 = B1: Rule R-Cast-Sub applies.

23916