Affine Contracts for Affine Types

Jesse A. Tov Riccardo Pucella Northeastern University {tov,riccardo}@ccs.neu.edu

Abstract ML-like language illustrates why conventional type systems are Affine and other substructural type systems offer a range of expres- inadequate to the task of typing strong updates: siveness and performance benefits but cannot easily interact with fun tricky (r1: (int → int) ref, r2: (int → int) ref): int = components written in non-affine languages in a safe manner. We let val f = !r1 propose a technique for regulating the interaction between code val = r1 := 0 written in an affine language and code written in a conventional val g = !r2 typed language by means of contracts. in f (g 2) end We formalize our approach via a typed calculus with both affine-typed and conventionally typed modules. An affine type If we invoke tricky(r, r), where r is a reference cell containing an system is used for affine modules and a conventional int → int function, then g in the code above will be an integer rather for conventional modules, and neither type system is aware of the than a function, resulting in a run-time fault. The problem occurs other. We show how to preserve the properties guaranteed by the only if the two arguments to tricky are aliases for the same reference two type systems despite modules being able to call each other and cell, which a linear or affine type system can prevent. exchange values arbitrarily, and we establish the correctness of our Beyond strong updates, substructural types have been used approach through a syntactic type soundness theorem. for memory management (Jim et al. 2002), for optimization of lazy languages (Turner et al. 1995), and to handle effects in pure Categories and Subject Descriptors D.2.4 [Software Engineer- languages (Barendsen and Smetsers 1996). Typestate (Strom and ing]: Software/Program Verification—Programming by contract; Yemini 1986) and session types (Gay and Hole 1999) can also be F.4.1 [Mathematical Logic and Formal Languages]: Mathematical understood as substructural type systems, although they are not Logic—Affine Logic always presented as such. Given the range of language features that substructural types can General Terms Languages, Reliability express, a programmer may wish to take advantage of these features in writing real-world programs. Writing real systems, however, Keywords Contracts, affine types, substructural logic, multi- often requires access to comprehensive libraries, which mainstream language systems programming languages generally provide but prototype imple- mentations often do not. The prospect of rewriting libraries to work 1. Introduction in a substructural language strikes these authors as unappealing. This suggests allowing conventional and substructural lan- Substructural type systems augment conventional type systems guages to interoperate. We envision the following scenarios: with the ability to control the number and order of uses of a data structure or operation (Walker 2005). The most common • The programmer wishes to import a legacy library into a sub- substructural type systems are linear type systems (Wadler 1990; structural language. Unfortunately, if the library is unaware Plotkin 1993; Benton 1995; Ahmed et al. 2004). Roughly speaking, of the substructural conditions then it may duplicate values a linear type system ensures that values with linear type cannot received from the substructural language. be duplicated or dropped: a value of linear type must be elimi- • The programmer wishes to write a library in a substructural nated exactly once. Other substructural type systems refine these language but provide access to the library from a conventional constraints. Affine type systems, for instance, enforce that values language. The naive client may duplicate values it receives cannot be duplicated, but allow them to be dropped: a value of from a substructural library and resubmit them, causing aliasing affine type may be used once or not at all. that the substructural library could not produce by itself and A common use case for substructural type systems is to type bypassing the guarantees of the substructural type system. check reference cells that support strong updates, that is, where the type of a value stored in a cell can be different at different points Our Contributions. The core contribution of this paper is a novel during the execution of a program. The following example in an approach to regulating the interaction between an affine language and a conventionally typed language. We present a calculus having several notable features: • The non-affine language may gain access to affine values, in- cluding strongly-updatable references, and may apply affine- language functions. • The non-affine type system has no awareness of the affine type system. [Copyright notice will appear here once ’preprint’ option is removed.] • And yet, the resulting system enjoys type soundness.

1 2009/3/2 We expect our technique to be efficiently implementable on real variables x, y hardware. module names f, g Our calculus consists of two sublanguages with different type integers z systems: a simply-typed λ calculus with a completely standard type system and a λ calculus with an affine type system. We have chosen programs P ::= M e declarations M ::= • | M m an affine rather than linear type system because what it means for u an affine invariant to be violated is significantly clearer than what it modules m ::= module f : σ = v means for a linear invariant to be violated. (We revisit this point in expressions e ::= v | x | f | e e | if0 e e e §6.) Our affine language provides strong updates as a simple stand- | he, ei | let hx, xi = e in e in for the variety of features such as typestate that they subsume. values v ::= λx:σ.e | c | hv, vi A program is a collection of modules and a main expression, constants c ::= dze | new | swap | · · · and each module may be written in either of the two sublanguages. a u types σ ::= σ | σ Modules in each language have access to modules written in the u u unlimited types σu ::= int | σ ∞ σ | σ ⊗ σ other language, though they view foreign types through a transla- ( a a affine types σa ::= σ ref | σ 1 σ | σ ⊗ σ | σ ⊗ σ tion into the native type system. Affine modules are checked by ( arrows ∗ ::= ∞ | 1 an affine type system, and non-affine modules are checked by the ( ( ( conventional type system. Notably, the conventional type system has no knowledge of affine types and their properties. Figure 1. A syntax To allow modules to invoke each other and exchange values while preserving the safety properties guaranteed by the individual type systems, we need to perform runtime checks in cases where affine sublanguage in §3. In §4 we introduce our multi-language the non-affine type system is too weak to express the affine type calculus and attempt to justify some of its design. The impatient system’s invariants for values that flow between the languages. reader may wish to skip to §4.3 on page 7 for examples. We explain For instance, the affine type system guarantees that an affine value our soundness criterion, state a standard type soundness theorem, created in an affine module will not be duplicated within the affine and discuss highlights of the proof in §5. We explore alternate sublanguage. If the value flows into a non-affine module, however, design ideas, extensions, and future research possibilities in §6 and static bets are off. In that case, we resort to a dynamic check that conclude in §7. prevents the value from flowing back into an affine context—where To distinguish the two languages of our calculus, we typeset our it can be used—more than once. Since our calculus is higher-order, affine language A in a sans-serif font and our non-affine language we use a form of higher-order contract (Findler and Felleisen 2002) C in a bold serif font. to keep track of each module’s obligations toward maintaining the affine invariants. 2. The Affine Sublanguage In Findler and Felleisen’s formulation, a software contract is A an agreement between two software components, or parties, about We begin with our affine sublanguage, which we call A (for some property of a value. The positive party produces a value, “affine,” typeset in sans-serif). Figure 1 presents the syntax of types which must satisfy the specified property. The negative party con- and terms. sumes the value and is held responsible to treat it appropriately. In preparation for adding contracts, A includes modules, which Contracts are concerned with catching violations of the property act as contractual parties. For simplicity, a module declares only and blaming the guilty party, which may help locate the source of a one name bound to one value. A program comprises a mutually bug. For first-order values, the contract may be immediately check- recursive collection of modules M and a main expression e. able, but for functional values, the property is likely undecidable, Expressions are mostly conventional: values, which include λ so the check must wait until the negative party applies the function, expressions, several constants, pairs of values, and dze for integer at which point the negative party is responsible for providing a literals; variables; application; if expressions; pair construction; suitable argument and the positive party for producing a suitable and pair elimination. Less conventionally, expressions also include result. Thus, for higher-order functions, checks are delayed until module names (f), which reduce to the value of the named module. first-order values are reached. We define the free variables of an expression in the usual way, but Our approach to integrating affine and conventional types note that this includes only regular variables (e.g., y), not module borrows heavily from recent literature on multi-language inter- names (e.g., g), which we assume are distinguished syntactically. operability, in particular work that uses contracts to mediate Types are syntactically partitioned into two sorts, the unlimited between typed and untyped code. A variety of approaches to partial types σu, which permit value duplication, and the affine types typing appear in the literature, including hybrid types (Flanagan σa, which do not.1 In particular, the type system forbids aliasing 2006) and gradual types (Siek and Taha 2006). We follow the of affine-typed variables, which is crucial for supporting strong approaches of Typed Scheme and its predecessor (Tobin-Hochstadt updates. The unlimited types include integers, unlimited functions ∞ and Felleisen 2008, 2006) and of Matthews and Findler (2007), (σ ( σ), and products of unlimited types. The affine types include both of which handle multi-language programs that combine an references, one-shot functions (σ 1 σ), and products with at least ( 1 untyped, Scheme-like language with a typed language. Both use one affine component. While a one-shot function σ ( σ can only contracts to regulate the inter-language boundary. Unlike these be applied once and cannot be duplicated, an unlimited function other systems, both of our languages are typed. Our affine type σ ∞ σ can be duplicated and thus used freely. (Our grammar also ( ∗ system is strictly stronger than our non-affine type system, in the defines a non-terminal (, which is useful when discussing both sense that the former is capable of enforcing all of the latter’s one-shot and unlimited functions.) invariants. The typed/untyped systems have this property in a trivial sense only, since the untyped (or trivially typed, rather) side 1 This calculus is designed to be simple, but the multi-language system enforces no representation invariants. described in this paper should work equally well with a “low level” affine calculus having explicit !-suspension and duplication, with a “high level” Road Map. In §2, we give the syntax and semantics of our affine polymorphic calculus, or with λURAL-style (Ahmed et al. 2005) qualifier sublanguage, followed by a very brief presentation of our non- polymorphism.

2 2009/3/2 locations ` vent programs from going wrong. For tricky to work, it is not values v ::= · · · | ` enough that it be given two references containing the right sort of value—it must be given two different references. Consider what evaluation contexts E ::= [ ]A | E e | v E | if0 E e e | hE, ei | hv, Ei | let hx, xi = E in e happens if we apply tricky to the same reference for both argu- ments:2 stores s ::= ∅ | s ]{` 7→ v} configurations C ::= (s, e) 1 let r :(int ( int) ref = new (λx:int. x + d1e) in tricky r r C 7−→M C The first swap writes d0e to the same location that the second swap reads, which means that this program gets stuck trying to (A-δ) (s, c v) 7−→ δA (s, c, v) M apply an integer: (A-βv) (s, (λx:σ. e) v) 7−→ (s, e[v/x]) M ({` 7→ d1e}, (λx:int. x + d1e)(d0e d2e)) (A-LET) (s, let hx1, x2i = hv1, v2i in e) 7−→ (s, e[v2/x2][v1/x1]) For the strong update to be safe, it is necessary but not sufficient M to ensure that the two references passed to tricky are unaliased, (A-IF0) (s, if0d0e et ef ) 7−→ (s, et) M either to each other or elsewhere. A similar problem occurs if tricky (A-IFZ) (s, if0dze et ef ) 7−→ (s, ef ) z 6= 0 M is partially applied and the resulting function is then applied twice: (A-MOD) (s, f) 7−→ (s, v) 1 1 M let f :(int ( int) ref ( int = tricky (new (λx:int. x + d1e)) in module f : σu = v ∈ M f (new (λx:int. x + d2e)) + f (new (λx:int. x + d3e)) (s, e) 7−→ (s0, e0) When we partially apply tricky to some new reference `, it closes M over the reference. The first call to f then stores d0e at `, which the (A-CTXT) 0 0 (s, E[e]A ) 7−→ (s , E[e ]A ) second call to f reads out and tries to apply to d4e, thereby getting M stuck. Thus, it is not enough to avoid aliasing references. We must also avoid aliasing values, such as closures, that contain references. δA (s, c, v) = (s, v) 2.2 Static Semantics δA (s, new, v) = (s ]{` 7→ v}, `) (` fresh) M Expressions in A are typed by a judgment Γ `A e : σ, in figure 3, δA (s ]{` 7→ v1}, swap, h`, v2i) = (s ]{` 7→ v2}, h`, v1i) where Γ is an environment mapping variables to types. As in other substructural type systems, our type system pays special attention Figure 2. A operational semantics to managing the type environment. For example, in TA-APP, the rule for application, the environment Γ is split between the operator and operand rather than being wholly available to both. 2.1 Operational Semantics As in Girard’s (1987) linear logic, we do without the structural rule for contraction (duplication), but only for some types. Envi- Figure 2 presents the operational semantics of A . Evaluation ronment splitting is defined so that variables with affine type are contexts are wholly conventional as are stores, which map lo- available in only one subexpression or the other: cations to values. The reduction relation 7−→M is defined over configurations—pairs of a store and an expression—in the context Γ1  Γ2 = Γ3 Γ1  Γ2 = Γ3 a a a a of module declarations M. We say that a program M e evaluates Γ1 Γ2, x : σ = Γ3, x : σ Γ1, x : σ Γ2 = Γ3, x : σ ∗ 0 0   to value v if (∅, e) 7−→M (s , v) for some final store s . At run Unlimited variables, on the other hand, are subject to contraction, time, values may include locations (`). so they are available on both sides of the split: We use a function δA to interpret the two operations on ref- erences. The constant new allocates a new reference cell, and the Γ1  Γ2 = Γ3 u u u constant swap performs a update on a reference. Given a pair of Γ1, x : σ  Γ2, x : σ = Γ3, x : σ a reference ` and a value v , where the store maps ` to value 2 Of course, the empty environment splits into itself: v1, swap h`, v2i stores v2 at ` and returns the pair h`, v1i. As is frequently observed (Ahmed et al. 2004), freely reading from a •  • = • reference containing an affine value is not necessarily sound, as We implicitly allow exchange (reordering) of Γ. Our affine lan- that may result in aliasing the value. While writing is all right in guage, unlike a linear language, does implicitly permit weakening an affine system, we provide only swap, which can express both (discarding variables), since the base-case rules TA-CON, TA- reading and writing when it is safe to do so. VAR, and TA-MOD allow for ignoring an arbitrary Γ. This calculus supports strong updates to references. For exam- Of note are two rules for typing λ expressions, TA-LAMPROM ple, consider the function tricky from §1 written as an A module. 1 and TA-LAMAFF. The former applies only in cases where the free It takes two references to int ( int functions, reads them from the variables of the term do not include any affine (σa) types, and the references (replacing them with integers), and returns the composi- latter applies only in cases where the free variables do include affine tion of the functions applied to d2e: types; thus, at most one of these rules applies in any case. In the 1 ∞ 1 1 former case, it is safe to use the function an unlimited number of module tricky:(int ( int) ref ( (int ( int) ref ( int = 1 times, so we give it an unlimited (∞ ) type. In the latter case, the λr1:(int ( int) ref. ( 1 function closes over some affine values, so it should be used at most λr2:(int ( int) ref. 0 once, and we give it an affine ( 1 ) type. let hr1, x1i = swap hr1, d0ei in ( 0 The affine sublanguage also admits a simple form of let hr2, x2i = swap hr2, d1ei in x1 (x2 d2e) wherein an unlimited function may be used where a one-shot func- tion is expected. The reverse, as we have seen, must be avoided. We Because the types of the references change when they are up- 2 dated, a simple, fully-structural type system is insufficient to pre- We define let x : σ = e1 in e2 as an abbreviation for ((λx:σ. e2) e1).

3 2009/3/2 M Γ `A e : σ

(TA-SUBSUME) (TA-CON) (TA-VAR) (TA-MOD) M 0 Γ `A e : σ σ <: σ module f : σ = v ∈ M M 0 M M M Γ `A e : σ Γ `A c : tyA (c) Γ, x : σ `A x : σ Γ `A f : σ

(TA-LAMPROM) (TA-LAMAFF) M 0 a M 0 a Γ, x : σ `A e : σ @σ ∈ Γ(FV(λx:σ. e)) Γ, x : σ `A e : σ ∃σ ∈ Γ(FV(λx:σ. e)) M ∞ 0 M 1 0 Γ `A λx:σ. e : σ ( σ Γ `A λx:σ. e : σ ( σ

(TA-APP) (TA-IF0) M 0 ∗ M 0 M M M Γ1 `A e1 : σ ( σ Γ2 `A e2 : σ Γ1 `A e1 : int Γ2 `A e2 : σ Γ2 `A e3 : σ M M Γ1  Γ2 `A e1 e2 : σ Γ1  Γ2 `A if0 e1 e2 e3 : σ

(TA-PAIR) (TA-LET) M M M M Γ1 `A e1 : σ1 Γ2 `A e2 : σ2 Γ1 `A e1 : σx ⊗ σy Γ2, x : σx, y : σy `A e2 : σ M M Γ1  Γ2 `A he1, e2i : σ1 ⊗ σ2 Γ1  Γ2 `A let hx, yi = e1 in e2 : σ

tyA (c) = σ

int new ∞ ref swap ref ∞ ref tyA (dze) = tyA ( ) = σ ( σ tyA ( ) = σ1 ⊗ σ2 ( σ2 ⊗ σ1 ···

Figure 3. A static semantics (expressions)

σ <: σ ` P : σ (T-PROGA) M M (O-DERELICT) (O-REFL) (O-TRANS) ∀m ∈ M, ` m okay • `A e : σ σ1 <: σ2 σ2 <: σ3 ` M e : σ ∞ 1 σ1 ( σ2 <: σ1 ( σ2 σ <: σ σ1 <: σ3 `M m okay (O-PROD) (O-ARROW) (TM-A) 0 0 0 0 M u σ1 <: σ1 σ2 <: σ2 σ1 <: σ1 σ2 <: σ2 • `A v : σ 0 0 ∗ 0 ∗ 0 M u σ1 ⊗ σ2 <: σ1 ⊗ σ2 σ1 ( σ2 <: σ1 ( σ2 ` module f : σ = v okay

Figure 4. A static semantics (subtyping) Figure 5. A static semantics (programs and modules) define the subtyping relation (<:) in figure 4. The subtyping rela- variables x, y tion is reflexive and transitive, covariant in products and function module names f, g results, and contravariant in function arguments, as usual. Given modules m ::= module f : ττττ = v this subtyping relation, the subsumption rule TA-SUBSUME states expressions e ::= v | x | f | e e that any expression may be used at a more restrictive (higher) type. values v ::= λλλλx:ττττ. e | c To type a whole program (figure 5), we check that each mod- constants c ::= ddddzeeee | · · · ule’s value has the declared type; each module is checked in the types ττττ ::= int | ττττ → ττττ context of all the modules, which means that all modules must have u evaluation contexts E ::= [[[[]]]] | E e | v E an unlimited type σ . Then, the main expression must type in the C context of all the modules. configurations C ::= (s, e)

Figure 6. C syntax 3. The Non-Affine Sublanguage C Our calculus includes a second, non-affine sublanguage, which we call C (for “conventional”). Sublanguage C is a completely stan- 4. Mixing It Up dard simply-typed λ calculus augmented with modules as in A . Our goal in this work is to construct (type-safe) programs by mix- Figure 6 presents the syntax of types and terms, and the semantics ing modules written in an affine language and modules written in appear in appendix A. We typeset C terms in a bold serif font to a non-affine language, and to have them interoperate as seamlessly distinguish them visually from A . as possible. This might represent an affine program calling into a This language omits features such as references and if expres- library written in a legacy language, or calling into a library written sions that are inessential to the story we are telling. Such features in an affine language from a program written in a conventional would not be difficult to add. language, or both. In either case, we must ensure that the non-affine

4 2009/3/2 programs P ::= M e ` P : ττττ declarations M ::= • | M m (T-PROG) u M M modules m ::= m | m | interface f :> σ = g ∀m ∈ M, ` m okay • `C e : ττττ g A expressions e ::= · · · | f ` M e : ττττ C expressions e ::= · · · | fg o opaque A types σ ::= σ ref | σ ⊗ σ M o ` m okay C types τ ::= · · · | σ aff (TM-I) (module g :(σu)C = v) ∈ M Figure 7. Mixed syntax `M (interface f :> σu = g) okay (σ)C = ττττ (int)C = int M Γ `C e : ττττ (TC-MODA) ∗ C C C (module f : σu = v) ∈ M (σ1 ( σ2) = (σ1) → (σ2) M g u o C o C (σ ) = σ aff Γ `C f :(σ )

M Γ `A e : σ A (ττττ) = σ (int)A = int (TA-MODC) (TA-MODI) A A ∞ A u 0 (ττττ 1 → ττττ 2) = (ττττ 1) ( (ττττ 2) (module f : ττττ = v) ∈ M (interface f :> σ = f ) ∈ M o A o M g A M g u (σ aff) = σ Γ `A f :(ττττ) Γ `A f : σ

Figure 8. Mixed static semantics (type conversions) Figure 9. Mixed static semantics (type judgments) portions of the program cannot break the affine portions’ invariants. an integer, returns a one-shot function. By the conversion rule, a C We accomplish this via run-time checks in the style of higher-order expression that refers to g will see it with C type int → int → contracts (Findler and Felleisen 2002). int, which means that C ’s type system cannot enforce that the The additional syntax for mixed programs is in figure 7. The functional result of partially applying g be used at most once. We main expression in a mixed program is in the C language, though will need to insert a dynamic check. this decision is arbitrary, and A would work just as well. Modules For C modules used in A expressions, we use rule TA-MODC. now include A modules, C modules, and interface modules, which Given a C type ττττ, any A type σ such that (σ)C = ττττ may be are used to ascribe an A type to a C module. We discuss interface a reasonable view from A . The type conversion function used by modules in more detail in §4.1. rule TA-MODC, (·)A , is a conservative choice: We add a production to each language’s expressions allowing it g ∞ to refer to modules from the other language. We decorate each such • If f : int → int in C , then f : int ( int is the right type in A . module name with the name of the module in which it statically There is no reason to limit f to a one-shot function type because appears (e.g., f g for a reference to C module f from A module subsumption means we can use it where a one-shot function is g) and use this name as the negative party in contracts regulating allowed if necessary. g ∞ ∞ the inter-language interaction. This is not necessary for soundness, • If f :(int → int) → int in C , then f :(int ( int) ( int will but we use it to assign blame. In an implementation this annotation allow the imported function to be passed unlimited functions would not appear in source programs, as it is easily computed by a but not one-shot functions. This is a safe choice, because C ’s compiler. type system cannot tell whether it is safe to pass a one-shot We define a new syntactic class of types for the A language: function to f. Opaque types (σo) do not translate into existing types in the C language. Rather, values of opaque type appear as opaque values In the latter case, where f :(int → int) → int in C , what if they flow into C , and all C may do is shuffle them around or if the programmer knows, perhaps by inspecting its code, that return them back into A . To C we add a new type constructor aff function f applies its argument at most once? For example, it will that is used to refer to opaque A types in C . not cause a problem to pass a one-shot function to the C function λλλλx:int → int. x dddd0eeee, but A ’s subtyping cannot safely allow this in general. 4.1 Static Semantics To this end, we introduce interface modules (figure 7) to allow The type system for the mixed calculus is the union of the type explicit assertions that a C value conforms to a particular A type. systems for A and C , along with additional typing rules that A module interface f :> σu = g means that f is defined to describe how to type A module invocations in C expressions and have the same value as g, but additionally promises to behave as u C module invocations in A expressions. if it had A type σ . By rule TA-MODI, A expressions then see Rule TC-MODA (figure 9) is used to type A modules used in invocations of the f at its asserted type σu. We do not take its C expressions. The rule uses a function (·)C , defined in figure 8, word for it, however. Statically, rule TM-I checks that the declared to convert the A module’s type into the C type for the expression. A type converts to the actual C type. This ensures that we get Intuitively, A types are richer than C types; in particular, A has integers where integers are expected and functions where functions two constructors for function types whereas C has only one. As a are expected. This of course does not suffice—we also need to consequence, for any A type σ, the only reasonable view from C insert a run-time check that the C function actually behaves as the is (σ)C . interface claims. For example, suppose that some A module g has the type In an indication that A is in some sense stronger than C , we ∞ 1 int ( int ( int; that is, g is an unlimited-use function that, given observe that C types embedded into A return faithfully to C :

5 2009/3/2 A expressions e ::= · · · | σ AC (e) C expressions e ::= · · · | CA σ(e) | blame f f f f f C values v ::= · · · | CA[`]σ(v) f f A evaluation contexts E ::= · · · | σ AC (E) C evaluation contexts E ::= · · · | CA σ(E) f f f f

stores s ::= ∅ | s ]{` 7→ v} | s ]{` 7→ v} answers a ::= (s, v) | (s, E[[[[ blame f]]]]C )

C 7−→M C (s, e) 7−→ (s0, e0) 0 M (A-CTXT ) 0 0 (s, E[e]A ) 7−→ (s , E[e ]A ) M g (ττττ)A (A-MODC) (s, f ) 7−→ (s, AC (f)) module f : ττττ = v ∈ M M g f g σu ` 0´ u 0 (A-MODI) (s, f ) 7−→ (s, AC f ) interface f :> σ = f ∈ M M g f int (A-INT) (s, AC (ddddzeeee)) 7−→ (s, dze) f g M ∗ „ « σ1 σ2 σ2 σ1 (A-η) (s, ( AC (v)) 7−→ (s, λx:σ1. AC v CA (x) ) x fresh f g M f g g f „ « σo0 σo (A-OPEN) (s ]{` 7→ BLSSD}, AC CA[`] (v) ) 7−→ (s ]{` 7→ DFNCT}, v) g0 f 0 f g M „ « σo0 σo σo0 (A-REOPEN) (s, AC CA[`] (v) ) 7−→ (s, AC (blame f)) {` 7→ BLSSD} s g0 f 0 f g M g0 f 0 *

C 7−→M C g σ (C-MODA) (s, f ) 7−→ (s, CA (f)) module f : σ = v ∈ M M g f int (C-INT) (s, CA (dze)) 7−→ (s, ddddzeeee) f g M ∞ „ « σ1 σ2 C σ2 σ1 (C-η) (s, CA ( (v)) 7−→ (s, λλλλx:(σ1) . CA v AC (x) ) x fresh f g M f g g f σa σa (C-BLESS) (s, CA (v)) 7−→ (s ]{` 7→ BLSSD}, CA[`] (v)) ` fresh f g M f g 1 „ « σ1 σ2 σ2 σ1 (C-SPLIT) (s ]{` 7→ BLSSD}, CA[`] ( (v1) v2) 7−→ (s ]{` 7→ DFNCT}, CA v1 AC (v2) ) f g M f g g f 1 σ1 σ2 (C-RESPLIT) (s, CA[`] ( (v1) v2) 7−→ (s, blame f) {` 7→ BLSSD} s f g M *

Figure 10. Mixed operational semantics

Lemma 4.1 (Type Conversion). For any A type ττττ, ((ττττ)A )C = ττττ. represents a contract between the two modules that gave rise to C the nested expression. Some contracts, for example int, are fully The other direction does not hold, since (·) is not injective. 1 enforced by the type systems. Other contracts, such as int ( int, 4.2 Operational Semantics require dynamic checks. The type system statically ensures that the function receives and returns only integers, but this type also To express the intermediate states reached by our reduction se- imposes an obligation on the caller to apply the function at most mantics, we extend the syntax of our language to include several once. This is not checked statically in the C language. new forms (figure 10). This is required because, while our program syntax segregates the two sublanguages into separate modules, a The right subscript of a boundary—a module name in the inner module invocation should reduce to the body of the module, which language—is the positive party: It promises that if the enclosed leads expressions of both sublanguages to nest arbitrarily at run subexpression reduces to a value, then the value will obey the time. Rather than simply allow A expressions to appear directly contract σ. The left subscript is the negative party, which promises in C expressions, and vice versa, we need a way to cordon off to treat the resulting value properly. In particular, if the contract is expressions in one sublanguage embedded in the other, and to make affine, then the negative party promises to use the resulting value sure the interaction between them is well-behaved. We call these at most once. If an affine type appears in a negative portion of the new expression forms boundaries. Whereas Matthews and Findler contract, then this imposes an obligation on the positive party. (2007) allow nested expressions in source programs and avoid modules, Tobin-Hochstadt and Felleisen (2006) segregate their two Inter-language boundaries first arise when a module in one languages by module as we do, and boundaries arise only at run language refers to a module in the other language. When the name time. While this is not necessary for soundness in their system, it is of a C module appears in an A term, A-MODC wraps the module necessary in ours. name with an AC boundary, with the A -conversion of the module’s The new run-time syntax includes both boundary expressions ττττ type as the contract. The (outer) A module is the negative party σ f ACg(e) for embedding C expressions in A and boundary ex- and the (inner) C module is the positive party. When the name σ pressions f CAg (e) for embedding A expressions in C . Each of of an interface module appears, the contract is as declared by the these forms has a superscript σ, written on the A side, which interface, and the name of the interface is the positive party (A-

6 2009/3/2 MODI). From the other direction, an A module name in a C 4.3 Examples expression is wrapped by a CA boundary by rule C-MODA. A Type-Correct, Blame-Free Program. The following small pro- We add evaluation contexts which allow reduction under inter- gram consists of two A modules and a main expression (in C ) that language boundaries. This structure means that it is now possible σ invokes one of them: to construct a C evaluation context with an A hole: f CAg ([ ]A ). 0 1 ∞ 1 Therefore, we add the rule A-CTXT for reducing an A subexpres- module ap :(int int) int int = 1 ( ( ( sion in a C configuration. λf: int ( int. Eventually, the expression under a boundary may reduce to a λx: int. value. If that value is an integer, it passes through the boundary f x ∞ transparently (A-INT and C-INT). If it is a C function in an AC module inc : int ( int = boundary, then we η expand it (rule A-η) in the standard fashion λy: int. ap (λz: int. z + d1e) y for higher-order contracts (Findler and Felleisen 2002). Likewise, incmain dddd5eeee an unlimited A function may be applied multiple times by the C Function ap takes a one-shot function and an integer and applies language, so it also η expands (C-η). Affine values, however, must the function to the integer; this types in . By rule TA-LAMPROM, be treated specially. A σ ap is given an unlimited function type, since it, like all well-typed The new value form CA[`] (v) is called a sealed boundary. f g module bodies, has no free variables. The inner function, on the When the contract on a CA boundary specifies a reference or a one- other hand, closes over the affine argument f, which is why the shot function, we must treat the resulting value gingerly. Rather function returned by ap is given an affine type by TA-LAMAFF. than convert or η expand it, rule C-BLESS blesses the boundary Function inc takes an int, and then applies ap to an unlimited by sealing it with a location, which is used to track whether the ∞ int int function and the given integer. By TA-SUBSUME, the wrapped value has been used. We assume two distinguished values ( int ∞ int function may be used at the stronger affine type int 1 int, for seals: BLSSD and DFNCT. These values are merely tags, so ( ( so inc types as well. all that matters is that the reduction relation can distinguish them. Finally, the main expression (given party name main), applies Initially the location contains the value BLSSD, which indicates inc to an integer. a value that has yet to be used. This blessed value is a regular This program produces the value dddd6eeee. During reduction, the C value, and may be freely aliased. At some, point, however, a integer argument to inc passes transparently through an AC bound- dynamic check will be required: ary, the two A modules compute the result entirely within A , and finally the result passes back through a CA boundary in the last • For a wrapped one-shot function, the check occurs when an reduction step. attempt is made to apply the function. An Ill-Typed Module. Consider now a slight modification to If the seal location on the wrapper still contains BLSSD, then A module inc: the function has not yet been applied. Rule C-SPLIT then ∞ performs the application. It updates the location to contain module inc2 : int ( int = the value DFNCT, rendering any other blessed boundaries λy: int. 1 sealed by the same location defunct. let g: int ( int = ap (λz: int. z + d1e) in If the seal location contains DFNCT, that means that the g (g y) wrapped function has already been applied once, and this Unlike the previous case, the new program does not type, because boundary is known to be defunct. Rule C-RESPLIT signals A cannot type inc2. In particular, because the affine variable g is an error, blaming the negative party, which is responsible applied twice, we have a type error: There is no type σ such that for protecting the one-shot function assigned to its care. the judgment • For wrapped opaque values (references and pairs), cannot do M C y : int, g : int 1 int ` g (g y): σ anything interesting with them but hand them back to A . When ( A an opaque CA value appears directly in an AC boundary, the is provable in A , because there is no way to split the type environ- boundaries should cancel out and the value appear unwrapped ment in order to type g in both subexpressions of the application in A —but only once! expression. If the value has not been unwrapped before, as indicated by A Blameworthy C Module. On the other hand, if we rewrite inc2 its seal, rule A-OPEN unwraps the value and updates the in C , the type system has no problem with it: seal. module ap :(int 1 int) ∞ int 1 int = Other copies of the value may remain sealed on the side, 1 ( ( ( C λf: int int. but this is not a problem, as they are marked defunct. Rule ( λx: int. A-REOPEN signals an error rather than allow such a value f x to become unwrapped a second time. module inc2 : int → int = λλλλy: int. The new C expression blame f is used to signal that party f let g: int → int = apinc2 (λλλλz: int. z + dddd1eeee) in has violated a contract. An answer configuration (a) is a store g (g y) paired with either a value or an evaluation context with a blame inc2 dddd5eeee expression in the hole. Our type soundness theorem (§5) says that a well-typed program diverges or reaches an answer. Syntactically, This program types because we are only able to assign blame to modules, which means that, 1 ∞ 1 C ((int int) int int)C = (int → int) → int → int, as a trivial corollary to our soundness theorem, provided that we ( ( ( assign blame reasonably, only C modules can fail to live up to their which is the type that C sees for ap. While C ’s type system cannot contractual obligations. Intuitively, contracts in our calculus are A detect the problematic reuse of g, the second application of g is types, which are enforced completely by A ’s type system. prevented at run time by the contract on ap, which assigns blame

7 2009/3/2 (∅, inc2 dddd5eeee) 7−→M (C-MOD) inc2 (∅, (λλλλy: int. let g: int → int = ap (λλλλz:int. z + dddd1eeee) in g (g y))dddd5eeee) 7−→M (C-βv) inc2 (∅, let g: int → int = ap (λλλλz:int. z + dddd1eeee) in g (g dddd5eeee)) 7−→M (C-MODA) 1 ∞ 1 (int(int)(int(int (∅, let g: int → int = inc2CAap (ap)(λλλλz:int. z + dddd1eeee) in g (g dddd5eeee)) 7−→M (A-MOD) 1 ∞ 1 (int(int)(int(int 1 (∅, let g: int → int = inc2CAap (λf: int ( int. λx: int. f x)(λλλλz:int. z + dddd1eeee) in g (g dddd5eeee)) 7−→M (C-η) 1 1 int int“ 1 int int ” (∅, let g: int → int = (λλλλy: int → int. inc2CAap( (λf: int ( int. λx: int. f x) ap( ACinc2(y) )(λλλλz:int. z + dddd1eeee) in g (g dddd5eeee)) 7−→M (C-βv) 1 1 int int“ 1 int int ” (∅, let g: int → int = inc2CAap( (λf: int ( int. λx: int. f x) ap( ACinc2(λλλλz:int. z + dddd1eeee) in g (g dddd5eeee)) 7−→M (A-η) 1 int int` 1 int ` int ´ ´ (∅, let g: int → int = inc2CAap( (λf: int ( int. λx: int. f x)(λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) in g (g dddd5eeee)) 7−→M (A-βv) 1 int int` int ` int ´ ´ (∅, let g: int → int = inc2CAap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x in g (g dddd5eeee)) 7−→M (C-BLESS) 1 int int` int ` int ´ ´ ({` 7→ BLSSD}, let g: int → int = inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x in g (g dddd5eeee)) 7−→M (C-βv) 1 ({` 7→ BLSSD}, CA[`]int(int`λx: int. (λw: int. int AC `(λλλλz:int. z + dddd1eeee) CAint (w)´) x´) inc2 ap 1 ap inc2 inc2 ap int int` int ` int ´ ´ (inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x dddd5eeee) 7−→M (C-SPLIT) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x ) int ` int ` int ´ int ´ (inc2CAap (λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x) apACinc2(dddd5eeee) ) 7−→M (A-INT) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x ) int ` int ` int ´ ´ (inc2CAap (λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x) d5e ) 7−→M (A-βv) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x ) int ` int ` int ´ ´ (inc2CAap (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) d5e ) 7−→M (A-βv) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x ) int `int ` int ´´ (inc2CAap apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(d5e) ) 7−→M (C-INT) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x ) int `int ´ (inc2CAap apACinc2((λλλλz:int. z + dddd1eeee) dddd5eeee) ) 7−→M (C-βv) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x ) int `int ´ (inc2CAap apACinc2(dddd5eeee + dddd1eeee) ) 7−→M (C-δ) 1 int int` int ` int ´ ´ int `int ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x (inc2CAap apACinc2(dddd6eeee) )) 7−→M (A-INT) 1 int int` int ` int ´ ´ int ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x (inc2CAap(d6e))) 7−→M (C-INT) 1 int int` int ` int ´ ´ ({` 7→ DFNCT}, inc2CA[`]ap( λx: int. (λw: int. apACinc2 (λλλλz:int. z + dddd1eeee) inc2CAap(w) ) x dddd6eeee) 7−→M (C-RESPLIT) ({` 7→ DFNCT}, blame inc2)

Figure 11. Mixed language reduction sequence to the negative party, inc2. See figure 11 for an example reduction λλλλy: int. sequence for this program. let g: int → int = apinc (λλλλz: int. z + dddd1eeee) in let h: int → int = apinc (λλλλz: int. z + dddd1eeee) in The C Module, Corrected. Our run of the program revealed a bug h (g y) in inc2, so we know that we need to repair it: Now the program runs without attempting to apply a one-shot module inc2 : int → int = function twice, and thus it computes dddd7eeee with no trouble.

8 2009/3/2 A Call from A into C . We have seen how C code can call into either (s, e) diverges, or it reduces to an answer of the same type A code, but the reverse is also true: as M e. In order to prove a Wright-Felleisen–style type soundness the- module ap :(int → int) → int → int = orem (1994), it is necessary to identify precisely what property is λλλλf: int → int. preserved in a preservation lemma. Because we reduce configura- λλλλx: int. tions rather than programs, we need a type judgment for configura- f x tions, and because our expressions now contain locations, we need module inc : int ∞ int = ( a store environment (Σ) mapping locations to types. Augmenting λy: int. apinc (λz: int. z + d1e) y main our existing expression type rules with a store environment is not inc dddd5eeee enough, however. The expression rules for C , naively extended, This program types, and in particular, inc types, because the func- are too weak to express the property that we need. Consider, for ∞ example, a hypothetical typing judgment: tion it passes to ap has unlimited type int ( int rather than an affine type. 1 1 Σ; h : int ( int ( int An Ill-Typed A -to-C Call. However, if we explicitly give that 1 „ 1 « M int int int int function an affine type (by subsumption), then A ’s type system `C CA ( (h d2e) CA ( (h d3e) dddd4eeee : int. rejects the module: f g f g If we use TC-APP, which duplicates the type environment to both module inc : int ∞ int = ( subterms, there might be a derivation of this judgment. We do not λy: int. want this to be the case. The term being typed above is liable to let g: int 1 int = λz: int. z + d1e in ( go wrong, as both applications under CA boundaries will evaluate inc g y ap before any contract has an opportunity to say otherwise. Such The problem above is that A sees ap at type a configuration is not reachable from any well-typed program, A ∞ ∞ ∞ so preservation requires type judgments for run-time terms that ((int → int) → int → int) = (int ( int) ( int ( int, demonstrate that fact. and A does not allow passing a one-shot function where an unlim- To this end, our soundness theorem relies on an auxiliary type ited function is required. system for configurations with mixed terms. The auxiliary type M system has judgments of the form Σ; Γ .C e : ττττ, and it is careful An Interface Module Intervenes. This is where interfaces are use- about environment splitting even in rules for C terms: ful, since they allow us to assert that ap actually conforms to the (RTC-APP) desired A type: M 0 M Σ1;Γ1 .C e1 : ττττ → ττττ Σ2;Γ2 .C e1 : ττττ module ap :(int → int) → int → int = M λλλλf: int → int. Σ1  Σ2;Γ1  Γ2 .C e1 e2 : ττττ λλλλx: int. We emphasize that this auxiliary type system is merely a proof f x technique that generalizes our type system enough to state our 1 ∞ 1 interface iap :> (int int) int int = ap preservation invariant. In no way does it require that typing C ∞ ( ( ( module inc : int ( int = code in source programs be aware of context splitting and other λy: int. substructural trivia. It does, however, indicate why programs in our 1 let g: int ( int = λz: int. z + d1e in language require language boundaries between modules rather than iapinc g y arbitrarily nested expressions as in Matthews and Findler (2007): incmain dddd5eeee Arbitrary nesting means that environments with bindings for A may need to be managed by the type system for C , which forces C When type checking inc, we provisionally believe the contract to become as complex as A . Selected rules from the auxiliary type applied to iap, so its use type checks. In this case, ap indeed system may be found in appendix B. behaves as promised. If, however, we modify ap to apply f twice, Additional machinery is required to deal with blessed bound- then blame accrues to iap at run time. aries, which allow locations to be duplicated temporarily and scat- tered through a C term. 5. Type Soundness Our full set of run-time type rules in hand, we prove a lemma The presence of strong updates in our calculus makes it easy to that well-typed programs produce well-typed configurations: write programs, such as the example with tricky in §2.1, that get Lemma 5.1 (Programs to Configurations). stuck, in the sense that they reach a state that is not an answer • M M configuration but cannot take a step. Our soundness criterion is that If Γ `C e : ττττ then •; Γ .C e : ττττ. no such program can be assigned a type. • If Γ `M e : σ then •; Γ .M e : σ. 3 A A Some similar linear systems, such as L , use a semantic notion • If ` M e : ττττ then .M ( , e): ττττ. of soundness that involves the interpretation of types as sets of pairs ∅ of store fragments and values that have exclusive access to those Configurations now enjoy standard progress and preservation fragments (Ahmed et al. 2004). In their successor language λURAL, lemmas under the run-time type judgments, which allows us to state Ahmed et al. (2005) instrument values in order to detect when and prove a syntactic type soundness theorem. relevant and linear values would be implicitly discarded. Since our Definition 5.2 (Divergence). A configuration C diverges in M if affine language does not purport to protect against discarding, that 0 ∗ 0 00 for all C such that C 7−→M C , there exists a C such that kind of machinery is unnecessary here. Because strong updates are C0 7−→ C00. sufficiently capable of leading to stuck configurations, we are sat- M isfied with a syntactic soundness theorem that well-typed programs Theorem 5.3 (Type Soundness). If ` M e : ττττ, then either (s, e) ∗ do not reach such configurations. Our syntactic soundness theorem, diverges in M, or there is some answer a such that (s, e) 7−→M a formalized below, says that if a program M e type checks, then and .M a : ττττ.

9 2009/3/2 int 6. Discussion and Future Work Cf x g = x J K The work presented here is part of an ongoing program to inves- ∞ σ1(σ2 C σ1 σ2 tigate practical aspects of substructural type systems. Our answers Cf x g = λλλλy:(σ1) . let z = x Ag y f in Cf z g J K J K J K naturally raise more questions. 1 σ1(σ2 Cf x g = let u = new dddd0eeee in J K Exceptions. In a production language with a contract system, C contract violations should not always terminate the program. Real λλλλy:(σ1) . programs may catch an exception and either try to mitigate the if0 (!u) condition that caused it, try something easier instead, or report σ1 σ2 (u := dddd1eeee; let z = x Ag y f in Cf z g ) an error and go on with some other task. To ensure soundness, it J K J K suffices to prevent the questionable actions from occurring. (blame f) On one hand, we believe that ML-style exceptions should not σo provide too much difficulty in an affine setting. We expect that Cf x g = let u = new dddd0eeee in J K handle expressions should be multiplicative, in the sense that the λλλλy : unit. type environment needs to be split between an expression and its exception handler, not given in whole to both. if0 (!u) On the other hand, we do not know how exceptions or any (u := dddd1eeee; x) sort of blame might work in a linear setting—this is one reason why we chose an affine rather than linear calculus. Terminating (blame f) the program is dodgy because of the implicit discarding of linear x int = x values, but catching an exception once values on the stack have Ag f J K been lost seems even worse. Exceptions in linear languages remain ∗ x σ1(σ2 = λy:σ . let z = x y σ1 in z σ2 a very open question. Ag f 1 Cf g Ag f J K J K J K σo Linearity. Our work emphasizes contract-based interaction with Ag x f = x hi affine type systems rather than linear type systems because it J K remains unclear to us what linear contracts ought to mean. We Figure 12. Sketch of a type-directed translation may want a conventional language to interoperate with a language that (at least sometimes) prohibits discarding values. However, as Ahmed et al. (2005) point out, detecting illicit discarding may Implementation. We believe that our technique admits a reason- require special instrumentation. Unlike affine guarantees, which able implementation strategy, given an existing language imple- correspond to safety properties, relevance guarantees—that a value mentation standing in for C and a similar but affine language for is used at some point in the future—are a form of liveness property A . Significantly, the C implementation need not be modified, and (Alpern and Schneider 1987), a violation of which may result only A does not require access to the sources of C modules in a mixed in long-term resource leaks. program. We require that C support references in order to track Conventional contracts regulate only sins of commission: when blessed versus defunct values, and that it have some mechanism for a party agrees to a conventional contract, it essentially agrees not signaling blame. to do something, such as not passing a non-integer to a particular We type check the whole program in the mixed type system, function, or not returning anything other than a string.3 If the and then translate the A modules to C , renaming and wrapping so party ever does what it agreed not to do, that constitutes a contact that A modules access each other directly, but only communicate violation. Likewise, for affine contracts, a party agrees not to use with C modules through wrappers implementing the necessary some value more than once; if it ever does, then a violation is contracts. Finally, we compile the whole program with an existing reported. When exactly should we report a violation when a party C compiler. takes on the positive obligation to do something? Matthews and Findler (2007) show how to implement the con- Perhaps the most conservative answer is this: when a party tracts in their multi-language system as checks and error signal- assumes responsibility to use a value, the party is obliged not to ing written in their object language. For us, the key is replacing allow the program to terminate before using the value. This may the C-BLESS reduction rule with an appropriate translation for seem rather pointless: The program is about to exit, but we detect a boundaries. A sketch of a potential type-directed translation, which contract violation and thus terminate the program instead! While it assumes several additional but uncontroversial features in C , may be found in figure 12. When an A module g with type σ appears is too late to prevent a resource leak, it is not too late to file a bug σ report. in a C module f, we translate it as Cf g g . Likewise, when a C module f having A -type σ appears in anJ AK module g, we translate With such an interpretation of linear contracts, we could rea- σ sonably consider the contract to be violated if at any point we can it as Ag f f . determine that the contract necessarily will be violated. Detecting We conjectureJ K that a translated program faithfully simulates the the violation of such a liveness property is undecidable for any original program. nontrivial language, but tracing garbage collection may be used to Affine Contracts without Affine Types. The dynamic properties detect violations of linearity approximately and somewhat late. In enforced by affine contracts may be useful even in a language an idealized semantics, we might garbage collect the store after that lacks affine types. Affine contracts can provide a conventional each reduction step and signal a violation if the seal location of a typed language or an untyped language a way to specify formally not-yet-used linear value has become unreachable. In a real imple- and then check resource usage invariants. We believe an implemen- mentation, finalizers on linear values could detect sins of omission. tation along the lines of the above sketch could easily be added to Even if we detect a violation early, however, it is not clear that we some existing contract system such as PLT Scheme’s (Findler and can do anything to prevent it. Felleisen 2002; Flatt 2007). Our calculus can express affine contracts directly between C 3 For conventional contracts, non-termination is an acceptable outcome. modules by a simple translation. Assume we have a C module,

10 2009/3/2 ∞ module f : ττττ = v, to which we wish to add the contract σ1 σ2, A. Semantics of C ∞ C ( where (σ1 ( σ2) = ττττ. We simply rename f and wrap it with an interface having the contract, and wrap that in turn with an A C 7−→M C module having the original name: (C-βv) (s, (λλλλx:ττττ. e) v) 7−→ (s, e[v/x]) M module fu : ττττ = v ∞ (C-MOD) (s, f) 7−→ (s, v) module fi :> σ1 ( σ2 = fu M f module f : σ1 → σ2 = λx:σ1. fi x module f : ττττ = v ∈ M 0 The A module is necessary here only because our system does (s, e) 7−→ (s, e ) M not place contract boundaries between same-language modules; (C-CTXT) (s, E[[[[e]]]] ) 7−→ (s, E[[[[e0]]]] ) it is not difficult to lift this restriction and provide a more direct C M C implementation of affine contracts in a conventional language. Dynamic Promotion Several linear type systems offer some way `M (module f : ττττ = v) okay for linear values to become temporarily unlimited. In L3, rather roughly, reference cells may be frozen, which makes them aliasable (TM-C) but allows only weak updates; provided a suitable witness that M • `C v : ττττ a frozen cell is not aliased, it may be thawed to provide strong M updates again (Ahmed et al. 2004).4 Wadler (1990) introduces ` (module f : ττττ = v) okay let!, which temporarily treats a single-threaded, mutable value as M unlimited but read-only. Γ `C e : ττττ Affine contracts point to another way to loosen the duplication restriction. We allow one-shot functions to be provisionally pro- (TC-MOD) (TC-INT) moted to unlimited functions: module f : ττττ = v ∈ M M 1 M M Γ `A e : σ1 ( σ2 Γ `C f : ττττ Γ `C ddddzeeee : int M ∞ Γ `A promote e : σ1 σ2 ( (TC-VAR) (TC-LAM) At run time, promote must wrap its argument in a contract that M 0 Γ, x : ττττ `C e : ττττ ensures it is actually applied at most once. Γ, x : ττττ `M x : ττττ Γ `M λλλλx:ττττ. e : ττττ → ττττ 0 In fact, promote (for a given σ1 and σ2) is definable in our C C calculus: (TC-APP) 0 1 C ∞ C M 0 M 0 module promote :(σ1 σ2) → (σ1 σ2) ( ( Γ `C e1 : ττττ → ττττ Γ `C e2 : ττττ 1 C = λλλλx:(σ1 ( σ2) . x M 1 ∞ ∞ Γ `C e1 e2 : ττττ module promote :> (σ1 ( σ2) ( (σ1 ( σ2) = promote0 B. The Auxiliary Type System 7. Conclusion As explained in §5, we use an auxiliary type system for proving This paper describes one step in our research program to study the soundness theorem. We present selected portions of that system substructural type systems for practical programming. Here, we here. have focused on the problem of interaction between substructural A store type maps locations to C types, A types, and sealed A and non-substructural code, each governed by its own type system. types, which are used to type locations that appear under blessed Our solution synthesizes two previously disparate technologies, boundaries. 0 substructural type systems and higher-order contracts. We have store types Σ ::= • | Σ, ` : ττττ | Σ, ` : σ | Σ, ` :[σ]` illustrated our solution through a calculus that combines modules written in an affine-typed language and modules written in a con- Environment splitting (Γ  Γ = Γ) is extended to treat C types ventionally typed language. The two languages call one another and as duplicable, and store type splitting (Σ  Σ = Σ) is defined to exchange values freely, using contracts to prevent the conventional duplicate C types and sealed A types but not ordinary A types. language from breaking the affine language’s stronger invariants. A To type the store, we type each element in it. Locations holding soundness theorem validates the correctness of our approach. A values may be given either sealed or ordinary A types. Our work suggests that adding the ability to write substructural libraries in a conventional programming language such as ML, Σ .M s :Σ Haskell, or Scheme does not require a particularly complicated implementation, and our results yield a realistic contract-based (TH-EMPTY) (TH-CLOC) M 0 M implementation design. Σ1 . s :Σ Σ2; • .C v : ττττ M M 0 Σ . • : • Σ1 Σ2 . (s, ` 7→ v) : (Σ , ` : ττττ) Acknowledgments  We greatly appreciate Sam Tobin-Hochstadt’s help in thinking (TH-ALOC) M 0 M about contracts and inter-language systems. We wish to thank Σ1 . s :Σ Σ2; • .A v : σ Daniel Brown, Ryan Culpepper, Jed Davis, Alec Heller, and Aaron M 0 Σ1 Σ2 . (s, ` 7→ v) : (Σ , ` : σ) Turon for their helpful comments and corrections.  (TH-ALOCPROT) 4 3 In reality, L distinguishes reference cells from capabilities to access Σ .M s :Σ0 Σ ; • .M v : σ those cells. Only the capabilities are linear, but our characterization applies 1 2 A 0 if we treat cells and their capabilities as a single entity. M 0 ` Σ1  Σ2 . (s, ` 7→ v) : (Σ , ` :[σ] )

11 2009/3/2 To type a configuration, all the modules must be okay; then we type A. Ahmed, M. Fluet, and G. Morrisett. A step-indexed model of the store, and whatever part of the store type is not used in typing substructural state. In Proc. 10th ACM SIGPLAN International the store itself (as locations may contain other locations), we use in Conference on Functional Programming (ICFP’05), pages 78–91. typing the main expression: ACM Press, 2005. B. Alpern and F. B. Schneider. Recognizing safety and liveness. .M C : ττττ Distributed Computing, 2:117–126, 1987. E. Barendsen and S. Smetsers. Uniqueness typing for functional languages (T-CONF) with graph rewriting semantics. Mathematical Structures in Computer ∀m ∈ M. `M m okay Science, 6(6):579–612, 1996. M M P. N. Benton. A mixed linear and non-linear logic: Proofs, terms and Σ1 . s :Σ1  Σ2 Σ2; • .C e : ττττ M models. In Proc. Computer Science Logic (CSL’94), number 933 in . (s, e): ττττ Lecture Notes in Computer Science, pages 121–135. Springer-Verlag, 1995. M M Σ; Γ .A e : σ and Σ; Γ .C e : ττττ R. B. Findler and M. Felleisen. Contracts for higher-order functions. In Proc. 7th ACM SIGPLAN International Conference on Functional (RTA-LOC) (RTC-BLAME) Programming (ICFP’02), pages 48–59. ACM Press, 2002.

M M C. Flanagan. Hybrid type checking. In Proc. 33rd Annual ACM Σ, ` : σ;Γ .A ` : σ ref Σ; Γ .C blame f : ττττ Symposium on Principles of Programming Languages (POPL’06), volume 41, pages 245–256. ACM Press, 2006. (RTA-BOUNDARY) (RTC-BOUNDARY) M C M M. Flatt. PLT MzScheme: Language manual. Reference Manual Σ; Γ .C e :(σ) Σ; Γ .A e : σ PLT-TR2007-1-v372, PLT Scheme Inc., December 2007. M σ M σ C Σ; Γ .A AC (e): σ Σ; Γ .C CA (e):(σ) S. J. Gay and M. J. Hole. Types and subtypes for client-server interactions. f g f g In Proc. 8th European Symposium on Programming (ESOP’99), volume 1576 of Lecture Notes in Computer Science, pages 74–90. (RTC-BLESSED) Springer-Verlag, 1999. 0 M a 0 ` Σ ; • .A v : σ [Σ ] = Σ J.-Y. Girard. Linear logic. Theoretical Computer Science, 50:1–102, 1987. a M σ a C T. Jim, G. Morrisett, D. Grossman, M. Hicks, J. Cheney, and Y. Wang. Σ, ` : tyC (BLSSD); Γ .C CA[`] (v):(σ ) f g Cyclone: A safe dialect of C. In Proc. USENIX Annual Technical Conference, pages 275–288, 2002. (RTC-DEFUNCT) 0 ` J. Matthews and R. B. Findler. Operational semantics for multi-language [Σ ] = Σ programs. In Proc. 34th Annual ACM Symposium on Principles of M σa a C Programming Languages (POPL’07), volume 42, pages 3–10. ACM Σ, ` : ty (DFNCT); Γ .C CA[`] (v):(σ ) C f g Press, 2007. G. Plotkin. Type theory and recursion. Proc. 8th Annual IEEE Symposium (RTC-LAM) on Logic in Computer Science (LICS’93), 1993. M 0 Σ; Γ, x : ττττ.C e : ττττ Σ; Γ .C λλλλx:ττττ. e worthy J. G. Siek and W. Taha. for functional languages. In Σ; Γ .M λλλλx:ττττ. e : ττττ → ττττ 0 Workshop on Scheme and Functional Programming, pages 81–92. C ACM Press, 2006. (RTA-LAMPROM) R. Strom and S. Yemini. Typestate: A programming language concept for Σ; Γ, x : σ .M e : σ0 Σ; Γ . λx:σ. e worthy enhancing software reliability. IEEE Transactions on Software A A Engineering, 12(1):157–171, 1986. M ∞ 0 Σ; Γ .A λx:σ. e : σ ( σ S. Tobin-Hochstadt and M. Felleisen. Interlanguage migration: From scripts to programs. In Companion to the 21st ACM SIGPLAN (RTA-LAMAFF) Conference on Object-Oriented Programming Systems (OOPSLA’06), M 0 Σ; Γ, x : σ .A e : σ Σ; Γ 6 .A λx:σ. e worthy pages 964–974. ACM Press, 2006. Σ; Γ .M λx:σ. e : σ 1 σ0 S. Tobin-Hochstadt and M. Felleisen. The design and implementation of A ( Typed Scheme. In Proc. 34th Annual ACM Symposium on Principles of Σ; Γ . e worthy and Σ; Γ . e worthy Programming Languages (POPL’07), pages 395–406. ACM Press, A C 2008. (WORTHY-A) (WORTHY-C) D. N. Turner, P. Wadler, and C. Mossin. Once upon a type. In Proc. 7th a a @σ ∈ Γ(FV(e)) @σ ∈ Γ(FV(e)) International Conference on Functional Programming Languages and Computer Architecture (FPCA’95), pages 1–11. ACM Press, 1995. @σ ∈ Σ(FL(e)) @σ ∈ Σ(FL(e)) Σ; Γ . e worthy Σ; Γ . e worthy P. Wadler. Linear types can change the world. In Programming Concepts A C and Methods, pages 347–359. North Holland, 1990.

` D. Walker. Substructural type systems. In B. C. Pierce, editor, Advanced [Σ] = Σ Topics in Types and Programming Languages, chapter 1, pages 3–44. MIT Press, 2005. [•]` = • [Σ, `0 : ττττ]` = [Σ]`, `0 : ττττ A. K. Wright and M. Felleisen. A syntactic approach to type soundness. Information and Computation, 115(1):38–94, 1994.

00 ` 00 [Σ, `0 : σ]` = [Σ]`, `0 :[σ]` [Σ, `0 :[σ]` ] = [Σ]`, `0 :[σ]`

References A PLT Redex model of our calculus, including an algorithmic variant of the A. Ahmed, M. Fluet, and G. Morrisett. L3: A linear language with type system, may be found at http:// www.ccs.neu.edu/ ∼tov/ substructural/ locations. Technical Report TR-24-04, Harvard University, 2004. affine-contracts.ss.

12 2009/3/2