A Separate Compilation Extension to Standard ML

David Swasey Tom Murphy VII Karl Crary Robert Harper Carnegie Mellon {swasey,tom7,crary,rwh}@cs.cmu.edu

Abstract The goal of this work is to consolidate and synthesize previ- We present an extension to Standard ML, called SMLSC, to support ous work on compilation management for ML into a formally de- separate compilation. The system gives meaning to individual pro- fined extension to the Standard ML language. The extension itself gram fragments, called units. Units may depend on one another in a is syntactically and conceptually very simple. A unit is a series of way specified by the . A dependency may be mediated Standard ML top-level declarations, given a name. To the current by an interface (the type of a unit); if so, the units can be compiled top-level declarations such as structure and functor we add an separately. Otherwise, they must be compiled in sequence. We also import declaration that is to units what the open declaration is to propose a methodology for programming in SMLSC that reflects structures. An import declaration may optionally specify an inter- code development practice and avoids syntactic repetition of inter- face for the unit, in which case we are able to compile separately faces. The language is given a formal semantics, and we argue that against that dependency; with no interface we must compile incre- this semantics is implementable in a variety of compilers. mentally. Compatibility with existing compilers, including whole- program compilers, is assured by making no commitment to the Categories and Subject Descriptors .3.3 [Programming Lan- precise meaning of “compile” and “link”—a compiler is free to guages]: Language Constructs and Features—Modules, pack- limit compilation to elaboration and type checking, and to perform ages; D.3.1 [Programming Languages]: Formal Definitions and code generation as part of linking. Theory—Syntax Sections 1 and 2 summarize our main design principles, and provide an overview of the system. In Section 2 we give a small General Terms Languages example of its use. The semantics, formulated in the framework of the Harper-Stone semantics of ML [12, 13], is given in Section 3.1 Keywords standard ml, separate compilation, incremental compi- Some implementation issues are discussed in Section 4. We con- lation, types clude with a discussion of related work in Section 6.

Introduction 1. Design Principles We propose an extension to Standard ML called SMLSC. SMLSC A language, not a tool. We propose an extension to the Standard supports separate compilation in the sense that it gives a static se- ML language to support separate compilation, rather than a tool to mantics to individual program fragments, which we call units.A implement it. The extension is defined by a semantics that extends unit may depend on other units, and can be type-checked indepen- the semantics of Standard ML to provide a declarative description dently of those units by specifying what it expects of them. These of the meanings of the language constructs. The semantics provides expectations are given in the form of interfaces for those other a clear correctness criterion for implementations to ensure source- units. When unit A is checked against another unit B via a medi- level compatibility among them. ating interface, we need not have access to B at all. Therefore we Flexibility. A compilation unit consists of any sequence of top- say that A is separately compiled (SC) against B. 2 It is also useful to allow unit A to depend on another unit level declarations, including signature and functor declarations. B without specifying an interface for B. In this case, the only However, since Standard ML lacks syntactically expressible signa- way to derive the context necessary to check A is to first check B tures, some units cannot be separately compiled from one another. and read off its actual interface. In this scenario we say that A is We therefore support incremental, as well as separate, compilation incrementally compiled (IC) against B. for any unit. This means that the interface of a unit can either be in- Units may be compiled and then linked together to satisfy de- ferred from its source (incremental compilation) or explicitly spec- pendencies. The compiled form of a unit or of linked units is ified (separate compilation) at the programmer’s discretion. called a linkset. A linkset may be further linked with other linksets. Simplicity. The design provides only the minimum functionality If a linkset has no remaining dependencies, then it can be trans- of a separate compilation system. It omits any form of compilation formed into an executable program. parameters, conditional compilation directives, or compiler direc- tives. We leave for future work the specification of such additional machinery. Conservativity. The semantics of Standard ML should not be Permission to make digital or hard copies of all or part of this work for personal or changed by the introduction of separate compilation. In particu- classroom use is granted without fee provided that copies are not made or distributed lar, we do not permit “circular dependencies” or similar concepts for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute 1 to lists, requires prior specific permission and/or a fee. We give a semantics in the framework of The Definition of Stan- ML’06 September 16, 2006, Portland, Oregon, USA. dard ML [16] in the companion technical report [21]. Copyright c 2006 ACM 1-59593-483-9/06/0009. . . $5.00. 2 Consequently, units cannot be identified with Standard ML structures. srcunit ::= unit unitid = top topdec end unit declaration topdec ::= import impexp open units strdec structure-level declaration sigdec signature declaration fundec functor declaration local topdec1 in topdec2 end local declaration topdec1 topdec2 impexp ::= unitid h: intexpi open unit unitid impexp1 impexp2 intexp ::= intf topspec end interface expression topspec ::= spec structure-level specification functor funspec functor specification topspec1 topspec2 funspec ::= funid(strid : sigexp): sigexp0 hand funspeci

Figure 1. SMLSC concrete syntax that are not otherwise expressible in Standard ML. This ensures one unit on another is mediated by an interface, the type of a unit. that compilers should not be disturbed by the proposed extension The interface of an imported unit can be specified in one of two beyond what is required to implement the extension itself. ways, either implicitly or explicitly, corresponding to incremental or separate compilation. Explicit dependencies. The dependencies among units are ex- An import declaration of the form import unitid : intexp plicitly specified, not inferred. The chief reason for this is that de- specifies an explicit interface for the imported unit. This permits the pendencies among units may not be syntactically evident—for ex- importing unit to be compiled independently of the implementation ample, the side effects of one unit may influence the behavior of of unitid, relying only on the specified interface. This is called another. There are in general many ways to order effects consis- separate compilation, or SC for short. An import declaration of tently with syntactic dependencies, and these orderings need not be the form import unitid specifies that the interface for unitid is equivalent. A lesser reason is that supporting dependency inference to be inferred from its source code. This is called incremental requires restrictions on compilation units that are not semantically compilation, or IC for short. necessary, reducing flexibility. The concrete syntax for units and interfaces in SMLSC is given Environment independence. The separate compilation system is in Figure 1. We extend topdecs to add import and local. The defined independently of any environment in which it might be import declaration (like open) allows multiple units to be simulta- implemented. The design speaks in terms of linguistic and semantic neously imported. Interfaces are topspecs; this is the syntactic class entities, rather than implementation-specific concepts such as files spec of Standard ML, with the addition of a specification form for or directories. functors. The local declaration limits the scope of imports, just as the structure declaration of the same name. Separation of units from modules. The separate compilation sys- tem is designed as a proper extension to Standard ML so as to en- sure backward compatibility of source code. It is tempting to iden- Projects and Linksets tify compilation units with modules, but to do so would require that A linkset consists of several compiled units, called its exports, functors, signatures, and fixity declarations be permitted as com- together with the names and interfaces of its imports, the units ponents of modules. Permitting such an extension is not entirely on which it depends (following Cardelli [5]). A project consists straightforward; for example, permiting signature declarations in of a linearly ordered sequence of source units and linksets. The modules and their types can lead to undecidability of type check- ordering of the components in a project is significant, both because ing [10]. it specifies the order of identifier resolution, and because it specifies the order of computational effects when executed. Compilation of 2. Overview a project consists of processing the source units in the specified order to obtain linksets, and then knitting them together to resolve Units and Interfaces dependencies. The SMLSC extension is organized around the concept of a unit.A Linking consists of resolving inter-unit dependencies by bind- unit consists of top-level declarations, which include declarations ing exports to imports among linksets. When all references have of signatures, structures, and functors. Each unit is given a name been resolved, the resulting linkset can be completed to form an by which the unit is known throughout the program. One unit may executable. refer to the components of another using an import declaration, We do not give a concrete syntax for linksets, as we do not in- which records the dependency of the importing unit on the imported tend for to write them, nor do we expect compati- unit, and opens it for use within the importing unit. This is the bility of linksets across implementations. Rather, they are left as only means by which one unit may refer to another; we do not implementation-specific concepts (such as object files), which are support “dot notation” for accessing the components of a unit. An modeled here by the abstract semantic objects described in Sec- import declaration is a new form of top-level declaration. (This is tion 3. the only modification that we make to an existing syntactic category of Standard ML.) The compilation context for a unit is entirely determined by its Examples imports. That is, all dependencies of a unit on another unit must be We begin with a few simple examples to illustrate the features of explicitly indicated using import declarations. The dependency of the system. Suppose that we have a library of data structures whose name This allows Scheduler2 and Collections to be compiled sepa- is Collections. It is natural to place this library in a unit. Let’s rately: assume it contains only the queue : L0 = link(Scheduler2) unit Collections = L1 = link(Collections) top L2 = link(L1,L0) signature QUEUE = However, writing the SC import this way forces an undesirable sig repetition of code. If more than one client uses Collections— type ’a queue which we would expect—each client repeats the interface for its val empty : ’a queue import of the unit. A further problem is that this style asks the client val push : ’a * ’a queue -> ’a to supply the interface of the library, but the interface of a library is val pop : ’a queue -> ’a * ’a queue usually provided by the library author, not the client. Fortunately, end a combination of SC and IC allows us to use the system in a much structure Queue :> QUEUE = cleaner way. struct (* ··· *) end end Handoff Units A client of the Collections library can import it using IC easily: A programmer who wishes his code to be available for separate compilation can provide a handoff unit which supplies the interface. unit Scheduler = Starting from scratch, the handoff unit contains an SC import: top import Collections unit Collections = structure Sched = top struct signature QUEUE = type job = (* ··· *) sig val readyqueue = type ’a queue ref Queue.empty : job Queue.queue ref val empty : ’a queue (* ··· *) val push : ’a * ’a queue -> ’a end val pop : ’a queue -> ’a * ’a queue end end import CollectionsImpl : In these examples we use link to stand for the semantic operation intf of compiling and linking a list of source units and linksets. We can structure Queue : QUEUE compile and link this program as end L = link(Collections, Scheduler) end The implementation of the collections library is moved to the or we can compile the library and then the client unit CollectionsImpl. Because the import declaration opens

L0 = link(Collections) the imported unit, all of the contents of CollectionsImpl are L1 = link(L0, Scheduler) available in the Collections unit. (Note that signature declara- tions do not appear in interfaces, but we can write these before the but the Scheduler unit may not be compiled on its own. import declaration and then use the signature bindings in the inter- Incremental compilation is convenient when we have source or face for the same effect.) Clients wishing to make use of the library a compiled linkset for the Collections unit. We may prefer to use simply import the handoff unit using IC: separate compilation, or may be forced to because the implementa- tion for Collections is not available. A client with an SC import unit Scheduler3 = looks like this: top import Collections unit Scheduler2 = structure Sched = (* ··· *) top end import Collections : intf This additionally has the benefit that the clients only need to structure Queue : know the name of the handoff unit, not the implementation unit. A sig few such clients can be linked with the handoff unit: type ’a queue L0 = link(Collections, Scheduler3, OtherClient) val empty : ’a queue val push : ’a * ’a queue -> ’a The result can later be linked with the implementation of the Col- val pop : ’a queue -> ’a * ’a queue lections library: end L1 = link(CollectionsImpl, L0) end Definite References structure Sched = In the terminology of Harper and Pierce [11] an import of one unit struct in another is interpreted as a definite reference—that is, as a free type job = (* ··· *) variable that refers to single, specific unit through an interface for val readyqueue = it, either inferred or specified. This ensures that if two separate units ref Queue.empty : job Queue.queue ref import a common unit, such as a well-known library, these units (* ··· *) share a common understanding of the abstract types exported by end that unit. No additional sharing specifications are required to use end the separate compilation system. This is in sharp contrast to the so-called fully functorized style of Judgement... Meaning... use of the ML modules system, in which a functor is λ-abstracted ` decs ok decs is well-formed over the modules on which it depends. In this formulation refer- decs ` sdecs ok sdecs is well-formed ences are interpreted as indefinite, in that the functors involved may decs ` sig : Sig sig is well-formed be applied to any modules satisfying those interfaces, not necessar- decs ` sig ≡ sig 0 : Sig signature equivalence ily in a coherent manner. Different functors with a parameter refer- decs ` sbnds : sdecs sbnds has declaration list sdecs ring to “the” shared library might be applied to different instances decs ` mod : sig mod has signature sig of it. To ensure coherence in the presence of indefinite references, one must resort to explicit type sharing specifications, which are Figure 2. TSIL judgements (summary) potentially burdensome. The handoff methodology in SMLSC facilitates programming with definite references. Two pieces of code that SC import the We can systematically rename the bound variables t and x.A path same unit may only be linked if they import that unit at equivalent is a module variable followed by a list of labels, serving a role interfaces. The imports are then consolidated into a single import, similar to SML long identifiers. The paths m.T (a constructor) and ensuring that type equations hold. When the same handoff unit m.X (an expression) refer to m’s components. is used to create the two imports, these interfaces will always be A bnd binds a variable to an expression (var=exp), constructor equivalent. In corner cases such as skew between versions of a (var=con), or module (var=mod). A structure binding list sbnds library’s handoff unit, the programmer may manually consolidate has the form two imports. We discuss this further in Section 5. lab1Bbnd 1,..., labnBbnd n. A structure is written [sbnds]. The module syntax is closed under 3. Semantics the formation of functors: dependently typed functions from mod- We give a semantics to SMLSC by extending Harper and Stone’s ules to modules. Typed Semantics (TS) for Standard ML [13].3 At a high level the We shall use the TSIL judgements given in Figure 2. These typed semantics consists of an elaboration relation from an exter- judgements have the following meaning. nal language, called TSEL, into an internal language, called TSIL. • decs ` sdecs ok. No label is used twice and every declaration The external language is a slight extension of the abstract syntax of is well-formed. For example, Standard ML. The internal language is a typed λ-calculus based on the Harper-Lillibridge type theory for modules [10]. Elaboration ` T Bt:Ω,XBx:t ok. comprises type inference, pattern compilation, equality compila- • decs ` sig : Sig. The signature sig is well-formed. tion, identifier resolution, and insertions of coercions for signature matching. The result of elaboration is a well-formed program in • decs ` sig ≡ sig 0 : Sig. The signatures sig and sig 0 declare the TSIL, to which a dynamic semantics is given to provide an ex- the same components, in the same order, with the same labels, ecution model. The semantics of SMLSC is an extension of the and corresponding type components are equivalent. Harper-Stone semantics that elaborates units into linksets that can • decs ` sbnds : sdecs. The structure binding list sbnds be completed for execution. matches the structure declaration list sdecs. Corresponding The TSIL. We begin with a brief review of the structure of the labels must agree and each bound expression, constructor, or TSIL. The TSIL consists of a core level and a module level. The module in sbnds must match its declaration in sdecs. For ex- core level includes expressions exp, constructors con, and kinds ample, the judgement knd. Kinds classify constructors. Constructors of kind Ω are types; decs ` (labBvar=mod, sbnds):(labBvar:sig, sdecs) they classify expressions. The module level includes modules mod and signatures sig, which classify modules. We write {} to denote holds if decs ` mod : sig and decs, var:sig ` sbnds : sdecs. the empty record, and mod.lab to denote the projection of a com- • decs ` mod : sig. The module mod has signature sig. The ponent named lab from the structure mod. The semantics works signature sig may or may not be fully transparent. For example, mainly with modules, ultimately elaborating units to TSIL struc- we may derive both tures. m :[T t:Ω,X x:t] ` m :[T t:Ω=m.T, X x:t] Declaration lists serve as contexts in the TSIL static semantics. B B B B A declaration list decs = dec1,..., decn declares expression and (var:con), constructor (var:kndh=coni), and module (var:sig) m :[T t:Ω,X x:t] ` m :[T t:Ω,X x:t]. variables. A structure declaration list sdecs has the form B B B B The former signature is said to be selfified with respect to the lab dec ,..., lab dec 1B 1 nB n variable m. associating a label with each declaration. The structure declara- TS elaboration. Harper and Stone give a semantics to Standard tion list lab dec, sdecs binds the variable declared by dec with B ML by elaboration of TSEL into TSIL. Elaboration is performed scope sdecs. We write [sdecs] to denote the signature of a structure in a context Γ consisting of a structure declaration list (sdecs) that, containing fields described by sdecs. Variables express dependen- due to shadowing, may have duplicate labels. We shall use the TS cies between components in a structure signature and may be freely elaboration judgements given in Figure 3. These judgements have alpha-varied. Labels name components for external reference and the following meaning: may not be renamed without changing the meaning of the signature. Consider the declaration of a structure m containing an opaque type • Γ ` strdec sbnds : sdecs. Elaborate the TSEL structure component T and value component X of that type: declaration strdec to the structure binding list sbnds : sdecs. Since the TSEL permits functors within structures, this includes m :[T t:Ω,X x:t]. B B elaboration of functor declarations. 3 In the companion technical report [21], we also give a semantics by • Γ ` sigexp sig : Sig. Elaborate the TSEL signature extending The Definition of Standard ML [16]. expression sigexp to the signature sig. The TSEL does not Judgement... Meaning... Judgement... Meaning... Γ ` strdec sbnds : sdecs structure declaration elaboration decs ` L ok L is well-formed Γ ` sigexp sig : Sig signature elaboration decs ` S ok S is well-formed Γ ` spec sdecs specification elaboration L exp : {} L completes to exp 0 00 0 00 Γ `ctx labs path : class context lookup decs ` L++L L L and L merge to L 0 decs `sub path : sig 0  sig mod : sig coercion compilation Figure 4. Linking judgements

Figure 3. TS elaboration judgements (summary) L ::= sdecs0 → sbnds : sdecs; S linkset S ::= · include signature declarations; we treat them as abbreviations S, sigid = sig top-level 0 for TSIL signatures, recording them in linksets and expanding S, unitid = S declared by unitid them during elaboration. • Γ ` spec sdecs. Elaborate the TSEL specification spec to Figure 5. Linkset syntax the structure declaration list sdecs. This includes elaboration of functor specifications. Imports specify assumptions to be satisfied by linking. A linkset • Γ `ctx labs path : class. Perform identifier resolution in with imports sdecsAB assumes structure B binds (at least) a the context Γ. The input is a list of labels, which is derived value Y : a.T but can be linked with (a linkset exporting) a from an SML long identifier; the output is a path classified by structure B providing more components. the type, kind, or signature class. Some labels in the context are annotated with a star, indicating that they are “open” (in • The exports sbnds : sdecs are the TSIL code associated with the sense of the SML open declaration). Identifier resolution the linkset. They may make reference to the linkset’s imports. searches Γ from right to left, descending into structures with Continuing our example, the exports starred labels. For example, we may derive sbndsZR : sdecsZR = ZBz=b.Y, RBr=a.T : Z z:a.T, R r:Ω=a.T T Bt1:Ω,T Bt2:Ω={} ` T t2 : Ω={} B B and reference the imports sdecsAB to bind an expression Z of the imported type and an equivalent type R. X x :{}, 1? m:[T t:Ω,X x :t] ` X m.X : m.T. B 1 B B B 2 • The signature abbreviations S are used during elaboration. • 0 They may make reference to the linkset’s imports and exports. decs `sub path : sig 0  sig mod : sig . Perform transparent signature ascription. The inputs are a signature sig0, Continuing our example, the signature abbreviations a path having that signature, and a target signature sig. The SSIG = SIG=[MBm:Ω=r] output is a module mod : sig 0, where sig 0 has the same shape as sig but is fully transparent relative to path. specify that elaboration should treat the signature identifier SIG as an abbreviation for a TSIL signature referencing the exported Elaboration maps TSEL identifiers to TSIL labels using a func- type R. tion · . To implement identifier “shadowing,” elaboration employs a function sbnds++sbnds0 : sdecs++sdecs0 that concatenates The dynamic semantics for SMLSC is very simple. The com- sbnds : sdecs and sbnds0 : sdecs0, renaming labels in the left pletion judgment L exp : {} translates a linkset hand sides that appear in the right hand sides. The function chooses · → sbnds : sdecs; S fresh labels that do not correspond to TSEL identifiers. For exam- ple, if with no imports to a TSIL expression [sbnds, lab var={}].lab : {} sbnds : sdecs = T Bt1={} : T Bt1:Ω={} B 0 0 sbnds : sdecs = T Bt2=Int : T Bt2:Ω=Int, where lab and var are fresh. Under the TSIL dynamic semantics, the resulting expression evaluates the linkset’s exports from left to then sbnds++sbnds0 : sdecs++sdecs0 might be right for their side-effects. Evaluation terminates when an uncaught (labBt1={}, T Bt2=Int):(labBt1:Ω={}, T Bt2:Ω=Int) exception is raised or when every export has been evaluated. We give the full syntax for linksets in Figure 5 and the rules in where lab is not in the range of the · function. Appendix A. The remainder of this section explains the rules for 3.1 Linking linkset merge. We define linking for the TSIL by giving rules for deriving the Notation. We write decs, sdecs to extend a context decs, implic- linking judgements in Figure 4. A linkset itly dropping the labels in sdecs. We define the domain of a struc- ture declaration list, dom(sdecs), by sdecs0 → sbnds : sdecs; S dom(lab1Bdec1,..., labnBdecn) = {lab1,..., labn}. comprises imports sdecs0, exports sbnds : sdecs, and signature abbreviations S. We write {var/var 0}L for the capture-free substitution of var for free occurrences of var 0 in L.4 • The imports sdecs0 describe the TSIL structures on which the linkset depends; they must be well-formed in the ambient Linkset merge. The rules for linkset merge decs ` L1++L2 context. For example, the imports L3 combine L1 and L2 to produce L3. The rules presuppose that L1 is well-formed with respect to decs but permit L2 to make sdecsAB = ABa:[T Bt:Ω,XBx:t], B b:[Y y:a.T ] B B 4 Linkset bound variables and scopes are discussed in Appendix A. express dependency on structures labelled A and B. reference (via free TSIL variables) not only to decs but to the Judgement... Meaning... imports and exports of L1. Formally, the rules satisfy the following project L project elaboration 5 property. Γ ` srcunit L unit elaboration Γ ` topdec L top-level declaration elaboration If L = sdecs → sbnds : sdecs 1 1 Γ ` impexp L import expression elaboration and decs ` L ok 1 Γ ` sigbind S signature binding elaboration and decs, sdecs1, sdecs ` L2 ok, and decs ` L1++L2 L3, Γ `ctx sigid sig : Sig signature lookup then decs ` L3 ok. Γ `ctx unitid S If a linkset is well-formed, then it neither imports nor exports Γ ok Γ is well-formed the same label twice (although it may both import and export a particular label). Figure 6. Elaboration judgements The rules process the imports in L2 from left to right. If L2 has no imports, then the following rule applies. project ::= · empty L = sdecs0 → sbnds : sdecs project, srcunit source unit 0 0 project,L compiled unit(s) decs ` L++(· → sbnds : sdecs ) 0 0 srcunit ::= unit unitid = topdec unit declaration sdecs0 → sbnds++sbnds : sdecs++sdecs topdec ::= import impexp open units L3 imports what L1 does and exports what L1 and L2 do. To strdec ensure that L3 is well-formed, the rule uses the TS function ++ signature sigbind to concatenate the exports in L1 with the exports in L2, renaming local topdec1 in topdec2 end labels exported by L1 that are also exported by L2. topdec1 topdec2 Otherwise, the rules examine the first import labBvar:sig in impexp ::= unitid hintf spec endi open unitid L2 and distinguish three mutually exclusive cases: impexp1 impexp2 sigbind ::= sigid = sigexp hand sigbindi • L1 exports lab. 00 0 0 000 sdecs = sdecs , labBvar :sig , sdecs 0 0 00 Figure 7. SMLSC abstract syntax decs, sdecs0, sdecs `sub var :sig  sig mod:sig 00 sbnd := 1Bvar=mod sdec := 1Bvar:sig L := sdecs0 → sbnds++sbnd : sdecs++sdec • L1 neither imports nor exports lab. 0 0 00 decs ` L++(sdecs1 → sbnds : sdecs ) L lab 6∈ dom(sdecs) ∪ dom(sdecs0) 0 decs ` (sdecs0 → sbnds : sdecs)++ decs, sdecs0, sdecs ` sig ≡ sig : Sig 0 0 00 0 (lab var:sig, sdecs1 → sbnds : sdecs ) L decs, sdecs0 ` sig : Sig B 0 L := sdecs0, labBvar:sig → sbnds : sdecs The first premise picks out the L export lab var 0:sig 0 for 0 0 00 1 B decs ` L++(sdecs1 → sbnds : sdecs ) L lab; there can be at most one since L1 is well-formed. The decs ` (sdecs0 → sbnds : sdecs)++ second premise calls the TS coercion compiler to match the 0 0 00 export var 0:sig 0 to the import signature sig. Linking fails if (labBvar:sig, sdecs1 → sbnds : sdecs ) L 00 no match is possible; otherwise, sig has the same “shape” as The first premise ensures that L1 neither imports nor exports sig, but is fully transparent relative to the variable var 0. The lab. The next two premises choose a signature sig 0 equivalent structure binding sbnd : sdec is constructed using the coercion to sig but well-formed without reference to the exports of L1. module mod at the signature sig 00, maximizing type sharing. Linking fails if no such signature exists—when opaque types The linkset L has the same imports as L1, and exports those of exported by L1 occur in sig. Otherwise, L is constructed by L1 plus the result of the preceding coercion. To ensure that L adding a new import to the imports in L1. is well-formed—in particular, that it exports nothing more than once—the rule uses ++ to construct its exports. 3.2 Elaboration

• L1 imports lab but does not export it. We define a semantics for SMLSC by giving rules for the elab- oration judgements in Figure 6. We give the abstract syntax for lab 6∈ dom(sdecs) SMLSC in Figure 7. The elaboration rules appear in Appendix B. 00 0 0 000 sdecs0 = sdecs , labBvar :sig , sdecs These judgements have the following meaning. 0 decs, sdecs0, sdecs ` sig ≡ sig : Sig 0 0 0 0 • project L. Elaborate project, using linkset merge to accu- L := {var /var}(sdecs1 → sbnds : sdecs ) 0 00 mulate a resulting linkset L. A source unit is elaborated in a decs ` (sdecs0 → sbnds : sdecs)++L L context Γ that declares the imports and exports in L. decs ` (sdecs → sbnds : sdecs)++ 0 • (lab var:sig, sdecs → sbnds0 : sdecs0) L00 Γ ` srcunit L. Elaborate the topdec in srcunit to the B 1 linkset

The first premise ensures L1 does not export lab. The second sdecs0 → sbnds : sdecs; S. 0 0 premise picks out the L1 import labBvar :sig . Linking fails if 0 0 The imports sdecs0 arise from the import declarations in sig and sig are not equivalent; otherwise, L is constructed by topdec. The exports sbnds : sdecs arise from the structure dec- changing references in the remainder of L2 to use the import in larations in topdec. The signature abbreviations S arise from L1. the signature declarations in topdec. The result, L, exports a single module 5 In this description of linkset merge, we suppress all details related to signature abbreviations. unitidBvar=[sbnds]: unitidBvar:[sdecs]. • Γ ` topdec L. Elaborate topdec using linkset merge and ology of definite references by introducing “views” of the same identifier resolution. underlying unit. For example, suppose that two linksets L1 and L2 import the • Γ ` impexp L. Elaborate impexp using identifier resolu- tion and spec elaboration. same unit MathLib at disparate interfaces I1 and I2. This may hap- pen because the developers of L1 and L2 compiled using different • Γ ` sigbind S. Elaborate sigbind using signature elabora- versions of the handoff unit for MathLib, or because the developers tion. wrote their import interfaces by hand. The link link(L ,L ) 4. Implementation 1 2 fails because the linksets are required to agree on the interfaces of The semantics of SMLSC avoids commitment to the meaning of their common imports. Aside from recompiling the two linksets “compilation,” “linking,” and “completion” to ensure compatibility to use the same interface, the programmer has several options with various implementation strategies. These phases may be im- for resolving this situation. First, she can satisfy the imports by plemented using classical methods (code generation during com- providing the implementation of MathLib: pilation, object code weaving during linking, and writing an ex- 0 ecutable for completion), or in other, more novel, ways (such as L1 = link(MathLib, L1) 0 type checking during compilation, and code generation during link- L = link(L1,L2) ing). The design is, as far as we know, implementable in all current The first step satisfies the SC import of MathLib in L , as Standard ML compilers without requiring radical changes to their 1 long as the actual interface of MathLib matches the import in- infrastructure. 0 terface I1. The result L1 does not import MathLib, so it does 0 Parallel Build. The purpose of separate compilation is to permit not conflict with the import of MathLib in L2. L1 does ex- a client unit to be compiled independently of its implementation. port MathLib, so if the actual interface of MathLib matches I2, A compiler can exploit this by permitting clients of a separately then the second link succeeds. Because linking is left-associative, compiled unit to be compiled in parallel with one another in order L = link(MathLib,L1,L2) accomplishes the same thing. to speed up system build times. The TILT compiler, which imple- Any implementation of MathLib that satisfies both I1 and I2 ments an earlier version of the present extension, implements such will suffice. Because we do not require unit names to be glob- a strategy. Moreover, it also implements cut-off incremental recom- ally unique, this implementation of MathLib might even import pilation [1], where it is able to interrupt the normal cascade of re- MathLib (again) and then contain some glue code to make it com- compilation when a source change does not cause a unit’s interface patible with the two given interfaces I1 and I2 (Figure 8). We ex- to change. pect such cases to be uncommon, the preferred methodology being to use a single handoff unit for all clients. Parsing. This presentation of SMLSC provides concrete and ab- stract syntax, but does not formalize parsing. The only issue that entangles separate compilation and parsing is fixity declarations. 6. Related Work To support fixity declarations at parse-time, we include a pars- There are several closely related systems that influenced the design ing context in the concrete representation of linksets (object files). of SMLSC. A source unit that is incrementally compiled against a linkset is The notion of linkset in SMLSC comes from Cardelli’s investi- parsed using that linkset’s included parsing context. We do not per- gation of separate compilation and type-safe linking in the simply- mit fixity specifications in user-specified interfaces, and therefore typed λ-calculus [5]. Our formalization of linking extends these they do not affect interface matching or any other part of the se- ideas to support the Standard ML module system including signa- mantics. ture subtyping, abstract types, and module and type definitions in Note that a library may specify fixity information by placing ap- structures. propriate declarations in the handoff unit. For example, to describe Harper and Pierce [11] discuss language design for module sys- a matrix library that supplies an infix ** operator for multiplication, tems, including separate compilation. Particularly relevant to the we may write the following handoff unit: current work is their discussion of sharing of abstract types. They unit Matrices = describe the use of definite references to avoid the coherence prob- top lems (and excess sharing specifications) that arise from aliasing. import MatricesImpl : The notion of a handoff unit bears some resemblance to the use intf of .h files in C. The presence of function prototypes in a .h file type matrix provides an interface for application code that includes that header val ** : matrix * matrix -> matrix file. Code that references a prototyped function triggers a link-time (* ··· *) demand for that function. The degree of link-time type-checking end varies accross C implementations. Usually, type correctness is as- infix ** sured by programming conventions. end

5. Multiple Interfaces for the Same Import In Section 2 we presented the programming methodology of hand- off units. As long as two linksets that import the same unit identifier do so by using the same handoff unit, they will always agree on the interface for that unit and so can be linked together. However, in some situations it may be useful to permit two clients to import the Figure 8. The linkset Lglue imports MathLib at interface I and same unit, each with a different interface. Since interface matching, then exports it to satisfy the imports in L1 and L2 at disparate like signature matching, is coercive, this complicates the method- interfaces I1 and I2. Glew and Morrisett [8] describe separate compilation for Typed ML Basis. The MLton compiler [18] and ML Kit [17] implement Assembly Language [19]. Their language, MTAL, permits type a language called ML Basis. A “basis” in their terminology is what definitions, abstract types, and polymorphic types in interfaces and we call a unit. An ML Basis program is a series of declarations, supports recursive linking. including a binding construct for bases and an open construct for Jim [14] describes a λ-calculus P2 with rank 2 intersection basis identifiers. These are analogous to SMLSC’s unit declaration types that has principal typings. The principal typings property and IC import declaration. Like SMLSC, the order of compila- means that from a term M, one can infer both Γ and τ such that any tion entities is explicit, and thus each program has unambiguous typing derivation Γ0 ` M : τ 0 is an instance of Γ ` M : τ. In a meaning. ML Basis is given a formal semantics [6] in terms of The system with principal typings, program fragments can be separately Definition of Standard ML. The implementation of ML Basis in the compiled without context information, meaning that SC imports ML Kit supports cut-off incremental recompilation based on Els- need not even specify interfaces. Standard ML, however, does not man’s thesis work [7]. Like CM, ML Basis does not provide a way have principal typings. It remains an open problem to design a type for programmers to write down interfaces or separately compile system that supports principal types for features such as abstract against unimplemented bases. and recursive types, and type definitions in modules. Objective Caml. The separate compilation system of Objective 7. Conclusion Caml (O’Caml) [15] is similar in many regards to SMLSC. The We have presented an extension to Standard ML for separate com- declaration of a unit U is an O’Caml module stored within a file pilation called SMLSC. Its focus is the unit, a program fragment called U.ml. The interface for U may optionally be given in a file that can depend on other program fragments through either separate called U.mli. If the interface is present, other units depending on or incremental compilation. Via the programming idiom of handoff U can compile even if the implementation is not available, just as units—that uses both separate and incremental compilation—we in SMLSC. Because the filename of an interface indicates the unit limit the number and complexity of linguistic mechanisms while that it describes, O’Caml interfaces play the role of handoff units supporting a convenient programming style. Our formal and ab- in SMLSC. Additionally, O’Caml’s use of the filesystem to provide stract definition of the language ensures that it is unambiguously a canonical location for each unit and interface means that all unit specified, and admits a variety of implementation strategies. references are definite. On the other hand, O’Caml’s dependence on the filesystem means that the language is not independent from its environment. Acknowledgments For instance, unit names are limited to valid filenames on the host We thank the anonymous referees for their constructive comments. system, and restructuring a project on disk may force changes to the code. Another significant difference is that O’Caml conflates the References notions of units and modules. This earns O’Caml some conceptual economy, but it makes it impossible to separate the notions of top- [1] Rolf Adams, Walter Tichy, and Annette Weinert. The cost of selective level declarations and structure components. This makes it neces- recompilation and environment processing. ACM Transactions on sary to support signature and functor definitions within structures, Software Engineering and Methodology, 3(1):3–28, January 1994. so such a choice would not be compatible with our design princi- [2] Matthias Blume. Dependency analysis for Standard ML. ACM ple of conservativity over Standard ML. Finally, unlike SMLSC, Transactions on Programming Languages and Systems, 21(4):790– O’Caml and its separate compilation system are defined informally 812, 1999. in terms of their implementation. [3] Matthias Blume. CM: The SML/NJ compilation and library manager (for SML/NJ version 110.40 and later) user manual, 2002. Moscow ML. The Moscow ML [20] compiler for Standard ML http://www.smlnj.org/doc/CM/new.pdf. supports a separate compilation system nearly identical to Objec- tive Caml’s. Moscow ML extends the Standard ML module system [4] Matthias Blume and Andrew W. Appel. Hierarchical modularity. ACM Transactions on Programming Languages and Systems, to allow (among other things) functor and signature declara- 21(4):813–847, 1999. tions in structures and specifications for them in signatures. Then, like O’Caml, units are structures. In contrast, SMLSC does not re- [5] Luca Cardelli. Program fragments, linking, and modularization. quire any changes to the Standard ML module language. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 266–277. ACM Press, 1997. Other Standard ML implementations include mechanisms for breaking programs up into compilation units. None support sepa- [6] Henry Cejtin, Matthew Fluet, Suresh Jagannathan, and Stephen Weeks. Formal specification of the ML Basis system, January 2005. rate compilation in the sense we use it here; they use the term to http://mlton.org/MLBasis. mean cut-off incremental recompilation (recall Section 4). [7] Martin Elsman. Program Modules, Separate Compilation, and SML/NJ CM. The Compilation Manager for Standard ML of Intermodule Optimisation. PhD thesis, Department of Computer New Jersey (CM) [3] is a tool for compiling Standard ML pro- Science, University of Copenhagen, January 1999. grams spread across many source files. CM permits a program to [8] Neal Glew and Greg Morrisett. Type-safe linking and modular be divided into a hierarchy of libraries [4]. A library comprises a assembly language. In Proceedings of the ACM SIGPLAN-SIGACT list of imported libraries, Standard ML source files, and a list of Symposium on Principles of Programming Languages, pages 250– symbols exported by the library. Dependencies between libraries 261. ACM Press, 1999. are explicit but dependencies among the source files in a library are [9] Robert Harper, Peter Lee, Frank Pfenning, and Eugene Rollins. inferred [2, 9]. CM provides control over the identifiers visible to Incremental recompilation for Standard ML of New Jersey. Technical a source file, and supports conditional compilation, parallel compi- Report CMU-CS-94-116, School of Computer Science, Carnegie lation, and cut-off incremental recompilation. CM provides no way Mellon University, February 1994. for the programmer to write interfaces nor to compile against unim- [10] Robert Harper and Mark Lillibridge. A type-theoretic approach plemented units. SMLSC is not a replacement for CM; we believe to higher-order modules with sharing. In Proceedings of the that dependency analysis and recompilation tools are useful, and ACM SIGPLAN-SIGACT Symposium on Principles of Programming that SMLSC provides a good linguistic target for such tools. Languages, pages 123–137. ACM Press, 1994. [11] Robert Harper and Benjamin C. Pierce. Design considerations for We assume that its range includes neither the distinguished ML-style module systems. In Benjamin C. Pierce, editor, Advanced label basis, nor the labels chosen fresh in the rules. Topics in Types and Programming Languages, chapter 8, pages 293– • 346. MIT Press, 2005. Structure declaration lists sdecs, signature abbreviations S, and so on specify lists of elements. We adopt the following notation [12] Robert Harper and Christopher Stone. An interpretation of Standard for lists. ML in type theory. Technical Report CMU-CS-97-147, School of Computer Science, Carnegie Mellon University, June 1997. We denote by (·, ·) the operation of syntactic concatenation; 0 [13] Robert Harper and Christopher Stone. A type-theoretic interpretation for example, S, S . of Standard ML. In Gordon Plotkin, Colin Stirling, and Mads Tofte, We sometimes use pattern matching at the left end of a list, editors, Proof, Language, and Interaction: Essays in Honour of Robin writing sigid=sig,S to match the first binding in the list. Milner. MIT Press, 2000. We usually omit the initial ·; for example, [14] Trevor Jim. What are principal typings and what are they good for? In Proceedings of the ACM SIGPLAN-SIGACT Symposium on sigid 1=sig 1,..., sigid 1=sig 1. Principles of Programming Languages. ACM Press, 1996. • [15] Xavier Leroy, Damien Doligez, Jacques Garrigue, Didier Remy,´ We define the domain of a signature abbreviation, dom(S), by and Jer´ omeˆ Vouillon. The Objective Caml system release 3.09: dom(·) = ∅ Documentation and user’s manual, 2005. http://caml.inria. dom(S, sigid=sig) = dom(S) ∪ {sigid} fr/pub/docs/manual-ocaml/index.html. dom(S, unitid=Su) = dom(S) ∪ {unitid}. [16] Robin Milner, Mads Tofte, Robert Harper, and David MacQueen. The Definition of Standard ML (Revised). MIT Press, 1997. • We define the function S++S0 by [17] MLKit web site. http://www.itu.dk/research/mlkit/index. (·++S0) = S0 php/Main Page. ((sigid=sig,S)++S0) = [18] MLton web site. http://mlton.org/.  sigid=sig,S00 if sigid 6∈ dom(S00) 00 [19] Greg Morrisett, David Walker, Karl Crary, and Neal Glew. From S otherwise system F to typed assembly language. ACM Transactions on where S00 = S++S0 0 Programming Languages and Systems, 21(3):527–568, 1999. ((unitid=Su,S)++S ) =  00 00 [20] Sergei Romanenko, Claudio Russo, and Peter Sestoft. Moscow ML unitid=Su,S if unitid 6∈ dom(S ) owner’s manual version 2.00, June 2000. http://www.dina.kvl. S00 otherwise dk/∼sestoft/mosml/manual.pdf. where S00 = S++S0. [21] David Swasey, Tom Murphy, VII, Karl Crary, and Robert Harper. A It concatenates S and S0, making the result well-formed by separate compilation extension to Standard ML (extended technical 0 report). Technical Report CMU-CS-06-133, School of Computer dropping signature abbreviations if dom(S) ∩ dom(S ) 6= ∅. Science, Carnegie Mellon University, 2006. • We write both “=” and “:=” in side-conditions. Interpreting the rules algorithmically, the former pattern-matches inputs, and A. Linking Rules the latter specifies an output.

The typed semantics defines a closed structure mod basis :sig basis decs ` L ok serving as an initial basis for the TSIL. The elaborator assumes Γ declares basis:sig , which includes components such as the basis decs ` sdecs ok built-in Match exception. This basis structure is introduced in 0 decs, sdecs ` sbnds : sdecs Rule 6 for completion and Rule 12 for elaboration of source units 0 decs, sdecs , sdecs ` S ok in projects. 0 (1) sdecs = lab var :[sdecs ],..., lab var :[sdecs ] We use the following definitions and notation. 0 1B 1 1 nB n n decs ` sdecs0 → sbnds : sdecs; S ok • Writing BV(dec) for the variable declared by dec, we define the bound variables of a structure declaration list, BV(sdecs), Rule 1: Imports are restricted to structures. The elaborator in Ap- by pendix B needs nothing else.

BV(lab1Bdec1,..., labnBdecn) = {BV(dec1),..., decs ` S ok BV(decn)}. ` decs ok A linkset (2) decs ` · ok L = sdecs0 → sbnds : sdecs; S binds variables BV(sdecs ) with scope sbnds : sdecs; S decs ` sig : Sig decs ` S ok sigid 6∈ dom(S) 0 (3) and variables BV(sdecs) with scope S. We write BV(L) for decs ` sigid=sig,S ok BV(sdecs0) ∪ BV(sdecs). decs ` S0 ok decs ` S ok unitid 6∈ dom(S) • For readability, we sometimes elide variables in structure bind- S0 = (sigid =sig ,..., sigid =sig ) ings and declarations. It should be immediately obvious how to 1 1 n n (4) consistently restore these with fresh variables. decs ` unitid=S0,S ok • We assume that unit identifiers are disjoint from all other iden- L exp : {} tifier classes. • We assume that the TS overbar injection · maps identifiers of lab 6∈ dom(sdecs) different classes to different labels and that there are infinitely (5) many labels not in its range. · → sbnds : sdecs; S [sbnds, lab={}].lab : {} • We define a function U(sdecs) that drops the labels in sdecs: Lbasis := · → basis=mod basis : basis:sig basis ; · 0 0 U(lab1Bdec1,..., labnBdecn) = dec1,..., decn. ` Lbasis ++L L L exp : {} (6) • L exp : {} When an elaboration context Γ = sdecs; S appears in a judge- ment requiring an IL context decs, we implicitly coerce Γ to 0 00 U(sdecs). decs ` L++L L • We define the substitution function σ(var, sdecs,S) by L = sdecs0 → sbnds : sdecs; S σ(var, ·,S) = S 0 0 0 decs ` L++(· → sbnds : sdecs ; S ) (7) σ(var, (labBdec, sdecs),S) = 0 0 0 sdecs0 → sbnds++sbnds : sdecs++sdecs ; S++S {var.lab/BV(dec)}σ(var, sdecs,S) where {path/var}S denotes the capture-free substitution of 00 0 0 000 sdecs = sdecs , labBvar :sig , sdecs path for free occurrences of var in S. Rule 14 uses σ to 0 0 00 decs, sdecs0, sdecs `sub var :sig  sig mod:sig elaborate source units. 00 sbnd := 1Bvar=mod sdec := 1Bvar:sig project L L := sdecs0 → sbnds++sbnd : sdecs++sdec; S (8) 0 0 0 00 decs ` L++(sdecs1 → sbnds : sdecs ; S ) L (11) decs ` (sdecs0 → sbnds : sdecs; S)++ · (· → · : ·; ·) 0 0 0 00 (labBvar:sig, sdecs1 → sbnds : sdecs ; S ) L project L basis 6∈ BV(L) lab 6∈ dom(sdecs) L = sdecs0 → sbnds : sdecs; S 00 0 0 000 sdecs0 = sdecs , labBvar :sig , sdecs Γ := basis:sig basis ,R(sdecs0), sdecs; S 0 0 0 decs, sdecs0, sdecs ` sig ≡ sig : Sig Γ ` srcunit L var 6∈ BV(L ) 0 0 0 0 0 0 0 0 0 0 (12) L := {var /var}(sdecs1 → sbnds : sdecs ; S ) (9) sdecs1 → sbnds : sdecs ; S := {var/basis}L 0 00 0 decs ` (sdecs0 → sbnds : sdecs; S)++L L sdecs1 := basisBvar:sig basis , sdecs1 0 0 0 00 ` L++(sdecs1 → sbnds : sdecs ; S ) L decs ` (sdecs0 → sbnds : sdecs; S)++ 0 0 0 00 00 (labBvar:sig, sdecs1 → sbnds : sdecs ; S ) L project, srcunit L Rule 12: The side-condition basis 6∈ BV(L) can always be lab 6∈ dom(sdecs) ∪ dom(sdecs0) 0 achieved by renaming bound variables in L. decs, sdecs0, sdecs ` sig ≡ sig : Sig 0 decs, sdecs0 ` sig : Sig 0 project L L := sdecs0, labBvar:sig → sbnds : sdecs; S (10) 0 0 0 00 0 0 0 00 BV(L) ∩ BV(L ) = ∅ ` L ok ` L++L L decs ` L++(sdecs1 → sbnds : sdecs ; S ) L (13) project,L0 L00 decs ` (sdecs0 → sbnds : sdecs; S)++ 0 0 0 00 (labBvar:sig, sdecs1 → sbnds : sdecs ; S ) L Γ ` srcunit L B. Elaboration Rules Γ ` topdec L We change the TS elaborator to expand signature abbreviations. L = sdecs0 → sbnds : sdecs; S First, we modify every TS elaboration judgement and rule us- var 6∈ BV(Γ) ∪ BV(L) 0 ing a TS elaboration context sdecs to use sdecs; S. A context sbnds := unitidBvar=[sbnds] sdec, sdecs; S binds the variable BV(sdec) with scope sdecs; S 0 (14) sdecs := unitidBvar:[sdecs] and a context sdecs; S binds variables BV(sdecs) with scope S. S0 := unitid=σ(var, sdecs,S) We define BV(Γ) by BV(sdecs). Second, we extend the syntax Γ ` unit unitid = topdec for TSEL signature expressions: 0 0 0 sdecs0 → sbnds : sdecs ; S sigexp ::= ... sigid signature identifier Γ ` topdec L Finally, we extend the TS judgement Γ ` sigexp sig : Sig, adding the rule Γ ` impexp L (15) Γ ` import impexp L Γ `ctx sigid sig : Sig Γ ` sigid sig : Sig Γ ` strdec sbnds : sdecs (16) to elaborate signature identifiers. Γ ` strdec · → sbnds : sdecs; · We use the following definitions and notation. Γ ` sigbind S (17) • To extend an elaboration context Γ = sdecs; S, we write Γ ` signature sigbind · → · : ·; S Γ, dec for sdecs, 1Bdec; S, 0 0 Γ ` topdec sdecs0 → sbnds : sdecs; S Γ, sdecs for sdecs, sdecs ; S, and 0 0 var 6∈ BV(Γ) ∪ BV(sdecs0) Γ,S for sdecs; S, S . ? 0 0 Γ,R(sdecs0), 1 Bvar:[sdecs],S ` topdec L (18) We also define a function R(sdecs) that renames the labels in L := sdecs0 → 1Bvar=[sbnds] : 1Bvar:[sdecs]; · 0 00 sdecs to make them inaccessible to identifier resolution: Γ ` L++L L 0 00 R(lab1Bdec1,..., labnBdecn) = 1Bdec1,..., 1Bdecn. Γ ` local topdec in topdec end L Γ ` topdec L sdecs; S ok sdecs ` sig : Sig L = sdecs0 → sbnds : sdecs; S (31) 0 0 sdecs; S, sigid=sig ok Γ,R(sdecs0), sdecs,S ` topdec L (19) 0 00 Γ ` L++L L sdecs; S ok sdecs ` S0 ok 0 00 S0 = (sigid =sig ,..., sigid =sig ) Γ ` topdec topdec L 1 1 n n (32) sdecs; S, unitid=S0 ok Γ ` impexp L

Γ `ctx unitid var : sig 0 Γ `ctx unitid S var 6∈ BV(Γ) 0 ? 0 ? (20) L := unitidBvar :sig → 1 =var : 1 :sig; S Γ ` unitid L Rule 20: Rules 12, 18, and 19 use R(·) to hide imported units from IR imports. The signature sig should be fully selfified.

0 Γ ` spec sdecs var 6∈ BV(Γ) Γ, var 0 :[sdecs] ` var 0 : sig 0 ? 0 ? (21) L := unitidBvar :[sdecs] → 1 =var : 1 :sig; · Γ ` unitid : intf spec end L Rule 21: The signature sig should be fully selfified.

0 0 Γ ` impexp L Γ ` impexp L 0 0 00 BV(L) ∩ BV(L ) = ∅ Γ ` L++L L (22) 0 00 Γ ` impexp impexp L

Γ ` sigbind S

Γ ` sigexp sig : Sig S := sigid=sig 0 0 hΓ ` sigbind S sigid 6∈ dom(S )i (23) 0 Γ ` sigid = sigexp hand sigbindi Sh,S i Rule 23: Either all optional elements or none must be present.

Γ `ctx sigid sig : Sig

(24) sdecs; S, sigid=sig ` sigid sig : Sig sigid 0 6= sigid sdecs; S ` sigid sig : Sig (25) 0 0 sdecs; S, sigid =sig ` sigid sig : Sig sdecs; S ` sigid sig : Sig 0 (26) sdecs; S, unitid=S ` sigid sig : Sig

Γ `ctx unitid S

0 0 (27) sdecs; S, unitid=S ` unitid S unitid 0 6= unitid 00 sdecs; S ` unitid S (28) 0 0 00 sdecs; S, unitid =S ` unitid S 0 sdecs; S ` unitid S 0 (29) sdecs; S, sigid=sig ` unitid S Γ ok

` U(sdecs) ok (30) sdecs; · ok