Abstract Types Have Existential Type
Total Page:16
File Type:pdf, Size:1020Kb
Abstract Types Have Existential Type JOHN C. MITCHELL Stanford University AND GORDON D. PLOTKIN University of Edinburgh Abstract data type declarations appear in typed programming languages like Ada, Alphard, CLU and ML. This form of declaration binds a list of identifiers to a type with associated operations, a composite “value” we call a data algebra. We use a second-order typed lambda calculus SOL to show how data algebras may be given types, passed as parameters, and returned as results of function calls. In the process, we discuss the semantics of abstract data type declarations and review a connection between typed programming languages and constructive logic. Categories and Subject Descriptors: D.3 [Software]: Programming Languages; D.3.2 [Program- ming Languages]: Language Classifications-applicative languages; D.3.3 [Programming Lan- guages]: Language Constructs--abstract data types; F.3 [Theory of Conmputation]: Logics and Meanings of Programs; F.3.2 [Logics and Meanings of Programs]: Semantics of Programming Languages-denotational semantics, operational semantics; F.3.3 [Logics and Meanings of Pro- grams]: Studies of Program Constructs-type structure General Terms: Languages, Theory, Verification Additional Key Words and Phrases: Abstract data types, lambda calculus, polymorphism, program- ming languages, types 1. INTRODUCTION Ada packages [17], Alphard forms [66, 711, CLU clusters [41, 421, and abstype declarations in ML [23] all bind identifiers to values. Although there are minor variations among these constructs, each allows a list of names to be bound to a composite value consisting of “private” type and one or more operations. For example, the ML declaration abstype complex = real # real with create = . and pius = . and re = . andim= ..- An earlier version of this paper appeared in the Proceedings of the 22thACM Symposium on Principles of Programming Languages (New Orleans, La., Jan. 14-16). ACM, New York, 1985. Authors’ addresses: J. C. Mitchell, Department of Computer Science, Stanford University, Stanford, CA 94305; G. D. Plotkin, Department of Computer Science, University of Edinburgh, Edinburgh, Scotland EH9 352. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 0 1988 ACM 0164-0925/88/0700-0470 $01.50 ACM Transactions on Programming Languages and Systems, Vol. 10, No. 3, July 1988, Pages 470-502. Abstract Types Have Existential Type l 471 binds the identifiers complex, create, plus, re, and im to the components of an implementation of complex numbers. The implementation consists of the collec- tion defined by the ML expression real # real, meaning the type of pairs of reals, and the functions denoted by the code for create, plus, and so on. An important aspect of this construct is that access to the representation is limited. We cannot apply arbitrary operations on pairs of reals to elements of type complex; only the explicitly declared operations may be used. We will call a composite value constructed from a set and one or more operations, packaged up in a way that limits access, a data algebra. We will discuss the typing rules associated with the formation and the use of data algebras and observe that data algebras themselves may be given types in a straightforward manner. This will allow us to devise a typed programming notation in which implementations of abstract data types may be passed as parameters or returned as the results of function calls. The phrase “abstract data type” sometimes refers to a class of algebras (or perhaps an initial algebra) satisfying some specification. For example, the ab- stract type stack is sometimes regarded as the class of all algebras satisfying the familiar logical formulas axiomatizing push and pop. Associated with this view is the tenet that a program must rely only on the data type specification, as opposed to properties of a particular implementation. Although this is a valuable guiding principle, most programming languages do not contain assertions or their proofs, and without this information it is impossible for a compiler to guarantee that a program depends only on a data type specification. Since we are primarily concerned with properties of the abstract data type declarations used in common programming languages, we will focus on the limited form of information hiding or “abstraction” provided by conventional type checking rules. We can be more specific about how data algebras are defined by considering the declaration of complex numbers in more detail. Using an explicitly typed ML-like notation, the declaration sketched earlier looks something like this: abstype complex = real # real with create: real + real + complex = Xx: real. Xy: real. ( X, y ) and plus: complex --, complex = Xz:real # real. Xw:real # real. ( fst(z) + fst(w), snd(z) + snd(w)) and re: complex + real = Xz:real # real.fst(z) and im: complex + real = Xz:real # real. snd(z) The identifiers complex, create, plus, re, and im are bound to a data algebra whose elements are represented as pairs of reals, as specified by the type expression real # real. The operations of the data algebra are given by the function expres- sions to the right of the equals signs1 Notice that the declared types of the operations differ from the types of the implementing functions. For example, re is declared to have type complex + real, but the implementing expression has type real # real + real. This is because operations are defined using the concrete representation of values, but the representation is hidden outside the declaration. In the next section, we will discuss the type checking rules associated with abstract data type declarations, which are designed to make complex numbers 1 In most programming languages, function definitions have the form “create(x:real, y:real) = . .” In the example above, we have used explicit lambda abstraction to move the formal parameters from the left- to the right-hand sides of the equals signs. ACM Transactions on Programming Languages and Systems, Vol. 10, No. 3, July 1988. 472 9 J. C. Mitchell and G. D. Plotkin “abstract” outside the data algebra definition. In the process, we will give types to data algebras. These will be existential types, which were originally developed in constructive logic and are closely related to infinite sums (as in category theory, for example). In Section 3, we describe a statically typed language SOL. This language is a notational variant of Girard’s system F, developed in the analysis of constructive logic [21,22], and an extension of Reynolds’ polymorphic lambda calculus [62]. An operational semantics of SOL, based on the work of Girard and Reynolds, is presented using reduction rules. However, we do not address a variety of practical implementation issues. Although the basic calculus we use has been known for some time, we believe that the analysis of data abstraction using existential types originates with this paper. (A preliminary version appeared as [56].) The use of SOL as a proof-theoretic tool is based on an analogy between types and constructive logic. This analogy gives rise to a large family of typed languages and suggests that our analysis of abstract data types applies to more expressive languages involving specifications. Since the connection between constructive proofs and typed programs does not seem to be well known in the programming language community (at least at present), our brief discussion of specifications will follow a review of the general analogy in Section 4. Additional SOL program- ming examples are given in Section 5. The design of SOL suggests new programming languages along the lines of Ada, Alphard, CLU, and ML but with richer and more flexible type structures. In addition, SOL seems to be a natural “kernel language” for studying the semantics of languages with polymorphic functions and abstract data type declarations. For this reason, we expect SOL to be useful in future studies of current languages. It is clear that SOL provides greater flexibility in the use of abstract data types than previous languages, since data algebras may be passed as parameters and returned as results. We believe that this is accomplished without any compromise in “type security.” However, since we do not have a precise characterization of type security, we are unable to show rigorously that SOL is secure.’ Some languages that are similar to SOL in scope and intent are Pebble [7], designed to capture some essential features of Cedar (an extension of Mesa [57]), and Kernel Russell, KR, of [28], based on Russell [14, 15, 161. Martin-Lof’s constructive type theory [46] and the calculus of constructions [ll] are farther from programming language syntax but share many properties of SOL. Some features of Martin-Lof’s system have been incorporated into the Standard ML module design [44, 541, which was formulated after the work described here was completed. We will compare SOL with some of these languages in Section 3.8. 2. TYPING RULES FOR ABSTRACT DATA TYPE DECLARATIONS The basic typing rules associated with abstract data type declarations do not differ much from language to language. To avoid the unnecessary complication of discussing a variety of syntactic forms, we describe abstract data types using the syntax we will adopt in SOL. Although this syntax was chosen to resemble ’ Research begun after this paper was written has shed some light on the type security of SOL. See [52] and [55] for further discussion. ACM Transactions on Programming Languages and Systems, Vol. 10, No. 3, July 1988. Abstract Types Have Existential Type 473 common languages, there is one novel aspect that leads to additional flexibility: We separate the names bound by a declaration from the data algebra they come to denote.