First-Class Labels for Extensible Rows Technical Report: UU-CS-2004-051

First-class labels for extensible rows Technical report: UU-CS-2004-051 Daan Leijen Institute of Information and Computing Sciences, Utrecht University P.O.Box 80.089, 3508 TB Utrecht, The Netherlands [email protected] Abstract ants based on the theory of quali¯ed types. Unfortu- nately, as we show later in this article, type inference This paper describes a type system for extensible for their proposed extension is incomplete. Sulzmann records and variants with ¯rst-class labels; labels are acknowledges this problem and describes a record cal- polymorphic and can be passed as arguments. This in- culus with ¯rst-class labels as an instance of HM(X) creases the expressiveness of conventional record calculi [27, 28, 29], but this calculus is not extensible and lacks signi¯cantly, and we show how we can encode intersec- a general compilation method. Shields and Meijer intro- tion types, closed-world overloading, type case, label duce an intriguing label-less calculus, ¸tir, where records selective calculi, and ¯rst-class messages. We formally with ¯rst-class labels can be encoded with opaque types motivate the need for row equality predicates to express [25, 24]. However, due to general type equality con- type constraints in the presence of polymorphic labels. straints, ¸tir is a very complicated system. We wanted This naturally leads to an orthogonal treatment of unre- to explore an alternative point in the design space with stricted row polymorphism that can be used to express a simpler calculus where labels are explicit; indeed, we ¯rst-class patterns. can express many of the motivating examples of ¸tir Based on the theory of quali¯ed types, we present an without having to work from ¯rst principles. e®ective type inference algorithm and e±cient compila- We base our calculus directly on the original record tion method. The type inference algorithm, including system of Gaster and Jones, with two important tech- the discussed extensions, is fully implemented in the nical di®erences: we introduce row equality predicates experimental Morrow compiler. to express required type constraints in the presence of polymorphic labels, and we move the language of 1 Introduction rows into the predicate language to ensure completeness of type inference. Furthermore, our system is the Records and variants provide a convenient way to con- ¯rst to naturally describe row polymorphic operations, struct data types. Furthermore, record calculi can be and it completely avoids the complex uni¯cation prob- used as a foundation for features as objects and mod- lems, and resulting restriction to non-empty rows, as ule systems. Over the years, many aspects of record described by Gaster [4]. Being able to use unrestricted calculi have been studied, including polymorphic exten- row polymorphism, we can express ¯rst-class extensible sion, record concatenation, type directed compilation, patterns and views [30]. and even label-less calculi. We have fully implemented our type inference algo- The holy grail [6] of polymorphic extensible record rithm, including discussed extensions like row polymor- calculi are ¯rst-class labels, where labels are polymorphic operations, in the experimental Morrow compiler phic and can be passed as arguments. First-class labels [12]. Indeed, Morrow can infer all the types of the ex- increase the expressiveness of conventional record cal- amples in this paper. However, at the time of writing, culi signi¯cantly, and we will show how we can encode the code generator for Morrow is not yet complete. many interesting programming idioms, including inter- In our calculus, row terms are no longer part of the section types, closed-world overloading, type case, label type language but are only present in the predicate lan- selective calculi, and ¯rst-class messages. guage, to ensure completeness of type inference with One of the earliest works on ¯rst-class labels was polymorphic labels. By not folding labels back into done by Gaster and Jones [5, 4]. They describe a poly- the language of types, we also reduce the complexity morphic type system for extensible records and vari- of improvement with respect to ¸tir. Unfortunately, the improvement algorithm, and thus the test for satis¯a- (jl1 :: ¿1; :::; ln :: ¿nj) ´ (jl1 :: ¿1 j:::j (jln :: ¿n j(jj)j):::j) bility, ambiguity, and entailment, is still exponential. (jl1 :: ¿1; :::; ln :: ¿n jrj) ´ (jl1 :: ¿1 j:::j (jln :: ¿n j r j):::j) However, just like normal type inference, we believe The ¯elds of a row are distinguished by their label and that the worst case is unlikely to occur in practical pro- not by their position, and we consider rows equal up to grams. This intuition is largely con¯rmed by practical permutation of their ¯elds: experience with the Morrow compiler. In Section 2 we give an overview of conventional ex- (jl :: a; m :: b j rj) = (jm :: b; l :: a j rj) tensible records and variants `ala Gaster and Jones [5]. In Section 3 we extend this calculus with ¯rst-class polymorphic labels, and we discuss many interesting exam- 2.2 Lacks contraints ples in the following section. Section 5 discusses ¯rst- For the purposes of this paper, we restrict ourselves to class patterns. Section 6 and 7 formally de¯ne our cal- rows without duplicate labels. Without this constraint, culus, and give the typing rules and an inference algo- ¯elds can not be addressed unambiguously and, as a rithm. Section 8 describes simpli¯cation and improve- result, some programs can not be assigned a principal ment, and discusses complexity issues. We ¯nish with type [31]. A particularly elegant approach to enforce a discussion of related work and the conclusion. uniqueness of labels are lacks (or insertion) constraints [5, 25]. A predicate (rnl) restricts r to rows that do not 2 Records and variants contain a label l. In general, a row extension (jl :: a j rj) is only valid when the predicate (rnl) holds. For clarity, There are two dual concepts to describe the structure of we always write the lacks constraints explicitly in this data types: products group data items together, while paper, but a practical system can normally infer them sums describe a choice between alternatives. Here are from row expressions automatically, which simpli¯es the two examples of both operations: type signatures a lot. type Point = Int £ Int type Event = Char + Point 2.3 Record operations In most programming languages, we can name the com- A record interprets a row as a product of types. The ponents of a product and sum with a label. A labeled empty record is the only ground value with an empty product is called a record (or struct), while a labeled record type: sum is called a variant (or union, or data type). We de- fg :: fg scribe records with curly braces fg, and variants with angled brackets hi: Furthermore, there are three basic operations that can be performed on records, namely selection, restriction, type Point = fx :: Int; y :: Intg and extension: type Event = hkey :: Char; mouse :: Pointi ( :l) :: 8ra: (rnl) ) fl :: a j r g ! a We can see that both records and variants are described ( ¡ l) :: 8ra: (rnl) ) fl :: a j r g ! fr g by a sequence of labeled types, which we call a row. In fl = j g :: 8ra: (rnl) ) a ! fr g ! fl :: a j r g this article, we enclose rows in banana brackets (jj). For convenience, we leave these out when the row is directly Note that we assume a dist¯x notation where argument enclosed by a record or variant brackets. For example, positions are written as \ ". Furthermore, we explicitly the unabbreviated type of a Point is f(jx ::Int; y ::Intj)g. quantify all types in this paper, but practical systems can normally use implicit quanti¯cation. The three ba- 2.1 Extensible rows sic record operations are not arbitrary but based on the corresponding primitive operations on products in cate- Following Gaster and Jones [5], we consider an exten- gory theory or logic. Note that all type schemes contain sible row calculus where a row is either empty or an a predicate (rnl) to ensure the validity of row extension. extension of a row. The empty row is written as (jj) and For repeated record extension on terms, we apply the an extension of a row r with a label l and type ¿ is same abbreviations as for row extension on types. The written as (jl :: ¿ j rj). Here is an example of a row that basic operations can be used to implement a number of can be used to describe coordinates: other common operations like update and rename: (jx :: Int j (jy :: Int j (jj)j)j) fl := x j rg ´ fl = x j r ¡ lg To reduce the number of brackets, we use the following fl Ã m j rg ´ fm = r:l j r ¡ lg abbreviations: 2 2.4 Variant operations Note that we write r[i] to select the ith ¯eld of a mem- ory block. The o®set 1 is the evidence for the predicate A variant interprets a row as a sum of types. Dual to (jage :: Int; name :: Stringj)nmale, that resulted from se- records, the basic operations are based on the corre- lecting the male ¯eld. It is resolved to 1 as male is sponding operations on sums in category theory. The larger than age but smaller than name. Gaster and primitives consist of the empty variant, injection, em- Jones describe the formal translation from lacks predi- bedding, and decomposition: cates to evidence [5] and we will not repeat that here. hi :: hi Note that common transformations like inlining can be hl = i :: 8ra: (rnl) ) a ! hl :: a j ri used to optimize this expression.

First-Class Labels for Extensible Rows Technical Report: UU-CS-2004-051

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support