Set-Theoretic Foundation of Parametric Polymorphism and Subtyping

Set-theoretic Foundation of Parametric Polymorphism and Subtyping Giuseppe Castagna1 Zhiwu Xu1;2 ∗ 1CNRS, Laboratoire Preuves, Programmes et Systemes,` Univ Paris Diderot, Sorbonne Paris Cite,´ Paris, France. 2State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Science, Beijing, China Abstract. We define and study parametric polymorphism for a needed, as shown by our practice of Ocsigen [1], a framework to type system with recursive, product, union, intersection, negation, develop dynamic web sites where web-site paths (uri) are associ- and function types. We first recall why the definition of such a sys- ated to functions that take the uri parameters —the so-called “query tem was considered hard—when not impossible—and then present strings”[7]— and return a Xhtml page. The core of the dynamic the main ideas at the basis of our solution. In particular, we intro- part of Ocsigen system is the function register new service duce the notion of “convexity” on which our solution is built up and whose (moral) type is: discuss its connections with parametricity as defined by Reynolds 8X<:Params:(Path × (X ! Xhtml)) ! unit to whose study our work sheds new light. That is, it is a function that registers the association of its two Categories and Subject Descriptors D.3.3 [Programming Lan- parameters: a path in the site hierarchy and a function that fed with guages]: Language Constructs and Features—Polymorphism; Data a query string that matches the description X (Params being the types and structure; F.3.3 [Logics and Meanings of Programs]: XML type of all possible query strings) returns an Xhtml page. Studies of Program Constructs—type structure Unfortunately, this kind of polymorphism is not available and the General Terms Theory, Languages current implementation of register new service must bypass the type system (of OCaml, OCamlDuce [9], and/or CDuce [2]), Keywords Types, subtyping, polymorphism, parametricity, XML losing all the advantages of static verification. So why despite all this interest and motivations does no satis- 1. Introduction factory solution exist yet? The crux of the problem is that, despite several efforts (eg, [15, 20]), it was not known how to define —and The standard approach to defining subtyping is to define a collec- a fortiori how to decide— the subtyping relation for XML types in tion of syntax-driven subtyping rules that relate types with similar the presence of type variables. Actually, a purely semantic subtyp- syntactic structure. However the presence of logical operators, such ing approach to polymorphism was believed to be impossible. as unions and intersections, makes a syntactic characterization dif- In this paper we focus on this problem and —though, hence- ficult, which is why a semantic approach is used instead: type t is a forth we will seldom mention XML— solve it. The solution — subtype of type s if the set of values denoted by t is a subset of the which is more broadly applicable than just to an XML processing set of values denoted by s. The goal of this study is to extend this setting— is based on some strong intuitions and a lot of technical approach to parametric types. Since such type systems are at the work. We follow this dichotomy for our presentation and organize heart of functional languages manipulating XML data, our study it in two parts: an informal description of the main ideas and in- directly applies to them. Parametric polymorphism has repeatedly tuitions underlying the approach and the formal description of the been requested to and discussed in various working groups of stan- technical development. More precisely, in Section 2 we first ex- dards (eg, RELAX NG [5] and XQuery [6]) since it would bring amine why this problem is deemed unfeasible or unpractical and not only the well-known advantages already demonstrated in ex- simple solutions do not work (x2.1-2.3). Then we present the in- isting functional languages, but also new usages peculiar to XML. tuition underlying our solution (x2.4) and outline, in an informal A typical example is SOAP [21] that provides XML “envelopes” way, the main properties that make the definition of subtyping pos- to wrap generic content. Functions manipulating SOAP envelopes sible (x2.5) as well as the key technical details of the algorithm that are thus working on polymorphically typed objects encapsulated decides it (x2.6). We conclude this first part by giving some ex- in XML structures. Polymorphic higher-order functions are also amples of the subtyping relation (x2.7) and discussing related work (x2.8). In Section 3 we present the key steps of the formal treatment ∗ This author was partially supported by the National Natural Science Foun- to support the claims of Section 2 in particular the soundness and dation of China under Grant No. 61070038 and by a joint training program completeness of the algorithm and the decidability of the subtyp- by CNRS and the Chinese Academy of Science. ing relation. We conclude our presentation with Section 4 where we discuss at length the connections between parametricity and our solution, as well as the new perspectives of research that our work opens. It may seem odd that we focus on a subtyping relation Permission to make digital or hard copies of all or part of this work for personal or without defining any calculus to use it. Defining the polymorphic classroom use is granted without fee provided that copies are not made or distributed subtyping relation and defining a polymorphic calculus are distinct for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute problems (though the latter depends on the former), and there is no to lists, requires prior specific permission and/or a fee. doubt that the former is the very harder one. Once the subtyping ICFP’11, September 19–21, 2011, Tokyo, Japan. relation is defined, it can be immediately applied to simple poly- Copyright c 2011 ACM 978-1-4503-0865-6/11/09. $10.00 morphic calculi (eg, an extension with higher-order functions of the language in [15]) but the definition of calculi that fully exploit the these are the sets of pairs in which it is not true that the first pro- expressiveness of these types and of algorithms that (even partially) jection belongs to t1 and the second does not belong to t2 ). infer type annotations are different and very difficult problems that But here the problemJ K is not circularity but cardinality, sinceJ thisK we leave for future work. would require D to contain P(D2), which is impossible. The solution to both problems is given by the theory of semantic 2. The key ideas subtyping [10], and relies on the observation that in order to use 2.1 Regular types types in a programming language we do not need to know what types are, but just how they are related (by subtyping). In other XML types are essentially regular tree languages: an XML type terms, we do not require the semantic interpretation to map arrow is the set of all XML documents that match the type. As such types into the set in (1), but just to map them into sets that induce they can be encoded by product types (for concatenation), union the same subtyping relation as (1) does. Roughly speaking, this types (for regexp unions) and recursive types (for Kleene’s star). turns out to require that for all s1, s2, t1, t2, the function : satisfies To type higher order functions we need arrow types and we also the property: J K need intersection and negation types since in the presence of arrows they can no longer be encoded. Therefore, studying polymorphism s1!s2 ⊆ t1!t2 () P( s1 × s2 )⊆P( t1 × t2 ) (2) for XML types is equivalent to studying it for the types that are J K J K J K J K J K J K coinductively (for recursion) produced by the following grammar: whatever the sets denoted by s1!s2 and t1!t2 are. Equation (2) above covers only the case in which we compare two single arrow t ::= b j t × t j t ! t j t _ t j t ^ t j :t j 0 j 1 types. But, of course, a similar restriction must be imposed also where b ranges over basic types (eg, Bool, Real, Int, . ), and when comparing arbitrary Boolean combinations of arrows. For- 0 and 1 respectively denote the empty (ie, that contains no value) mally, this can be enforced as follows. Let : be a mapping from J K and top (ie, that contains all values) types. In other terms, types T to P(D), we define a new mapping E : as follows (henceforth J K are nothing but a propositional logic (with standard logical con- we omit the : subscript from E : ): J K J K nectives: ^; _; :) whose atoms are 0, 1, basic, product, and arrow E(0)=? E(1)=D types. We use T to denote the set of all types. E(:t)=D n E(t) E(b)= b In order to preserve the semantics of XML types as sets of doc- J K uments (but also to give programmers a very intuitive interpreta- E(t1 _ t2)=E(t1) [ E(t2) E(t1 × t2)= t1 × t2 J K J K tion of types) it is advisable to interpret a type as the set of all E(t1 ^ t2)=E(t1) \ E(t2) E(t1 ! t2)=P( t1 × t2 ) values that have that type. Accordingly, Int is interpreted as the J K J K Then : and D form a set-theoretic model of types if, besides set that contains the values 0, 1, -1, 2,...; Bool is interpreted J K as the set the contains the values true and false; and so on.

Set-Theoretic Foundation of Parametric Polymorphism and Subtyping

An Extended Theory of Primitive Objects: First Order System Luigi Liquori

Metaobject Protocols: Why We Want Them and What Else They Can Do

On Model Subtyping ClÉment Guy, Benoit Combemale, Steven Derrien, James Steel, Jean-Marc JÉzÉquel

250P: Computer Systems Architecture Lecture 10: Caches

Chapter 2 Basics of Scanning And

BALANCING the Eulisp METAOBJECT PROTOCOL

Third Party Channels How to Set Them up – Pros and Cons

Abstract Data Types

Design and Implementation of Generics for the .NET Common Language Runtime

POINTER (IN C/C++) What Is a Pointer?

Subtyping Recursive Types

Parametric Polymorphism Parametric Polymorphism