Subtyping: Overview and Implementation CS5218: Principles of Program Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Subtyping: Overview and Implementation CS5218: Principles of Program Analysis Subtyping: Overview and Implementation CS5218: Principles of Program Analysis Chen Jingwen National University of Singapore [email protected] 1 ABSTRACT of a typechecker in a toy programming language with subtyping. Type systems has been a topic of heavy research in programming Finally, we will explore the related work in the eld of subtyping. language theory for decades, with implementations in dierent lan- guages across paradigms. In object oriented languages, class based 3 SUBTYPING type systems describe a hierarchy between objects using a concept 3.1 Overview known as subtyping. Subtyping describes relations between types, allowing terms of a type to be used safely in places where terms Traditionally, type systems rigidly restrict the usage of typed terms of another type is expected. This increases the exibility of exist- in locations based on the idea of type equality. This results in ing type systems by increasing expressiveness while maintaining programs where it is not obvious why typechecking fails, even well-typedness. In this technical report, we will give an overview though it seems natural in both syntax and semantics. For example, of the motivation and formalism of subtyping systems. We will the function application also provide a Haskell implementation of record datatype subtyp- (fn (x: float) => x + 2) (2 :: int) ing in a functional toy language to demonstrate a typechecking algorithm. Lastly, we will explore related work in the eld in the seems that it should be well-typed as the + operation permits addi- decades following its conception. tions with operands of both int and float types, but since they are not equal under classical type systems, the expression is ill-typed. 2 INTRODUCTION Subtyping is the formalisation of relations between types with the goal of allowing terms of a type to be used in locations where Type systems are a central feature of programming languages and terms of other types are expected. For example, we want to allow they come in dierent manifestations, stemming from decades of operations that are dened on reals, R, to work with integers, Z, research in programming language theory. Despite implementation by forming a subtype relation between Z and R. Pierce described dierences, the one thing in common is to prevent entire classes of this to be the principle of safe substitution in his book, Types and erroneous program behaviours from happening. Programming Languages [26], where terms of one type can be safely Cardelli’s 1996 paper, Type Systems [8], provides a high level substituted in place of a term of another type without incurring overview on the subject, ranging from rst order type systems (a runtime type errors. This is also known as Liskov’s substitution la Simply Typed Lambda Calculus) to System F<:. A collection of principle, where safety is asserted on behavioural properties in type system avours is summarised in the following list: subtype relations [20]. (1) First order (System F1): Pascal, Algol68 As a core concept utilised in object-oriented (OO) languages (2) Second order (System F2): ML, Haskell, Modula-3 for the hierarchical organization of objects, subtyping is often im- (3) Object-oriented (OO), class-based: Java, C++ plemented in the form of subclassing. Classes, the templates that (4) Dependent types: Idris, Agda objects are instantiated from, are ordered hierarchically in the form (5) Dynamic: Smalltalk, Ruby of superclasses and subclasses. In “A Semantics of Multiple Inheritance”, Cardelli analysed pop- Zooming in on typed object-oriented systems, the dominating ular OO languages (Simula, Lisp & Smalltalk) in their representa- feature is the type hierarchy via subclasses, which describes a rela- tions of objects [5], and decided to focus on representing objects tion of attribute and behaviour inheritance between objects. How- as records, which is essentially the Cartesian product type with la- ever, the variety of implementations and optimizations of subclass- bels. The main advantage of using the concept of records (with ing systems makes understanding and formalising this relation a functional components) was for simplicity, because it is possible to dicult endeavour. Subtyping is a popular approach to formalise statically determine the elds of a record at compile-time, hence such relations, and it was a concept introduced by Cardelli with his removing any need for dynamical analysis at runtime for invalid seminal paper “A Semantics of Multiple Inheritance” [5]. eld accesses. He then presented the grammar and semantics of a The purpose of this report is to present a notion of typing and static, strongly typed functional language with multiple inheritance, subtyping, an exploration on the evolution of research around and record and variant subtyping. He also provided sound type subtyping in the past decades, and an implementation of a static inference and type checking algorithms on the semantics. typechecker for a language with a subtyping system. In the next In a later paper, Type Systems [8], he described subtyping as a section, we will describe what subtyping is and the benets that it form of set inclusion, where terms of a type can be understood as el- brings to a type system. Next, we will formalize the inference rules ements belonging to a set, and the subtyping relation is understood for a language with subtyping, alongside a concrete implementation as the subset relation between these sets of elements. 3.2 Implementation of FunSub 3.2.4 Records. Records are data structures in the form of nite Before diving into an overview of the fundamental subtyping con- and unordered associations from labels to values, with each pair cepts, we will rst describe a Haskell implementation of a toy lan- described as a eld. To extract a value from a eld, we use the guage, FunSub, and a static typechecker for checking well-typedness projection notation e:a where a is the label of a eld in the record in the presence of subtypes. e. The purpose of this toy language is to provide a concrete real- The purpose of having records in our toy language is for demon- isation of the formal denitions that we will explore in the later strating subtyping relations on a composite data structure, unlike sections. We will attach snippets of implementations alongside the Int and Bool. denitions where applicable. 3.2.5 Explicit types. To simplify implementation, we chose to FunSub Fun is a superset of [10], which is in turn an implementa- use an explicit typing notation (expr :: type) to assert the types tion of Church’s Simply Typed Lambda Calculus [13] that includes for functions and records, in place of a type inference algorithm. the record datatype and a modied style of explicit typing (also known as ascriptions). 3.2.6 Example. The following is a valid expression in the FunSub The source code for FunSub is available online [12]. syntax: 3.2.1 Usage. Assuming Haskell is installed and the user is in (fn x :: ({ a: Int, b: Int } -> { a: Int }) => x) the project directory, running the following command will invoke { a = 2, b = 2, c = true } :: { a: Int, b: Int, c: Bool } the typechecker: In type systems without subtyping, this function application will not typecheck because the record argument type does not match runhaskell Main.hs examples/typed_expressions.fun the parameter type, and the parameter type does not match the This produces an output similar to the following: function body type even though it is an identity function. However .. in our subtyping implementation, we are able to use concepts such [Expression]: (fn x :: (Int -> Int) => 2) as variance (§3.5), width and depth record subtyping (§3.4) to make [Typecheck][OK]: (Int -> Int) this expression well-typed. 3.2.7 Implementation: AST. The AST is a direct translation of [Expression]: (fn x :: (Int -> Int) => true) the syntax dened in §3.2.3. [Typecheck][FAIL]: Type mismatch: expected Int, got Bool .. -- Syntax.hs data Ty = IntTy 3.2.2 Architecture. The components of the language comprises | BoolTy of a REPL/reader, lexer, parser and typechecker. Some examples of | ArrowTy Ty Ty FunSub expressions are stored in the examples/ folder. | RcdTy [(String, Ty)] Main.hs is the entry point that takes in the lename for a le deriving (Show, Eq) containing FunSub expressions. The lexer (Lexer.hs) denes re- served tokens and lexemes, and the parser (Parser.hs) uses Parsec data Expr = I Int Ty on the tokens to generate an abstract syntax tree (AST) dened | B Bool Ty in Syntax.hs. Lastly, the AST is checked for well-typedness with | Var String rules dened in the typechecker (Typecheck.hs). | Fn String Expr Ty | FApp Expr Expr 3.2.3 Syntax. | Rcd [(String, Expr)] Ty c 2 Const constants | RcdProj Expr Expr a; x 2 Ident alphanumeric identiers deriving (Show, Eq) 3.2.8 Implementation: Typechecker. The two main functions of t ::= Int integers the typechecker are typecheck and isSubtype. Their type signa- j Bool booleans tures are dened as follows: -- Typecheck.hs j t1 -> t2 arrows newtype TypeEnv = TypeEnv (Map String Ty) j fa1 : t1;:::; an : tn g records isSubtype :: Ty -> Ty -> Bool e ::= c constant typecheck :: TypeEnv -> Expr -> Either String Ty j x variable isSubtype takes in two types and recursively determines if the j (fn x :: (t) => e) function abstraction rst type is a subtype of the second, using the inference rules dened in §3.3. j e e function application 1 2 typecheck takes in a type environment (a mapping of variables j fa1 = e1;:::; an = en g :: t record to types) and an AST, and recursively determines if the expression j e:a record projection is well-typed. If it is ill-typed, an error message describing the issue 2 is bubbled up and handled in Main.hs.
Recommended publications
  • Application of TRIE Data Structure and Corresponding Associative Algorithms for Process Optimization in GRID Environment
    Application of TRIE data structure and corresponding associative algorithms for process optimization in GRID environment V. V. Kashanskya, I. L. Kaftannikovb South Ural State University (National Research University), 76, Lenin prospekt, Chelyabinsk, 454080, Russia E-mail: a [email protected], b [email protected] Growing interest around different BOINC powered projects made volunteer GRID model widely used last years, arranging lots of computational resources in different environments. There are many revealed problems of Big Data and horizontally scalable multiuser systems. This paper provides an analysis of TRIE data structure and possibilities of its application in contemporary volunteer GRID networks, including routing (L3 OSI) and spe- cialized key-value storage engine (L7 OSI). The main goal is to show how TRIE mechanisms can influence de- livery process of the corresponding GRID environment resources and services at different layers of networking abstraction. The relevance of the optimization topic is ensured by the fact that with increasing data flow intensi- ty, the latency of the various linear algorithm based subsystems as well increases. This leads to the general ef- fects, such as unacceptably high transmission time and processing instability. Logically paper can be divided into three parts with summary. The first part provides definition of TRIE and asymptotic estimates of corresponding algorithms (searching, deletion, insertion). The second part is devoted to the problem of routing time reduction by applying TRIE data structure. In particular, we analyze Cisco IOS switching services based on Bitwise TRIE and 256 way TRIE data structures. The third part contains general BOINC architecture review and recommenda- tions for highly-loaded projects.
    [Show full text]
  • An Extended Theory of Primitive Objects: First Order System Luigi Liquori
    An extended Theory of Primitive Objects: First order system Luigi Liquori To cite this version: Luigi Liquori. An extended Theory of Primitive Objects: First order system. ECOOP, Jun 1997, Jyvaskyla, Finland. pp.146-169, 10.1007/BFb0053378. hal-01154568 HAL Id: hal-01154568 https://hal.inria.fr/hal-01154568 Submitted on 22 May 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. An Extended Theory of Primitive Objects: First Order System Luigi Liquori ? Dip. Informatica, Universit`adi Torino, C.so Svizzera 185, I-10149 Torino, Italy e-mail: [email protected] Abstract. We investigate a first-order extension of the Theory of Prim- itive Objects of [5] that supports method extension in presence of object subsumption. Extension is the ability of modifying the behavior of an ob- ject by adding new methods (and inheriting the existing ones). Object subsumption allows to use objects with a bigger interface in a context expecting another object with a smaller interface. This extended calcu- lus has a sound type system which allows static detection of run-time errors such as message-not-understood, \width" subtyping and a typed equational theory on objects.
    [Show full text]
  • A Polymorphic Type System for Extensible Records and Variants
    A Polymorphic Typ e System for Extensible Records and Variants Benedict R. Gaster and Mark P. Jones Technical rep ort NOTTCS-TR-96-3, November 1996 Department of Computer Science, University of Nottingham, University Park, Nottingham NG7 2RD, England. fbrg,[email protected] Abstract b oard, and another indicating a mouse click at a par- ticular p oint on the screen: Records and variants provide exible ways to construct Event = Char + Point : datatyp es, but the restrictions imp osed by practical typ e systems can prevent them from b eing used in ex- These de nitions are adequate, but they are not par- ible ways. These limitations are often the result of con- ticularly easy to work with in practice. For example, it cerns ab out eciency or typ e inference, or of the di- is easy to confuse datatyp e comp onents when they are culty in providing accurate typ es for key op erations. accessed by their p osition within a pro duct or sum, and This pap er describ es a new typ e system that reme- programs written in this way can b e hard to maintain. dies these problems: it supp orts extensible records and To avoid these problems, many programming lan- variants, with a full complement of p olymorphic op era- guages allow the comp onents of pro ducts, and the al- tions on each; and it o ers an e ective type inference al- ternatives of sums, to b e identi ed using names drawn gorithm and a simple compilation metho d.
    [Show full text]
  • On Model Subtyping Cl´Ement Guy, Benoit Combemale, Steven Derrien, James Steel, Jean-Marc J´Ez´Equel
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by HAL-Rennes 1 On Model Subtyping Cl´ement Guy, Benoit Combemale, Steven Derrien, James Steel, Jean-Marc J´ez´equel To cite this version: Cl´ement Guy, Benoit Combemale, Steven Derrien, James Steel, Jean-Marc J´ez´equel.On Model Subtyping. ECMFA - 8th European Conference on Modelling Foundations and Applications, Jul 2012, Kgs. Lyngby, Denmark. 2012. <hal-00695034> HAL Id: hal-00695034 https://hal.inria.fr/hal-00695034 Submitted on 7 May 2012 HAL is a multi-disciplinary open access L'archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destin´eeau d´ep^otet `ala diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publi´esou non, lished or not. The documents may come from ´emanant des ´etablissements d'enseignement et de teaching and research institutions in France or recherche fran¸caisou ´etrangers,des laboratoires abroad, or from public or private research centers. publics ou priv´es. On Model Subtyping Clément Guy1, Benoit Combemale1, Steven Derrien1, Jim R. H. Steel2, and Jean-Marc Jézéquel1 1 University of Rennes1, IRISA/INRIA, France 2 University of Queensland, Australia Abstract. Various approaches have recently been proposed to ease the manipu- lation of models for specific purposes (e.g., automatic model adaptation or reuse of model transformations). Such approaches raise the need for a unified theory that would ease their combination, but would also outline the scope of what can be expected in terms of engineering to put model manipulation into action.
    [Show full text]
  • C Programming: Data Structures and Algorithms
    C Programming: Data Structures and Algorithms An introduction to elementary programming concepts in C Jack Straub, Instructor Version 2.07 DRAFT C Programming: Data Structures and Algorithms, Version 2.07 DRAFT C Programming: Data Structures and Algorithms Version 2.07 DRAFT Copyright © 1996 through 2006 by Jack Straub ii 08/12/08 C Programming: Data Structures and Algorithms, Version 2.07 DRAFT Table of Contents COURSE OVERVIEW ........................................................................................ IX 1. BASICS.................................................................................................... 13 1.1 Objectives ...................................................................................................................................... 13 1.2 Typedef .......................................................................................................................................... 13 1.2.1 Typedef and Portability ............................................................................................................. 13 1.2.2 Typedef and Structures .............................................................................................................. 14 1.2.3 Typedef and Functions .............................................................................................................. 14 1.3 Pointers and Arrays ..................................................................................................................... 16 1.4 Dynamic Memory Allocation .....................................................................................................
    [Show full text]
  • Abstract Data Types
    Chapter 2 Abstract Data Types The second idea at the core of computer science, along with algorithms, is data. In a modern computer, data consists fundamentally of binary bits, but meaningful data is organized into primitive data types such as integer, real, and boolean and into more complex data structures such as arrays and binary trees. These data types and data structures always come along with associated operations that can be done on the data. For example, the 32-bit int data type is defined both by the fact that a value of type int consists of 32 binary bits but also by the fact that two int values can be added, subtracted, multiplied, compared, and so on. An array is defined both by the fact that it is a sequence of data items of the same basic type, but also by the fact that it is possible to directly access each of the positions in the list based on its numerical index. So the idea of a data type includes a specification of the possible values of that type together with the operations that can be performed on those values. An algorithm is an abstract idea, and a program is an implementation of an algorithm. Similarly, it is useful to be able to work with the abstract idea behind a data type or data structure, without getting bogged down in the implementation details. The abstraction in this case is called an \abstract data type." An abstract data type specifies the values of the type, but not how those values are represented as collections of bits, and it specifies operations on those values in terms of their inputs, outputs, and effects rather than as particular algorithms or program code.
    [Show full text]
  • Typescript Language Specification
    TypeScript Language Specification Version 1.8 January, 2016 Microsoft is making this Specification available under the Open Web Foundation Final Specification Agreement Version 1.0 ("OWF 1.0") as of October 1, 2012. The OWF 1.0 is available at http://www.openwebfoundation.org/legal/the-owf-1-0-agreements/owfa-1-0. TypeScript is a trademark of Microsoft Corporation. Table of Contents 1 Introduction ................................................................................................................................................................................... 1 1.1 Ambient Declarations ..................................................................................................................................................... 3 1.2 Function Types .................................................................................................................................................................. 3 1.3 Object Types ...................................................................................................................................................................... 4 1.4 Structural Subtyping ....................................................................................................................................................... 6 1.5 Contextual Typing ............................................................................................................................................................ 7 1.6 Classes .................................................................................................................................................................................
    [Show full text]
  • Data Structures, Buffers, and Interprocess Communication
    Data Structures, Buffers, and Interprocess Communication We’ve looked at several examples of interprocess communication involving the transfer of data from one process to another process. We know of three mechanisms that can be used for this transfer: - Files - Shared Memory - Message Passing The act of transferring data involves one process writing or sending a buffer, and another reading or receiving a buffer. Most of you seem to be getting the basic idea of sending and receiving data for IPC… it’s a lot like reading and writing to a file or stdin and stdout. What seems to be a little confusing though is HOW that data gets copied to a buffer for transmission, and HOW data gets copied out of a buffer after transmission. First… let’s look at a piece of data. typedef struct { char ticker[TICKER_SIZE]; double price; } item; . item next; . The data we want to focus on is “next”. “next” is an object of type “item”. “next” occupies memory in the process. What we’d like to do is send “next” from processA to processB via some kind of IPC. IPC Using File Streams If we were going to use good old C++ filestreams as the IPC mechanism, our code would look something like this to write the file: // processA is the sender… ofstream out; out.open(“myipcfile”); item next; strcpy(next.ticker,”ABC”); next.price = 55; out << next.ticker << “ “ << next.price << endl; out.close(); Notice that we didn’t do this: out << next << endl; Why? Because the “<<” operator doesn’t know what to do with an object of type “item”.
    [Show full text]
  • Typescript-Handbook.Pdf
    This copy of the TypeScript handbook was created on Monday, September 27, 2021 against commit 519269 with TypeScript 4.4. Table of Contents The TypeScript Handbook Your first step to learn TypeScript The Basics Step one in learning TypeScript: The basic types. Everyday Types The language primitives. Understand how TypeScript uses JavaScript knowledge Narrowing to reduce the amount of type syntax in your projects. More on Functions Learn about how Functions work in TypeScript. How TypeScript describes the shapes of JavaScript Object Types objects. An overview of the ways in which you can create more Creating Types from Types types from existing types. Generics Types which take parameters Keyof Type Operator Using the keyof operator in type contexts. Typeof Type Operator Using the typeof operator in type contexts. Indexed Access Types Using Type['a'] syntax to access a subset of a type. Create types which act like if statements in the type Conditional Types system. Mapped Types Generating types by re-using an existing type. Generating mapping types which change properties via Template Literal Types template literal strings. Classes How classes work in TypeScript How JavaScript handles communicating across file Modules boundaries. The TypeScript Handbook About this Handbook Over 20 years after its introduction to the programming community, JavaScript is now one of the most widespread cross-platform languages ever created. Starting as a small scripting language for adding trivial interactivity to webpages, JavaScript has grown to be a language of choice for both frontend and backend applications of every size. While the size, scope, and complexity of programs written in JavaScript has grown exponentially, the ability of the JavaScript language to express the relationships between different units of code has not.
    [Show full text]
  • Subtyping Recursive Types
    ACM Transactions on Programming Languages and Systems, 15(4), pp. 575-631, 1993. Subtyping Recursive Types Roberto M. Amadio1 Luca Cardelli CNRS-CRIN, Nancy DEC, Systems Research Center Abstract We investigate the interactions of subtyping and recursive types, in a simply typed λ-calculus. The two fundamental questions here are whether two (recursive) types are in the subtype relation, and whether a term has a type. To address the first question, we relate various definitions of type equivalence and subtyping that are induced by a model, an ordering on infinite trees, an algorithm, and a set of type rules. We show soundness and completeness between the rules, the algorithm, and the tree semantics. We also prove soundness and a restricted form of completeness for the model. To address the second question, we show that to every pair of types in the subtype relation we can associate a term whose denotation is the uniquely determined coercion map between the two types. Moreover, we derive an algorithm that, when given a term with implicit coercions, can infer its least type whenever possible. 1This author's work has been supported in part by Digital Equipment Corporation and in part by the Stanford-CNR Collaboration Project. Page 1 Contents 1. Introduction 1.1 Types 1.2 Subtypes 1.3 Equality of Recursive Types 1.4 Subtyping of Recursive Types 1.5 Algorithm outline 1.6 Formal development 2. A Simply Typed λ-calculus with Recursive Types 2.1 Types 2.2 Terms 2.3 Equations 3. Tree Ordering 3.1 Subtyping Non-recursive Types 3.2 Folding and Unfolding 3.3 Tree Expansion 3.4 Finite Approximations 4.
    [Show full text]
  • 4 Hash Tables and Associative Arrays
    4 FREE Hash Tables and Associative Arrays If you want to get a book from the central library of the University of Karlsruhe, you have to order the book in advance. The library personnel fetch the book from the stacks and deliver it to a room with 100 shelves. You find your book on a shelf numbered with the last two digits of your library card. Why the last digits and not the leading digits? Probably because this distributes the books more evenly among the shelves. The library cards are numbered consecutively as students sign up, and the University of Karlsruhe was founded in 1825. Therefore, the students enrolled at the same time are likely to have the same leading digits in their card number, and only a few shelves would be in use if the leadingCOPY digits were used. The subject of this chapter is the robust and efficient implementation of the above “delivery shelf data structure”. In computer science, this data structure is known as a hash1 table. Hash tables are one implementation of associative arrays, or dictio- naries. The other implementation is the tree data structures which we shall study in Chap. 7. An associative array is an array with a potentially infinite or at least very large index set, out of which only a small number of indices are actually in use. For example, the potential indices may be all strings, and the indices in use may be all identifiers used in a particular C++ program.Or the potential indices may be all ways of placing chess pieces on a chess board, and the indices in use may be the place- ments required in the analysis of a particular game.
    [Show full text]
  • Generic Programming
    Generic Programming July 21, 1998 A Dagstuhl Seminar on the topic of Generic Programming was held April 27– May 1, 1998, with forty seven participants from ten countries. During the meeting there were thirty seven lectures, a panel session, and several problem sessions. The outcomes of the meeting include • A collection of abstracts of the lectures, made publicly available via this booklet and a web site at http://www-ca.informatik.uni-tuebingen.de/dagstuhl/gpdag.html. • Plans for a proceedings volume of papers submitted after the seminar that present (possibly extended) discussions of the topics covered in the lectures, problem sessions, and the panel session. • A list of generic programming projects and open problems, which will be maintained publicly on the World Wide Web at http://www-ca.informatik.uni-tuebingen.de/people/musser/gp/pop/index.html http://www.cs.rpi.edu/˜musser/gp/pop/index.html. 1 Contents 1 Motivation 3 2 Standards Panel 4 3 Lectures 4 3.1 Foundations and Methodology Comparisons ........ 4 Fundamentals of Generic Programming.................. 4 Jim Dehnert and Alex Stepanov Automatic Program Specialization by Partial Evaluation........ 4 Robert Gl¨uck Evaluating Generic Programming in Practice............... 6 Mehdi Jazayeri Polytypic Programming........................... 6 Johan Jeuring Recasting Algorithms As Objects: AnAlternativetoIterators . 7 Murali Sitaraman Using Genericity to Improve OO Designs................. 8 Karsten Weihe Inheritance, Genericity, and Class Hierarchies.............. 8 Wolf Zimmermann 3.2 Programming Methodology ................... 9 Hierarchical Iterators and Algorithms................... 9 Matt Austern Generic Programming in C++: Matrix Case Study........... 9 Krzysztof Czarnecki Generative Programming: Beyond Generic Programming........ 10 Ulrich Eisenecker Generic Programming Using Adaptive and Aspect-Oriented Programming .
    [Show full text]