Towards Size-Dependent Types for Array Programming

Towards Size-Dependent Types for Array Programming Troels Henriksen Martin Elsman DIKU DIKU University of Copenhagen University of Copenhagen Copenhagen, Denmark Copenhagen, Denmark [email protected] [email protected] Abstract 1 Introduction We present a type system for expressing size constraints on Array programming often involves functions whose argu- array types in an ML-style type system. The goal is to detect ments must satisfy certain shape constraints. For example, shape mismatches at compile-time, while being simpler than when multiplying two matrices, the column count of the full dependent types. The main restrictions is that the only former must match the row count of the latter. Most of to- terms that can occur in types are array sizes, and syntactically day’s programming languages do not support compile-time they must be variables or constants. For those programs checking of such constraints, instead deferring them to run- where this is not sufficient, we support a form of existential time. As with all dynamic type checking, this can result in types, with the type system automatically managing the frustrating crashes. requisite book-keeping. We formalise a large subset of the Dependently typed languages allow us to encode the sizes type system in a small core language, which we prove sound. of arrays and have the type checker enforce the shape con- We also present an integration of the type system in the straints. From the perspective of array programming, current high-performance parallel functional language Futhark, and dependently typed languages have two problems: show on a collection of 44 representative programs that the 1. They are typically not very fast, where array programs restrictions in the type system are not too problematic in are often performance-critical. practice. 2. Full dependent types are complicated, and array pro- grammers may not need or want all of their capabili- CCS Concepts: • Computing methodologies ! Parallel ties. In particular, working with existential sizes, as pro- programming languages. duced by operations such as filtering, requires deeper understanding of type theory. Keywords: type systems, parallel programming, functional programming In this paper, we investigate the foundations of a type system that supports size-dependent types. Our goal is to ACM Reference Format: construct a programming language in the ML family that Troels Henriksen and Martin Elsman. 2021. Towards Size-Dependent allows a “conventional” style of programming, while still Types for Array Programming. In Proceedings of the 7th ACM SIG- verifying shape constraints statically. We have the following PLAN International Workshop on Libraries, Languages and Compilers concrete design goals for size-dependent types: for Array Programming (ARRAY ’21), June 21, 2021, Virtual, Canada. 1. It must be possible to write functions that operate ACM, New York, NY, USA, 14 pages. https://doi.org/10.1145/3460944. 3464310 on multi-dimensional arrays, where we can express equality constraints on the dimensions of these arrays. When a function is applied, the type checker must verify that the arguments satisfy the constraints. Permission to make digital or hard copies of all or part of this work for 2. Any book-keeping involved in handling existential personal or classroom use is granted without fee provided that copies sizes should be done automatically. are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights 3. Multidimensional arrays should be handled as “ar- for components of this work owned by others than the author(s) must rays of arrays”, while still guaranteeing that all arrays be honored. Abstracting with credit is permitted. To copy otherwise, or are regular.1. This permits parametric polymorphism, republish, to post on servers or to redistribute to lists, requires prior specific which allows functions such as map to have their con- permission and/or a fee. Request permissions from [email protected]. ventional behaviour. ARRAY ’21, June 21, 2021, Virtual, Canada © 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM. 1A regular multidimensional array is one where all rows have the same ACM ISBN 978-1-4503-8466-7/21/06...$15.00 shape. These are sometimes also called rectangular arrays, and are in oppo- https://doi.org/10.1145/3460944.3464310 sition to jagged arrays, which we do not support. ARRAY ’21, June 21, 2021, Virtual, Canada Troels Henriksen and Martin Elsman 4. The type system should not impede a high-performance 3 ::= Size sorts implementation of the language. In particular it must j = constant not require complicated value representations or main- j G variable tenance of complex runtime structures. g ::= Basic types We value simplicity over completeness. As we shall see, j int integer we place limitations on the form of sizes which means some j bool boolean programs will require dynamic shape checking, although j ¹g, g 0º pair such checks will always be syntactically explicit size coer- j »3¼g array cions that are inserted by the programmer when needed. An j ¹G : gº ! ` function intentional non-goal is that we are not trying to statically eliminate out-of-bounds indexing. ` ::= Return types We begin in Section2 by defining 퐹, a small core language j 9G.` existential size that we use to illustrate the mechanics of the type system. j g basic type In Section3 we prove soundness for 퐹, which implies the absence of shape errors at runtime. 퐹 is a simplified lan- f ::= Type schemes guage, and does not yet fulfill all our goals—in particular, j g basic type it is monomorphic. However, Section5 demonstrates how j ¢ abstract size type we have added size types to the data parallel programming language Futhark [13], which has full support for paramet- 4 ::= Expressions ric polymorphism. This not only shows that the type sys- j = constant integer tem, despite its simplicity, is expressive enough to handle j true j false constant boolean real programs, but also that it does not provide obstacles to j G variable high-performance implementation. Section6 discusses our j _¹G : gº.4 function experience with using size types in practice, in particular j 4 4 application whether they are flexible enough to handle a wide variety j »4, ··· , 4¼ array of parallel benchmark problems. Finally, Section7 discusses j iota 4 index array how our approach relates to prior work, with a particular j 4 »4¼ array index focus on how it fits into the array language tradition. j ¹4, 4º pair j fst 4 j snd 4 projection 2 A Language with Size-Dependent Types j if 4 then 4 else 4 conditional In this section, we present a language 퐹 with support for size j 4 ⊲ g dynamic type coercion types. The language is simplistic in a variety of ways. First of j let ? = 4 in 4 let all, it does not support type polymorphism, which we shall ? ::= G treat later in Section4. Second, the language assumes that j »G¯¼ G : g expressions are flattened so that expressions that allocate existentially sized arrays are bound by explicit let-constructs. Figure 1. Grammar for 퐹. Thus, if we assume the availability of a type-monomorphic (but size-polymorphic) map-function and a function iota that takes an integer = and returns an index-array of size =, we also assume that an expression Whereas the type of this entire expression will be 9G.»G¼int, G map ¹_G.G ¸ 1º ¹iota 5º where is an existentially bound size variable, inside the scope of the let-construct, it will be known that 0 and the has been converted to the form result of the map application have the same size. let 0 = iota 5 in map ¹_G.G ¸ 1º 0. 2.1 Types and Expressions This conversion simplifies the framework and can easily We assume a denumerably infinite set of program variables, be implemented in the frontend of a compiler. Although, ranged over by G, ~, and 5 . The grammars for size sorts (3), the type system supports an implicit let-binding construct, basic types (g), and return types (`) are defined in Figure1. as used in the map example, an explicit version is available, Size sorts appear in array types of the form »3¼g, which which eases type inference and which brings existential sizes denote arrays of size 3 containing elements of type g.A into scope at expression level as int variables. In the explicit function type ¹G : gº ! ` allows for G to appear in `, which version of the let construct, the expression becomes is useful for expressing a dependency between a parameter let »=¼ 0 : »=¼int = iota 5 in map ¹_G.G ¸ =º 0. of type int and the size of an array in the result. Towards Size-Dependent Types for Array Programming ARRAY ’21, June 21, 2021, Virtual, Canada For basic types on the form ¹G : gº ! `, we consider G Size sorts Γ ` 3 ok bound in `. For return types on the form 9G.`, we consider G bound in `. Respectively, basic types and return types Γ¹Gº = int Γ¹Gº = ¢ are considered identical up to renaming of bound variables. Γ ` = ok Γ ` G ok Γ ` G ok Return types are also considered identical up to removal of g bound existential variables that do not occur free in the body. Types Γ ` ok Contexts, ranged over by Γ, map program variables to type schemes, which, in the simple setting that lacks parametric Γ ` int ok Γ ` bool ok size- and type-polymorphism, amounts to being a type g or a type scheme of the form ¢, which denote an abstract size. Γ ` 3 ok Γ ` g ok Γ ` g ok Γ ` g 0 ok G G , ,G f f , , f G f 0 When ¯ = 1 ··· = and ¯ = 1 ··· =, we write ¯ : ¯ Γ ` »3¼g ok Γ ` ¹g, g º ok for specifying the type assumption G1 : f1, ··· ,G= : f=.

Load more