Open Data Types and Open Functions
Total Page:16
File Type:pdf, Size:1020Kb
Open Data Types and Open Functions Andres Loh¨ Ralf Hinze Institut fur¨ Informatik III, Universitat¨ Bonn Romerstraße¨ 164, 53117 Bonn, Germany {loeh,ralf}@informatik.uni-bonn.de Abstract Preferably, such extensions should be possible without modify- The problem of supporting the modular extensibility of both data ing the previously written code, because we want to organize differ- and functions in one programming language at the same time is ent aspects separately, and the existing code may have been written known as the expression problem. Functional languages tradi- by a different person and exist in a closed library. tionally make it easy to add new functions, but extending data Alas, Haskell natively supports only one of the two directions (adding new data constructors) requires modifying existing code. of extension. We can easily add a new function, such as We present a semantically and syntactically lightweight variant of toString :: Expr → String open data types and open functions as a solution to the expression toString (Num n) = show n problem in the Haskell language. Constructors of open data types and equations of open functions may appear scattered throughout but we cannot grow our language of expressions: in Haskell, all a program with several modules. The intended semantics is as fol- constructors of the Expr data type must be defined the very moment lows: the program should behave as if the data types and functions when Expr is introduced. We therefore cannot add new constructors were closed, defined in one place. The order of function equations to the Expr data type without touching the original definition. is determined by best-fit pattern matching, where a specific pattern The above problem of supporting the modular extensibility of takes precedence over an unspecific one. We show that our solution both data and functions in one programming language at the same is applicable to the expression problem, generic programming, and time has been formulated by Wadler [1] and is called the expression exceptions. We sketch two implementations: a direct implementa- problem. As we have just witnessed, functional languages make it tion of the semantics, and a scheme based on mutually recursive easy to add new functions, but extending data (adding new data modules that permits separate compilation. constructors) requires modifying existing code. In object-oriented languages, extension of data is directly supported by defining new Categories and Subject Descriptors Programming Lan- D.3.3 [ classes, but adding new functions to work on that data requires guages ]: Language Constructs and Features changing the class definitions. Using the Visitor design pattern, one General Terms Languages, Design can simulate the situation of functional programming in an object- oriented language: one gains extensibility for functions, but loses Keywords expression problem, extensible data types, extensible extensibility of data at the same time. functions, functional programming, Haskell, generic programming, Many partial and proper solutions to the expression problem extensible exceptions, mutually recursive modules have been proposed over the years. However, much of this work 1. Introduction suffers from one or more of the following disadvantages: • Suppose we have a very simple language of expressions, which It is focused on object-oriented programming, and cannot di- consists only of integer constants: rectly be translated to a functional language. • data Expr = Num Int It introduces rather complex extensions to the type system, such as multi-methods [2, 3] or mixins [4]. We intend to grow the language, and add new features as the need • Open entities have their own special syntax or severe limi- for them arises. Here is an interpreter for expressions: tations, thereby forcing a programmer to decide in advance eval :: Expr → Int whether an entity should be open or not, because there is no eval (Num n) = n easy way to switch later. There are two possible directions to extend the program: first, we Closest to our proposal is the work of Millstein et al. [3], who de- can add new functions, such as a conversion function from expres- scribe the addition of hierarchical classes and extensible functions sions to strings; second, we can add new forms of expressions by to ML. A discussion of this and other related work can be found in adding new constructors. Section 6. In this paper, we present a semantically and syntactically lightweight variant of open data types and open functions as a solution to the expression problem in the functional programming Permission to make digital or hard copies of all or part of this work for personal or language Haskell. By declaring Expr (from the example above) as classroom use is granted without fee provided that copies are not made or distributed an open data type, we can add new constructors at any time and at for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute any point in the program. If we want to evaluate or print the newly to lists, requires prior specific permission and/or a fee. introduced expression forms, we have to adapt eval and toString. PPDP’06 July 10–12, 2006, Venice, Italy. If these functions are declared as open functions, we can add new Copyright c 2006 ACM 1-59593-388-3/06/0007. $5.00. defining equations for the functions at a later point in the program. open eval :: Expr → Int The intended semantics is as follows: the program should behave as if the data types and functions were closed, defined in one place. Note that only top-level functions can be marked as open. The The main design goal of our proposal is simplicity: open data defining equations for an open function can be provided at any types and open functions are easy to add to the language, and their point of the program where the name of the function is in scope, semantics is simple to explain as a source-to-source transformation. using the normal syntax for function definitions: In particular, the type system and the operational semantics of eval (Num n) = n the original language are unaffected by the addition. Interestingly, we can implement our proposal in such a way that it supports If we leave it at that, we have a program consisting of an Expr separate compilation: we employ Haskell’s support for mutually data type with only one constructor, and an eval function consisting recursive modules to tie the recursive knot of open functions (see of only one equation, just like in the introduction. Section 5.2). In addition to the expression problem, we present two other ap- 2.3 The expression problem plications of open data types and open functions: generic program- Open data types and open functions support both directions of ming and extensible exceptions. extension. A new function can be added as usual, except that we The rest of the paper is structured as follows: In Section 2, also mark it as open: we present the syntax and informal semantics of open data types and open functions. We use the expression problem as a running open toString :: Expr → String example and show how it is solved using the new features. We then toString (Num n) = show n look at other applications (Section 3). A formal semantics is given It is now easy to add a new constructor for Expr. All we have to do in Section 4. We investigate possible implementations in Section 5. is provide its type signature: We then discuss related work (Section 6), before we conclude in Section 7. Plus :: Expr → Expr → Expr Of course, if we do not adapt eval and toString and call it on 2. Open data types and open functions expressions constructed with Plus, we get a pattern match failure. In this section, we describe how to declare open data types and open However, since eval and toString are open functions, we can just functions, and how to add new constructors to open data types and provide the missing equations: new equations to open functions. We also explain the semantics of eval (Plus e e ) = eval e + eval e these constructs, and we discuss the problem of pattern matching 1 2 1 2 for open functions. toString (Plus e1 e2) = "(" ++ toString e1 ++ "+" We strive for a simple and pragmatic solution to the problem ++ toString e2 ++ ")" of open data types. Therefore, we draw inspiration from a well- Note that we have extended our program without modifying exist- known and widely used Haskell feature that is extensible: type ing code. classes. Instances for type classes can be defined at any point in the program, but type classes also have a number of restrictions 2.4 Semantics that make this possible: Type classes have a global, program-wide meaning. Type class and instance declarations may only appear at Let us now turn to the semantics of open data types and open the top level of a module. All instances are valid for the entire functions. The driving goal of our proposal is to be as simple program, and instance declarations cannot be hidden. as possible: a program should behave as if all open data types We use a similar approach for open data types and functions: and open functions were closed, defined in one place. Apart from they have a global, program-wide meaning. Open entities can only the syntactic peculiarity that they are defined in several places, be defined and extended at the top level of a module. All construc- open data types and open functions define ordinary data types and tors and function equations are valid for the entire program, and functions.