The History of Standard ML
Total Page:16
File Type:pdf, Size:1020Kb
The History of Standard ML DAVID MACQUEEN, University of Chicago, USA ROBERT HARPER, Carnegie Mellon University, USA JOHN REPPY, University of Chicago, USA Shepherd: Kim Bruce, Pomona College, USA The ML family of strict functional languages, which includes F#, OCaml, and Standard ML, evolved from the Meta Language of the LCF theorem proving system developed by Robin Milner and his research group at the University of Edinburgh in the 1970s. This paper focuses on the history of Standard ML, which plays a central rôle in this family of languages, as it was the first to include the complete set of features that we now associate with the name “ML” (i.e., polymorphic type inference, datatypes with pattern matching, modules, exceptions, and mutable state). Standard ML, and the ML family of languages, have had enormous influence on the world of programming language design and theory. ML is the foremost exemplar of a functional programming language with strict evaluation (call-by-value) and static typing. The use of parametric polymorphism in its type system, together with the automatic inference of such types, has influenced a wide variety of modern languages (where polymorphism is often referred to as generics). It has popularized the idea of datatypes with associated case analysis by pattern matching. The module system of Standard ML extends the notion of type-level parameterization to large-scale programming with the notion of parametric modules, or functors. Standard ML also set a precedent by being a language whose design included a formal definition with an associated metatheory of mathematical proofs (such as soundness of the type system). A formal definition was one of the explicit goals from the beginning of the project. While some previous languages had rigorous definitions, these definitions were not integral to the design process, and the formal part was limitedtothe language syntax and possibly dynamic semantics or static semantics, but not both. The paper covers the early history of ML, the subsequent efforts to define a standard ML language, and the development of its major features and its formal definition. We also review the impact that the language had on programming-language research. CCS Concepts: • Software and its engineering ! Functional languages; Formal language definitions; Polymorphism; Data types and structures; Modules / packages; Abstract data types. Additional Key Words and Phrases: Standard ML, Language design, Operational semantics, Type checking ACM Reference Format: David MacQueen, Robert Harper, and John Reppy. 2020. The History of Standard ML. Proc. ACM Program. Lang. 4, HOPL, Article 86 (June 2020), 100 pages. https://doi.org/10.1145/3386336 86 Authors’ addresses: David MacQueen, Computer Science, University of Chicago, 5730 S. Ellis Avenue, Chicago, IL, 60637, USA, [email protected]; Robert Harper, Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA, [email protected]; John Reppy, Computer Science, University of Chicago, 5730 S. Ellis Avenue, Chicago, IL, 60637, USA, [email protected]. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all otheruses, contact the owner/author(s). © 2020 Copyright held by the owner/author(s). 2475-1421/2020/6-ART86 https://doi.org/10.1145/3386336 Proc. ACM Program. Lang., Vol. 4, No. HOPL, Article 86. Publication date: June 2020. 86:2 David MacQueen, Robert Harper, and John Reppy Contents Abstract 1 Contents 2 1 Introduction: Why Is Standard ML important?4 2 Background6 2.1 LCF/ML — The Original ML Embedded in the LCF Theorem Prover7 2.1.1 Control Structures8 2.1.2 Polymorphism and Assignment8 2.1.3 Failure and Failure Trapping8 2.1.4 Types and Type Operators9 2.2 HOPE9 2.3 Cardelli ML9 2.3.1 Labelled Records and Unions 11 2.3.2 The ref Type 12 2.3.3 Declaration Combinators 12 2.3.4 Modules 13 3 An Overview of the History of Standard ML 13 3.1 Early Meetings 13 3.1.1 November 1982 13 3.1.2 April 1983 14 3.1.3 June 1983 14 3.1.4 June 1984 15 3.1.5 May 1985 15 3.2 The Formal Definition 15 3.3 Standard ML Evolution 16 3.3.1 ML 2000 16 3.3.2 The Revised Definition (Standard ML ’97) 17 4 The SML Type System 18 4.1 Early History of Type Theory 18 4.2 Milner’s Type Inference Algorithm 22 4.3 Type Constructors 25 4.3.1 Primitive Type Constructors 25 4.3.2 Type Abbreviations 26 4.3.3 Datatypes 27 4.4 The Problem of Effects 29 4.5 Equality Types 32 4.6 Overloading 33 4.7 Understanding Type Errors 33 5 Modules 34 5.1 The Basic Design 36 5.2 The Evolution of the Design of Modules 41 5.3 Modules and the Definition 46 5.4 Module Innovations in Version 0.93 of SML/NJ 46 5.5 Module Changes in Definition (Revised) 46 5.6 Some Problems and Anomalies in the Module System 47 5.7 Modules and Separate Compilation 48 6 The Definition of Standard ML 49 Proc. ACM Program. Lang., Vol. 4, No. HOPL, Article 86. Publication date: June 2020. The History of Standard ML 86:3 6.1 Early Work on Language Formalization 49 6.2 Overview of the Definition 51 6.3 Early Work on the Semantics 52 6.4 The Definition 54 6.5 The Revised Definition 58 7 Type Theory and New Definitions of Standard ML 59 7.1 Reformulating the Definition Using Types 61 7.2 A Mechanized Definition 62 7.3 Lessons of Mechanization 64 8 The SML Basis Library 64 8.1 A Brief History of the Basis Library 64 8.2 Design Philosophy 66 8.3 Interaction with the REPL 67 8.4 Numeric Types 67 8.4.1 Supporting Multiple Precisions 67 8.4.2 Conversions Between Types 68 8.4.3 Real Numbers 69 8.5 Input/Output 69 8.6 Sockets 70 8.7 Slices 72 8.8 Internationalization and Unicode 72 8.9 Iterators 73 8.10 Publication and Future Evolution 73 8.11 Retrospective 74 8.12 Participants 74 9 Conclusion 75 9.1 Mistakes in the Design of Standard ML 75 9.2 The Impact of Standard ML 75 9.2.1 Applications of SML 75 9.2.2 Language Design 76 9.2.3 Language Implementation 77 9.2.4 Concurrency and Parallelism 78 9.3 The Non-evolution of SML 79 9.4 Summary 80 Acknowledgments 80 A Standard ML Implementations 80 A.1 Alice 80 A.2 Edinburgh ML 81 A.3 HaMLeT 81 A.4 The ML Kit 81 A.5 MLton 81 A.6 MLWorks 81 A.7 Moscow ML 81 A.8 Poly ML 82 A.9 Poplog/SML 82 A.10 RML 82 A.11 MLj and SML.NET 82 A.12 Standard ML of New Jersey 82 Proc. ACM Program. Lang., Vol. 4, No. HOPL, Article 86. Publication date: June 2020. 86:4 David MacQueen, Robert Harper, and John Reppy A.13 SML# 83 A.14 TIL and TILT 83 References 83 1 INTRODUCTION: WHY IS STANDARD ML IMPORTANT? The ML family of languages started with the meta language (hence ML) of the LCF theorem proving system developed by Robin Milner and his research group at Edinburgh University from 1973 through 1978 [Gordon et al. 1978b].1 Its design was based on experience with an earlier version of the LCF system that Milner had worked on at Stanford University [Milner 1972]. Here is how Milner succinctly describes ML, the LCF metalanguage, in the introduction to the book Edinburgh LCF [Gordon et al. 1979]: ::: ML is a general purpose programming language. It is derived in different aspects from ISWIM, POP2 and GEDANKEN, and contains perhaps two new features. First, it has an escape and escape trapping mechanism, well-adapted to programming strategies which may be (in fact usually are) inapplicable to certain goals. Second, it has a polymorphic type discipline which combines the flexibility of programming in a typeless language with the security of compile-time type checking (as in other languages, you may also define your own types, which may be abstract and/or recursive); this is what ensures that a well-typed program cannot perform faulty proofs. This original ML was an embedded language within the interactive theorem proving system LCF, serving as a logically-secure scripting language. Its type system enforced the logical validity of proved theorems and it enabled the partial automation of the proof process through the definition of proof tactics. It also served as the interactive command language of LCF. The basic framework of the language design was inspired by Peter Landin’s ISWIM language from the mid 1960s [1966b], which was, in turn, a sugared syntax for Church’s lambda calculus. The ISWIM framework provided higher-order functions, which were used to express proof tactics and their composition. To ISWIM were added a static type system that could guarantee that programs that purport to produce theorems in the object language actually produce logically valid theorems and an exception mechanism for working with proof tactics that could fail. ML followed ISWIM in using strict evaluation and allowing impure features, such as assignment and exceptions. Initially the ML language was only available to users of the LCF system, but its features soon evoked interest in a wider community as a potentially general-purpose programming language.