An Overview of Standard ML M
Total Page:16
File Type:pdf, Size:1020Kb
CMSC 22100/32100: Programming Languages An Overview of Standard ML M. Blume October 2, 2008 Contents 1 What is SML 2 2 Integers, reals, booleans 2 3 Variables 3 4 Constructing new types 4 4.1 Option type . 4 4.2 List type . 4 4.3 Tuples . 5 4.4 Records . 6 4.5 Data Types . 6 5 Functions 9 5.1 Function values and function definitions . 9 5.2 Function application . 10 5.3 Curried function definitions . 10 6 Other types 11 6.1 The unit type ............................... 11 6.2 References|the ref type ......................... 11 6.3 Text|the types char and string ..................... 11 7 Block structure 12 7.1 Simultaneous bindings and mutual recursion . 13 8 Patterns and pattern matching 14 8.1 Examples . 16 9 Other language features 18 9.1 Polymorphism . 18 9.2 Modules . 18 9.3 Exceptions . 21 9.4 Infix declarations . 21 10 Using Files 21 10.1 Function use ................................ 21 10.2 A note on polymorphism and type inference . 22 10.3 Generativity of datatype definitions . 22 10.4 CM | the SML/NJ compilation manager . 23 1 A Reserved words 23 B Pre-defined infix operators 24 C Other restricted identifiers 24 1 What is SML Standard ML is a strongly typed, impure, strict functional language. Strongly typed: Every value, expression in the language has a type (int, real, bool etc.). The compiler rejects a program that does not conform to the type system of the language. Functional: Each expression evaluates to a value. Some of these values are functions. In fact, every function in ML is a value. Like other values, functions can be bound to variables, passed as arguments to other func- tions, returned as values from function calls, and stored in data structures. Impure: Unlike in other functional languages such as Haskell, the evaluation of expressions in ML can incur side-effects, e.g., assignment to locations within mutable data structures or I/O. Strict: Unlike \lazy" languages such as Haskell, arguments to ML functions are evaluated before the function call is performed. This means that if one of the arguments loops forever, then so will the entire program, regardless of whether or not the function actually needed that argument. Similarly, all side-effects caused by the evaluation of the arguments occur before any side-effects caused by the evaluation of the function body. In this class, we will use the SML/NJ compiler. SML/NJ generates reason- ably fast executable code, although some other compilers, e.g., MLton, outper- form it on a regular basis. However, SML/NJ can be used interactively and comes with a mature programming environment. (Some people use SML/NJ for development and use MLton to speed up the final version. However, in this class, runtime performance will not be an issue.) 2 Integers, reals, booleans Among the built-in data types are the following: int, real, bool, 'a option, etc. Standard ML calls the type of floating points numbers real.1 Here are some examples. $ sml Standard ML of New Jersey v110.68 [built: Mon Sep 8 13:47:59 2008] - (* Integers: *) 1This is a misnomer; they are really finite precision numbers. 2 - 1; val it = 1 : int - 2; val it = 2 : int - 1+2; val it = 3 : int - - (* Reals: *) - 1.0; val it = 1.0 : real - 2.0; val it = 2.0 : real - Math.sqrt(3.0*3.0+4.0*4.0); [autoloading] [library $SMLNJ-BASIS/basis.cm is stable] [autoloading done] val it = 5.0 : real - (* Booleans: *) - false; val it = false : bool - true; val it = true : bool - if true then 1 else 2; val it = 1 : int Some notes: • Comments in Standard ML are enclosed within comment brackets: (* ... *). Comments are nestable. • Many library functions such as Math.sqrt are automatically loaded by the SML/NJ interactive system on an on-demand basis. This is what accounts for the [autoloading ... done] sequence of messages. 3 Variables Variables in Standard ML are identifiers that name values. Once a binding for a variable is established, the variable continues to name the same value until it goes out of scope. In other words, Standard ML variables are immutable. Imperative programming with assignment, while possible using reference val- ues (see below), it is not encouraged and cannot be done using variables alone. Assignment to variables can usually be avoided by writing iterative algo- rithms in recursive style. In this style, instead of changing the value of a vari- able one simply establishes a new binding. The new binding can use the same identifier, in which case the old binding will be shadowed and go out of scope. As an example, consider the following iterative version of the factorial func- tion, written in tail-recursive style: 3 - fun fac_loop (n, f) = if n = 0 then f else fac_loop (n-1, f*n); val fac_loop = fn : int * int -> int - fun fac n = fac_loop (n, 1); val fac = fn : int -> int - fac 10; val it = 3628800 : int 4 Constructing new types ML has a variety of built-in type constructors, and programmers can add to that variety by defining their own. A type constructor maps a sequence of types to another type. The length of the argument sequence is called the arity of the type constructor. The names of \ordinary" types, e.g., int, bool, real, and so on are actually type constructors of arity 0. When forming types, the type arguments precede the name of the constructor.2 4.1 Option type One of the predefined type constructors in Standard ML is option. It gives rise to the family of option types. Example: Values of type int option come in two varieties: There is the singleton value NONE or the set of values of the form SOME i where i an integer. In general, types of the form t option are used to describe situations where there may or may not be a value of type t. For example, the library function Int.fromString takes a string argument and returns an int option, using NONE to indicate that the argument could not be parsed as an integer. - NONE; val it = NONE : 'a option - SOME 1; val it = SOME 1 : int option - SOME 1.0; val it = SOME 1.0 : real option - SOME false; val it = SOME false : bool option The option type is really a datatype (see below) defined by: datatype 'a option = NONE | SOME of 'a 4.2 List type Another predefined type constructor in Standard ML is list. A type t list contains values that are finite lists whose elements are values of type t. The 2If there is more than one, the arguments are given as a comma-separated list, enclosed in parentheses. 4 empty list is called nil. The keyword nil is special in that it is interchangeable with the notation []. A non-empty list consists of a head element and a tail list (of the same type). If h is the head and t is the tail, then the entire list is h :: t. The infix constructor :: is sometimes pronounced \cons." It associates to the right. A k-element list x1 :: ··· :: xk :: nil can alternatively be written [x1, :::, xk]. - nil; val it = [] : 'a list - 1 :: nil; val it = [1] : int list - true :: false :: nil; val it = [true,false] : bool list - [1.0,2.0,4.0]; val it = [1.0,2.0,4.0] : real list The list type is essentially a datatype (see below) defined by: datatype 'a list = nil | :: of 'a * 'a list However, to be precise, one must also declare :: as an infix operator and estab- lish the alternative syntax using square brackets. The former would be possible using a Standard ML infix declaration (actually, since we need right associativ- ity, it would be an infixr declaration), but the latter is built-in syntax|which is why the list type itself must be built-in and cannot be defined from first principles. 4.3 Tuples The built-in type constructor *, which is written in infix notation and may be repeated as in int * real * bool to form arbitrary n-ary tuple type con- structors, produces types that correspond to cartesian products of other types. The values inhabiting these types are called tuples and are written as comma- separated sequences enclosed in round parentheses. - (1, 2); val it = (1,2) : int * int - (1.0, 2.0); val it = (1.0,2.0) : real * real - (1, 2.0, true, NONE); val it = (1,2.0,true,NONE) : int * real * bool * 'a option The notation #i where i is an integer literal greater or equal than 1, speci- fying a position, can be used to project from tuples. However, it is often more convenient to get at the elements of a tuple by pattern matching (see below). 5 - #1 (1, 2.0); val it = 1 : int - #2 (1, 2.0); val it = 2.0 : real 4.4 Records Records generalize tuples by making it possible to give names (labels) to fields instead of relying on positional information. (In fact, Standard ML defines tuples to be special cases of records where the labels are numeric and form an uninterrupted sequence from 1 to some natural number n.) Projection from records uses the notation #l where l is the label name. - val r = { a = 1.0, b = true, c = {a = 3.0}, d = fn x => x}; val r = {a=1.0,b=true,c={a=3.0},d=fn} : {a:real, b:bool, c:{a:real}, d:'a -> 'a} - #a (#c r); val it = 3.0 : real - (#d r) 1; val it = 1 : int As shown in the examples above, record types have the form fl1:t1; : : : ;ln:tng where the li are the labels and the ti are the corresponding types of individual record fields.