<<

IA010: Principles of Programming Languages

3. Types

Jan Obdržálek obdrzalek@fi.muni.cz

Faculty of Informatics, Masaryk University, Brno

IA010 3. Types 1 Data types

A is a collection of data values and a of predefined operations on these values.

Why use types • error detection – improves reliability "IQ" + 160 • implicit context for many operations – improves writability a + b, new p • code documentation – improves readability

A consists of 1 a mechanism to define types and associate them with certain language constructs 2 set of rules for type equivalence, type compatibility and type inference IA010 3. Types 2 Outline

Primitive data types

Type checking

Composite data types Array types Record types Union types List types

Pointer and reference types

Type inference

IA010 3. Types 3 Basic type taxonomy

• boolean type • numeric types Primitive types • type • character string types • enumeration types, subrange types • record types • union types • array types Composite types • list types • set types • pointer and reference types

What about primitive and composite data types? IA010 3. Types 4 Primitive data types

IA010 3. Types 5 Primitive and composite types primitive data types two meanings, may coincide 1 with support built-in the also called built-in types 2 building blocks for composite types also called types composite data types created by • applying a (record, array, set ...) • to one or more simpler types (either primitive or composite)

The distinction is not always clear and may depend on the language.

IA010 3. Types 6 Numeric types

• historically oldest, typically reflect hardware • integers and floats, complex numbers • range can implementation-dependent (problem: portability) Integer types • different lengths: signed char, short, int, long, long long (using at least 1, 2, 2, 4 and 8 , respectively) • may be signed or unsigned (typical implementation: twos complement) • arbitrary precision: string representation "12354654231654L" (Python – long integer) • typical for scripting languages • performance penalty

IA010 3. Types 7 Numeric types II

floating-point types • model real numbers • single (float/real) or double (double) precision • standard IEEE 754 • cannot precisely express all real numbers: 1 irrational numbers: e, π, . . . 2 even short numbers in : 0.1 (base 10) = 0.0001100110011 ... (base 2) decimal types • fixed number of decimal digits each digit the same number of (usually 4 or 8) • use – business applications, precise decimal number representation (0.1) • especially useful, if available in hardware (BCD) • examples: COBOL, #, F# IA010 3. Types 8 Non-numeric primitive types boolean type • just two values (true, false) • missing in C89, arbitrary numeric type can be used (zero/non-zero) • usually implemented using more than a single character type • to store a single character • size of representation depends on the encoding used (ASCII, modern languages: , UTF-xx) (may vary for different characters – e.g. in UTF-8) • sometimes missing from the language (Python: strings of length 1) • may be even handled like a numeric type (C)

IA010 3. Types 9 Character string types

• string – a sequence of characters • strings according to length • static length – Python, Java (String), C# • limited dynamic length – C (upper bound on the length) • dynamic length – JavaScript, , standard C++ library • strings according to implementation • special of an array of characters – C, C++ (terminated by the \0) • primitive data type – PYTHON • class – JAVA, F# • supported operations • concatenation, comparison, substring selection • pattern matching – Perl, JavaScript, Ruby, PHP

IA010 3. Types 10 Ordinal (discrete) types

An ordinal type is a type which can be mapped to a range of integers.

1 primitive ordinal types • provided in the language • e.g. Java: integer, char, boolean 2 user defined ordinal types • enumeration types • subranges

IA010 3. Types 11 User-defined ordinal types

Enumeration types

• the values, called enumeration constants, are enumerated in the definition enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun};

• typical implementation – implicit numerical value • the value can often be given explicitly (Fri=2) • advantages over named constants: type checking!

Important aspects • can one enum. constant name be used in multiple types? • are enum. constants coerced to integers? (C: yes; JAVA 5.0, C#, F#: no)

IA010 3. Types 12 User-defined ordinal types

Subrange types

• contiguous subsequence of an ordinal type • PASCAL,ADA

Example (ADA) type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun); subtype Weekdays is Days range Mon..Fri; subtype Index is Integer range 1..100;

• operations of the parent type are preserved (as long as the result stays in range) • require run-time type checking • advantages: readability, range checks • can be simulated by asserts

IA010 3. Types 13 Type checking

IA010 3. Types 14 Type checking

topics • type equivalence • type compatibility • (cast) • type coercion • nonconverting type cast • type inference (later)

IA010 3. Types 15 Type checking

Ensuring that the operands of an operation are of compatible types. A language can be • strongly typed – an operation cannot be applied to any object which does not support the operation • statically typed – checking can be performed at compile-time • dynamically typed – checking performed at run-time (a form of late binding) (languages with dynamic scoping)

Examples • Ada, Java, C# – strongly typed (except for the explicit cast) • Pascal – almost strongly statically typed (except for the untagged variant records) • C89 – weak typing (unions, pointers, arrays, . . . ) • Scheme, Lisp, ML, F# – strongly typed • Python, Ruby – strongly dynamically typed IA010 3. Types 16 Type equivalence

When are two types equivalent?

Nontrivial in a language which allows defining of new types (records, arrays,. . . ) Two variables have the same type, if • they were defined in the same declaration, or in a declaration using the same type name name equivalence: Pascal, Ada, Java, C# • if their types are identical as structures structural equivalence: Algol, Modula, C, ML • combination of both approaches, e.g. C: • name equivalence for struct, union, enum • structural equivalence otherwise

IA010 3. Types 17 Structural equivalence issues

Are the following types the same? type T1 = record types= array [1..10] of char; a,b : integer typet= array [0..9] of char; end; type T2 = record type T3 = record a : integer; b : integer; b : integer; a : integer; end; end;

• T1 and T2: yes • T2 and T3: no (most languages), yes (ML) • s and t: no (most languages), yes (, Ada)

IA010 3. Types 18 Deciding type equivalence

Structural equivalence • type names are (recursively) replaced by their definitions • resulting strings are simply compared • obstacles (surmountable): recursive types, pointers

Name equivalence • straightforward name comparison • assumption: if the programmer gave two definitions of the “same” type, he probably had a particular use in mind

IA010 3. Types 19 Name equivalence

TYPE new_type = old_type( * Modula-2 *)

• old_type and new_type are aliases • two types, or two names for the same type? • two types: strict name equivalence • same type: loose name equivalence (Pascal) Example: strict name equivalence

TYPE imperial_distance = REAL; metric_distance = REAL; VAR i : imperial_distance; m : metric_distance; ... m := i; (* this should probably be an error *)

IA010 3. Types 20 Name equivalence in Ada

• a restrictive version of name type equivalence • subtype – a type equivalent to the parent type

subtype new_int is integer; subtype small_int is integer range 1..100;

• derived type – a new type

type imperial_distance is new float; type metric_distance is new float;

• note the difference: type derived_small_int is new integer range 1..100; subtype subrange_small_int is integer range 1..100;

IA010 3. Types 21 Type conversion (cast)

Change of types, explicitly stated in the program code.

Implementation – three principle cases: 1 structurally equivalent types (same internal representation) (no code executed – the conversion is “for free”) 2 different types, same representation (e.g. subtypes) (run-time check, value can be used if successful) 3 different types with related values (e.g. int vs float) (the specified conversion is performed)

Nonconverting type cast • no conversion is performed, the stored value is only interpreted as of the new type • uses: systems programming, significand/exponent extraction, . . . IA010 3. Types 22 Type conversion examples type test_score is new integer range 0..100; type celsius_temp is new integer; ... n : integer; -- assume 32 bits r : real; -- assume IEEE double-precision t : test_score; c : celsius_temp; ... t := test_score(n); -- run-time semantic check required n := integer(t); -- no check req.; every test_score is an int r := real(n); -- requires run-time conversion n := integer(r); -- requires run-time conversion and check n := integer(c); -- no run-time code required c := celsius_temp(n); -- no run-time code required IA010 3. Types 23 Type compatibility

• of prime importance to the programmer • full type equivalence is not always needed • we often need only compatible types: • addition – two numeric type operands • assignment – target type compatible with the source type • subroutine call – formal parameters compatible with arguments • the definition of compatibility differs significantly among various languages

Coercion • implicit type conversion; for compatiible types • implementation similar to type cast (explicit type conversion)

IA010 3. Types 24 Coercion

Coercion causes significant weakening of the type system!

• trends: less (or no) coercion, but . . . • improves writability, supports abstraction • today: scripting languages, C++ short int s; unsigned long int l; char c;/ * may be signed or unsigned */ float f;/ * usually IEEE single-precision */ double ;/ * usually IEEE double-precision */ ... s = l;/ * low bits are interpreted asa signed number */ l = s;/ * sign-extended, then interpreted as unsigned */ s = c;/ * either sign-extended or zero-extended */ f = l;/ * precision may be lost */ d = f;/ * no precision lost */ f = d;/ * precision may be lost, undefined possible */ IA010 3. Types 25 Composite data types

IA010 3. Types 26 Array types

• the most used and most important combined data type • homogeneous aggregate of data elements • elements are identified by their relative position • semantically finite mappings: array_name(index) → element • design decisions: • which types can be used for indexing? • are the bounds checked on access? • when are the bounds fixed? • when does array allocation take place? • rectangullar or ragged multidimensional arrays? • initialization when allocated? • what kind of slices are allowed, if any?

IA010 3. Types 27 Arrays – indexing

Which type can be used for indexing? • integer types (Fortran, C, . . . ) • any ordinal type (Ada) • user-defined keys (associative arrays – e.g. Python)

Bound checking • “expensive” operation, historically usually omitted (C) • however common in modern languages (Java, C#, ML)

Lower bounds • fixed: C (0) and its successors • user defined: Ada, Fortran95+ (1 by default)

IA010 3. Types 28 Multidimensional arrays

Language • pure multidimensional array (access: [2,3])

mat: array (1..10,1..10) of real; -- Ada

• array of arrays (access: [2][3])

VAR mat = ARRAY [1..10] OF ARRAY [1..10] OF REAL;

Array shape • rectangular array • all rows of the same length • jagged array • length may differ between the rows (C, Java) • typical for the “array of arrays” approach

IA010 3. Types 29 Arrays – slices

• a slice is some substructure of an array • trivially a single row/column • many othe options (Fortran 90):

IA010 3. Types [Scott]30 Array bounds and storage bindings

type bounds allocation storage static static static fixed stack-dynamic static elaboration stack stack-dynamic elaboration elaboration stack fixed heap-dynamic execution execution heap heap-dynamic dynamic dynamic heap elaboration – when the declaration is elaborated execution – when program actually requests the array dynamic – execution + can change during run-time C-family languages: • C89: fixed stack-dynamic, static (static), fixed heap-dynamic (malloc/free) • JAVA: fixed heap-dynamic • C#: fixed heap-dynamic, heap-dynamic (List class)

IA010 3. Types 31 Array type examples

fixed stack- (C89) void foo() { int fixed_stack_dynamic_array[7]; /* ... */ } stack-dynamic array (C99) void foo(int n) { int stack_dynamic_array[n]; /* ... */ } fixed heap-dynamic array (C89) int * fixed_heap_dynamic_array = malloc(7 * (int));

IA010 3. Types 32 Array implementation – storage

Two basic choices: 1 a block of adjacent memory cells • advantages: simple adressing • mapping to one dimension • by rows (row major) – almost all languages • by columns (column major) – Fortran 2 array of arrays • : may need more space • pros: jagged arrays, rows can be shared

Some languages support both approaches – e.g. C.

IA010 3. Types 33 Arrays in C

[Scott]

IA010 3. Types 34 Record types

• heterogeneous (unlike arrays) • model collections of related data • correspond to cartesian products struct rpg_character { char name[20]; int strength, stamina, dexterity, inteligence; _Bool male; ... };

• individual elements are often called fields • access – usually using the dot notation: frodo.strength • nesting – usually allowed C– struct, C++ – special version of class JAVA – ordinary classes used instead IA010 3. Types 35 Record type implementation

• usually consecutive memory cells • may contain “holes” (to align with word-length)

struct element { char name[2]; int atomic_number; double atomic_weight; _Bool metallic; };

Likely layout on a 32-bit machine [Scott]

IA010 3. Types 36 Record packing and reordering

record packing

• usually explicitly requested by the programmer (Pascal) • space/speed trade-off (breaks alignment)

record reordering

• to make records both space- and speed-efficient • may be problematic (systems programming, FFI) (solution: nonstandard alignment can be specified (Ada, C++)) IA010 3. Types 37 types

• similar to record types, elements are not named • use: functions returning more values • Python • immutable type, can be converted to an array and back • elements accessed using arrays syntax • of any number of elements (even 0)

myTuple = (42, 2.7, 'mtb') myTuple[1]

• F# • pairs can be addressed using fst and snd • otherwise using tuple patterns

let tup = (42, 50, 1729); let a, b, c = tup;

IA010 3. Types 38 Union (variant) types

Allow a variable to store different type values at different times during program execution. (Correspond to set unions.) union flexType { int intEl; float floatEl; }; • storage – allocated for the largest variant • uses • system programming (non-converting type cast) • representing alternatives in a record • problem: free unions are not type checked: union flexType el1; float x; ... el1.intEl = 27; x = el1.floatEl; Unions are often missing in modern languages (e.g. Java, C#). IA010 3. Types 39 Discriminated (tagged) unions

• each union variable keeps and information – tag/discriminant – which variant is currently in use • support type checking • common in functional languages (ML, Haskell, F#) type intReal = //F# | IntValue of int | RealValue of float; let printType value = match value with | IntValue value -> printfn"It is an integer" | RealValue value -> printfn"It isa float";

let ir2 = RealValue 3.4; printType ir2; It is a float

IA010 3. Types 40 Variant records

type shapeKind = (square, rectangle, circle);( * Pascal *) shape = record centerx : integer; centery : integer; case kind : shapeKind of square : (side : integer); rectangle : (length, height : integer); circle : (radius : integer); end;

IA010 3. Types 41 Variant records in C/C++ num ShapeKind { Square, Rectangle, Circle }; struct Shape { int centerx; int centery; enum ShapeKind kind; union{ struct{ int side; };/ * Square */ struct{ int length, height; };/ * Rectangle */ struct{ int radius; };/ * Circle */ }; }; int getSquareSide(struct Shape* s) { assert(s->kind == Square); return s->side; } void setSquareSide(struct Shape* s, int side) { s->kind = Square; s->side = side; IA010 3. Types 42 } Lists

• defined recursively: a list is • either an empty list, or • a pair of an object and another (shorter) list • particularly useful in functional languages (which use recursion and higher order functions) • common in imperative scripting languages (Python) • can be modelled using records and pointers • two main kinds • homogeneous (every element of the same type – ML) • heterogenous (any object can be placed in the list – Lisp) • terminology: head – the first element, tail – the remainder of the list

IA010 3. Types 43 Lists – Lisp and ML

Lisp • program is a list (can be modified during execution!) • quote prevents evaluation: ’(a b c d), quote(a b c d) • implementation: chain of cons cells (a pair of pointers) (printed as dotted pairs: (cons 1 2) => (1 . 2)) • pointer names: car (head) and cdr (tail)

(a b c d);; list syntax (a. (b. (c. (d . null))));; proper list (a. (b. (c . d)));; improper list (a (b c) d);; list nesting

ML • implementation: chain of blocks [(object, value) pairs] • operations hd (head) and tl (tail)

[a, b, c, d]

IA010 3. Types 44 List examples

Lisp

(cons 'a '(b)) (a b) (car '(a b)) a (car nil); either nil or error (cdr '(a b c)) (b c) (cdr '(a)) nil (cdr nil); either nil or error (append '(a b) '(c d)) (a b c d)

ML a :: [b] [a, b] hd [a, b] a hd []( * run-time exception *) tl [a, b, c] [b, c] tl [a] nil tl []( * run-time exception *) [a, b] @ [c, d] [a, b, c, d]

IA010 3. Types 45 List comprehensions

• so-called generator notation • create “list from lists” • based on traditional mathematical notation (set comprehensions) • Miranda, Haskell, Python, F#

{i × i | i ∈ {1,..., 100} ∧ i mod 2 = 1}

Haskell: [i*i | i <- [1..100], i `mod` 2 ==1]

Python: [i*i for i in range (1,100) if i % 2 ==1]

F#: [for i in 1..100 do if i % 2 = 1 then yield i*i]

IA010 3. Types 46 Pointer and reference types

IA010 3. Types 47 Pointer types

Values are memory addresses and apecial value nil.

Uses 1 indirect addressing 2 way to manage dynamic storage • heap-dynamic variables often do not have an associated name (anonymous variables) • accessible only through a pointer or a reference

Pointers are not • structured types (even though usually defined using a type operator) • scalar types (values are not data, but references to variables) IA010 3. Types 48 Pointer operations

Basic operations 1 assignment (sets value to an address) • use of an operator for objects outside heap 2 dereferencing (value of the variable pointed to) • implicit (Fortran95) • explicit (C, the * operator)

Accessing record fields • (*p).age / p->age (C, C++) • p.age (Ada, implicit dereference)

Heap management • explicit alocation required – malloc (C), new (C++) (in languages using pointers for heap management)

IA010 3. Types 49 Problem 1: Dangling pointers

The pointer contains an address of a deallocated variable.

• Why it is a problem? • new varible can be allocated to the same address • heap management can use the “empty” memory • Creating a dangling pointer: 1 new variable is allocated on the heap, pointed to by p1 2 p2:=p1 3 the variable is deallocated through p1 (p2 is now dangling) int * arrayPtr1; //C++ int * arrayPtr2 = new int[100]; arrayPtr1 = arrayPtr2; delete [] arrayPtr2; In C++ both arrayPtr1 and arrayPtr2 are now dangling!

Solution: prohibit deallocation IA010 3. Types 50 Problem 2: Lost variables (garbage)

There is a variable on the heap, which is no longer accessible.

• Creating a lost variable: 1 new variable, pointed to by p1, is allocated on the heap 2 some other address is assigned to p1 • consequence: memory leak • solution: garbage collection

IA010 3. Types 51 Pointers in C/C++

• typed • can point anywhere (as in assembly languages) • extremely flexible, extra caution necessary • operations: * – dereference, & – address of a variable Pointer arithmetic • ptr + index = ptr plus index * sizeof(*ptr) int list [10]; int *ptr; ptr = list; The following holds: • *(ptr + 1) is the same as list[1] • *(ptr + index) is the same as list[index] • ptr[index] is the same as list[index] IA010 3. Types 52 Pointers in C/C++ (2)

pointers pointing to functions • used to pass functions as parameters pointers of type void * • can point to values of any type (generic pointers) • cannot be dereferenced (so the type checker would not complain) • use: parameters/results of functions which operate on memory (e.g. malloc)

IA010 3. Types 53 Reference types

A variable of a reference type refers to an object or a value in memory, not a to .

• no point of doing arithmetics • C++ • constant pointer, always implicitly dereferenced • uses: parameter passing – two-way communication (advantage over pointers: no need to dereference) • Java • non-constant, can point to any instance of the same class • used for referencing class instances • no explicit deallocation (no dangling references)

String str1; // value: null ... str1 = "This is a Java literal string";

IA010 3. Types 54 Reference types (2)

• C# • both (C-style) pointers and (Java-style) references • use of pointers is strongly discouraged (unsafe modifier for subprograms using pointers) • Python, , Ruby • all variables are references always implicitly dereferenced pointers vs references

Their (pointers) introduction into high-level languages has been a step backward from which we may never recover. C. A. R. Hoare

References provide some of the flexibility and capabilities of pointers, without their hazards. IA010 3. Types 55 Type inference

IA010 3. Types 56 ML type inference

Is it necessary to always specify a type?

Example 1 val s : string = "Arthur Dent" val n : int = 42 The type is obvious from the syntax.

Example 2 fun twice x = 2 * x We know that • 2 is of type int, and • * is of type int -> (int -> int). Therefore twice is of type int -> int.

IA010 3. Types 57 ML type inference

Is it necessary to always specify a type?

Example 3 fun add [] = 0 | add (a :: L) = a + add L We know that • 0 is of type int, and • [] and a::L are of type list • + is of type int -> (int -> int).

Therefore add is of type int list -> int.

In the ML language, it is always possible to infer the type (for any correct program).

IA010 3. Types 58 Type inference

• makes programs more readable • supports abstraction (guarantees the most general type) • present in many modern languages (ML, Haskell, F#, C# 3.0, C++11, VisualBasic 9.0, . . . ) • based on the Hindley-Milner (Damas-Hindley-Milner) algorithm

History 1958 Curry, Feis: simply typed λ-calculus 1969 Hindley: extended, the most geneal type (proof) 1978 Milner: an equivalent algorithm (the algorithm W ) 1982 Damas: proof of completenes

IA010 3. Types 59 Three stages of type inference

fun add [] = 0 | add (a :: L) = a + add L

1 each (sub)expression is assigned a new type • add : α, b : β, L : γ,... 2 the inference rules are applied • built-in expressions: 0 : int, [] : δ list • add is a function, applied to a list therefore α = ι → κ a ι = δ list • ... 3 the system of equations (constraints) created in step 2 is then solved (using type unification)

IA010 3. Types 60 Type inference – possible results

1 there is exactly one solution 2 the constraint system cannot be solved (e.g. x is required to be of type int and string) type error 3 there are multiple solutions a) polymorphism: the result contains parametric types (i.e. includes type variables) b) ambiguity: Is fun f(x, y) = x + y of type • (int * int) -> int, or • (string * string) -> string?

IA010 3. Types 61 ML type expressions

(simplified)

• type expression syntax • primitive types: int, bool • type variables: ’a, ’b, ’c, ... • type constructor list: T list (where T is a type expression) • type constructor n-tuple: T1*T2*T3 (where T1, T2, T3 are type expressions) • : T1 -> T2 (where T1 and T2 are type expressions) • our problem will be formulated as a system of type equalities between pairs of type expressions • to get a solution we need to find substitutions for type variables

IA010 3. Types 62 Finding a substitution

Simple cases: • ’a list = int list ’a = int • ’a list = ’b list list ’a = int list ’b list = int list ‘b = int • ’a list = ’b -> ’b does not have a solution • ’a list = ’b does not have a finite solution ’b list = ’a

What about the following? • ’a list = ’b list list ’a = int list; ’b = int ’a = (int -> int) list; ’b = int -> int ’a = (int list) list; ’b = int list ...

IA010 3. Types 63 The most general solution

‘a list = ’b list list

• problem: infinitely many solutions • we aim to find the most general solution • observation: • for ’a we need to substitute some suitable list • ’b must be of the “element of a list of type ’a” type • no other constraints • solution: ’a = ’b list, where b’ is a free type variable

The solutions of TI will be sets of equalities ’ai = Ti, where:

• Ti are type expressions

• no ’ai appears in any Ti

IA010 3. Types 64 Finding the most-general solution

Unification of (two) type expressions Finding substitutions for type variables, so the expressions are identical after performing the substitutions.

• the resulting set of substitutions is called the unifier • substitutions are represented by a binding between the type variable ’a and a type expression τ(a) • at the beginnig is each variable free (not bound) • we define ( T 0 T = ’a ∧ τ(’a) = T 0 τ 0(T ) = T otherwise

IA010 3. Types 65 The unification algorithm

Unify (T1, T2): 0 0 T1 := τ (T1); T2 := τ (T2) if T1=T2 then return true else if T1=’a ∧ (’a does not appear in T2) then τ(’a):=T2; return true else if T2=’b ∧ (’b does not appear in T1) then τ(’b):=T1; return true 0 0 else if T1=T1 list ∧ T2=T2 list then 0 0 return Unify (T1, T2) else if T1=D1->C1 ∧ T2=D2->C2 then return Unify (D1, D2) && Unify (C1, C2) else return false end

As a side-effect the algorithm produces the substitution τ. IA010 3. Types 66 Unification – example

'b list = 'a list; 'a->'b='c; 'c-> bool = (bool -> bool) -> bool

'a: bool Unify(’b list, ’a list) Unify(’b, ’a) 'b: ’a Unify(’a->’b, ’c) bool Unify(’c->bool, (bool->bool)->bool) Unify(’c, bool->bool) 'c: ’a->’b Unify(’a->’b, bool->bool) bool->bool Unify(’a, bool) Unify(’b, bool) Unify(bool, bool) Unify(bool, bool)

IA010 3. Types 67 Constraint generation

Selected type rules expression type constraints 1,2,3,. . . int [] ’a list hd(L) ’a L :’a list tl(L) ’a list L :’a list E1::E2 ’a list E1:’a, E2:’a list E1+E2 int E1:int, E2:int E1*E2 int E1:int, E2:int E1=E2 bool E1:’a, E2:’a if E1 then E2 else E3 ’a E1:bool, E2:’a, E3:’a E1E2 ’b E1:’a -> ’b, E2:’a fun f x1.. xn = E x1:’a1,..,xn:’an, E:’b f:’a1->...->’an->’b

IA010 3. Types 68 Example

funfxL= ifL=[] then [] else if x <> (hd L) then f x (tl L) else x :: f x (tl L)

• introduce new type variables ’f, ’x, ’L,... • generate constraints using rules

'f = 'a0 -> 'a1 -> 'a2( * fun *) 'L = 'a3 list( * "=" and "[]" *) 'L = 'a4 list( * hd *) 'x = 'a4( * "<>" *) 'x = 'a0( * application *) 'L = 'a5 list( * tl *) 'a1 = 'a5 list( * tl, application *) ...

IA010 3. Types 69 Type checking

Direct: fun f g = g 2 fun not x = if x then false else true f not

Error: operator and operand don't agree [tycon mismatch] operator domain: int -> 'Z operand: bool -> bool in expression: f not

Indirect: fun reverse [] = [] | reverse (x:xs) = reverse xs val reverse = fn : ’a list -> ’b list • changes the type of the list – something is wrong

IA010 3. Types 70 Conclusion

• type inference computes the types of expressions • type declarations are not needed • we look for the most general type • solving a system of constraints, using unification • leads to polymorphism • type checking • possible errors are discovered statically • sometimes the error can be deduced from the expression type • disadvantages • makes it harder to find the origin of a type error

IA010 3. Types 71