A Simple Language
Type Systems . Simple untyped expressions . Natural numbers encoded as succ … succ 0 . E.g. succ succ succ 0 represents 3 . Pierce Ch. 3, 8, 11, 15 . term: a string from this language . To improve readability, we will sometime write parentheses: e.g. iszero (pred (succ 0))
CSE 6341 1 CSE 6341 2
Semantics (informally) Equivalent Ways to Define the Syntax
. A term evaluates to a value . Inductive definition: the smallest set S s.t. . Values are terms themselves . { true, false, 0 } S . Boolean constants: true and false . if t1S, then {succ t1, pred t1, iszero t1 } S . Natural numbers: 0, succ 0, succ (succ 0), … . if t1 , t2 , t3 S, then if t1 then t2 else t3 S . Given a program (i.e., a term), the result . Same thing, written as inference rules of “running” this program is a boolean trueS falseS 0S axioms (no premises) value or a natural number t1S t1S t1St2St3S . if false then 0 else succ 0 succ 0 succ t1 S pred t1 S if t1 then t2 else t3 S . iszero (pred (succ 0)) true t S If we have established the premises . Problematic: succ true or if 0 then 0 else 0 1 (above the line), we can derive the iszero t1 S CSE 6341 3 conclusion (below the line) 4
Why Does This Matter? Inductive Proofs
. Key property: for any tS, one of three things . Structural induction – used very often must be true: . Suppose P is a predicate over terms (i.e., a . It is a constant (i.e., derived from an axiom) function mapping elements of S to truth values)
. It is of the form succ t1, pred t1, or iszero t1 . When P(t) is true, we will just write P(t) where t is some smaller term 1 . For each term t, let ti be its immediate . It is of the form if t1 then t2 else t3 where subterms. Suppose we can prove that t , t , and t are some smaller terms 1 2 3 . Whenever P(ti) for all ti , we also have P(t) . The inference rules make this explicit, and . For terms without subterms, P(t) holds make it easy for us to have . This means that P(t) for all terms in S . Inductive definitions of functions over S . Inductive proofs of properties of S
CSE 6341 5 CSE 6341 6
1 Semantics: Why? Semantics: How?
. We need to define the semantics before we . Operational semantics in the general sense: can discuss type systems imagine an abstract machine . The semantics defines the difference . Some notion of the state of this machine between “good” and “bad” programs . Transition function: given the current state, . A type system can help us prove that certain what is the next state? programs are “good”, for all possible inputs . It is possible that the machine gets . Safety (a.k.a. soundness) of a type system: if “stuck” – there is no valid transition a program is well-typed, it will not “go wrong” . The semantics we will define for this simple . But only for certain bad behaviors: e.g. a language is a specific form of “small-step” type system typically cannot assure the operational semantics absence of “division by zero” or “array . state = term; transition = term simplification index out of bounds” . CSE 6341 7 Later will discuss “big-step” semantics 8
Semantics: How? Semantics (formally)
. Initial state: the term whose meaning we are . The domain of values (a subset of the terms) trying to determine .
CSE 6341 9 CSE 6341 10
Evaluation Relation: Booleans Example
. Relation SS defined with inference rules if true then (if (if false then false else false) then . Just a way of writing an inductive definition true
if true then t2 else t3 t2 else false) else if false then t2 else t3 t3 true ? (value i.e. term that is true or false) t1 t1 Step 1: ... if (if false then false else false) then true else false if t then t else t if t then t else t 1 2 3 1 2 3 Step 2: if false then false else false false Step 3: if (if false then false else false) then . These rules get instantiated with concrete true else false if false then true else false terms – to get rule instances Step 4: if false then true else false false CSE 6341 11 CSE 6341 12
2 More on the Evaluation Relation Typed Expressions
. We can generalize to the natural numbers by . Goal: without evaluating a term, can we adding more inference rules guarantee that it will not get stuck? . Will not go into these details here . Idea: define types, and establish a . A key issue: what if we reach a term that relationship between terms and types cannot be evaluated anymore (no inference rule . For our simple example: applies), but the term is not a semantic value? . Type Bool, which is the set of all terms . Examples: if 0 then 0 else 0 and pred false that evaluate to a boolean value . There is no inference rule that can be used . Type Nat, which is the set of all terms to make “the next step” that evaluate to a numeric value . We get “stuck” – i.e. have a run-time error: . To determine that a term t has type T (i.e., the program has reached a meaningless state tT), we will only look at the structure of t (i.e., will do a compile-time analysis) CSE 6341 13 CSE 6341 14
Typing Relation Example: Typing Derivation
. Relation : S { Bool, Nat} . if (iszero 0) then 0 else (succ 0) : ? . t : T is the same as t T true : Bool false : Bool 0 : Nat 0 : Nat 0 : Nat t1 : Bool t2 : T t3 : T iszero 0 : Bool 0 : Nat succ 0 : Nat
if t1 then t2 else t3 : T if (iszero 0) then 0 else (succ 0) : Nat
t1 : Nat t1 : Nat t1 : Nat . This structure is a derivation tree: the leaves succ t1 : Nat pred t1 : Nat iszero t1 : Bool are instances of axioms, the inner nodes are instances of inference rules with premises
CSE 6341 15 CSE 6341 16
More on the Typing Relation More on the Typing Relation
. A term t is typable (or well typed) if there is . Safety = Progress + Preservation some T such that t : T . Safety (a.k.a. soundness) of a type system: if . In this particular simple type system, each a program is well-typed, it will not “go wrong” term has at most one type . For this type system: a well-typed term t : T . In general, a term may have multiple types will not get stuck (e.g. when the type system has subtypes) . And will evaluate to a value of type T . Progress: A well-typed term will not be stuck: . This property does not work in the other it either is a value, or it can take a step direction: a term which is not well typed may or according to the evaluation rules may not get stuck (conservative analysis) . Preservation: If a well-typed term takes a step . if (iszero 0) then 0 else false of evaluation, the result is also well typed . if true then 0 else false
CSE 6341 17 CSE 6341 18
3 An Extended Simple Language Typing Relation Again
| {
CSE 6341 19 CSE 6341 20
Records Typing Relation
. In the semantics, introduce record values t1 : { l1:T1, l2:T2, …, ln:Tn } . In the type system, introduce record types t1.lk : Tk { l1:T1 , l2:T2 ,…, ln:Tn } . E.g. { sum:Nat , overdraft:Bool } . {sum=succ 0,overdraft=true}.sum : ? . { … } : { sum:Nat , overdraft:Bool } . { … }.sum : Nat
CSE 6341 21 CSE 6341 22
Ordering of Labels Lists
. Consider { sum=succ 0 , overdraft=true } and
4 Typing Relation Let Bindings
isnil[T1] t1 : Bool . Sequence of (name,type) pairs
. Example 1: cons[Bool] (isnil[NatBool] . , x:T means “ appended with the pair (x:T)” nil[NatBool]) (cons[Bool] false nil[Bool]) . Name x should not already be bound by . Example 2: cons[Bool] false true . Ternary typing relation: t : T . Example 3: isnil[Bool] nil[NatBool] . “Term t has type T under the bindings in “ CSE 6341 25 CSE 6341 26
Typing Relation Extended Typing Relation
. Need to include in all rules ; e.g. t1 : T1 ,x:T1 t2 : T2 x:T let x = t1 in t2 : T2 x: T true : Bool t1 : T1 t2 : List T1
cons[T1] t1 t2 : List T1 . let z=true in cons[Bool] z (cons[Bool] z nil[Bool]) : ? . true : Bool . also needed for functions and function . z:Bool cons[Bool] z (cons[Bool] z nil[Bool]) : ? applications (function body should be evaluated . z:Bool z : Bool z:Bool nil[Bool] : List Bool under bindings for the function parameters) . z:Bool cons[Bool] z nil[Bool] : List Bool . But, we have no time for this discussion . z:Bool cons[Bool] z (cons[Bool] z nil[Bool]) : List Bool . In this generalized type system, as before, . let z=true in cons[Bool] z (cons[Bool] z nil[Bool]) each term has at most one type, and a well- : List Bool typed term will not get stuck (safety) . Note: t : T is typically written simply as t : T CSE 6341 27 CSE 6341 28
Subtypes Subtype Relation
. Subtypes play an important role in many S S reflexivity S U U T transitivity languages (e.g. object-oriented ones) S T S Top top type . S is a subtype of T, written S T, if any term of type S can be safely used in any situation S1 T1 S2 T2 … Sn Tn where a term of type T is expected { l1:S1, l2:S2, …, ln:Sn } { l1:T1, l2:T2, …, ln:Tn } . Principle of safe substitution t: S ST depth subtyping for records subsumption rule t: T { l1:T1, l2:T2, …, ln:Tn, ln+1:Tn+1 } { l1:T1, l2:T2, …, ln:Tn } . Simple interpretation is that the elements of S width subtyping for records form a subset of the elements of T Example: {x:Nat} is the set of all records that have a field x:Nat, and some . We will define the subtype relation with the other fields. {x:Nat,y:Bool} is the set of all records that have a field x:Nat, a field y:Bool, and some other fields. Thus, {x:Nat,y:Bool} {x:Nat} help of inference rules 29 CSE 6341 30
5 Should the Order of Labels Matter? Functions and Subtypes
{ k1:S1, …, kn:Sn } is a permutation of { l1:T1, …, ln:Tn } . Function types: T1 T2 . For a term of type T , the result of applying { k1:S1, …, kn:Sn } { l1:T1, …, ln:Tn } 1 the function on this term is of type T2 . The rule says that the order of labels (fields) in a . Subtyping: contravariant for the parameter, record does not matter: e.g. {x:Nat,y:Bool} is a covariant for the result subtype of {y:Bool,x:Nat} and vice versa T1 S1 S2 T2 . Problem: this is bad for run-time performance S S T T . If we fix the order at compile time, we would 1 2 1 2 . know, at compile time, the offset of the field Function f of type S1 S2 accepts an argument of S1, so it should be OK with an with label ln – allows efficient access for t.ln argument of T . Returns a value of S , so f(…) . But with permutation, at run time need to 1 2 can be used anywhere where T is expected. “search” in memory for the actual location of l 2 n So, f is also of type T T CSE 6341 31 1 2 32
Tuples and Lists Casting
. n-tuples can be thought of as a special case of . (T) t in Java and C++ records with labels 1, 2, …, n . Up-cast: a term is “forced” to a supertype of . Essentially, same typing rules the type the typechecker would choose for it . Lists If t:Sand ST, use S1 T1 t: T this and the subsumption List S List T (T) t : T 1 1 rule to derive (T) t :T . . Allows the creation of heterogeneous lists: e.g. Down-cast: force a type that cannot be cons[{x:Nat}] {x=0} (cons[{x:Nat,y:Bool}] {x=0,y=true} determined statically t: S nil[{x:Nat,y:Bool}]) . The programmer says (T) t : T . For the inner expression: cons … : List {x:Nat,y:Bool} to the typechecker: . Subsumption rule: give it type List {x:Nat} “I know this will be the type; trust me” . Only then we can type the outer cons … . “trust but verify” e.g. run-time checks in Java CSE 6341 33 CSE 6341 34
Polymorphism Terminology
. Statically typed language: compile-time analyses . Poly = many, morph = form . Prove the absence of certain type-related bad . A piece of code has multiple types run-time behaviors (C, C++, Java, ML, Haskell,…) . Example 1: subtype polymorphism . Type safety: all bad behaviors of certain . Subsumption rule: a term has multiple types kinds are excluded (e.g. Java, but not C) . Typical for object-oriented languages . Dynamically typed language: run-time checks to . Example 2: parametric polymorphism catch bad behaviors (e.g. Lisp, Scheme, Perl) . E.g. f(x)=x has types BoolBool, NatNat, … . Language safety: cannot “break” the fundamental . Use a type parameter T and type TT abstractions (type-related and otherwise); e.g. no . Examples: generics in C++ and Java, ML- buffer overflows, seg faults, return address style polymorphism in functional languages overriding, garbage values due to type errors, etc. . Example 3: ad hoc polymorphism -e.g. overloading . C: unsafe; Java: safe, static+dynamic checking; CSE 6341 35 Lisp: safe, dynamic checking 36
6