Xi Type System Specification

CS4120/4121/5120/5121—Spring 2019 Xi Type System Specification Cornell University Version of March 10, 2019 0 Changes • 3/10: Clarified how ARRAYDECL works. 1 Types The Xi type system uses a somewhat bigger set of types than can be expressed explicitly in the source language: t ::= int T ::= t R ::= unit s ::= var t j bool j unit j void j ret T j [] n≥2 0 t j (t1, t2,..., tn) j fn T ! T Ordinary types expressible in the language are denoted by the metavariable t, which can be int, bool, or an array type. The metavariable T denotes an expanded notion of type that represents possible types in procedures, functions, and multiple assignments. It may be an ordinary type, unit, or a tuple type. The type unit does not appear explicitly in the source language. This type is given to an element on the left-hand side of multiple assignments that uses the _ placeholder, allowing their handling be integrated directly into the type system. The unit type is also used to represent the result type of procedures. Tuple types represent the parameter types of procedures and functions that take multiple arguments, or the return types of functions that return multiple results. The metavariable R represents the outcome of evaluating a statement, which can be either unit or void. The unit type is the the type of statements that might complete normally and permit the following statement to execute. The type void is the type of statements such as return that never pass control to the following statement. The void type should not be confused with the C/Java type void, which is actually closer to unit. The set s is used to represent typing-environment entries, which can either be normal variables (bound to var t for some type t), return types (bound to ret T), functions (bound to fn T ! T0 where T0 , unit), or procedures (bound to fn T ! unit), where the “result type” unit indicates that the procedure result contains no information other than that the procedure call terminated. 2 Subtyping The subtyping relation on T is the least partial order consistent with this rule: t ≤ unit For now, however, there is no subsumption rule, so subtyping matters only where it appears explicitly. 3 Type-checking expressions To type-check expressions, we need to know what bound variables and functions are in scope; this is represented by the typing context G, which maps names x to types s. The judgment G ` e : T is the rule for the type of an expression; it states that with bindings G we can conclude that e has the type T. CS4120/4121/5120/5121 Spring 2019 1/5 Xi Type System Specification We use the metavariable symbols x or f to represent arbitrary identifiers, n to represent an integer literal constant, string to represent a string literal constant, and char to represent a character literal constant. Using these conventions, the expression typing rules are: G ` n : int G ` true : bool G ` false : bool G ` string : int[ ] G ` char : int G(x) = var t G ` e1 : int G ` e2 : int ⊕ 2 f+, -, *, *>>, /, %g G ` x : t G ` e1 ⊕ e2 : int G ` e : int G ` e1 : int G ` e2 : int 2 f==, !=, <, <=, >, >=g G ` -e : int G ` e1 e2 : bool G ` e : bool G ` e1 : bool G ` e2 : bool 2 f==, !=, &, |g G ` !e : bool G ` e1 e2 : bool G ` e : t[] G ` e1 : t[] G ` e2 : t[] 2 f==, !=g G ` length(e) : int G ` e1 e2 : bool G ` e1 : t ... G ` en : t n ≥ 0 G ` e1 : t[] G ` e2 : int G ` e1 : t[] G ` e2 : t[] G ` {e1, ... ,en} : t[] G ` e1[e2] : t G ` e1 + e2 : t[] G( f ) = fn unit ! T0 T0 , unit G( f ) = fn t ! T0 T0 , unit G ` e : t G ` f () : T0 G ` f (e) : T0 0 0 (8i21..n) G( f ) = fn (t1,..., tn) ! T T , unit G ` ei : ti n ≥ 2 0 G ` f (e1, ... ,en) : T 4 Type-checking statements To type-check statements, we need all the information used to type-check expressions, plus the types of procedures, which are included in G. In addition, we extend the domain of G a little to include a special symbol r. To check the return statement we need to know what the return type of the current function is or if it is a procedure. Let this be denoted by G(r) = ret T, where T , unit if the statement is part of a function, or T = unit if the statement is part of a procedure. Since statements include declarations, they can also produce new variable bindings, resulting in an updated typing context which we will denote as G0. To update typing contexts, we write G[x 7! var t], which is an environment exactly like G except that it maps x to var t. We use the metavariable s to denote a statement, so the main typing judgment for statements has the form G ` s : R, G0. Most of the statements are fairly straightforward and do not change G: G ` s1 : unit, G1 G1 ` s2 : unit, G2 ... Gn−1 ` sn : R, Gn (SEQ) (EMPTY) G ` {s1 s2 ... sn} : R, G G ` {} : unit, G 0 0 00 G ` e : bool G ` s : R, G G ` e : bool G ` s1 : R1, G G ` s2 : R2, G (IF) (IFELSE) G ` if (e) s : unit, G G ` if (e) s1 else s2 : lub(R1, R2), G CS4120/4121/5120/5121 Spring 2019 2/5 Xi Type System Specification G ` e : bool G ` s : R, G0 (WHILE) G ` while (e) s : unit, G G( f ) = fn unit ! unit G( f ) = fn t ! unit G ` e : t (PRCALLUNIT) (PRCALL) G ` f () : unit, G G ` f (e) : unit, G (8i21..n) G( f ) = fn (t1,..., tn) ! unit G ` ei : ti n ≥ 2 (PRCALLMULTI) G ` f (e1, ... ,en) : unit, G G(r) = ret unit G(r) = ret t G ` e : t (RETURN) (RETVAL) G ` return : void, G G ` return e : void, G (8i21..n) G(r) = ret (t1, t2,..., tn) G ` ei : ti n ≥ 2 (RETMULTI) G ` return e1, e2, ... , en : void, G The function lub is defined as follows: lub(R, R) = R lub(unit, R) = lub(R, unit) = unit Therefore, the type of an if is void only if all branches have that type. Assignments require checking the left-hand side to make sure it is assignable: G(x) = var t G ` e : t G ` e1 : t[] G ` e2 : int G ` e3 : t (ASSIGN) (ARRASSIGN) G ` x = e : unit, G G ` e1[e2] = e3 : unit, G Declarations are the source of new bindings. Three kinds of declarations can appear in the source language: variable declarations, multiple assignments, and function/procedure declarations. We are only concerned with the first two kinds within a function body. To handle multiple assignments, we define a declaration d that can appear within a multiple assignment: d ::= x:t j _ and define functions typeof (d) and varsof (d) as follows: typeof (x:t) = t typeof (_) = unit varsof (x:t) = fxg varsof (_) = Æ Using these notations, we have the following rules: x < dom(G) x < dom(G) G ` e : t (VARDECL) (VARINIT) G ` x:t : unit, G[x 7! var t] G ` x:t = e : unit, G[x 7! var t] (8i21..n) G ` e : t x < dom(G) G ` ei : int n ≥ 1 m ≥ 0 (EXPRSTMT) (ARRAYDECL) G ` _ = e : unit, G G ` x:t[e1] ... [en] [] ... [] : unit, G[x 7! var t [] ... []] | {z } | {z } m n+m (8i21..n) G ` e : (t1,..., tn) ti ≤ typeof (di) (8i21..n) (8i,j21..n.j,i) dom(G) \ varsof (di) = Æ varsof (di) \ varsof (dj) = Æ (MULTIASSIGN) (8i21..n8x .varsof (d )=fx g) G ` d1, ... ,dn = e : unit, G[xi 7! var typeof (di) i i i ] The final premise in rule MULTIASSIGN prevents shadowing by ensuring that dom(G) and all of the varsof (di)’s are disjoint from each other. With respect to (ARRAYDECL), note that the case of declaring an array with no dimension sizes specified (n = 0) is already covered by (VARDECL). CS4120/4121/5120/5121 Spring 2019 3/5 Xi Type System Specification 5 Checking top-level declarations At the top level of the program, we need to figure out the types of procedures and functions, and make sure their bodies are well-typed. Since mutual recursion is supported, this needs to be done in two passes. First, we use the judgment G ` fd : G0 to state that the function or procedure declaration fd extends top-level bindings G to G0: f < dom(G) f < dom(G) G ` f () s : G[ f 7! fn unit ! unit] G ` f (x:t) s : G[ f 7! fn t ! unit] 0 f < dom(G) G = G[ f 7! fn (t1,..., tn) ! unit] n ≥ 2 0 G ` f (x1:t1, ... ,xn:tn) s : G f < dom(G) f < dom(G) G ` f ():t0 s : G[ f 7! fn unit ! t0] G ` f (x:t):t0 s : G[ f 7! fn t ! t0] 0 0 f < dom(G) G = G[ f 7! fn (t1,..., tn) ! t ] n ≥ 2 0 0 G ` f (x1:t1, ... ,xn:tn):t s : G 0 0 0 f < dom(G) G = G[ f 7! fn unit ! (t1,..., tm)] m ≥ 2 0 0 0 G ` f ():t1, ... ,tm s : G 0 0 0 f < dom(G) G = G[ f 7! fn t ! (t1,..., tm)] m ≥ 2 0 0 0 G ` f (x:t):t1, ... ,tm s : G 0 0 0 f < dom(G) G = G[ f 7! fn (t1,..., tn) ! (t1,..., tm)] n ≥ 2 m ≥ 2 0 0 0 G ` f (x1:t1, ..

Load more