Qualifier Type Inference
Total Page:16
File Type:pdf, Size:1020Kb
Qualifier Type Inference We present the full qualifier inference system in this section. annotates the expression e with the qualifier Q. The expression Our system extends the flow-sensitive analysis of Foster et al. check(e; Q) requires the top-level qualifier of e to be at most Q. [1]. In particular, we consider pair types (and more generally We automatically insert the check expressions through a simple records) and present their corresponding type inference rules. program transformation. Specifically, we consider two types of Providing separate qualifiers for the elements of pairs is use as security critical: pointer dereferences and conditional important in our problem domain, as records (C structs) branches. To detect UBI, we insert a check(e; init) statement are used extensively in the Linux kernel. More importantly, before every statement where e is dereferenced or is used as pointers to records are often passed between functions and the predicate of a conditional branch. whether a field of a record is or is not initialized is independent of the other fields of the record. We present a type qualifier B. Types and Type Stores inference system to infer a qualifier (either init or uninit) for We now define the qualified types. each expression of the program. τ := Q σ A. Syntax Q := κ j init j uninit 0 0 σ := int j ref (ρ) j (C; τ) ! (C ; τ ) j hτ1; τ2i Our qualifier inference is performed after alias analysis. The C := j Alloc(C; ρ) j Assign(C; ρ: τ) alias analysis results are used to decorate aliased references with j Merge(C; C0;L) j Filter(C; L) the same abstract locations ρ. This can be the line number of η := 0 j 1 j ! an object allocation statement. In the input programs, reference creation expressions are decorated with abstract locations and The qualified types τ can have qualifiers at different levels. functions are decorated with effects (i.e., the set of abstract Q can be a qualifier variable κ or a constant qualifier init locations that they access). The abstract syntax is defined as or uninit. The flow-sensitive analysis associates a ground follows: store C to each program point that is a vector that associates abstract locations to qualified types. Thus, function types are e := x j n j λL x: t: e j e e j refρ e j !e 1 2 now extended to (C; τ) ! (C0; τ 0) where C is the store that j e := e j he ; e i j fst(e) j snd(e) 1 2 1 2 the function is invoked in and C0 is the store when the function j fst(e ) := e j snd(e ) := e j 1 2 1 2 returns. j assert(e; Q) j check(e; Q) Each location in a store C also has an associated linearity t := α j int j ref (ρ) j t !L t0 j ht ; t i 1 2 η that can take three values: 0 for unallocated locations, L := fρ, ::; ρg 1 for linear locations, and ! for non-linear locations. An An expression e can be a variable x, a constant integer n, a abstract location is linear if the type system can prove that it function λL x: t: e with argument x of type t, effect set L and corresponds to a single concrete location in every execution. An body e. The effect set L is the set of abstract locations ρ that update that changes the qualifier of a location is called a strong the function accesses. A type t is either a type variable α, an update; otherwise, it is called a weak update. Strong updates integer type int, a reference ref (ρ) (to the abstract location can be applied to only linear locations. The three linearities ρ), a function type t !L t0 (that is decorated with its effects form a lattice 0 < 1 < !. Addition on linearities is as follows: L) or a pair type ht1; t2i. The analysis will involve a store 0 + x = x, 1 + 1 = !, and ! + x = !. The type inference C that maps abstract locations ρ to types. The expression system tracks the linearity of locations to allow strong updates e1 e2 is the application of function e1 to argument e2. The for only the linear locations. ρ reference creation expression ref e (decorated with the abstract Since a store C maps from each abstract location ρi to a location ρ) allocates memory with the value e. The expression type τi and a linearity ηi, we write C(ρ) as the type of ρ !e dereferences the reference e. The expression e1 := e2 in C and Clin (ρ) as the linearity of ρ in C. Store variables assigns the value of e2 to the location e1 points to. The are denoted as . We use the following store constructors to expression he1; e2i is the pair of e1 and e2. The expressions represent the store after an expression as a function of the store fst(e) and snd(e) are the first and second elements of the pair e before it. Alloc(C; ρ) returns the same store as C except for respectively. The expressions fst(e1) := e2 and snd(e1) := e2 the location ρ. Allocating ρ does not affect the types in the assign the value of e2 to the first element and second elements store; however, as ρ is allocated once more, the linearity of ρ 0 of the location e1 points to respectively. is increased by one. Merge(C; C ;L) returns the combination We use explicit qualifiers to both annotate and check the of stores C and C0; for a location ρ, if ρ 2 L, then its type and initialization status of expressions. The expression assert(e; Q) linearity are taken from C, otherwise from C0. Filter(C; L) ρ Alloc(C; ρ0)(ρ) =C(ρ) type of in the post-store. The qualifier of the new location n1 + C (ρ) if ρ = ρ0 is initialized. The rule DEREF checks that the dereferenced Alloc(C; ρ0) (ρ) = lin lin C(ρ) otherwise expression is of a reference type ref (ρ) and retrieves the type nC(ρ) if ρ 2 L Merge(C; C0;L)(ρ) = C(ρ0) otherwise of the value stored at the location ρ from the store. Qualifiers nC (ρ) if ρ 2 L are checked by the single check expression described before Merge(C; C0;L) (ρ) = lin lin C0 (ρ) lin otherwise (and not when references are dereferenced). The rule ASSIGN Filter(C; L)(ρ) =C(ρ) ρ 2 L nC (ρ) if ρ 2 L checks that the left-hand side expression is of a reference type Filter(C; L) (ρ) = lin lin 0 otherwise and checks that the type of the right-hand side is a subtype of 0 0 0 τ where τ τ if ρ = ρ ^ Clin (ρ) 6= ! 0 0 the type of the value that the reference stores. It also checks Assign(C; ρ : τ)(ρ) = τ t C(ρ) if ρ = ρ ^ Clin (ρ) = ! C(ρ) otherwise that the right-hand side can be assigned to the left-hand side 0 Assign(C; ρ : τ)lin (ρ)=Clin (ρ) considering the linearity and type of the left-hand side reference and the type of the right-hand side expression (as described in restricts the domain of C to L. Assign(C; ρ: τ) overrides C the definition of Assign above). The rule LAM type-checks the by mapping ρ to a type τ 0 such that τ τ 0. The condition function body e in a fresh initial store and with the parameter τ τ 0 allows assigning a subtype τ of resulting type τ 0 to ρ. bound to a type with fresh qualifier variables. The resulting If ρ is linear then its type in Assign(C; ρ : τ) is τ 0; otherwise post-store of the function body C0 should be a subtype of its type is conservatively the least-upper bound of τ and its the post-store of the function 0. This step essentially creates previous type C(ρ). a function summary, which has been explained in the paper The type inference system generates subtyping constraints section 4.3. We use the function sp(t) to decorate a standard between stores. We define store subtyping in Figure 1. type t with fresh qualifier and store variables: sp(α) = κ α κ fresh INT REF Q Q0 Q Q0 sp(int) = κ int κ fresh 0 0 sp(ref (ρ)) = κ ref (ρ) κ fresh Q int Q int Q ref (ρ) Q ref (ρ) L 0 L 0 0 0 FUN sp(t ! t ) = κ (, sp(t)) ! ( ; sp(t )) κ, , fresh 0 0 0 0 0 0 0 Q Q τ2 τ1 τ1 τ2 C2 C1 C1 C2 sp(ht; t i) = κ hsp(t); sp(t )i κ fresh L 0 0 0 L 0 0 Q (C1; τ1) ! (C1; τ1) Q (C2; τ2) ! (C2; τ2) The rule APP checks that the type of e2 is a subtype STORE 0 0 of the parameter type of e1, Further, with the condition τi τi ηi ηi i = 1::n Filter(C; L) , it checks that state of the locations that 0 0 η1 ηn η1 0 ηn 0 00 fρ1 : τ1; :::; ρn : τ1g fρ1 : τ1; :::; ρ1 : τng e1 uses (captured by its effect set L) in the post-store C of e2 PAIR are compatible with the store that the function e expects. The 0 0 0 1 Q Q τ1 τ1 τ1 τ2 resulting store Merge(0;C00;L) joins the store C00 before the 0 0 0 0 Q hτ1; τ2i Q hτ1; τ2i function call with the result store of the function.