A Theory of Type Qualifiers
Total Page:16
File Type:pdf, Size:1020Kb
A Theory of Type Qualifiers∗ Jeffrey S. Foster Manuel F¨ahndrich Alexander Aiken [email protected] [email protected] [email protected] EECS Department University of California, Berkeley Berkeley, CA 94720-1776 Abstract Purify [Pur]), which test for properties in a particular pro- gram execution. We describe a framework for adding type qualifiers to a lan- A canonical example of a type qualifier from the C world guage. Type qualifiers encode a simple but highly useful is the ANSI C qualifier const. A variable with a type an- form of subtyping. Our framework extends standard type notated with const can be initialized but not updated.1 A rules to model the flow of qualifiers through a program, primary use of const is annotating pointer-valued function where each qualifier or set of qualifiers comes with addi- parameters as not being updated by the function. Not only tional rules that capture its semantics. Our framework al- is this information useful to a caller of the function, but it lows types to be polymorphic in the type qualifiers. We is automatically verified by the compiler (up to casting). present a const-inference system for C as an example appli- Another example is Evans's lclint [Eva96], which intro- cation of the framework. We show that for a set of real C duces a large number of additional qualifier-like annotations programs, many more consts can be used than are actually to C as an aid to debugging memory usage errors. One present in the original code. such annotation is nonnull, which indicates that a pointer value must not be null. Evans found that adding such an- 1 Introduction notations greatly increased compile-time detection of null pointer dereferences [Eva96]. Although it is not a type- Programmers know strong invariants about their programs, based system, we believe that annotations like lclint's can and it is widely accepted by practitioners that such invari- be expressed naturally as type qualifiers in our framework. ants should be automatically, statically checked to the ex- Yet another example of type qualifiers comes from tent possible [Mag93]. However, except for static type sys- binding-time analysis, which is used in partial evaluation tems, modern programming languages provide little or no systems [Hen91, DHM95]. Binding-time analysis infers support for expressing such invariants. In our view, the whether values are known at compile time (the qualifier problem is not a lack of proposals for expressing invariants; static) or may not be known until run time (the quali- the research community, and especially the verification com- fier dynamic) by specializing the program with respect to an munity, has proposed many mechanisms for specifying and initial input. proving properties of programs. Rather, the problem lies in There are also many other examples of type qualifiers in gaining widespread acceptance in practice. A central issue the literature. Each of the cited examples adds particular is what sort of invariants programmers would be willing to type qualifiers for a specific application. This paper presents write down. a framework for adding new, user-specified type qualifiers to In this paper we consciously seek a conservative frame- a language in a general way. Our framework also extends the work that minimizes the unfamiliar machinery programmers standard type system to perform qualifier inference, which must learn while still allowing interesting program invariants propagates programmer-supplied annotations through the to be expressed and checked. One kind of programming an- program and checks them. Such a system gives the pro- notation that is widely used is a type qualifier. grammer more complete information about qualifiers and Type qualifiers are easy to understand, yet they can ex- makes qualifiers more convenient to use than a pure check- press strong invariants. The type system guarantees that in ing system. every program execution the semantic properties captured The main contributions of the paper are by the qualifier annotations are maintained. This is in con- trast to dynamic invariant checking (e.g., assert macros or We show that it is straightforward to parameterize a • language by a set of type qualifiers and inference rules ∗This research was supported in part by the National Science Foundation Young Investigator Award No. CCR-9457812, NASA for checking conditions on those qualifiers. In particu- Contract No. NAG2-1210, and an NDSEG fellowship. lar, the changes to the lexing, parsing, and type check- ing (see below) phases of a compiler are minimal. We In Proceedings of the ACM SIGPLAN '99 Conference believe it would be realistic to incorporate our proposal on Programming Language Design and Implementation into software engineering tools for any typed language. (PLDI), Atlanta, Georgia, May 1999. 1C allows type casts to remove constness, but the result is imple- mentation dependent [KR88]. 1 We show that the handling of type qualifiers can be well-formedness condition comes from binding time analysis: • separated from the standard type system of the lan- Nothing dynamic may appear within a value that is static. guage. That is, while the augmented type system in- Thus, a type such as static (dynamic α dynamic β) is cludes rules for manipulating and checking qualifiers, not well-formed.3 ! in fact the computation of qualifiers can be isolated in Because our framework is parameterized by the set of a separate phase after standard typechecking has been qualifiers, we must extend not only the types, but also performed. This factorization is similar to that of re- the source language. We add both qualifier annotations, gion inference [TT94]. which introduce qualifiers into types, and qualifier asser- tions, which enforce checks on the qualifiers of a qualified We introduce a natural notion of qualifier polymor- • type. These extensions allow the programmer to express the phism that allows types to be polymorphic in their invariants that are to be checked by the qualifier inference qualifiers. We present examples from existing C pro- rules. grams to show that qualifier polymorphism is useful We conclude this section with a brief illustration of the and in fact necessary in some situations. need for qualifier polymorphism. Qualifier polymorphism We present experimental evidence from a prototype solves a problem with const familiar to C and C++ pro- • qualifier inference system. For this study, we exam- grammers. One of the more awkward consequences of the standard (monomorphic) C++ type system appears in the ine the use of the qualifier const on a set of C bench- marks. We show that even in cases where programmers Standard Template Library (STL) [MSS96] for C++. STL have apparently tried to systematically mark variables must always explicitly provide two sets of operations, one for constant data structures and one for non-constant data as const, monomorphic qualifier inference is able to in- structures. For illustration, consider the following pair of C fer many additional variables as const. Furthermore, functions: polymorphic qualifier inference finds more const vari- ables than monomorphic inference. This study shows typedef const int ci; both that qualifier inference is practical and useful, int *id1(int *x) { return x; } even for existing qualifiers and programs. ci *id2(ci *x) { return x; } The technical observation behind our framework is that a C programmers would like to have only one copy of this type qualifier q introduces a simple form of subtyping: For function, since both versions behave identically and in fact all types τ, either τ q τ or q τ τ. Here, as through compile to the same code. Unfortunately we need both. A the rest of the paper, we write qualifiers in prefix notation, pointer to a constant cannot be passed to id1 without a so q τ represents standard type τ qualified by q. We illus- cast. A pointer to a non-constant can be passed to id2, but trate the subtyping relationship using the examples given then the return value will be const. In the language of type above. In C, non-const l-values can be promoted to const theory, this difficulty occurs because the identity function l-values, but not vice-versa. We capture this formally by has type κ α κ α, with qualifier set κ appearing both co- saying that τ const τ for any type τ. In Evans's system, and contravariantly.! the set of non-null pointers is a subset of the set of all point- In part because of the lack of const polymorphism in ers, which is expressed as nonnull τ τ. In binding time C and C++, const is often either not used, or function analysis values may be promoted fromstatic to dynamic. results are deliberately cast to non-const. For example, the Since static and dynamic are dual notions, we can choose standard library function strchr takes a const char *s as to write static τ τ or τ dynamic τ, depending on a parameter but returns a char * pointing somewhere in s. which qualifier name we regard as the canonical one. By adding polymorphism, we allow const to be used more Our framework extends a language with a set of standard easily without resorting to casting. types and standard type rules to a qualified type system as The rest of this paper is organized as follows. Section 2 2 follows. The user defines a set of n type qualifiers q1; : : : ; qn describes our framework in detail, including the rules for and indicates the subtyping relation for each (whether qi τ const. Section 3 discusses type inference, qualifier poly- τ or τ qi τ for any standard type τ). Each level of a morphism, and soundness. Section 4 describes our const- standard type may be annotated with a set of qualifiers, inference system.