Type Qualifiers As Composable Language Extensions
Total Page:16
File Type:pdf, Size:1020Kb
Type Qualifiers as Composable Language Extensions Travis Carlson Eric Van Wyk Computer Science and Engineering Computer Science and Engineering University of Minnesota University of Minnesota Minneapolis, MN, USA Minneapolis, MN, USA [email protected] [email protected] Abstract 1 Introduction and Motivation This paper reformulates type qualifiers as language exten- Type qualifiers in C and C++, such as const or restrict, sions that can be automatically and reliably composed. Type enable the programmer to disallow certain operations on qualifiers annotate type expressions to introduce new sub- qualified types and to indicate simple subtype relationships typing relations and are powerful enough to detect many between types. For example, only initializing assignments kinds of errors. Type qualifiers, as illustrated in our ableC are allowed on a variable declared with a const qualifier, extensible language framework for C, can introduce rich and a function with an argument declaration int *x cannot forms of concrete syntax, can generate dynamic checks on be passed a value of type const int * as this would allow data when static checks are infeasible or not appropriate, and changes to the const int via the pointer. The type int * inject code that affects the program’s behavior, for example is considered to be a subtype of const int *. for conversions of data or logging. In their seminal paper “A Theory of Type Qualifiers”, Fos- ableC language extensions to C are implemented as at- ter et al.[12] formalize this subtyping relationship and show tribute grammar fragments and provide an expressive mech- how user-defined type qualifiers can be added to a language anism for type qualifier implementations to check for ad- to perform additional kinds of checking. One example is a ditional errors, e.g. dereferences to pointers not qualified nonnull qualifier for pointer types to indicate that the value by a “nonnull” qualifier, and report custom error messages. of the pointer is not null. A pointer p declared as Our approach distinguishes language extension users from int * nonnull p = &v; developers and provides modular analyses to developers to can be passed to a function as a parameter declared to have ensure that when users select a set of extensions to use, they type int *, but the reverse of passing an int * value as a will automatically compose to form a working compiler. parameter with type int * nonnull is disallowed. Further- more, dereferences to pointers whose type is not qualified CCS Concepts • Software and its engineering → Ex- as nonnull raise errors. On the other hand, the qualifier tensible languages; Data types and structures; Transla- tainted induces a different subtype relationship in which tor writing systems and compiler generators; the type char * is a subtype of tainted char *, thus pre- Keywords type qualifiers, type systems, pluggable types, venting a value with this qualified type to be used where extensible languages, dimensional analysis an unqualified one is expected. These subtype relationships, which form a lattice of qualified types, are described in more ACM Reference Format: detail in Section2. In their paper, the qualifiers are called Travis Carlson and Eric Van Wyk. 2017. Type Qualifiers as Com- “user-defined” and it is a programmer that specifies anew posable Language Extensions. In Proceedings of 16th ACM SIG- qualifier, the form of subtype relationship that it induces, PLAN International Conference on Generative Programming: Con- and the operations that it disallows. cepts and Experiences (GPCE’17). ACM, New York, NY, USA, 13 pages. In this paper we reformulate these and other type qual- https://doi.org/10.1145/3136040.3136055 ifiers as language extensions in the ableC extensible lan- guage framework for C [17]. In the approach, we make a Permission to make digital or hard copies of all or part of this work for clear distinction between the independent parties that may personal or classroom use is granted without fee provided that copies develop various language extensions and the extension users are not made or distributed for profit or commercial advantage and that (programmers) that may select the extensions that address copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must their task at hand. These extensions may add new domain- be honored. Abstracting with credit is permitted. To copy otherwise, or specific syntax (notations) or semantic analyses to their host republish, to post on servers or to redistribute to lists, requires prior specific language, in this case C. While extension developers must permission and/or a fee. Request permissions from [email protected]. understand the underlying language implementation mech- GPCE’17, October 23–24, 2017, Vancouver, Canada anisms used in ableC, the guarantees of composability pro- © 2017 Copyright held by the owner/author(s). Publication rights licensed vided by ableC and its supporting tools ensure that the users to Association for Computing Machinery. ACM ISBN 978-1-4503-5524-7/17/10...$15.00 of extensions do not. https://doi.org/10.1145/3136040.3136055 91 GPCE’17, October 23–24, 2017, Vancouver, Canada Travis Carlson and Eric Van Wyk typedef datatype Expr Expr; Use of watch type qualifier: datatype Expr { watch int sum = 0; Add (Expr * nonnull, Expr * nonnull); for (int i=1; i < 4; ++i) { sum = sum + i; } Mul (Expr * nonnull, Expr * nonnull); Translation to C of assignment to sum: Const (int); sum = ({ int tmp = sum + i; }; printf("example.xc:3: (sum) = %d\n", tmp); tmp; }); int value (Expr * nonnull e) { match (*e) { Output: Add (e1, e2) -> { return value(e1) + value(e2); } example.xc:1: (sum) = 0 Mul (e1, e2) -> { return value(e1) * value(e2); } example.xc:3: (sum) = 1 Const (v) -> { return v; } example.xc:3: (sum) = 3 } example.xc:3: (sum) = 6 } Figure 2. Code insertion by watch type qualifier. Figure 1. Algebraic datatype and nonnull extensions. or appropriate, or computations to perform data conversion or generate print statements to monitor a changing variable Language extension specifications for ableC are written, as is done with the watch type qualifier shown in Figure2. as described in Section3, as context free grammars (for con- crete syntax) and attribute grammars (for semantic analy- Contributions: This paper reformulates type qualifier work sis and code generation). The underlying tools that process of Foster et al.[12] in an extensible setting with guarantees these specifications provide modular analyses of language of composability for independently-developed type qualifier extension specifications that an extension developer uses to extensions, extended with a more expressive mechanism for ensure that their extension will automatically and reliably specifying new static checks based on type qualifiers and compose with other independently-developed extensions handling of parameterized type qualifiers (Section3). that also pass these analyses. This relieves the extension We describe a refactoring and extension of previously users (programmers) of needing to know about the underly- ad-hoc handling of type qualifiers in ableC to support: ing tools and techniques and thus frees extension designers • Mechanisms for distinguishing behavior of type quali- to write expressive language extensions that introduce new fier checking on code written by a programmer versus syntax and semantic analysis with the knowledge that their code generated by a language extension or included extension will be easily used by programmers as long as it by the C preprocessor ( Section 4.1). passes the modular analyses. • A technique for automatically combining type quali- While this requires more sophistication of the extension fier annotations to library headers from multiple ex- developer, this approach does provide them with the tools tensions to alleviate a manual process in Cqal, Foster for writing more syntactically and semantically expressive et al.’s implementation of user-defined type qualifiers type qualifiers than possible in other approaches. Syntacti- in C (Section 4.2). cally, type qualifiers can define their own sub-language, e.g. • Mechanisms for type qualifiers to be specific to a type units(kg*m^2) which specifies a SI measurement in new- (and generate errors when applied to other types) and tons. Another example is the sample program in Figure1 for type qualifiers to be independent of the type they that uses two language extensions: one introducing alge- qualify (Section5). braic datatypes similar to those in Standard ML and Haskell, • Qualifiers that add dynamic checks of data when static and another introducing a nonnull qualifier like the one de- checks are not feasible or appropriate (Section6). scribed above. Here the algebraic datatype extension is used • Qualifiers that perform runtime conversion of data or to declare a datatype for simple arithmetic expressions and otherwise insert additional code (Section7). to write a function to compute their value. This value func- We demonstrate several type qualifiers such as nonnull tion takes pointers to expressions qualified as nonnull. The and qualifiers to indicate functions as being pure and associa- programmer writing this code would have imported both tive, qualifiers with richer syntax such as one for dimension of these independently-developed language extensions into analysis to check that physical measurement units (e.g. me- ableC in order to use both extensions in the same program. ters, seconds, etc.) are used correctly, extensions that insert Type qualifiers specified as ableC extensions can perform code such as dynamic array bound checks, the watch quali- the sort of analysis exemplified by a nonnull qualifier but fier, and data conversion (e.g. scaling values in millimeters they can also introduce code into the C program that is to meters in the dimension analysis extension). generated from the extended ableC program. These can be Section8 discusses related work before Section9 discusses dynamic checks of data when static checks are not possible some future work and concludes.