A Type System for Certified Runtime Type Analysis
Total Page:16
File Type:pdf, Size:1020Kb
A Type System For Certified Runtime Type Analysis A Dissertation Presented to the Faculty of the Graduate School of Yale University in Candidacy for the Degree of Doctor of Philosophy by Bratin Saha Dissertation Director: Professor Zhong Shao December 2002 Abstract A Type System For Certified Runtime Type Analysis Bratin Saha 2002 Modern programming paradigms are increasingly giving rise to applications that require type information at runtime. For example, services like garbage collection, marshalling, and serializa- tion need to analyze type information at runtime. When compiling code which uses runtime type inspections, most existing compilers use untyped intermediate languages and discard type informa- tion at an early stage. Unfortunately, such an approach is incompatible with type-based certifying compilation. A certifying compiler generates not only the code but also a proof that the code obeys a security policy. Therefore, one need not trust the correctness of a compiler generating certified code, instead one can verify the correctness of the generated code. This allows a code consumer to accept code from untrusted sources which is specially advantageous in a networked environment. In practice, most certifying compilers use a type system for generating and encoding the proofs. These systems are called type-based certifying compilers. This dissertation describes a type system that can be used for supporting runtime type analysis in type-based certifying compilers. The type system has two novel features. First, type analysis can be applied to the type of any runtime value. In particular quantified types such as polymorphic and existential types can also be analyzed, yet type-checking remains decidable. This allows the system to be used for applications such as a copying garbage collector. Type analysis plays the key role in formalizing the contract between the mutator and the collector. Second, the system integrates runtime type analysis with the explicit representation of proofs and propositions. Essentially, it integrates an entire proof language into the type system for a com- piler intermediate language. Existing certifying compilers have focussed only on simple memory and control-flow safety. This system can be used for certifying more advanced properties while still retaining decidable type-checking. Copyright c 2002 by Bratin Saha All rights reserved. i Acknowledgements First and foremost, I want to thank my advisor Professor Zhong Shao for his encouragement and support during my graduate study. He was a constant source of ideas, sound advice, and enthu- siasm. He exposed me to a wealth of knowledge in programming languages, and has taught me by example the skills of a good researcher. In my first summer at Yale, he put in a huge effort in helping me get up to speed with the SML/NJ compiler. I am particularly grateful for his efforts on our “Optimal Type Lifting” paper to improve my writing. I am indebted to him for all the other things that he has taught me: giving talks, writing research proposals, and much more. This work would not have been possible without the tremendous effort that he put into guiding me. I was privileged to be his student. I would like to thank my collaborators in the FLINT group. Valery Trifonov helped out a great deal with the work on intensional type analysis. I worked with Stefan Monnier on the application of intensional type analysis to type-safe garbage collectors. Stefan was always there to lend a hand with system administration issues. I would like to thank Chris League for many useful discussions and specially for Latex related help. I would like to thank Carsten Schurmann, Arvind Krishnamurthy, John Peterson, and Dana Angluin in the Yale Computer Science department. All of them have helped me in various ways during my graduate study and have taken an active interest in my work. Arvind and John served on my thesis committee. Carsten helped me refine my thoughts through stimulating discussions. I would also like to thank all the people in the administrative offices at Yale – specially Judy Smith and Linda Dobb for helping me out with all the paperwork. I want to thank Andrew Appel and Robert Harper for their help in various stages of my graduate study. Andrew served on my thesis committee and also helped me in other ways. Robert Harper shaped my thoughts through several stimulating discussions. His papers are an education in type- theory. ii Finally, my deepest thanks to my parents without whom I would not be here. They have provided me with loving support throughout my childhood and adulthood, and have been a great source of inspiration. I am very grateful to my brother-in-law Prasantada, my sister Gargi, Buni and Bublu for their constant love and support. I am indebted to Atrayee for her love and encouragement. Graduating during an economic depression can be particularly depressing. She was always there to cheer me up. The later part of graduate study can get particularly stressful specially in trying to tie up all the loose ends; she was always there to lend me a hand. Without Atrayee’s cheerful and constant support this work would have been impossible. I also wish to thank all my friends in Yale and outside – specially Narayan and Kishnan. This work was sponsored in part by the Defense Advanced Research Projects Agency under contract F30602-99-1-0519 and by the National Science Foundation under contracts CCR 9633390, CCR 9901011, CCR 0081590. I am grateful to both these organizations. iii Contents 1 Introduction 1 1.1 Runtime type analysis . 1 1.2 Certifying compilation . 2 1.3 Fully reflexive type analysis . 3 1.4 Integrating type analysis with a proof system . 4 1.5 Summary of contributions . 4 1.6 Outline of this dissertation . 5 2 Overview Of The Typed Lambda Calculi 7 2.1 The simply typed lambda calculus . 7 2.2 The polymorphically typed lambda calculus . 10 2.3 The third order lambda calculus . 12 2.4 The ! order lambda calculus . 13 2.5 Pure type systems . 14 3 A Language For Runtime Type Analysis 18 3.1 Introduction . 18 3.2 Background . 19 3.3 Analyzing quantified types . 24 3.4 Type erasure semantics . 36 ! ! 3.5 Translation from λi to λR . 44 3.6 The untyped language . 48 3.7 Related work . 50 iv 4 Applying Runtime Type Analysis: Garbage Collection 52 4.1 Introduction and motivation . 52 4.2 Source language λCLOS . 55 4.3 Target language λGC . 56 4.4 Translating λCLOS to λGC . 63 4.5 Summary and related work . 66 5 Integrating Runtime Type Analysis With a Proof System 67 5.1 Introduction . 67 5.2 Approach . 68 i 5.3 The type language λCC . 72 i 5.4 Formalization of λCC . 76 5.5 The computation language λH . 81 5.6 Summary and related work . 89 6 Implementation 91 6.1 Implementing a pure type system . 94 6.2 Rationale for design decisions . 98 6.3 Implementation Details . 99 6.4 Representing the base kind . 102 6.5 Performance measurement . 106 6.6 Related Work . 109 7 Conclusions and Future Work 110 7.1 Summary of contributions . 111 7.2 Future work . 112 Appendix 113 ! A Formal Properties of λi 114 ! A.1 Soundness of λi . 114 A.2 Strong normalization . 120 A.3 Confluence . 137 v B Formal Properties Of λGC 145 i C Formal Properties of λCC 154 C.1 Subject reduction . 154 C.2 Strong normalization . 168 C.3 Church-Rosser property . 201 C.4 Consistency . 204 Bibliography 204 vi Chapter 1 Introduction This dissertation describes a type system for supporting runtime type analysis in a certifying com- piler. Why is this important ? Runtime type analysis plays a crucial role in many applications, and modern language implementations are progressively targeting the generation of certified code. The following sections show how runtime type analysis plays a crucial role, and why it is important to support it in a certifying framework. 1.1 Runtime type analysis Modern programming paradigms are increasingly giving rise to applications that rely critically on type information at runtime. For example: • Java adopts dynamic linking as a key feature, and to ensure safe linking, an external module must be dynamically verified to satisfy the expected interface type. • A garbage collector must keep track of all live heap objects, and for that type information must be kept at runtime to allow traversal of data structures. • In a distributed computing environment, code and data on one machine may need to be pickled for transmission to a different machine, where the unpickler reconstructs the data structures from the bit stream. If the type of the data is not statically known at the destination (as is the case for the environment components of function closures), the unpickler must use type information, encoded in the bit stream, to correctly interpret the encoded value. 1 • Type-safe persistent programming requires language support for dynamic typing: the pro- gram must ensure that data read from a persistent store is of the expected type. • Finally, in polymorphic languages like ML, the type of a value may not be known statically; therefore, compilers have traditionally used inefficient, uniformly boxed data representation. To avoid this, several modern compilers use runtime type information to support unboxed data representation. Existing compilers can not type-check the code that involves runtime type inspections. When compiling such code they use untyped intermediate languages, and reify runtime types into values at some early stage. However, discarding the type information during compilation makes this approach incompatible with certifying compilation. 1.2 Certifying compilation A certifying compiler [Nec98] generates not only the object code, but also a machine-checkable proof that the code satisfies a given security policy.