Formalising, Improving, and Reusing the Java Module System

Formalising, improving, and reusing the Java Module System Rok Strnisaˇ St. John’s College This dissertation is submitted for the degree of Doctor of Philosophy Abstract Java has no module system. Its packages only subdivide the class namespace, allowing only a very limited form of component-level information hiding. A Java Community Process has started designing the Java Module System, a module system for Java 7, the next version of Java. The extensive draft of the design is written only in natural language documents, which inevitably contain many ambiguities. We design and formalise LJAM, a core of the module system. Where the informal documents are complete, we follow them closely; elsewhere, we make reasonable choices. We define the syntax, the type system, and the operational semantics of this core, defining these rigorously in the Isabelle/HOL automated proof assistant. We highlight the underly- ing design decisions, and discuss several alternatives and their benefits. Through analysis of our formalisation, we identify two major deficiencies of the module system: (a) its class resolution is unintuitive, insufficiently expressive, and fragile against incremental interface evolution; and (b) only a single instance of each module is permitted, which forces sharing of data and types, and so makes it difficult to reason about module invariants. We propose modest changes to the module language, and to the semantics of the class resolution, which together allow the module system to handle more scenarios in a clean and predictable manner. To develop confidence, both theoretical and practical, in our proposals, we (a) formalise in Ott the improved module system, iJAM, (b) prove in Isabelle/HOL mechanised type soundness results, and (c) give a proof-of-concept implementation in Java that closely follows the formalisation. Both of the formalisations, LJAM and iJAM, are based on Lightweight Java (LJ), our minimal imperative fragment of Java. LJ has been a good base language, allowing a high reuse of the definitions and proof scripts, which made it possible to carry out this development relatively quickly, on the timescale of the language evolution process. Finally, we develop a module system for Thorn, an emerging, Java-like language aimed at distributed environments. We find that local aliasing and module-prefixed type references remove the need for boundary renaming, and that in the presence of multiple module instances, care is required to avoid ambiguities at de-serialisation. We conclude with a high-level overview of the interactions between the desired properties and the language features considered, and discuss possible future directions. Declaration This thesis: • is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text; • is not substantially the same as any that I have submitted for a degree or diploma or other qualification at any other university; and • does not exceed the prescribed limit of 60,000 words. Rok Strnisaˇ May 2010 Acknowledgements First, I would like to thank my family for all their moral and financial support throughout my studies. Without them, achieving a Ph.D. at University of Cambridge would have been much harder, if not impossible. Peter Sewell and Matthew Parkinson, my Ph.D. supervisors, have provided me with all the attention, knowledge and advice a Ph.D. student could ask for. They helped me craft the big picture, while often also giving key insights and ideas for solving specific problems I faced. I thank Jan Vitek, Doug Lea, Alex Buckley and Stanley Ho for introducing me to the topic of module systems for Java, for providing me with many documents related to the topic, and for extensive discussion. The Thorn team, whom I have had the pleasure of collaborating with, has been most helpful with the design of the draft module system for the language. At the time, the team included John Field, Jan Vitek, Tobias Wrigstad, Johan Ostlund,¨ Bard Bloom, Nate Nystrom, and Gregor Richards. I also thank Sriram Srinivasan, Silvia Breu, Tom Ridge, Alisdair Wren, Viktor Vafei- adis, Nobuko Yoshida, Simon Peyton Jones, John Billings, Sam Staton, and Jat Singh for useful comments on this work. I would like to thank my examiners, Gavin Bierman and Sophia Drossopoulou, for thorough analysis of my work, and for many valuable comments and suggestions. Finally, I acknowledge funding from two EPSRC grants: GR/T11715/01 and DTA- RG44132. Contents Abstract3 Declaration5 Acknowledgements7 Contents9 List of figures 13 List of symbols 15 1 Introduction 23 1.1 The Java Module System.......................... 25 1.1.1 The WebCalendar example..................... 26 1.1.2 Component-level information hiding................ 26 1.1.3 Dealing with JAR hell....................... 27 1.1.4 Using the module system...................... 29 1.1.5 A short summary of JMS’s features................ 30 1.2 Desirable properties of a module system.................. 31 1.3 Thesis.................................... 34 1.4 Contribution................................. 35 1.5 Collaboration................................ 37 1.6 Preliminaries................................ 37 1.6.1 A brief introduction to Ott..................... 38 1.6.2 A brief introduction to Isabelle/HOL................ 39 2 Related work 41 2.1 Verified language formalisms........................ 41 2.2 Module systems............................... 42 2.2.1 A short overview of JMS...................... 42 9 2.2.2 OSGi................................ 42 2.2.3 .NET................................ 47 2.2.4 OCaml............................... 49 2.2.5 Jiazzi................................ 50 2.2.6 General overview.......................... 53 3 Lightweight Java (LJ) 57 3.1 Example program.............................. 58 3.2 Syntax.................................... 59 3.3 Operational semantics............................ 60 3.3.1 Configuration (config)....................... 60 3.3.2 Lookup functions.......................... 61 3.3.3 Statement reduction (config −! config 0)............. 66 0 3.3.4 Variable translation (θ ` s s )................. 67 3.4 Type system................................. 67 3.4.1 Type (τ)............................... 67 3.4.2 Type environment (Γ)....................... 68 3.4.3 Subtyping (P ` τ ≺ τ 0)...................... 69 3.4.4 Valid type (P ` τ)......................... 69 3.4.5 Type reflexivity........................... 71 3.4.6 Type transitivity.......................... 71 3.5 Type checking................................ 71 3.5.1 Lookup functions.......................... 71 3.5.2 Well-formedness rules....................... 73 3.6 Proof of type soundness........................... 76 3.6.1 Configuration well-formedness (Γ ` config)........... 77 3.6.2 Helper lemmas........................... 78 3.6.3 Progress............................... 79 3.6.4 Type preservation.......................... 81 3.7 Conclusion................................. 84 4 Lightweight Java Module System (LJAM) 85 4.1 An informal description........................... 87 4.2 Syntax.................................... 88 4.2.1 Compile-time code vs. runtime code................ 88 4.2.2 LJAM’s context (ctx)........................ 89 4.2.3 User syntax............................. 90 4.2.4 Inner syntax............................. 91 4.3 Operational semantics............................ 91 10 4.3.1 Lookup functions.......................... 92 4.3.2 Administrator actions........................ 98 4.3.3 Context insertion.......................... 100 4.4 Type system................................. 101 4.4.1 Type (τ)............................... 101 4.4.2 Subtyping (P ` τ ≺ τ 0)...................... 101 4.4.3 Type reflexivity........................... 102 4.4.4 Type transitivity.......................... 102 4.5 Type checking................................ 103 4.6 Proof of type soundness........................... 106 4.6.1 Progress............................... 106 4.6.2 Type preservation.......................... 107 4.7 Recent changes to the Java Module System................ 109 4.8 Conclusion................................. 109 5 Problems with the Java Module System 111 5.1 Class resolution............................... 111 5.1.1 Unintuitive class resolution..................... 112 5.1.2 Inexpressive class resolution.................... 113 5.2 Inflexible module instantiation....................... 115 5.3 Shallow validation............................. 118 5.4 A stronger form of information hiding................... 118 5.5 Conclusion................................. 119 6 Improved Java Module System (iJAM) 121 6.1 Syntax.................................... 121 6.1.1 User syntax............................. 121 6.1.2 Inner syntax............................. 122 6.2 Operational semantics............................ 123 6.2.1 Adapted class resolution...................... 123 6.2.2 Replication policies......................... 125 6.3 Type system................................. 127 6.4 Type checking................................ 127 6.5 Proof of type soundness........................... 128 6.5.1 Well-formedness for boundary renaming............. 129 6.6 Reuse within the definitions and proof scripts............... 130 7 Implementation 131 7.1 Overview.................................. 131 11 7.2 Creation of module instances........................ 132 7.3 Class resolution............................... 132 7.4 Making the JVM use our code....................... 134 7.5 Example

Formalising, Improving, and Reusing the Java Module System

Jpype Documentation Release 0.7.0

Chargeurs De Classes Java (Classloader)

How Java Works JIT, Just in Time Compiler Loading .Class Files The

An Empirical Evaluation of Osgi Dependencies Best Practices in the Eclipse IDE Lina Ochoa, Thomas Degueule, Jurgen Vinju

Kawa - Compiling Dynamic Languages to the Java VM

Components: Concepts and Techniques

Towards Integrating Modeling and Programming Languages: the Case of UML and Java⋆

Analysing Class Loading Conflicts on Weblogic

Understanding the Java Classloader Skill Level: Introductory

Supporting Resource Awareness in Managed Runtime Environment Inti Yulien Gonzalez Herrera

CICS JVM Server Application Isolation

How Java Works