Toba: Java for Applications a Way Ahead of Time (WAT) Compiler

Total Page:16

File Type:pdf, Size:1020Kb

Toba: Java for Applications a Way Ahead of Time (WAT) Compiler Proceedings of the Third Conference on Object-Oriented Technologies and Systems (COOTS '97) 1 Toba: Java For Applications A Way Ahead of Time (WAT) Compiler Todd A. Proebsting Gregg Townsend Patrick Bridges John H. Hartman Tim Newsham Scott A. Watterson The University of Arizona Abstract machine code once. For example, a compiled C pro- gram runs 1.5-2.2 times faster than the equivalent JIT- Toba is a system for generating ef®cient standalone Java compiled Java program, and 2.6-4.2 times faster than an applications. Toba includes a Java-bytecode-to-C com- interpreted Java program. piler, a garbage collector, a threads package, and Java These performance penalties are especially bother- API support. Toba-compiled Java applications execute some in non-mobile applications that are run many times 1.5±4.2 times faster than interpreted and Just-In-Time without change. To combat these inherent performance compiled applications. penalties we have developed a Java system that pre- compiles Java class ®les into machine code. Our system, 1 Introduction Toba,1 ®rst translates Java class ®les into C code, then compiles the C into machine code. The resulting object Java [GYT96] is an object-oriented language designed ®les are linked with the Toba run-time system to create by Sun Microsystems that supports mobile code, i.e., ex- traditional executable ®les. To distinguish our technique ecutable code that runs on a variety of platforms. Al- from JIT compilation, we have (somewhat facetiously) though the language is interesting in its own right, Java's coined the phrase Way-Ahead-of-Time (WAT) compiler popularity stems from its promise of ªwrite once, run to describe Toba. Toba compiles Java programs into ma- anywhere.º Mobile code proponents envision a future of chine code during program development, eliminating the location-independentcode moving about the Internet and need for interpretation or JIT compilation of bytecodes. running on any platform. Although we forfeit Java's architecture-neutral distribu- Java's mobility is achieved by compiling its object tion, Toba-generated executables are 1.5-4.4 times faster classes into a distribution format called a class ®le.A than alternative JVM implementations. class ®le contains information about the Java class, in- Toba has several advantages over interpretation or cluding bytecodes, an architecturally-neutral representa- JIT-compilation. First, because Toba runs way-ahead- tion of the instructions associated with the class's meth- of-time, rather than just-in-time, the resulting machine ods. A class ®le can execute on any computer supporting code can be more heavily optimized to yield more ef- the Java Virtual Machine (JVM). Java's code mobility, ®cient executables. Second, because Toba creates a C- therefore, depends on both architecture-neutral class ®les equivalent to the Java program, the standard C debug- and the implicit assumption that the JVM is supported on ging and pro®ling tools can operate on Toba-generated every client machine. executables. Third, because Toba executables include all Most JVM implementations execute bytecodes via class ®les used by the application, there is no possibility interpretation or Just-In-Time (JIT) compilation, which of an application suddenly ceasing to execute because of compiles the bytecodes into machine code at run time. a change in available class ®les. For these reasons we be- Thus, Java's mobility comes at a price, exacted by the lieve that WAT-compilation is valuable for the develop- cost of interpreting or JIT-compiling the bytecodes every ment and distribution of ef®cient Java programs. time the program is executed. These systems incur mod- Toba consists of many components: a bytecode-to-C est to severe performance penalties compared to more translator, a garbage collector, a threads package, a run- traditional systems that compile source code directly to time library, and native routines implementing the Java API. Toba is a surprisingly small system: the transla- Address: Department of Computer Science, University of Ari- 1 zona, Tucson, AZ 85721; Email: ftodd, gmt, bridges, jhh, newsham, Lake Toba is a prominent feature on Sumatra, the island just west [email protected]. of Java. Proceedings of the Third Conference on Object-Oriented Technologies and Systems (COOTS '97) 2 tor is only 5000 lines of Java; the garbage collector is a 3.1 Naming modestly-altered version of the Boehm-Demers-Weiser Toba attempts to preserve Java names in the C it pro- conservative collector [BW88]; the threads package is duces, although this isn't always possible. Java names builton top of Solaristhreads; the run-timelibraryis only may draw from thousands of different Unicode charac- 6500 lines of C; and the API routines are simply transla- ters whereas C names are limited to just 63 ASCII char- tions of Sun's API class ®les. Except for dynamic link- acters. Furthermore, some legal Java names such as ing, Toba provides a complete Java execution environ- enum and setjmp have special meaning in C. When ment. a Java name cannot be used directly as a C name, Toba discards non-C characters, adds a hash-code suf®x, and 2 The Java Virtual Machine additionally adds a pre®x character if the resulting name begins with a digit or other illegal character. The Java Virtual Machine (JVM) de®nes a stack-based Java method names always require hash-code suf- virtual machine that executes Java class ®les [LY97]. ®xes. Toba translates each Java method into a C function, Each Java class compiles into a separate class ®le con- and these functions share a global namespace. Because taining information describing the class's inheritance, Java methods may be overloaded among and within ®elds, methods, etc., as well as nearly all of the compile- classes, a hash-code suf®x is added to distinguish the time type information. The Java bytecodes form the methods. The suf®x encodes the class name, the method JVM's instruction set, and combine simple arithmetic name, and the method signature. and control-¯ow operators with operators speci®c to the Java language's object model. Powerful object-level in- structions include those to access static and instance vari- 3.2 Data Layout ables, and those to invoke static, virtual, nonvirtual and Java includes eight primitivetypes: byte, short, int, long, interface functions. The JVM also includes an exception boolean, char, ¯oat, and double. Each translates into a mechanism for handling abnormal conditions that arise primitive C type. (Note that Java's ªcharº type repre- during execution. sents a 16-bit Unicode value.) The JVM also provides facilities for managing ob- All other Java types are reference types that subclass jects and concurrency. The JVM implements a garbage- the root class, java.lang.Object. All reference collected object allocation model, with facilities for ini- types are translated into a C pointer type. Each reference tializing and ®nalizing objects. Concurrency is provided points to an object instance, and all instances of a par- through a thread abstraction. Threads are pre-emptive ticular class contain a class-pointer to a common class and scheduled according to priority. A monitor facility structure. Java has two different kinds of objects: array provides mutual exclusion on critical sections as well as objects and ordinary objects. The Toba structure for or- thread scheduling through wait/notify primitives. Moni- dinary objects appears in Figure 1. An ordinary object's tors are recursive, allowing a single thread to acquire the class descriptor includes the instance size and a ¯ag that same monitor lock multiple times without deadlocking. indicates it is not an array. The Toba structure for array objects appears in Figure 2. An array's class descriptor 3 Toba's Run-Time Data Structures includes the element size and its ¯ag indicates that it rep- resents an array. Array instances contain both a length Java's rich object model requires run-time data struc- ®eld and a vector of elements. tures to describe each object's type and methods. We de- Each per-class run-time structure has three parts: veloped our data structures with both performance and general informationthat is needed for all classes (e.g., su- simplicity in mind. They differ in many respects from perclass information), a method table that contains point- those of Sun's implementation of Java. For instance, ers to virtual functions, and a table of class variables. Sun's implementation requires that all object references Figure 3 summarizes run-time class-level information go through a handle, which represents an extra level of common to all classes. indirection, an added inef®ciency, and an extra compli- The method table is simply a vector of function point- cation. Toba accesses objects directly. The differences ers and unique method identi®ers. The method identi- are invisible to Java programmers but important to au- ®ers are used when invoking interface functions, which thors of native methods. must be found at run-time. The structure of the method table is typical of statically-bound object-oriented lan- guages likeOberon-2 [MW91] and C++ [Str86]. Method tables include inherited methods as well as functions de- ®ned by the class itself. Proceedings of the Third Conference on Object-Oriented Technologies and Systems (COOTS '97) 3 Object Pointer Object Instance Class Struct - - Array Bit = 0 Instance Variables Instance Size . Figure 1: Ordinary Object Structure Object Pointer Array Instance Class Struct - - Length Array Bit = 1 Element Size = 1,2,4,8 . Figure
Recommended publications
  • Compilers & Translator Writing Systems
    Compilers & Translators Compilers & Translator Writing Systems Prof. R. Eigenmann ECE573, Fall 2005 http://www.ece.purdue.edu/~eigenman/ECE573 ECE573, Fall 2005 1 Compilers are Translators Fortran Machine code C Virtual machine code C++ Transformed source code Java translate Augmented source Text processing language code Low-level commands Command Language Semantic components Natural language ECE573, Fall 2005 2 ECE573, Fall 2005, R. Eigenmann 1 Compilers & Translators Compilers are Increasingly Important Specification languages Increasingly high level user interfaces for ↑ specifying a computer problem/solution High-level languages ↑ Assembly languages The compiler is the translator between these two diverging ends Non-pipelined processors Pipelined processors Increasingly complex machines Speculative processors Worldwide “Grid” ECE573, Fall 2005 3 Assembly code and Assemblers assembly machine Compiler code Assembler code Assemblers are often used at the compiler back-end. Assemblers are low-level translators. They are machine-specific, and perform mostly 1:1 translation between mnemonics and machine code, except: – symbolic names for storage locations • program locations (branch, subroutine calls) • variable names – macros ECE573, Fall 2005 4 ECE573, Fall 2005, R. Eigenmann 2 Compilers & Translators Interpreters “Execute” the source language directly. Interpreters directly produce the result of a computation, whereas compilers produce executable code that can produce this result. Each language construct executes by invoking a subroutine of the interpreter, rather than a machine instruction. Examples of interpreters? ECE573, Fall 2005 5 Properties of Interpreters “execution” is immediate elaborate error checking is possible bookkeeping is possible. E.g. for garbage collection can change program on-the-fly. E.g., switch libraries, dynamic change of data types machine independence.
    [Show full text]
  • SJC – Small Java Compiler
    SJC – Small Java Compiler User Manual Stefan Frenz Translation: Jonathan Brase original version: 2011/06/08 latest document update: 2011/08/24 Compiler Version: 182 Java is a registered trademark of Oracle/Sun Table of Contents 1 Quick introduction.............................................................................................................................4 1.1 Windows...........................................................................................................................................4 1.2 Linux.................................................................................................................................................5 1.3 JVM Environment.............................................................................................................................5 1.4 QEmu................................................................................................................................................5 2 Hello World.......................................................................................................................................6 2.1 Source Code......................................................................................................................................6 2.2 Compilation and Emulation...............................................................................................................8 2.3 Details...............................................................................................................................................9
    [Show full text]
  • Source-To-Source Translation and Software Engineering
    Journal of Software Engineering and Applications, 2013, 6, 30-40 http://dx.doi.org/10.4236/jsea.2013.64A005 Published Online April 2013 (http://www.scirp.org/journal/jsea) Source-to-Source Translation and Software Engineering David A. Plaisted Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, USA. Email: [email protected] Received February 5th, 2013; revised March 7th, 2013; accepted March 15th, 2013 Copyright © 2013 David A. Plaisted. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ABSTRACT Source-to-source translation of programs from one high level language to another has been shown to be an effective aid to programming in many cases. By the use of this approach, it is sometimes possible to produce software more cheaply and reliably. However, the full potential of this technique has not yet been realized. It is proposed to make source- to-source translation more effective by the use of abstract languages, which are imperative languages with a simple syntax and semantics that facilitate their translation into many different languages. By the use of such abstract lan- guages and by translating only often-used fragments of programs rather than whole programs, the need to avoid writing the same program or algorithm over and over again in different languages can be reduced. It is further proposed that programmers be encouraged to write often-used algorithms and program fragments in such abstract languages. Libraries of such abstract programs and program fragments can then be constructed, and programmers can be encouraged to make use of such libraries by translating their abstract programs into application languages and adding code to join things together when coding in various application languages.
    [Show full text]
  • Tdb: a Source-Level Debugger for Dynamically Translated Programs
    Tdb: A Source-level Debugger for Dynamically Translated Programs Naveen Kumar†, Bruce R. Childers†, and Mary Lou Soffa‡ †Department of Computer Science ‡Department of Computer Science University of Pittsburgh University of Virginia Pittsburgh, Pennsylvania 15260 Charlottesville, Virginia 22904 {naveen, childers}@cs.pitt.edu [email protected] Abstract single stepping through program execution, watching for par- ticular conditions and requests to add and remove breakpoints. Debugging techniques have evolved over the years in response In order to respond, the debugger has to map the values and to changes in programming languages, implementation tech- statements that the user expects using the source program niques, and user needs. A new type of implementation vehicle viewpoint, to the actual values and locations of the statements for software has emerged that, once again, requires new as found in the executable program. debugging techniques. Software dynamic translation (SDT) As programming languages have evolved, new debugging has received much attention due to compelling applications of techniques have been developed. For example, checkpointing the technology, including software security checking, binary and time stamping techniques have been developed for lan- translation, and dynamic optimization. Using SDT, program guages with concurrent constructs [6,19,30]. The pervasive code changes dynamically, and thus, debugging techniques use of code optimizations to improve performance has necessi- developed for statically generated code cannot be used to tated techniques that can respond to queries even though the debug these applications. In this paper, we describe a new optimization may have changed the number of statement debug architecture for applications executing with SDT sys- instances and the order of execution [15,25,29].
    [Show full text]
  • Java (Programming Langua a (Programming Language)
    Java (programming language) From Wikipedia, the free encyclopedialopedia "Java language" redirects here. For the natural language from the Indonesian island of Java, see Javanese language. Not to be confused with JavaScript. Java multi-paradigm: object-oriented, structured, imperative, Paradigm(s) functional, generic, reflective, concurrent James Gosling and Designed by Sun Microsystems Developer Oracle Corporation Appeared in 1995[1] Java Standard Edition 8 Update Stable release 5 (1.8.0_5) / April 15, 2014; 2 months ago Static, strong, safe, nominative, Typing discipline manifest Major OpenJDK, many others implementations Dialects Generic Java, Pizza Ada 83, C++, C#,[2] Eiffel,[3] Generic Java, Mesa,[4] Modula- Influenced by 3,[5] Oberon,[6] Objective-C,[7] UCSD Pascal,[8][9] Smalltalk Ada 2005, BeanShell, C#, Clojure, D, ECMAScript, Influenced Groovy, J#, JavaScript, Kotlin, PHP, Python, Scala, Seed7, Vala Implementation C and C++ language OS Cross-platform (multi-platform) GNU General Public License, License Java CommuniCommunity Process Filename .java , .class, .jar extension(s) Website For Java Developers Java Programming at Wikibooks Java is a computer programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few impimplementation dependencies as possible.ble. It is intended to let application developers "write once, run ananywhere" (WORA), meaning that code that runs on one platform does not need to be recompiled to rurun on another. Java applications ns are typically compiled to bytecode (class file) that can run on anany Java virtual machine (JVM)) regardless of computer architecture. Java is, as of 2014, one of tthe most popular programming ng languages in use, particularly for client-server web applications, witwith a reported 9 million developers.[10][11] Java was originallyy developed by James Gosling at Sun Microsystems (which has since merged into Oracle Corporation) and released in 1995 as a core component of Sun Microsystems'Micros Java platform.
    [Show full text]
  • The Java® Language Specification Java SE 8 Edition
    The Java® Language Specification Java SE 8 Edition James Gosling Bill Joy Guy Steele Gilad Bracha Alex Buckley 2015-02-13 Specification: JSR-337 Java® SE 8 Release Contents ("Specification") Version: 8 Status: Maintenance Release Release: March 2015 Copyright © 1997, 2015, Oracle America, Inc. and/or its affiliates. 500 Oracle Parkway, Redwood City, California 94065, U.S.A. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. The Specification provided herein is provided to you only under the Limited License Grant included herein as Appendix A. Please see Appendix A, Limited License Grant. To Maurizio, with deepest thanks. Table of Contents Preface to the Java SE 8 Edition xix 1 Introduction 1 1.1 Organization of the Specification 2 1.2 Example Programs 6 1.3 Notation 6 1.4 Relationship to Predefined Classes and Interfaces 7 1.5 Feedback 7 1.6 References 7 2 Grammars 9 2.1 Context-Free Grammars 9 2.2 The Lexical Grammar 9 2.3 The Syntactic Grammar 10 2.4 Grammar Notation 10 3 Lexical Structure 15 3.1 Unicode 15 3.2 Lexical Translations 16 3.3 Unicode Escapes 17 3.4 Line Terminators 19 3.5 Input Elements and Tokens 19 3.6 White Space 20 3.7 Comments 21 3.8 Identifiers 22 3.9 Keywords 24 3.10 Literals 24 3.10.1 Integer Literals 25 3.10.2 Floating-Point Literals 31 3.10.3 Boolean Literals 34 3.10.4 Character Literals 34 3.10.5 String Literals 35 3.10.6 Escape Sequences for Character and String Literals 37 3.10.7 The Null Literal 38 3.11 Separators
    [Show full text]
  • Virtual Machine Part II: Program Control
    Virtual Machine Part II: Program Control Building a Modern Computer From First Principles www.nand2tetris.org Elements of Computing Systems, Nisan & Schocken, MIT Press, www.nand2tetris.org , Chapter 8: Virtual Machine, Part II slide 1 Where we are at: Human Abstract design Software abstract interface Thought Chapters 9, 12 hierarchy H.L. Language Compiler & abstract interface Chapters 10 - 11 Operating Sys. Virtual VM Translator abstract interface Machine Chapters 7 - 8 Assembly Language Assembler Chapter 6 abstract interface Computer Machine Architecture abstract interface Language Chapters 4 - 5 Hardware Gate Logic abstract interface Platform Chapters 1 - 3 Electrical Chips & Engineering Hardware Physics hierarchy Logic Gates Elements of Computing Systems, Nisan & Schocken, MIT Press, www.nand2tetris.org , Chapter 8: Virtual Machine, Part II slide 2 The big picture Some . Some Other . Jack language language language Chapters Some Jack compiler Some Other 9-13 compiler compiler Implemented in VM language Projects 7-8 VM implementation VM imp. VM imp. VM over the Hack Chapters over CISC over RISC emulator platforms platforms platform 7-8 A Java-based emulator CISC RISC is included in the course written in Hack software suite machine machine . a high-level machine language language language language Chapters . 1-6 CISC RISC other digital platforms, each equipped Any Hack machine machine with its VM implementation computer computer Elements of Computing Systems, Nisan & Schocken, MIT Press, www.nand2tetris.org , Chapter 8: Virtual Machine,
    [Show full text]
  • A Brief History of Just-In-Time Compilation
    A Brief History of Just-In-Time JOHN AYCOCK University of Calgary Software systems have been using “just-in-time” compilation (JIT) techniques since the 1960s. Broadly, JIT compilation includes any translation performed dynamically, after a program has started execution. We examine the motivation behind JIT compilation and constraints imposed on JIT compilation systems, and present a classification scheme for such systems. This classification emerges as we survey forty years of JIT work, from 1960–2000. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors; K.2 [History of Computing]: Software General Terms: Languages, Performance Additional Key Words and Phrases: Just-in-time compilation, dynamic compilation 1. INTRODUCTION into a form that is executable on a target platform. Those who cannot remember the past are con- What is translated? The scope and na- demned to repeat it. ture of programming languages that re- George Santayana, 1863–1952 [Bartlett 1992] quire translation into executable form covers a wide spectrum. Traditional pro- This oft-quoted line is all too applicable gramming languages like Ada, C, and in computer science. Ideas are generated, Java are included, as well as little lan- explored, set aside—only to be reinvented guages [Bentley 1988] such as regular years later. Such is the case with what expressions. is now called “just-in-time” (JIT) or dy- Traditionally, there are two approaches namic compilation, which refers to trans- to translation: compilation and interpreta- lation that occurs after a program begins tion. Compilation translates one language execution. into another—C to assembly language, for Strictly speaking, JIT compilation sys- example—with the implication that the tems (“JIT systems” for short) are com- translated form will be more amenable pletely unnecessary.
    [Show full text]
  • Building Useful Program Analysis Tools Using an Extensible Java Compiler
    Building Useful Program Analysis Tools Using an Extensible Java Compiler Edward Aftandilian, Raluca Sauciuc Siddharth Priya, Sundaresan Krishnan Google, Inc. Google, Inc. Mountain View, CA, USA Hyderabad, India feaftan, [email protected] fsiddharth, [email protected] Abstract—Large software companies need customized tools a specific task, but they fail for several reasons. First, ad- to manage their source code. These tools are often built in hoc program analysis tools are often brittle and break on an ad-hoc fashion, using brittle technologies such as regular uncommon-but-valid code patterns. Second, simple ad-hoc expressions and home-grown parsers. Changes in the language cause the tools to break. More importantly, these ad-hoc tools tools don’t provide sufficient information to perform many often do not support uncommon-but-valid code code patterns. non-trivial analyses, including refactorings. Type and symbol We report our experiences building source-code analysis information is especially useful, but amounts to writing a tools at Google on top of a third-party, open-source, extensible type-checker. Finally, more sophisticated program analysis compiler. We describe three tools in use on our Java codebase. tools are expensive to create and maintain, especially as the The first, Strict Java Dependencies, enforces our dependency target language evolves. policy in order to reduce JAR file sizes and testing load. The second, error-prone, adds new error checks to the compilation In this paper, we present our experience building special- process and automates repair of those errors at a whole- purpose tools on top of the the piece of software in our codebase scale.
    [Show full text]
  • Language Translators
    Student Notes Theory LANGUAGE TRANSLATORS A. HIGH AND LOW LEVEL LANGUAGES Programming languages Low – Level Languages High-Level Languages Example: Assembly Language Example: Pascal, Basic, Java Characteristics of LOW Level Languages: They are machine oriented : an assembly language program written for one machine will not work on any other type of machine unless they happen to use the same processor chip. Each assembly language statement generally translates into one machine code instruction, therefore the program becomes long and time-consuming to create. Example: 10100101 01110001 LDA &71 01101001 00000001 ADD #&01 10000101 01110001 STA &71 Characteristics of HIGH Level Languages: They are not machine oriented: in theory they are portable , meaning that a program written for one machine will run on any other machine for which the appropriate compiler or interpreter is available. They are problem oriented: most high level languages have structures and facilities appropriate to a particular use or type of problem. For example, FORTRAN was developed for use in solving mathematical problems. Some languages, such as PASCAL were developed as general-purpose languages. Statements in high-level languages usually resemble English sentences or mathematical expressions and these languages tend to be easier to learn and understand than assembly language. Each statement in a high level language will be translated into several machine code instructions. Example: number:= number + 1; 10100101 01110001 01101001 00000001 10000101 01110001 B. GENERATIONS OF PROGRAMMING LANGUAGES 4th generation 4GLs 3rd generation High Level Languages 2nd generation Low-level Languages 1st generation Machine Code Page 1 of 5 K Aquilina Student Notes Theory 1. MACHINE LANGUAGE – 1ST GENERATION In the early days of computer programming all programs had to be written in machine code.
    [Show full text]
  • SUBJECT-COMPUTER CLASS-12 CHAPTER 9 – Compiling and Running Java Programs
    SUBJECT-COMPUTER CLASS-12 CHAPTER 9 – Compiling and Running Java Programs Introduction to Java programming JAVA was developed by Sun Microsystems Inc in 1991, later acquired by Oracle Corporation. It was developed by James Gosling and Patrick Naughton. It is a simple programming language. Writing, compiling and debugging a program is easy in java. It helps to create modular programs and reusable code. Bytecode javac compiler of JDK compiles the java source code into bytecode so that it can be executed by JVM. The bytecode is saved in a .class file by compiler. Java Virtual Machine (JVM) This is generally referred as JVM. Before, we discuss about JVM lets see the phases of program execution. Phases are as follows: we write the program, then we compile the program and at last we run the program. 1) Writing of the program is of course done by java programmer. 2) Compilation of program is done by javac compiler, javac is the primary java compiler included in java development kit (JDK). It takes java program as input and generates java bytecode as output. 3) In third phase, JVM executes the bytecode generated by compiler. This is called program run phase. So, now that we understood that the primary function of JVM is to execute the bytecode produced by compiler. Characteristics of Java Simple Java is very easy to learn, and its syntax is simple, clean and easy to understand. Multi-threaded Multithreading capabilities come built right into the Java language. This means it is possible to build highly interactive and responsive apps with a number of concurrent threads of activity.
    [Show full text]
  • Coqjvm: an Executable Specification of the Java Virtual Machine Using
    CoqJVM: An Executable Specification of the Java Virtual Machine using Dependent Types Robert Atkey LFCS, School of Informatics, University of Edinburgh Mayfield Rd, Edinburgh EH9 3JZ, UK [email protected] Abstract. We describe an executable specification of the Java Virtual Machine (JVM) within the Coq proof assistant. The principal features of the development are that it is executable, meaning that it can be tested against a real JVM to gain confidence in the correctness of the specification; and that it has been written with heavy use of dependent types, this is both to structure the model in a useful way, and to constrain the model to prevent spurious partiality. We describe the structure of the formalisation and the way in which we have used dependent types. 1 Introduction Large scale formalisations of programming languages and systems in mechanised theorem provers have recently become popular [4–6, 9]. In this paper, we describe a formalisation of the Java virtual machine (JVM) [8] in the Coq proof assistant [11]. The principal features of this formalisation are that it is executable, meaning that a purely functional JVM can be extracted from the Coq development and – with some O’Caml glue code – executed on real Java bytecode output from the Java compiler; and that it is structured using dependent types. The motivation for this development is to act as a basis for certified consumer- side Proof-Carrying Code (PCC) [12]. We aim to prove the soundness of program logics and correctness of proof checkers against the model, and extract the proof checkers to produce certified stand-alone tools.
    [Show full text]