Implementing Higher-Level Languages

Implementing Higher-Level Languages Quick tour of programming language implementation techniques. From the Java level to the C level. Ahead-of-time compiler compile time C source x86 assembly x86 x86 machine code C compiler code assembler code x86 machine run time code x86 computer Output Data Figures for compilers/runtime systems adapted from slides by Steve Freund. 2 Typical Compiler Source Lexical Analyzer Program Syntax Analyzer Semantic Analyzer Analysis Intermediate Code Synthesis Generator Code Optimizer Code Generator Target Program 3 Interpreter Source Program Interpreter = virtual machine Output Data 4 Compilers... that target interpreters Java source Java code Java Compiler bytecod e Java bytecode Java Virtual Output Machine Data 6 Interpreters... that use compilers. Source Program Compiler Target Program Virtual Output Machine Data 8 JIT Compilers and Optimization Java source code just-in-time compiler x86 javac machine code bytecode Output Java Performance bytecode Monitor bytecode interpreter • HotSpot JVM Data • Jikes RVM • SpiderMonkey JVM • v8 • Transmeta • ... 9 Data in Java Arrays Every element initialized to 0 or null Immutable length field Since it has this info, what can it do? int array[5]: C ?? ?? ?? ?? ?? 0 4 20 24 Java 5 00 00 00 00 00 Data Representation in Java Data in Java Arrays Every element initialized to 0 or null Immutable length field Bounds-check every access. Bounds-checking sounds slow, but: int array[5]: 1. Length is likely in cache. 2. Compiler may store length in register C ?? ?? ?? ?? ?? for loops. 3. Compiler may prove that some checks 0 4 20 24 are redundant. Java 5 00 00 00 00 00 Data Representation in Java Data in Java Characters and strings 16-bit Unicode Explicit length, no null terminator the string ‘CS 240’: C: ASCII 43 53 20 32 34 30 \0 0 1 4 7 16 Java: Unicode 6 00 43 00 53 00 20 00 32 00 34 00 30 Data Representation in Java Data structures (objects) in Java C: programmer controls layout, inline vs. pointer. Java: objects always stored by reference, never stored inline. C struct rec { Java class Rec { int i; int i; int a[3]; int[] a = new int[3]; struct rec *p; Rec p; }; … } struct rec *r = malloc(...); r = new Rec(); struct rec r2; r2 = new Rec(); r->i = val; r.i = val; r->a[2] = val; r.a[2] = val; r->p = &r2; r.p = r2; i a p i a p 3 int[3] 0 4 16 24 0 4 8 16 Data Representation in Java 0 4 16 Pointers/References Pointers in C can point to any memory address References in Java can only point to [the starts of] objects And can only be dereferenced to access a field or element of that object C struct rec { Java class Rec { int i; int i; int a[3]; int[] a = new int[3]; struct rec *p; Rec p; }; } struct rec* r = malloc(…); Rec r = new Rec(); some_fn(&(r.a[1])) //ptr some_fn(r.a, 1) // ref, index r r x i a p i a p 3 int[3] 0 4 16 24 0 4 8 16 0 4 16 Data Representation in Java Java objects class Point { fields int x; int y; constructor Point() { x = 0; y = 0; } methods boolean samePlace(Point p) { return (x == p.x) && (y == p.y); } String toString() { return "(" + x + "," + y + ")"; } } 21 Java objects code:Point() Point object Point class vtable code:Point.samePlace() p vtable pointer x code:Point.toString() y object Point vtable pointer x q y For each class, compiler maps: field signature à offset (index) vtable pointer : points to per-class virtual method table (vtable) For each class, compiler maps: method signature à index samePlace: 0 toString: 1 23 Implementing dynamic dispatch code:Point() Point object Point class vtable code:Point.samePlace() p vtable pointer x code:Point.toString() y object Point vtable pointer q x q y what happens (pseudo code): Java: Point* p = calloc(1,sizeof(Point)); Point p = new Point(); p->header = ...; p->vtable = Point_vtable; Point_constructor(p); return p.samePlace(q); return p->vtable[0](this=p, q); Subclassing class ColorPoint extends Point{ String color; boolean getColor() { return color; } String toString() { return super.toString() + "[" + color + "]"; } } How do we access superclass pieces? fields inherited methods Where do we put extensions? new field new method overriding method dynamic (method) dispatch Java: Point p = ???; what happens (pseudo code): return p.toString(); return p.vtable[1](p); Point vtable Point object vtable x code:Point.samePlace() y code:Point.toString() ColorPoint object ColorPoint vtable vtable x code:ColorPoint.toString() y color code:ColorPoint.getColor().

Implementing Higher-Level Languages

The Architecture of Decentvm

Dynamically Loaded Classes As Shared Libraries: an Approach to Improving Virtual Machine Scalability

Proceedings of the Third Virtual Machine Research and Technology Symposium

Design and Implementation of a Scala Compiler Backend Targeting the Low Level Virtual Machine Geoffrey Reedy

Java and C CSE351, Summer 2018

G22.2110-003 Programming Languages - Fall 2012 Lecture 12

Strict Protection for Virtual Function Calls in COTS C++ Binaries

Note to Users

PDF Version of Thesis

Artdroid: a Virtual-Method Hooking Framework on Android ART Runtime

18 Advanced HLA Programming

Buffer Overflow and Other Memory Corruption Attacks