Harry Xu May 2012 Complex, concurrent software
Precision (no false positives) Find real bugs in real executions Need to modify JVM (e.g., object layout, GC, or ISA-level code)
Need to demonstrate realism (usually performance) Otherwise use RoadRunner, BCEL, Pin, LLVM, … Keeping track of stuff as the program executes?
Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC
Keeping track of stuff as the program executes?
JVM written in Java?! Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC Uninterruptible code Guide Jikes RVM Research Archive { Research mailing list Guide Jikes RVM Research Archive { Research mailing list Jikes RVM source code
Boot image writer
Dynamic compilers Jikes RVM source code
Boot image Run with writer another JVM
Dynamic compilers Jikes RVM source code
Boot image Run with writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Build configurations: Jikes RVM source code BaseBase BaseAdaptive FullAdaptive Boot image Run with FastAdaptive writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Build configurations: Jikes RVM source code BaseBase (prototype) BaseAdaptive (prototype-opt) FullAdaptive (development) Boot image Run with FastAdaptive (production) writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Build configurations: BaseBase Jikes RVM source code Testing BaseAdaptive FullAdaptive Boot image Run with FastAdaptive writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Build configurations: Jikes RVM source code BaseBase Faster builds BaseAdaptive FullAdaptive Boot image Run with FastAdaptive writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Build configurations: Jikes RVM source code BaseBase BaseAdaptive Faster runs FullAdaptive Boot image Run with FastAdaptive writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Build configurations: Jikes RVM source code BaseBase BaseAdaptive FullAdaptive Performance Boot image Run with FastAdaptive writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Jikes RVM source code
Boot image Run with writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Jikes RVM source code Edit with Eclipse (see Guide) Boot image Run with writer another JVM
Dynamic compilers Boot image (native code + initial heap space) Keeping track of stuff as the program executes?
Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC Baseline compiler Bytecode Native code Baseline compiler Bytecode Native code
Each bytecode several x86 instructions
(BaselineCompilerImpl.java) Baseline compiler Bytecode Native code
Each bytecode several x86 instructions
(BaselineCompilerImpl.java)
Baseline compiler Bytecode Native code Profiling
Adaptive optimization system Baseline compiler Bytecode Native code Profiling
(Faster) Optimizing native code compiler
Adaptive optimization system Baseline compiler Bytecode Native code Profiling
(Faster) Optimizing native code compiler
Adaptive optimization system Bytecode
(Faster) Optimizing native code compiler Bytecode Resembles bytecode HIR
(Faster) native code Resembles typical compiler IR LIR (3-address code)
MIR Resembles assembly code Bytecode
HIR
(Faster) native code LIR Opt levels: 0, 1, 2
(Even faster) MIR native code Bytecode
HIR
ExpandRuntimeServices.java (Faster) Add instrumentation at native code reads, writes, allocation, LIR synchronization
MIR
Keeping track of stuff as the program executes?
Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC header field0 field1 field2
Low address High address Object reference
type info locking & field0 field1 field2 block GC Object reference
type info locking & field0 field1 field2 block GC
Object reference
type info locking & Array elem0 elem1 block GC length Object reference
type info locking & field0 field1 field2 block GC
Steal bits Object reference
type info locking & misc field0 field1 field2 block GC
MiscHeader.java Object reference
type info locking & counter field0 field1 field2 block GC Object reference
type info locking & counter field0 field1 field2 block GC
Magic! Compiles down to three x86 instructions Object reference
type info locking & counter field0 field1 field2 block GC
2
Gotcha: can’t actually use LSB of leftmost word Object reference
type info locking & counter field0 field1 field2 block GC
2
What’s the problem with this code? Object reference
type info locking & counter field0 field1 field2 block GC
2 Object reference
type info locking & not used field0 field1 field2 block GC Object reference
type info locking & not used field0 field1 field2 block GC
Compiles down to three x86 instructions Object reference
type info locking & misc field0 field1 field2 block GC Object reference
type info locking & misc field0 field1 field2 block GC
What if GC moves object? What if GC collects object? Keeping track of stuff as the program executes?
Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC Object reference
type info locking & field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) Object reference
type info locking & field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) Object reference
type info locking & field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) Object reference
type info locking & misc field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) obj.misc = markAndPossiblyCopy(obj.f) worklist.push(obj.misc) Object reference
type info locking & misc field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) obj.misc = markAndPossiblyCopy(obj.f) worklist.push(obj.misc) Object reference
type info locking & misc field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) obj.misc = markAndPossiblyCopy(obj.f) worklist.push(obj.misc) TraceLocal.scanObject() Object reference
type info locking & field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) Object reference
type info locking & field0 field1 field2 block GC
// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) TraceLocal.processNode() Keeping track of stuff as the program executes?
Change application behavior (add instrumentation) Store per-object/per-field metadata Piggyback on GC Uninterruptible code Normal application code can be interrupted . Allocation GC . Synchronization & yield points join a GC Some VM code shouldn’t be interrupted . Heap etc. in inconsistent state Most instrumentation can’t be interrupted . Reads & writes aren’t GC-safe points @Uninterruptible static void myMethod(Object o) {
// No allocation or synchronization
// No calls to interruptible methods
} @Uninterruptible static void myMethod(Object o) {
currentThread.deferGC = true; Metadata m = new Metadata(); currentThread.deferGC = false;
setMiscHeader(o, offset, m); } Need to modify JVM internals Need to demonstrate realism
Guide Overview of other tasks & components
Jikes RVM Research Archive Dynamic analysis examples
Help (especially Research mailing list for novices) Object layout . Extra bits or words in header . Stealing bits from references . Discuss magic here Adding instrumentation . Baseline & optimizing compilers . Allocation sites; reads & writes . Inlining instrumentation Garbage collection . Piggybacking on GC . New spaces Low-level stuff . Uninterruptible code . Walking the stack Concurrency . Atomic stores . Thread-local data