A Song of JIT and GC

A Song of JIT and GC

Ga m e of Performance A Song of JIT and GC QCon London 2016 Monica Beckwith [email protected]; @mon_beck https://www.linkedin.com/in/monicabeckwith www.codekaram.com 1 About Me • Java/JVM/GC Performance Engineer/Consultant • Worked at AMD, Sun, Oracle… • Worked with HotSpot JVM for more than a decade • JVM heuristics, JIT compiler, GCs: Parallel(Old) GC, G1 GC, CMS GC 2 ©2016 CodeKaram Many Thanks • Vladimir Kozlov (Hotspot JIT Team) - for clearing my understanding of tiered compilation, escape analysis and nuances of dynamic deoptimizations. • Jon Masamitsu (Hotspot GC Team) - for keeping me on track with JDK8 changes that related to GC. 3 ©2016 CodeKaram Agenda • The Helpers within the JVM • JIT & Runtime • Adaptive Optimization • CompileThreshold • Inlining • Dynamic Deoptimization 4 ©2016 CodeKaram Agenda • The Helpers within the JVM • JIT & Runtime • Tiered Compilation • Intrinsics • Escape Analysis • Compressed Oops • Compressed Class Pointers 5 ©2016 CodeKaram Agenda • The Helpers within the JVM • Garbage Collection • Allocation • TLAB • NUMA Aware Allocator 6 ©2016 CodeKaram Agenda • The Helpers within the JVM • Garbage Collection • Reclamation • Mark-Sweep-Compact (Serial + Parallel) • Mark-Sweep • Scavenging 7 ©2016 CodeKaram Agenda • The Helpers within the JVM • Garbage Collection • String Interning and Deduplication 8 ©2016 CodeKaram Interfacing The Metal Java Application Class loader Java Runtime GC APIs JIT Compiler Java VM OS + Hardware 9 ©2016 CodeKaram Interfacing The Metal Java Application Class loader Java Runtime GC APIs JIT Compiler Java VM JRE OS + Hardware 10 ©2016 CodeKaram The Helpers Runtime GC JIT Compiler Java VM 11 ©2016 CodeKaram The Helpers - JIT And Runtime. 12 Advanced JIT & Runtime Optimizations - Adaptive Optimization 13 JIT & Runtime • Startup - Interpreter • Adaptive optimization - Performance critical methods • Compilation: • CompileThreshold • Identify root of compilation • Method Compilation or On-stack replacement (Loop)? 14 ©2016 CodeKaram PrintCompilation timestamp compilation-id flags tiered- compilation-level Method <@ osr_bci> code- size <deoptimization> 15 ©2016 CodeKaram PrintCompilation Flags: %: is_osr_method s: is_synchronized !: has_exception_handler b: is_blocking n: is_native 16 ©2016 CodeKaram PrintCompilation 567 693 % ! 3 org.h2.command.dml.Insert::insertRows @ 76 (513 bytes) 656 797 n 0 java.lang.Object::clone (native) 779 835 s 4 java.lang.StringBuffer::append (13 bytes) 17 ©2016 CodeKaram JIT & Runtime • Adaptive optimization - Performance critical methods • Inlining: • MinInliningThreshold, MaxFreqInlineSize, InlineSmallCode, MaxInlineSize, MaxInlineLevel, DesiredMethodLimit … 18 ©2016 CodeKaram PrintInling* @ 76 java.util.zip.Inflater::setInput (74 bytes) too big @ 80 java.io.BufferedInputStream::getBufIfOpen (21 bytes) inline (hot) @ 91 java.lang.System::arraycopy (0 bytes) (intrinsic) @ 2 java.lang.ClassLoader::checkName (43 bytes) callee is too large * needs -XX:+UnlockDiagnosticVMOptions 19 ©2016 CodeKaram JIT & Runtime • Adaptive optimization - Performance critical methods • Dynamic de-optimization • dependencies invalidation • classes unloading and redefinition • uncommon path in compiled code 20 ©2016 CodeKaram PrintCompilation 573 704 2 org.h2.table.Table::fireAfterRow (17 bytes) 7963 2223 4 org.h2.table.Table::fireAfterRow (17 bytes) 7964 704 2 org.h2.table.Table::fireAfterRow (17 bytes) made not entrant 33547 704 2 org.h2.table.Table::fireAfterRow (17 bytes) made zombie 21 ©2016 CodeKaram Advanced JIT & Runtime Optimizations - Tiered Compilation 22 Tiered Compilation • Start in interpreter • Tiered optimization with client compiler • Code profiled information • Enable server compiler 23 ©2016 CodeKaram Tiered Compilation - Effect on Code Cache • C1 compilation threshold for tiered is about a 100 invocations • Tiered Compilation has a lot more profiled information for C1 compiled methods • CodeCache needs to be 5x larger than non-tiered • Default on JDK8 when tiered is enabled (48MB vs 240MB) • Need more? Use -XX:ReservedCodeCacheSize 24 ©2016 CodeKaram Other JIT Compiler Optimizations • Fast dynamic type tests for type safety • Range check elimination • Loop unrolling • Profile data guided optimizations • Escape Analysis • Intrinsics • Vectorization 25 ©2016 CodeKaram Advanced JIT & Runtime Optimizations - Intrinsics 26 Without Intrinsics Java Method JIT Compilation Execute Generated Code 27 ©2016 CodeKaram Intrinsics Java Method Call Hand- JIT Compilation Optimized Assembly Code Execute Optimized Code 28 ©2016 CodeKaram PrintInling* @ 76 java.util.zip.Inflater::setInput (74 bytes) too big @ 80 java.io.BufferedInputStream::getBufIfOpen (21 bytes) inline (hot) @ 91 java.lang.System::arraycopy (0 bytes) (intrinsic) @ 2 java.lang.ClassLoader::checkName (43 bytes) callee is too large * needs -XX:+UnlockDiagnosticVMOptions 29 ©2016 CodeKaram Advanced JIT & Runtime Optimizations - Escape Analysis 30 Escape Analysis • Entire IR graph • escaping allocations? • not stored to a static field or non-static field of an external object, • not returned from method, • not passed as parameter to another method where it escapes. 31 ©2016 CodeKaram Escape Analysis allocated object doesn’t escape the compiled method + allocated object not passed as a parameter = remove allocation and keep field values in registers 32 ©2016 CodeKaram Escape Analysis allocated object doesn’t escape the compiled method + allocated object is passed as a parameter = remove locks associated with object and use optimized compare instructions 33 ©2016 CodeKaram Advanced JIT & Runtime Optimizations - Compressed Oops 34 A Java Object Header Body Mark Array Klass Word Length 35 ©2016 CodeKaram Objects, Fields & Alignment • Objects are 8 byte aligned (default). • Fields: • are aligned by their type. • can fill a gap that maybe required for alignment. • are accessed using offset from the start of the object 36 ©2016 CodeKaram ILP32 vs. LP64 Field Sizes Mark Word - 32 bit vs 64 bit Klass - 32 bit vs 64 bit Array Length - 32 bit on both boolean, byte, char, float, int, short - 32 bit on both double, long - 64 bit on both 37 ©2016 CodeKaram Compressed OOPs <wide-oop> = <narrow-oop-base> + (<narrow-oop> << 3) + <field-offset> 38 ©2016 CodeKaram Compressed OOPs <4 GB >4GB; <28GB Heap Size? (no encoding/ (zero-based) decoding needed) <wide-oop> = <narrow-oop> <narrow-oop> << 3 39 ©2016 CodeKaram Compressed OOPs >28 GB; <32 GB >32 GB; <64 GB Heap Size? * (regular) (change alignment) <narrow-oop-base> <narrow-oop-base> + (<narrow-oop> + (<narrow-oop> <wide-oop> = << 3) + <field- << 4) + <field- offset> offset> 40 ©2016 CodeKaram Compressed OOPs >4GB; Heap Size? <4 GB <32GB <64GB <28GB Object 8 bytes 8 bytes 8 bytes 16 bytes Alignment? Offset No No Yes Yes Required? Shift by? No shift 3 3 4 41 ©2016 CodeKaram Compressed Class Pointers • JDK 8 —> Perm Gen Removal —> Class Data outside of heap • Compressed class pointer space • contains class metadata • is a part of Metaspace 42 ©2016 CodeKaram Compressed Class Pointers PermGen Removal Overview by Coleen Phillimore + Jon Masamitsu @JavaOne 2013 43 ©2016 CodeKaram The Helpers - Garbage Collection. 44 Garbage Collection Allocation + Reclamation 45 ©2016 CodeKaram Garbage Collection - Allocation. 46 Generational Heap Yo u ng Old Generation Generation Eden Survivors 47 ©2016 CodeKaram Fast Path Allocation - Thread Local Allocation Buffers (TLABs). 48 Garbage Collection - Allocation Fast Path Most Allocations Eden Space TLABs per thread: • allocate lock-free • bump-(your-own)-pointer allocation • only co-ordinate when needing a new TLAB 49 ©2016 CodeKaram TLAB Thread 0 Thread 1 Eden Thread 2 TLAB TLAB TLAB TLAB TLAB Thread 3 Thread 4 50 ©2016 CodeKaram (Min)TLABSize ResizeTLAB TLABWasteTargetPercent PrintTLAB 51 TLAB TLAB TLAB TLAB TLAB TLAB Eden 52 ©2016 CodeKaram TLAB TLABWasteTargetPercent (default = 1%) TLAB TLAB TLAB TLAB TLAB Eden 53 ©2016 CodeKaram TLAB TLAB TLAB TLAB TLAB TLAB Eden 54 ©2016 CodeKaram TLAB (Min)TLABSize TLAB TLAB TLAB TLAB Eden 55 ©2016 CodeKaram TLAB ResizeTLAB (default = TRUE) TLAB TLAB TLAB Eden 56 ©2016 CodeKaram Allocation - Non Uniform Memory Access (NUMA) Aware Allocator. 57 NUMA DRAM Memory Processing Processing Memory DRAM Bank Controller Node 0 Node 1 Controller Bank DRAM Memory Processing Processing Memory DRAM Bank Controller Node 2 Node 3 Controller Bank 58 ©2016 CodeKaram UseNUMA Thread 0 Thread 1 DRAM Memory Processing Bank Controller Node 0 Eden Area for Node 0 Area for Node 1 DRAM Memory Processing Bank Controller Node 1 Thread 2 59 ©2016 CodeKaram UseNUMA UseNUMAInterleaving UseAdaptiveNUMAChunkSizing NUMAStats 60 Garbage Collection - Reclamation. 61 Garbage Collection -Reclamation via (Serial) Mark-Sweep-Compact Root Set Yo u ng G e n er at i o n Old Generation 62 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact Old Generation 63 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact Old Generation 64 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact destination region source region Old Generation 65 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact destination region source region Old Generation 66 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact destination region source region source region Old Generation 67 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact destination region source region source region Old Generation 68 ©2016 CodeKaram Garbage Collection -Reclamation via Parallel Mark-Compact

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    106 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us