Incremental Garbage Collection Using Method Specialisation Final Report
Total Page:16
File Type:pdf, Size:1020Kb
Incremental Garbage Collection Using Method Specialisation Final Report Luke Terry [email protected] Supervisor: Dr Tony Field June 15, 2010 Abstract This report documents the development of a prototype incremental garbage collector in the Jikes RVM 3.1 using a novel alternative to standard read barriers known as specialised self-scavenging. Self-scavenging collectors encode scavenge state on a per-object level, which allows individual objects to conditionally activate scavenge code during collection, eliminating the always-on nature of traditional barriers. We develop and evaluate an incremental collector framework coupled with the first traditional incremental Baker-style garbage collector for Jikes. We then modify the Baker collector to use self-scavenging, where objects encode their scavenge state in a header word. We finally eliminate the cost of explicitly encoding state in the object header by using specialised method variants, introduced through an updated method specialisation framework from the Jikes RVM 3.0.1. This results in a collector that has no conditional state checks when the collector is off or when an object has been scavenged. Acknowledgements I would like to thank my beautiful and understanding fiance´efor putting up with me over the last four years, I know at times this was not easy! I extend immense gratitude and respect to my supervisor Tony Field for remaining confident in my ability, for ideas and inspiration as to how to attack the inherent problems I encountered and for proof reading everything an uncountable number of times! I also want to thank Andrew Cheadle and Will Deacon for imparting their knowledge of garbage collection, Jikes and method specialisation to me and for providing invaluable support in the early hours. Finally I want to acknowledge the unrelenting support, wisdom and understanding of my parents who have guided, fed and shaped me into who I am today. Contents 1 Introduction 4 2 Background 6 2.1 Garbage Collection .............................. 6 2.1.1 Copying Collectors .......................... 8 2.1.1.1 Semi-Space Collector .................... 9 2.1.1.2 Generational Collectors .................. 9 2.1.2 Real-Time Collectors ......................... 10 2.1.2.1 Incremental Collectors ................... 11 2.1.2.2 The Tri-Colour Abstraction ................ 11 2.1.2.3 Scheduling ......................... 14 2.1.2.4 State-of-the-Art ....................... 15 2.1.3 Baker’s Incremental Garbage Collection .............. 17 2.2 Jikes RVM ................................... 20 2.2.1 MMTk ................................. 21 2.2.1.1 Spaces ............................ 21 2.2.1.2 Allocation Policies ..................... 21 2.2.1.3 Plans ............................ 22 2.2.1.4 Collection Policies ..................... 23 2.2.1.5 Concurrency ........................ 23 2.2.1.6 Testing and Visualisation ................. 24 2.2.1.7 Semi-Space Collector .................... 24 2.2.1.8 Concurrent Mark-Sweep Collector ............. 28 2.3 Method Specialisation ............................ 28 2.3.1 Class Transformations ........................ 33 2.3.2 Beanification ............................. 33 2.4 Related Work ................................. 38 3 Design and Implementation 40 3.1 MMTk Incremental Framework ....................... 40 3.1.1 Incremental Closure ......................... 41 3.1.2 Work-Based Triggers ......................... 45 3.1.2.1 Threading Issues ...................... 46 3.1.3 Mutator Object Tracing ....................... 47 3.1.4 High-Level Plan ............................ 49 1 3.2 Incremental Baker Collector Implementation ................ 50 3.2.1 Incremental Trigger .......................... 50 3.2.2 Forwarding Pointers ......................... 51 3.2.3 Read Barrier ............................. 51 3.2.4 Cross-Space Write Barrier ...................... 53 3.2.5 Calculating Required Incremental Work .............. 54 3.2.6 Missing Objects ............................ 56 3.2.6.1 Sanity Checker Confirmation ............... 56 3.2.6.2 Instrumentation ...................... 56 3.2.6.3 Missed Barriers ....................... 59 3.2.6.4 Race Conditions ...................... 62 3.2.6.5 Context Switching ..................... 63 3.2.7 Aside: Execution Issues ....................... 63 3.3 Method Specialisation ............................ 64 3.3.1 Method Specialisation Framework .................. 64 3.3.2 Garbage Collection Specialisation .................. 65 3.3.3 Extending The Beanification Transform ............... 66 3.3.4 Fixing Method Specialisation .................... 69 3.3.4.1 Obsolete Methods ..................... 69 3.3.4.2 Specialised Superclass Methods .............. 69 3.3.4.3 Interface Methods ..................... 70 3.3.4.4 Final Classes ........................ 72 3.4 ‘Free’ Read Barrier .............................. 73 3.4.1 Cross-Space Write Barriers ...................... 73 3.4.2 Global Beanification ......................... 73 3.4.3 Transforming Methods ........................ 73 3.4.4 Explicit Self-Scavenging ....................... 74 3.4.4.1 First Implementation .................... 76 3.4.4.2 Revised Implementation .................. 78 3.4.4.3 Lazy vs Eager Self-Scavenging ............... 78 3.4.4.4 Early Scavenging ...................... 80 3.4.4.5 Problem Classes ...................... 80 3.4.4.6 Array Handling ....................... 81 3.4.5 Implicit Self-Scavenging ....................... 81 3.4.5.1 Flipping Specialisations .................. 81 3.4.5.2 Specialised Self-Scavenge Transform ........... 82 3.4.5.3 Interacting With Collection ................ 85 3.4.6 Issues ................................. 88 2 4 Evaluation 89 4.1 Beanification ................................. 89 4.2 Incremental Baker Collector ......................... 91 4.3 Explicit Self-Scavenging ........................... 99 4.4 ‘Free’ Read Barrier .............................. 99 5 Conclusion 100 5.1 Future Work .................................. 102 5.1.1 Baker Collector Improvements .................... 102 5.1.2 ‘Free’ Read Barrier .......................... 103 5.1.3 Path Specialising Collector ...................... 103 5.1.4 Instrumentation ............................ 104 5.1.5 Specialising JNI and Reflection ................... 105 3 1 Introduction All programming language run-times need to support the allocation and reclamation of memory. Manually managing memory operations can be a cumbersome task for a programmer and inspired the development of automatic memory reclamation policies (garbage collection). The problem with garbage collection is that the user application (the mutator) cannot perform useful work while collection is in progress, manifested as periods of unresponsiveness. This is made more apparent when a collection cycle runs uninterrupted, to completion, with the mutator paused - the most commonly implemented type of collection algorithm. Concurrent garbage collectors for real-time and interactive applications have been developed to bound, or minimize, mutator pauses respectively. These collectors run either in parallel (pure concurrent collection), or in short interleaved bursts (incremental collection). Concurrent collection introduces the possibility for the mutator to gain access to an object that has been marked for collection. Allowing access to a possibly collected object may result in erroneous behaviour. For example, the memory may have been returned to the operating system, therefore, attempted access would violate memory protection. To solve this problem barriers are introduced; these intercept object reads or writes and ensure they are to objects that will survive collection. However, especially for reads, these barriers are invoked frequently and can prove expensive. To reduce their processing cost they must be implemented efficiently. The first incremental garbage collection algorithm, the classical Baker collector, was designed for list processing languages and is a pure copying collector [7]. The algorithm interleaves mutator work with increments of object copying (evacuation) from an area of memory that contains garbage (from-space), to an area that will contain only objects surviving this collection cycle (to-space). Using a read barrier, the algorithm tests whether the referenced object has already been copied and, if not, copies it. This ensures that the mutator only ‘sees’ objects in to-space. Although this approach reduces pause times, due to the test-and-branch the read barrier implementation often incurs a very high cost. This project details the development of an incremental garbage collector called SSB (Self-Scavenging Baker). SSB is based on an incremental Baker collector using method specialisation [18, 17, 22] in place of the expensive read-barrier usually associated with Baker’s scheme. Method specialisation supports the transparent switching between mul- tiple variations of the same virtual methods. We aim to remove the cost of the expensive read barrier test, resulting in a “free” read barrier. The key to achieving a “free” read barrier is to virtualise all field accesses by generating getter and setter methods, a process known as beanification. This means every reference to an object requires a virtual method call. We can then implement a variant of each 4 method that first evacuates all of its reference-type fields (scavenging) before “flipping” to the default variant. We only execute the self-scavenging variant when the object has been copied, but not yet been scavenged. This means we do not have the expensive