Quantifying and Improving the Performance of Garbage Collection

QUANTIFYING AND IMPROVING THE PERFORMANCE OF GARBAGE COLLECTION A Dissertation Presented by MATTHEW HERTZ Submitted to the Graduate School of the University of Massachusetts Amherst in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY September 2006 Computer Science °c Copyright by Matthew Hertz 2006 All Rights Reserved QUANTIFYING AND IMPROVING THE PERFORMANCE OF GARBAGE COLLECTION A Dissertation Presented by MATTHEW HERTZ Approved as to style and content by: Emery D. Berger, Chair J. Eliot B. Moss , Member Kathryn S. McKinley, Member Scott F. Kaplan, Member Andras Csaba Moritz, Member W. Bruce Croft, Department Chair Computer Science To Sarah for reminding me of everything I can do and to Shoshanna for inspiring me to do more. ACKNOWLEDGMENTS I am most grateful to my advisor, Emery Berger, for everything he has done throughout this thesis. I appreciate his guidance, suggestions, and inspiration. I feel especially fortu- nate for the patience he has shown with me throughout all the twists and turns my life took getting through this dissertation. I must also thank Eliot Moss and Kathryn McKinley for their leadership and support. I will be forever grateful that they took a chance on a student with a less-than-stellar aca- demic record and provided me with a fertile, inspiring research environment. They are both very knowledgeable and I benefited from our discussions in a myriad of ways. They have also served as members of my committee and I appreciate their helpful comments and suggestions. Thanks also to Scott Kaplan, another member of my committee, for his advice and feedback. I would also like to thank all of the members of the ALI and PLASMA research groups over the years, especially Steve Blackburn, Alistair Dundas, Yi (Eric) Feng, Chris Hoff- mann, Asjad Khan, and Zhenlin Wang, who always lent an ear when I needed to talk out my latest idea or blow off steam. I also owe thanks to Darko Stefanovic´ and Amer Diwan. The thoughts and insights these gentlemen shared with me have helped in many ways, both big and small. I am also indebted to my parents for their love and support over the years. They taught me the value of my ideas and the importance of staying true to my convictions. The love and support I received from my brother was also important. I could not have done this without them. v Finally, I want to thank Sarah and Shoshanna for everything they have done. They have shown an unbounded amount of love and support. I could not have asked for a more wonderful family. vi ABSTRACT QUANTIFYING AND IMPROVING THE PERFORMANCE OF GARBAGE COLLECTION SEPTEMBER 2006 MATTHEW HERTZ B.A., CARLETON COLLEGE M.Sc., UNIVERSITY OF MASSACHUSETTS AMHERST Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor Emery D. Berger Researchers have devoted decades towards improving garbage collection performance. A key question is: what opportunity remains for performance improvement? Because the perceived performance disadvantage of garbage collection is often cited as a barrier to its adoption, we propose using explicit memory management as the touchstone for evaluating garbage collection performance. Unfortunately, programmers using garbage-collected languages do not specify when to deallocate memory, precluding a direct comparison with explicit memory management. We observe that we can simulate the effect of explicit memory management by profil- ing applications and generating object reachability information and lifetimes to serve as zero-cost oracles to guide explicit memory reclamation. We call this approach oracular memory management. However, existing methods of generating exact object reachability are prohibitively expensive. We first present the object reachability time algorithm, which vii we call Merlin. We show how Merlin runs in time proportional to the running time of the application and can be performed either on-line or off-line. We then present empirical results showing that Merlin requires orders of magnitude less time than brute force trace generation. We next present simulation results from experiments with the oracular memory manager. These results show that when memory is plentiful, garbage collection performance is competitive with explicit memory management, occasionally increasing program through- put by 10%. This performance comes at a price. To match the average performance of explicit memory management, the best performing garbage collector requires an average heap size that is two-and-a-half or five times larger, depending on how aggressively the explicit memory manager frees objects. This performance gap is especially large when the heap exceeds physical memory. We show the garbage-collector causes poor paging performance. During a full heap collection, the collector scans all reachable objects. Which a reachable object resides on a non-resident page, the virtual memory manager reloads that page and evicts another. If the evicted page also contains reachable objects, ultimately it too will be accessed by the collector and reloaded into physical memory. This paging is unavoidable without cooperation between the garbage collector and the virtual memory manager. We then present a cooperative garbage collection algorithm that we call bookmarking collection (BC). BC works with the virtual memory manager to decide which pages to evict. By providing available empty pages and, when paging is inevitable, processing pages prior to their eviction, BC avoids triggering page faults during garbage collection. When paging, BC reduces execution time over the next best collector by up to a factor of 3 and reduces maximum pause times by over an order of magnitude versus GenMS. This thesis begins by asking how well garbage collection performs currently and which areas remain for it to improve. We develop oracular memory management to help answer this question by enabling us to quantify garbage collection’s performance. Our results show viii that while garbage collection can perform competitively, this good performance assumes the heap fits in available memory. Our attention thus focused onto this brittleness, we de- sign and implement the bookmarking collector. While performing competitively when not paging, our new collector’s paging performance helps make garbage collection’s performance must more robust. ix TABLE OF CONTENTS Page ACKNOWLEDGMENTS ................................................... v ABSTRACT .............................................................. vii LIST OF TABLES........................................................ xiv LIST OF FIGURES ...................................................... xvi CHAPTER 1. INTRODUCTION ...................................................... 1 1.1 Contributions . 2 1.2 Summary of Contributions . 5 2. RELATED WORK ...................................................... 7 2.1 Object Reachability . 7 2.1.1 Reachability Approximation. 7 2.1.2 Reference Counting . 8 2.2 Analyzing Memory Manager Performance . 8 2.2.1 Garbage Collection Overheads . 9 2.2.2 Explicit Memory Management Overheads . 9 2.2.3 Comparisons Using Explicit Memory Manager Performance Estimates . 10 2.2.4 Comparisons Using Program Heap Models . 10 2.2.5 Comparisons With Conservative Garbage Collection . 11 2.3 Improving Garbage Collection Paging Behavior . 12 2.3.1 VM-Sensitive Garbage Collection . 12 2.3.2 VM-Cooperative Garbage Collection . 14 x 3. COMPUTING OBJECT REACHABILITY ............................... 16 3.1 Objects Becoming Unreachable. 17 3.2 Forward Phase . 18 3.2.1 Instrumented Pointer Stores . 18 3.2.2 Uninstrumented Pointer Stores . 18 3.3 Forward Phase Limitations . 19 3.4 Backward Phase . 20 3.5 Algorithmic Details . 23 3.5.1 Algorithmic Requirements . 23 3.5.2 Merlin Assumptions . 23 3.6 Implementing Merlin . 24 3.6.1 On-Line Implementations . 24 3.6.2 Using Merlin Off-line . 25 3.7 Granularity of Object Reachability . 25 3.8 Evaluation of Merlin . 26 3.8.1 Experimental Methodology . 26 3.8.2 Computing Allocation-Accurate Results . 27 3.8.3 Computing Granulated Results . 28 3.9 Conclusion . 30 4. QUANTITATIVE ANALYSIS OF AUTOMATIC AND EXPLICIT MEMORY MANAGEMENT ......................................... 31 4.1 Oracular Memory Management . 32 4.1.1 Step One: Data Collection and Processing . 34 4.1.1.1 Repeatable Runs . 34 4.1.1.2 Tracing for the Liveness-Based Oracle . 35 4.1.1.3 Tracing for the Reachability-Based Oracle . 36 4.1.2 Step Two: Simulating Explicit Memory Management. 37 4.1.3 Validation: Live Oracular Memory Management . 37 4.2 Discussion . 38 4.2.1 Reachability versus Liveness . 39 4.2.2 malloc Overhead . 40 xi 4.2.3 Multithreaded versus Single-threaded . 40 4.2.4 Smart Pointers . 41 4.2.5 Program Structure . 42 4.3 Experimental Methodology . 43 4.4 Experimental Results . 47 4.4.1 Run-time and Memory Consumption . 47 4.4.1.1 Reported Heap Size Versus Heap Footprint. 49 4.4.1.2 Comparing Simulation to the Live Oracle . 60 4.4.1.3 Comparing the Liveness and Reachability Oracles . 62 4.4.2 Page-level locality . 63 4.5 Conclusion . 68 5. BOOKMARKING COLLECTOR ....................................... 70 5.1 Bookmarking Collector Overview . 71 5.2 Bookmarking Collector Details . 73 5.2.1 Managing Remembered Sets . 74 5.2.2 Compacting Collection . 74 5.3 Cooperation with the Virtual Memory Manager . 75 5.3.1 Reducing System Call Overhead . 76 5.3.2 Discarding Empty Pages . 76 5.3.3 Keeping the Heap Memory-Resident. 77 5.3.4 Utilizing Available Memory . 77 5.4 Bookmarking . 78 5.4.1 Collection After Bookmarking . ..

Quantifying and Improving the Performance of Garbage Collection

Quantifying the Performance of Garbage Collection Vs. Explicit Memory Management

ACDC: Towards a Universal Mutator for Benchmarking Heap Management Systems

Linux Multicore Performance Analysis and Optimization in a Nutshell

Predicting Execution Times with Partial Simulations in Virtual Memory Research: Why and How

Efficient Intra-Operating System Protection Against Harmful Dmas

Heapshield: Library-Based Heap Overflow Protection for Free

Vysok´E Uˇcení Technick´Ev Brnˇe

Bilal Shirazi 98107317 CS499 Research Paper: an Analysis of Concurrent Memory Allocators Presented To: Dr

0Sim: Preparing System Software for a World with Terabyte-Scale Memories

Hoard: a Scalable Memory Allocator For

Copyright by Emery D. Berger 2002

Topic : an Overview Multi-Core Processors - Architecture, Prog