Automatic Memory Management Techniques for the Go Programming Language

Automatic Memory Management Techniques for the Go Programming Language Matthew Ryan Davis Submitted in total fulfilment of the requirements of the degree of Doctor of Philosophy September 2015 Department of Computer Science and Software Engineering The University of Melbourne Produced on archival quality paper. ii Abstract Memory management is a complicated task. Many programming languages expose such complexities directly to the programmer. For instance, languages such as C or C++ require the programmer to explicitly allocate and reclaim dynamic memory. This opens the doors for many software bugs (e.g., memory leaks and null pointer dereferences) which can cause a program to crash. Automated techniques of memory management were in- troduced to relieve programmers from managing such complicated aspects. Two automated techniques are garbage collection and region-based memory management. The more common technique, garbage collection, is primarily driven by a runtime analysis (e.g., scanning live memory and reclaiming the bits that are no longer reachable from the program), where as the less common region-based technique performs a static analysis during compila- tion and determines program points where the compiler can insert memory reclaim operations. Each option has its drawbacks. In the case of garbage collection it can be computationally expensive to scan memory at runtime, often requiring the program to halt execution during this stage. In contrast, region-based methods often require objects to remain resident in memory longer than garbage collection, resulting in a less than optimal use of a system’s resources. This thesis investigates the less common form of automated memory management (region-based) within the context of the relatively new concurrent language Go. We also investigate combining both techniques, in a new way, with hopes of achieving the benefits of a combined system without the drawbacks that each automated technique provides alone. We conclude this work by applying our region-based system to a concurrent processing environment. iii iv Declaration This is to certify that: • the thesis comprises only my original work towards the PhD except where indicated in the Preface, • due acknowledgement has been made in the text to all other material used, • the thesis is fewer than 100,000 words in length, exclusive of tables, maps, bibliographies and appendices. Matthew Ryan Davis v vi Preface This thesis is the result of over three years of work and collaboration with some of the brightest people I have ever worked with, my supervisors. I have benefited greatly from our weekly discussions, impromptu brain storming sessions, and our collaborative paper writing process. Chapter 3 and Chapter 5 are based on publications [17, 18] written to- gether with my supervisors. I am listed as the primary author on both of these works, having done the implementation, experimentation, and original drafts. These two chapters derive from our publications to the ACM SIG- PLAN Workshop on Memory Systems Performance and Correctness 2012 and 2013 respectively. Chapter 6 extends a discussion that was started in Chapter 3 [17]. vii viii Acknowledgements I would like to thank my supervisors: Peter Schachte, Zoltan Somogyi, and Harald Søndergaard for all of their hard work, genius ideas, brain storming sessions, and impeccable pedantic reasoning towards correctness. I have never worked with such a smart team in my life. I would also like to thank the University for its great bandwidth and procurement of freeze-dried caf- feine that kept me trudging through the hours of finger-physical labor that this degree required. This research process has been an incredibly rewarding experience, and I have learned quite a lot through this collaborative effort. Not only have I broadened my knowledge on compilers and memory management, but I believe that I have also become a stronger developer and critical thinker. I was awarded both scholarship funding from NICTA and the University, and am forever grateful for their financial support. I would also like to thank my good friend, Julien Ridoux, for his friendship. Julien allowed me to work part-time with his and Darryl Veitch’s team researching high precision BSD/Linux kernel and network timing as well as network measurement. I was listed on two of their publications [19, 81] that is based on the part-time work and collaboration with their team. I would also like to thank the Ruxcon computer security team for their wonderful friendships, letting me work as a staff member at Ruxcon 2012, as a presenter on GCC plugins in 2011, and as one of their monthly lecturers. The plugin talk derives from the knowledge I have gathered while working on this thesis. I would also like to thank Linux Journal and Linux Weekly News for publishing articles [14, 16, 15] that I have written while undergoing this PhD. The latter two publications derive from the knowledge I have gained while working on this thesis. I also thank the Linux Users of Victoria for ix allowing me speak about writing and debugging Linux Kernel Drivers, and Linux. Lastly, I would like to thank the Søndergaards, they are much more than friends, they have been like an adopted family to me. I am forever indebted to them. x Contents Abstract iii Declaration v Preface vii Acknowledgements ix 1 Introduction 1 1.1 TheProgrammer’sBurden . 4 1.2 OurGoalsandSolution .................... 6 2 Memory Management 9 2.1 Introduction........................... 9 2.2 Background ........................... 10 2.2.1 SystemMemory..................... 10 2.2.2 VirtualMemory..................... 13 2.2.3 ProcessMemoryLayout . 14 2.2.4 StackDataandStackFrames . 17 2.2.5 DynamicMemory. 18 2.2.6 DynamicAllocationProblems . 20 2.3 AutomaticMemoryManagement . 25 2.3.1 Region-BasedMemoryManagement. 26 2.3.2 GarbageCollection . 30 2.3.3 MemoryFootprints . 34 2.3.4 Language Influences on Automatic Memory Manage- ment........................... 35 2.3.5 StaticversusRuntimeAnalysis . 38 2.3.6 ObjectRelationships . 42 2.3.7 ImplementationDifficulties. 44 xi 3 Implementing RBMM for the Go Programming Language 47 3.1 Introduction........................... 47 3.2 ADistilledGo/GIMPLESyntax. 49 3.3 Design.............................. 51 3.4 RegionTypes .......................... 52 3.5 RuntimeSupport ........................ 53 3.6 RegionInference ........................ 54 3.6.1 PreventingDanglingPointers . 55 3.6.2 Intraprocedural and Interprocedural Analyses . 56 3.6.3 Scalability........................ 63 3.7 Transformation ......................... 63 3.7.1 Region-BasedAllocation . 65 3.7.2 FunctionCallsandDeclarations . 66 3.7.3 RegionCreationandRemoval . 67 3.8 RegionProtectionCounting . 72 3.9 Higher-OrderFunctions. 75 3.10 InterfaceTypes ......................... 77 3.11MapDatatype.......................... 77 3.12 ReturningLocalVariables . 77 3.13RegionMerging ......................... 78 3.14 MultipleSpecialization . 80 3.15 GoInfrastructure . 81 3.16Evaluation............................ 82 3.17Summary ............................ 88 4 CorrectnessoftheRBMMTransformations 91 4.1 MemoryManagementCorrectness . 91 4.2 SemanticLanguage ....................... 94 4.3 TransformationCorrectness . 97 4.3.1 RegionCreationandRemoval . 101 4.3.2 Function Definition and Application. 106 4.3.3 RegionProtectionCounting . 108 xii 4.3.4 Region-BasedAllocation . 109 4.3.5 RegionMerging . 111 4.4 Conclusion............................ 113 5CombiningRBMMandGC 115 5.1 Introduction........................... 115 5.2 EnhancingtheAnalysis. 116 5.2.1 Increasing Conservatism for Higher-Order Functions . 117 5.3 Combining Regions and Garbage Collection . 118 5.4 ManagingOrdinaryStructures. 120 5.4.1 CreatingaRegion. 125 5.4.2 AllocatingfromaRegion. 126 5.4.3 ReclaimingaRegion . 127 5.4.4 FindingTypeInfos . 127 5.4.5 ManagingRedirections . 129 5.4.6 CollectingGarbage . 131 5.5 ManagingArraysandSlices . 138 5.5.1 FindingtheStartofanArray . 139 5.5.2 Preserving Only the Used Parts of Arrays . 141 5.6 HandlingOtherGoConstructs . 146 5.7 Evaluation............................ 149 5.7.1 Methodology ...................... 150 5.7.2 Results.......................... 150 5.8 Summary ............................ 158 6 RBMMinaConcurrentEnvironment 161 6.1 Introduction........................... 162 6.2 Go’sApproach ......................... 163 6.3 Design .............................. 164 6.3.1 HowaGo-routineExecutes . 165 6.3.2 Concurrency for Region Operations . 166 6.3.3 Transformation . 169 xiii 6.3.4 ProducerandConsumerExample . 171 6.4 Region-AwareGarbageCollection . 175 6.5 Evaluation............................ 175 6.5.1 Methodology ...................... 175 6.5.2 Results.......................... 177 6.6 Summary ............................ 178 7 Related Work 181 7.1 GarbageCollection . 181 7.2 Region-BasedMemoryManagement. 187 8 Conclusion 197 8.1 Summary ............................ 197 8.2 FutureWork........................... 199 8.2.1 OptimizingourGarbageCollector . 199 8.2.2 Applying our Research to Other Languages. 201 AGoProgrammingLanguage 203 A.1 Introduction........................... 204 A.2 GoSyntax............................ 204 A.2.1 DeclarationsandAssignments . 205 A.3 ModulesandImports. 206 A.4 Types .............................. 207 A.4.1 CommonPrimitives. 207 A.4.2 Pointers ......................... 207 A.5 UserDefinedTypes....................... 208 A.5.1 StructureTypes. 208 A.5.2 Interfaces ........................ 210 A.6 ContainerTypes ........................ 211 A.6.1 Arrays.........................

Load more