Uniprocessor Garbage Collection Techniques
Total Page:16
File Type:pdf, Size:1020Kb
Unipro cessor Garbage Collection Techniques Submitted to ACM Computing Surveys Paul R Wilson Abstract Incremental Tracing Collectors Coherence and Conservatism We survey basic garbage collection algorithms and Tricolor Marking variations such as incremental and generational collec Incremental approaches tion we then discuss lowlevel implementation consid Write Barrier Algorithms erations and the relationships b etween storage man Snapshotatb eginning agement systems languages and compilers Through Algorithms out we attempt to present a unied view based on Incremental Up date WriteBar abstract traversal strategies addressing issues of con rier Algorithms servatism opp ortunism and immediacy of reclama Bakers Read Barrier Algorithms tion we also p ointoutavariety of implementation Incremental Copying details that are likely to have a signicant impact on ers Incremental Noncopy Bak p erformance ing AlgorithmThe Treadmill Conservatism of Bakers Read Barrier Contents Variations on the Read Barrier Replication Copying Collection Automatic Storage Reclamation Coherence and Conservatism Revisited Motivation Coherence and Conservatism in The TwoPhase Abstraction Noncopying collection Ob ject Representations Coherence and Conservatism in Copying Collection Overview of the Pap er Radical Collection and Op Basic Garbage Collection Techniques p ortunistic Tracing Comparing Incremental Techniques Reference Counting Realtime Tracing Collection The Problem with Cycles Ro ot Set Scanning The Eciency Problem Guaranteeing Sucient Progress Deferred Reference Counting Trading worstcase p erformance V ariations on Reference Counting for exp ected p erformance MarkSweep Collection Discussion MarkCompact Collection Cho osing an Incremental Algorithm Copying Garbage Collection A Simple Copying Collector Generational Garbage Collection StopandCopy Using Semi Multiple Subheaps with Varying Col spaces lection Frequencies Eciency of Copying Collection AdvancementPolicies NonCopying Implicit Collection Heap Organization Cho osing Among Basic Tracing Tech Subareas in copying schemes niques Generations in Noncopying Problems with Simple Tracing Collectors Schemes Conservatism in Garbage Collection Discussion Tracking Intergenerational References Automatic Storage Reclama Indirection Tables tion Ungars Remembered Sets Garbage col lection is the automatic reclamation of Page Marking computer storage Knu Coh App While in Word marking many systems programmers must explicitly reclaim Card Marking heap memory at some p oint in the program byus Store Lists ing a free or disp ose statement garbage collected Discussion systems free the programmer from this burden The The Generational Principle Revisited garbage collectors function is to nd data ob jects Pitfalls of Generational Collection that are no longer in use and make their space avail The Pig in the Snake Problem able for reuse by the the running program An ob ject Small Heapallo cated Ob jects is considered garbage and sub ject to reclamation if it is not reachable by the running program via any path Large Ro ot Sets of p ointer traversals Live p otentially reachable ob Realtime Generational Collection jects are preserved by the collector ensuring that the program can never traverse a dangling p ointer into Lo cality Considerations a deallo cated ob ject Varieties of Lo cality Eects This pap er surveys basic and advanced techniques Lo cality of Allo cation and Shortlived in unipro cessor garbage collectors esp ecially those de ob jects velop ed in the last decade For a more thorough Lo calityofTracing Traversals treatment of older techniques see Knu Coh Clustering of LongerLived Ob jects While it do es not cover parallel or distributed col Static Grouping lection it presents a unied taxonomy of incremen Dynamic Reorganization tal techniques whichlays the groundwork for under standing parallel and distributed collection Our fo Co ordination with Paging cus is on garbage collection for pro cedural and ob ject oriented languages but much of the information here Lowlevel Implementation Issues will serveasbackground for understanding garbage Pointer Tags and Ob ject Headers collection of other kinds of systems such as functional ConservativePointer Finding or logic programming languages For further reading Linguistic Supp ort and Smart Pointers on various advanced topics in garbage collection the Compiler Co op eration and Optimizations pap ers collected in BC are a go o d starting p oint GCAnytime vs SafePoints Collection Motivation Partitioned Register Sets vs Variable Rep Garbage collection is necessary for fully mo dular pro resentation Recording gramming to avoid intro ducing unnecessary inter mo dule dep endencies Asoftware routine op erating Optimization of Garbage Col on a data structure should not havetodependwhat lection Itself Free Storage Management We use the term heap in the simple sense of a storage managementtechnique whichallows any dynamically allo cated Compact Representations of Heap Data ob ject to b e freed at any timethis is not to b e confused with heap data structures which maintain ordering constraints GCrelated Language Features We use the term ob ject lo osely to include anykindof structured data record suchasPascal records or C structs as Weak Pointers well as fulledged ob jects with encapsulation and inheritance Finalization in the sense of ob jectoriented programming Multiple DifferentlyManaged Heaps There is also a rep ository of pap ers in PostScript for mat available for anonymous Internet FTP from our FTP host csutexasedupubgarbage Among other things this Overall Cost of Garbage Collection rep ository contains collected pap ers from several garbage col lection workshops held in conjunction with ACM OOPSLA conferences Conclusions and Areas for Research other routines may b e op erating on the same struc grammers nd the source of leaked ob jects in lan ture unless there is some go o d reason to co ordinate guages with explicit deallo cation HJ and these can their activities If ob jects must b e deallo cated explic b e extremely valuable Unfortunately these to ols only itly some mo dule must b e resp onsible for knowing nd actual leaks during particular program runs not when other mo dules are not interested in a particular p ossible leaks due to uncommon execution patterns ob ject Finding the source of a leaked ob ject do es not always solve the problem either the programmer still must Since livenessisaglobal prop erty this intro duces b e able to determine a p oint where the ob ject should nonlo cal b o okkeeping into routines that might other b e deallo catedif one exists If one do esnt exist the wise b e lo cally understandable and exibly comp os program must b e restructured This kind of gar able This b o okkeeping inhibits abstraction and re bage debugging is b etter than nothing but it is very duces extensibility b ecause when new functionalityis fallible and it must b e rep eated whenever programs implemented the b o okkeeping co de must b e up dated change it is desirable to actually eliminate leaks in The runtime cost of the b o okkeeping itself may b e sig general rather than certain detectable leaks in partic nicant and in some cases it mayintro duce the need ular for additional synchronization in concurrent applica tions Explicit allo cation and reclamation lead to program The unnecessary complications and subtle interac errors in more subtle ways as well It is common for tions created by explicit storage allo cation are esp e programmers to allo cate a mo derate number of ob cially troublesome b ecause programming mistakes of jects statically so that it is unnecessary to allo cate ten break the basic abstractions of the programming them on the heap and decide when and where to re language making errors hard to diagnose Failing to claim them This leads to xed limitations on pro reclaim memory at the prop er p ointmayleadtoslow grams making them fail when those limitations are memory leaks with unreclaimed memory gradually ac exceeded p ossibly years later when computer memo cumulating until the pro cess terminates or swap space ries and data sets are much larger This brittleness is exhausted Reclaiming memory to o so on can lead to makes co de much less reusable b ecause the undo cu very strange b ehavior b ecause an ob jects space may mented limits cause it to fail even if its b eing used in b e reused to store a completely dierent ob ject while away consistent with its abstractions For example an old p ointer still exists The same memory may many compilers fail when faced with automatically therefore b e interpreted as two dierent ob jects simul generated programs that violate assumptions ab out taneously with up dates to one causing unpredictable normal programming practices mutations of the other These problems lead many applications program These programming errors are particularly dan mers to implement some form of applicationsp ecic gerous b ecause they often fail to show up rep eat garbage collection within a large software system to ably making debugging very dicultthey maynever avoid most of the headaches of explicit storage man show up at all until the program is stressed in an