Unipro cessor Garbage Collection Techniques

Paul R Wilson

UniversityofTexas

Austin Texas USA

wilsoncsutexasedu

Abstract We survey basic garbage collection algorithms and variations such as incremental and gen

erational collection The basic algorithms include marksweep markcompact copy

ing and treadmill collection Incremental techniques can keep garbage collection pause times short by

interleaving small amounts of collection work with program execution Generational schemes improve

eciency and lo calityby garbage collecting a smaller area more often while exploiting typical lifetime

characteristics to avoid undue overhead from longlived ob jects

Automatic Storage Reclamation

Garbage col lection is the automatic reclamation of computer storage Knu Coh App While in many

systems programmers must explicitly reclaim heap memory at some p oint in the program by using a free or

disp ose statement garbage collected systems free the programmer from this burden The garbage collectors

function is to nd data ob jects that are no longer in use and make their space available for reuse by the the

running program An ob ject is considered garbage and sub ject to reclamation if it is not reachable by the

running program via any path of p ointer traversals Live p otentially reachable ob jects are preserved by the

collector ensuring that the program can never traverse a dangling p ointer into a deallo cated ob ject

This pap er is intended to b e an intro ductory survey of garbage collectors for unipro cessors esp ecially those

develop ed in the last decade For a more thorough treatment of older techniques see Knu Coh

Motivation

Garbage collection is necessary for fully mo dular programming to avoid intro ducing unnecessary intermo dule

dep endencies A routine op erating on a data structure should not havetoknow what other routines maybe

op erating on the same structure unless there is some go o d reason to co ordinate their activities If ob jects

must b e deallo cated explicitly some mo dule must b e resp onsible for knowing when other mo dules are not

interested in a particular ob ject

Since liveness is a global prop erty this intro duces nonlo cal b o okkeeping into routines that might other

wise b e orthogonal comp osable and reusable This b o okkeeping can reduce extensibility b ecause when new

functionality is implemented the b o okkeeping co de must b e up dated

The unnecessary complications created by explicit storage allo cation are esp ecially troublesome b ecause

programming mistakes often intro duce erroneous b ehavior that breaks the basic abstractions of the program

ming language making errors hard to diagnose

Failing to reclaim memory at the prop er p ointmay lead to slow memory leaks with unreclaimed memory

gradually accumulating until the pro cess terminates or swap space is exhausted Reclaiming memory to o so on

can lead to very strange b ehavior b ecause an ob jects space may b e reused to store a completely dierent

ob ject while an old p ointer still exists The same memory may therefore b e interpreted as two dierent ob jects

simultaneously with up dates to one causing unpredictable mutations of the other

This pap er will app ear in the pro ceedings of the International Workshop on St

Malo France Septemb er in the SpringerVerlag Lecture Notes in Computer Science series

We use the term ob ject lo osely to include any kind of structured data record suchasPascal records or structs

as well as fulledged ob jects with encapsulation and inheritance in the sense of ob jectoriented programming

These bugs are particularly dangerous b ecause they often fail to show up rep eatably making debugging

very dicult they maynever showupatalluntil the program is stressed in an unusual way If the allo cator

happ ens not to reuse a particular ob jects space a dangling p ointer may not cause a problem Later in the

eld the application may crash when it makes a dierent set of memory demands or is linked with a dierent

allo cation routine A slowleakmay not b e noticeable while a program is b eing used in normal waysp erhaps

for manyyearsb ecause the program terminates b efore to o much extra space is used But if the co de is

incorp orated into a longrunning server program the server will eventually exhaust its swap space and crash

Explicit allo cation and reclamation lead to program errors in more subtle ways as well It is common for

programmers to statically allo cate a mo derate numb er of ob jects so that it is unnecessary to allo cate them

on the heap and decide when and where to reclaim them This leads to xed limitations on software making

them fail when those limitations are exceeded p ossibly years later when memories and data sets are much

larger This brittleness makes co de much less reusable b ecause the undo cumented limits cause it to fail

even if its b eing used in a way consistent with its abstractions For example many compilers fail when faced

with automaticallygenerated programs that violate assumptions ab out normal programming practices

These problems lead many applications programmers to implement some form of applicationsp ecic gar

bage collection within a large software system to avoid most of the headaches of explicit storage management

Many large programs have their own data typ es that implement reference counting for example Unfortunately

these collectors are often b oth incomplete and buggy b ecause they are co ded up for a oneshot application

The garbage collectors themselves are therefore often unreliable as well as b eing hard to use b ecause they

are not integrated into the programming language The fact that such kludges exist despite these problems

is a testimony to the value of garbage collection and it suggests that garbage collection should b e part of

programming language implementations

In the rest of this pap er we fo cus on garbage collectors that are built into a language implementation

The usual arrangement is that the allo cation routines of the language or imp orted from a library p erform

sp ecial actions to reclaim space as necessary when a memory request is not easily satised That is calls to

the deallo cator are unnecessary b ecause they are implicit in calls to the allo cator

Most collectors require some co op eration from the compiler or interpreter as well ob ject formats must

b e recognizable by the garbage collector and certain invariants must b e preserved by the running co de

Dep ending on the details of the garbage collector this may require slightchanges to the co de generator to

emit certain extra information at compile time and p erhaps execute dierent instruction sequences at run

time Bo e WH DMH Contrary to widespread misconceptions there is no conict b etween using a

compiled language and garbage collection stateofthe art implementations of garbagecollected languages use

sophisticated optimizing compilers

The TwoPhase Abstraction

Garbage collection automatically reclaims the space o ccupied by data ob jects that the running program can

never access again Such data ob jects are referred to as garbage The basic functioning of a garbage collector

consists abstractly sp eaking of two parts

Distinguishing the live ob jects from the garbage in some wayor garbage detection and

Reclaiming the garbage ob jects storage so that the running program can use it

In practice these two phases may b e functionally or temp orally interleaved and the reclamation technique

is strongly dep endent on the garbage detection technique

In general garbage collectors use a liveness criterion that is somewhat more conservative than those

used by other systems In an optimizing compiler a value may b e considered dead at the p oint that it can

never b e used again by the running program as determined by control ow and data ow analysis A garbage

collector typically uses a simpler less dynamic criterion dened in terms of a root set and reachability from



these ro ots At the p oint when garbage collection o ccurs all globally visible variables of active pro cedures



Typically this happ ens when allo cation of an ob ject has b een attempted by the running program but there is not

sucient free memory to satisfy the request The allo cation routine calls a garbage collection routine to free up space

then allo cates the requested ob ject

are considered live and so are the lo cal variables of any active pro cedures The root set therefore consists of

the global variables lo cal variables in the activation stack and any registers used by active pro cedures Heap

ob jects directly reachable from any of these variables could b e accessed by the running program so they must

b e preserved In addition since the program mighttraverse p ointers from those ob jects to reach other ob jects

any ob ject reachable from a live ob ject is also live Thus the set of live ob jects is simply the set of ob jects on

any directed path of p ointers from the ro ots

Any ob ject that is not reachable from the ro ot set is garbage ie useless b ecause there is no legal sequence

of program actions that would allow the program to reach that ob ject Garbage ob jects therefore cant aect

the future course of the computation and their space may b e safely reclaimed

Ob ject Representations

Throughout this pap er wemake the simplifying assumption that heap ob jects are selfidentifying ie that it

is easy to determine the typ e of an ob ject at run time Implementations of staticallytyp ed garbage collected

languages typically have hidden header elds on heap ob jects ie an extra eld containing typ e information

which can b e used to deco de the format of the ob ject itself This is esp ecially useful for nding p ointers to

other ob jects

Dynamicallytyp ed languages such as Lisp and Smalltalk usually use tagged p ointers a slightly shortened

representation of the hardware address is used with a small typ eidentifying eld in place of the missing address

bits This also allows short immutable ob jects in particular small integers to b e representedasuniquebit

patterns stored directly in the address part of the eld rather than actually referred to by an address This

tagged representation supp orts p olymorphic elds whichmaycontain either one of these immediate ob jects

or a p ointer to an ob ject on the heap Usually these short tags are augmented by additional information in

heapallo cated ob jects headers

For a purely staticallytyp ed language no p erob ject runtime typ e information is actually necessary except



the typ es of the ro ot set variables Once those are known the typ es of their referents are known and their

elds can b e deco ded Appa Gol This pro cess continues transitively allowing typ es to b e determined at

every p ointer traversal Despite this headers are often used for staticallytyp ed languages b ecause it simplies

implementations at little cost Conventional explicit heap management systems often use ob ject headers for

similar reasons

Basic Garbage Collection Techniques

Given the basic twopart op eration of a garbage collector manyvariations are p ossible The rst part distin

guishing live ob jects from garbage maybedoneinseveral ways by referencecounting markingor copying



Because eachscheme has a ma jor inuence on the second part reclamation and on reuse techniques we

will intro duce reclamation metho ds as we go

Reference Counting

In a reference counting system Col each ob ject has an asso ciated count of the references p ointers to it

Each time a reference to the ob ject is created eg when a p ointer is copied from one place to another by

an assignment the ob jects count is incremented When an existing reference to an ob ject is eliminated the

count is decremented See Fig The memory o ccupied by an ob ject may b e reclaimed when the ob jects

count equals zero since that indicates that no p ointers to the ob ject exist and the running program could not

reach it



Conservative garbage collectors BW Wen BDS WH are usable with little or no co op eration from the

compilernot even the typ es of named variablesbut we will not discuss them here



Some authors use the term garbage collection in a narrower sense which excludes reference counting andor copy

collection systems we prefer the more inclusive sense b ecause of its p opular usage and b ecause its less awkward

than automatic storage reclamation HEAP SPACE

1

2 ROOT SET 1 1 1

1 2 1

Fig Reference counting

In a straightforward reference counting system each ob ject typically has a header eld of information

describing the ob ject which includes a subeld for the reference count Like other header information the

reference count is generally not visible at the language level

When the ob ject is reclaimed its p ointer elds are examined and any ob jects it holds p ointers to also have

their reference counts decremented since references from a garbage ob ject dont count in determining liveness

Reclaiming one ob ject may therefore lead to the transitive decrementing of reference counts and reclaiming

many other ob jects For example if the only p ointer into some large data structure b ecomes garbage all of the

reference counts of the ob jects in that structure typically b ecome zero and all of the ob jects are reclaimed

In terms of the abstract twophase garbage collection the adjustmentandchecking of reference counts

implements the rst phase and the reclamation phase o ccurs when reference counts hit zero These op erations

are b oth interleaved with the execution of the program b ecause they may o ccur whenever a p ointer is created

or destroyed

One advantage of reference counting is this incremental nature of most of its op erationgarbage collection

work up dating reference counts is interleaved closely with the running programs own execution It can easily

b e made completely incremental and real time that is p erforming at most a small and b ounded amountof

work p er unit of program execution

Clearly the normal reference count adjustments are intrinsically incremental never involving more than a

few op erations for anygiven op eration that the program executes The transitive reclamation of whole data

structures can b e deferred and also done a little at a time bykeeping a list of freed ob jects whose reference

counts have b ecome zero but whichhavent yet b een pro cessed yet

This incremental collection can easily satisfy real time requirements guaranteeing that memory manage

ment op erations never halt the executing program for more than a very brief p erio d This can supp ort realtime

applications in which guaranteed resp onse time is critical incremental collection ensures that the program is

allowed to p erform a signicant though p erhaps appreciably reduced amountofwork in any signicant

amount of time A target criterion might b e that no more than one millisecond out of every twomillisecond

p erio d would b e sp ent on storage reclamation op erations leaving the other millisecond for useful work to

satisfy the programs realtime purp ose

There are two ma jor problems with reference counting garbage collectors they are dicult to make ecient

and they are not always eective

The Problem with Cycles The eectiveness problem is that reference counting fails to reclaim circular

structures If the p ointers in a group of ob jects create a directed cycle the ob jects reference counts are

never reduced to zero even if thereisnopathtotheobjects from the root set McB

Figure illustrates this problem Consider the isolated pair of ob jects on the right Eachholdsapointer to

the other and therefore each has a reference count of one Since no path from a ro ot leads to either however

the program can never reach them again

Conceptually sp eaking the problem here is that reference counting really only determines a conservative

approximation of true liveness If an ob ject is not p ointed to byanyvariable or other ob ject it is clearly

garbage but the converse is often not true

It may seem that circular structures would b e very unusual but they are not While most data structures

are acyclic it is not uncommon for normal programs to create some cycles and a few programs create very

many of them For example no des in trees mayhave backp ointers to their parents to facilitate certain

op erations More complex cycles are sometimes formed bytheuseofhybrid data structures whichcombine

advantages of simpler data structures and the like

Systems using reference counting garbage collectors therefore usually include some other kind of garbage

collector as well so that if to o much uncollectable cyclic garbage accumulates the other metho d can b e used

to reclaim it

Many programmers who use referencecounting systems suchasInterlisp and early versions of Smalltalk

have mo died their programming style to avoid the creation of cyclic garbage or to break cycles b efore they

b ecome a nuisance This has a negative impact on program structure and many programs still have storage



leaks that accumulate cyclic garbage whichmust b e reclaimed by some other means These leaks in turn

can compromise the realtime nature of the algorithm b ecause the system mayhavetofallback to the use of

a nonrealtime collector at a critical moment

The Eciency Problem The eciency problem with reference counting is that its cost is generally pro

p ortional to the amountofwork done by the running program with a fairly large constant of prop ortionality



Bob describ es mo dications to reference counting to allow it to handle some sp ecial cases of cyclic structures

but this restricts the programmer to certain stereotyp ed patterns HEAP SPACE

1

1 ROOT SET 1 1 1

1 2 1

Fig Reference counting with unreclaimable cycle

One cost is that when a p ointer is created or destroyed its referents countmust b e adjusted If a variables

value is changed from one p ointer to another two ob jects counts must b e adjustedone ob jects reference

countmust b e incremented the others decremented and then checked to see if it has reached zero

Shortlived stackvariables can incur a great deal of overhead in a simple referencecounting scheme When

an argument is passed for example a new p ointer app ears on the stack and usually disapp ears almost

immediately b ecause most pro cedure activations near the leaves of the call graph return very shortly after

they are called In these cases reference counts are incremented and then decremented back to their original

value very so on It is desirable to optimize away most of these increments and decrements that cancel each

other out

Deferred Reference Counting Much of this cost can b e optimized awayby sp ecial treatment of lo cal

variables DB Rather than always adjusting reference counts and reclaiming ob jects whose counts b ecome

zero references from the lo cal variables are not included in this b o okkeeping most of the time Usually reference

counts are only adjusted to reect p ointers from one heap ob ject to another This means that reference counts

may not b e accurate b ecause p ointers from the stackmay b e created or destroyed without b eing accounted

for that in turn means that ob jects whose count drops to zero may not actually b e reclaimable Garbage

collection can only b e done when references from the stack are taken into account

Every now and then the reference counts are brought up to date by scanning the stackforpointers to heap

ob jects Then any ob jects whose reference counts are still zero may b e safely reclaimed The interval b etween

these phases is generally chosen to b e short enough that garbage is reclaimed often and quicklyyet still long

enough that the cost of p erio dically up dating counts for stack references is not high

This deferredreferencecounting DBavoids adjusting reference counts for most shortlived p ointers

from the stack and greatly reduces the overhead of reference counting When p ointers from one heap ob ject

to another are created or destroyed however the reference counts must still b e adjusted This cost is still

roughly prop ortional to the amountofwork done by the running program in most systems but with a lower

constant of prop ortionality

There is another cost of referencecounting collection that is harder to escap e When ob jects counts go to

zero and they are reclaimed some b o okkeeping must b e done to makethemavailable to the running program

Typically this involves linking the freed ob jects into one or more free lists of reusable ob jects out of which

the programs allo cation requests are satised

It is dicult to make these reclamation op erations take less than several instructions p er ob ject and the

cost is therefore prop ortional to the numb er of ob jects allo cated by the running program

These costs of reference counting collection havecombined with its failure to reclaim circular structures

to make it unattractive to most implementors in recentyears As we will explain b elow other techniques are

usually more ecient and reliable

This is not to say that reference counting is a dead technique It still has advantages in terms of the im

 

mediacy with which it reclaims most garbage and corresp onding b enecial eects on lo cality of reference a

reference counting system may p erform with little degradation when almost all of the heap space is o ccupied by

live ob jects while other collectors rely on trading more space for higher eciency Reference counts themselves

maybevaluable in some systems For example they may supp ort optimizations in functional language imple

mentations byallowing destructive mo dication of uniquelyreferenced ob jects Distributed garbage collection

is often done with referencecounting b etween no des of a distributed system combined with marksweep or

copying collection within a no de Future systems may nd other uses for reference counting p erhaps in hybrid

collectors also involving other techniques or when augmented by sp ecialized hardware Nonetheless reference

counting is generally not considered attractive as the primary garbage collection technique on conventional

unipro cessor hardware

For most highp erformance generalpurp ose systems reference counting has b een abandoned in favor of

tracing garbage collectors which actually traverse trace out the graph of live ob jects distinguishing them

from the unreachable garbage ob jects which can then b e reclaimed

MarkSweep Collection

Marksweep garbage collectors McC are named for the two phases that implement the abstract garbage

collection algorithm we describ ed earlier



This can b e useful for nalization that is p erforming cleanup actions like closing les when ob jects die Rov



DeTreville DeT argues that the lo calitycharacteristics of referencecounting may b e sup erior to those of other col

lection techniques based on exp erience with the Topaz system However as WLM shows generational techniques

can recapture some of this lo cality

Distinguish the live objects from the garbageThisisdoneby tracingstarting at the ro ot set and actually

traversing the graph of p ointer relationshipsusually by either a depthrst or breadthrst traversal The

ob jects that are reached are marked in some way either by altering bits within the ob jects or p erhaps by

recording them in a bitmap or some other kind of table

Reclaim the garbage Once the live ob jects have b een made distinguishable from the garbage ob jects

memory is swept that is exhaustively examined to nd all of the unmarked garbage ob jects and reclaim

their space Traditionally as with reference counting these reclaimed ob jects are linked onto one or more

free lists so that they are accessible to the allo cation routines

There are three ma jor problems with traditional marksweep garbage collectors First it is dicult to

handle ob jects of varying sizes without fragmentation of the available memory The garbage ob jects whose

space is reclaimed are intersp ersed with live ob jects so allo cation of large ob jects may b e dicult several

small garbage ob jects may not add up to a large contiguous space This can b e mitigated somewhat bykeeping

separate free lists for ob jects of varying sizes and merging adjacent free spaces together but diculties remain

The system must cho ose whether to allo cate more memory as needed to create small data ob jects or to divide

up large contiguous hunks of free memory and risk p ermanently fragmenting them This fragmentation problem

is not unique to marksweepit o ccurs in reference counting as well and in most explicit heap management

schemes

The second problem with marksweep collection is that the cost of a collection is prop ortional to the size

of the heap including b oth live and garbage ob jects All live ob jects must b e marked and all garbage ob jects

must b e collected imp osing a fundamental limitation on any p ossible improvement in eciency

The third problem involves lo cality of reference Since ob jects are never moved the live ob jects remain

in place after a collection intersp ersed with free space Then new ob jects are allo cated in these spaces the

result is that ob jects of very dierent ages b ecome interleaved in memory This has negative implications for

lo cality of reference and simple marksweep collectors are often considered unsuitable for most

applications It is p ossible for the working set of active ob jects to b e scattered across many virtual memory

pages so that those pages are frequently swapp ed in and out of main memory This problem may not b e as

bad as manyhave thought b ecause ob jects are often created in clusters that are typically active at the same

time Fragmentation and lo cality problems are is unavoidable in the general case however and a p otential

problem for some programs

It should b e noted that these problems may not b e insurmountable with suciently clever implementation

techniques For example if a bitmap is used for mark bits bits can b e checked at once with a bit integer

ALU op eration and conditional branch If live ob jects tend to surviveinclustersinmemory as they apparently

often do this can greatly diminish the constant of prop ortionalityofthesweep phase cost the theoretical

linear dep endence on heap size may not b e as troublesome as it seems at rst glance As a result the dominant

cost may b e the marking phase which is prop ortional to the amountoflive data that must b e traversed not

the total amount of memory allo cated The clever use of bitmaps can also reduce the cost of allo cation by

allowing fast allo cation from contiguous unmarked areas rather than using free lists

The clustered survival of ob jects may also mitigate the lo cality problems of reallo cating space amid live

ob jects if ob jects tend to survive or die in groups in memory Hay the intersp ersing of ob jects used by

dierent program phases may not b e a ma jor consideration

At this p oint the technology of marksweep collectors and related hybrids is rapidly evolving As will b e

noted later this makes them resemble copying collectors in some ways at this p ointwe do not claim to b e

able to pick a winner b etween hightech marksweep and copy collectors

MarkCompact Collection

Markcompact collectors remedy the fragmentation and allo cation problems of marksweep collectors As with

marksweep a marking phase traverses and marks the reachable ob jects Then ob jects are compactedmoving

most of the live ob jects until all of the live ob jects are contiguous This leaves the rest of memory as a single

contiguous free space This is often done by a linear scan through memory nding live ob jects and sliding

them down to b e adjacent to the previous ob ject Eventually all of the live ob jects have b een slid down to

b e adjacenttoaliveneighb or This leaves one contiguous o ccupied area at one end of heap memory and

implicitly moving all of the holes to the contiguous area at the other end

This sliding compaction has several interesting prop erties The contiguous free area eliminates fragmen

tation problems so that allo cating ob jects of various sizes is simple Allo cation can b e implemented as the

incrementingofapointer into a contiguous area of memoryinmuch the way that dierentsized ob jects can

b e allo cated on a stack In addition the garbage spaces are simply squeezed out without disturbing the

original ordering of ob jects in memory This can ameliorate lo cality problems b ecause the allo cation order

ing is usually more similar to subsequent access orderings than an arbitrary ordering imp osed by a garbage

collector CG Cla

While the lo cality that results from sliding compaction is advantageous the collection pro cess itself shares

the marksweeps unfortunate prop erty that several passes over the data are required After the initial marking

phase sliding compactors maketwo or three more passes over the live ob jects CN One pass computes

the new lo cations that ob jects will b e moved to subsequent passes must up date p ointers to refer to ob jects

new lo cations and actually move the ob jects These algorithms may b e therefore b e signicantly slower than

marksweep if a large p ercentage of data survives to b e compacted

An alternative approach is to use a twopointer algorithmwhichscansinward from b oth ends of a heap

space to nd opp ortunities for compaction One p ointer scans downward from the top of the heap lo oking for

live ob jects and the other scans upward from the b ottom lo oking for a hole to put it in Manyvariations of this

algorithm are p ossible to deal with multiple areas holding dierentsized ob jects and to avoid intermingling

ob jects from widelydisp ersed areas For a more complete treatment of compacting algorithms see Knu

CN

Copying Garbage Collection

Like markcompact but unlike marksweep copying garbage collection do es not really collect garbage

Rather it moves all of the live ob jects into one area and the rest of the heap is then known to b e avail

able b ecause it contains only garbage Garbage collection in these systems is thus only implicit and some

researchers avoid applying that term to the pro cess

Copying collectors like markingandcompacting collectors move the ob jects that are reached by the

traversal to a contiguous area While compacting collectors use a separate marking phase that traverses the

live data copying collectors integrate the traversal of the data and the copying pro cess so that most ob jects

need only b e traversed once Ob jects are moved to the contiguous destination area as they are reached by the

traversal The work needed is prop ortional to the amountoflive data all of whichmust b e copied

The term scavenging is applied to the copying traversal b ecause it consists of picking out the worthwhile

ob jects amid the garbage and taking them away

A Simple Copying Collector StopandCopy Using Semispaces Avery common kind of copying

garbage collector is the semispace collector FY using the Cheney algorithm for the copying traversal Che



We will use this collector as a reference mo del for much of this pap er

In this scheme the space devoted to the heap is sub divided into twocontiguous semispaces During normal

program execution only one of these semispaces is in use as shown in Fig Memory is allo cated linearly

upward through this current semispace as demanded by the executing program This is much like allo cation

from a stack or in a sliding compacting collector and is similarly fast there is no fragmentation problem

when allo cating ob jects of various sizes

When the running program demands an allo cation that will not t in the unused area of the current

semispace the program is stopp ed and the copying garbage collector is called to reclaim space hence the term



As a historical note the rst copying collector was Minskys collector for Lisp Min Rather than copying

data from one area of memory to another a single heap space was used The livedatawere copied out to a le

and then read backininacontiguous area of the heap space On mo dern machines this would b e unb earably

slow b ecause le op erationswriti ng and reading every live ob jectare nowmany orders of magnitude slower than

memory op erations ROOT SET

FROMSPACE TOSPACE

Fig A simple semispace garbage collector b efore garbage collection

stopandcopy All of the live data are copied from the current semispace fromspace to the other semispace

tospace Once the copying is completed the tospace semispace is made the current semispace and program

execution is resumed Thus the roles of the two spaces are reversed each time the garbage collector is invoked

See Fig

Perhaps the simplest form of copying traversal is the Cheney algorithm Che The immediatelyreachable

ob jects form the initial queue of ob jects for a breadthrst traversal A scan p ointerisadvanced through

the rst ob ject lo cation by lo cation Each time a p ointer into fromspace is encountered the referredtoob ject

is transp orted to the end of the queue and the p ointer to the ob ject is up dated to refer to the new copyThe ROOT SET

FROMSPACE TOSPACE

Fig Semispace collector after garbage collection

free p ointer is then advanced and the scan continues This eects the no de expansion for the breadthrst

traversal reaching and copying all of the descendants of that no de See Fig Reachable data structures

in fromspace are shown at the top of the gure followed by the rst several states of tospace as the collection

pro ceedstospace is shown in linear address order to emphasize the linear scanning and copying

Rather than stopping at the end of the rst ob ject the scanning pro cess simply continues through subse

quent ob jects nding their ospring and copying them as well A continuous scan from the b eginning of the

queue has the eect of removing consecutive no des and nding all of their ospring The ospring are copied

to the end of the queue Eventually the scan reaches the end of the queue signifying that all of the ob jects B F

ROOT A SET E

C D

AAAAAAAAAAAAAAAA

AAAAAA AAAAAA A A A A

AAAAAA AAAAAA A A A

i) A

AAAAAAAAAAAAAAAA A B

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAA

AAAAAA AAAAAA AAAAAA A A A A A

ii) A

AAAAAA AAAAAA AAAAAA A A A A A A

AAAAAAAAAAAAAAAAAAAAAAAA A B C

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAA AAAAAA AAAAAA AAAAAA A A A A A A A

iii) A

AAAAAA AAAAAA AAAAAA AAAAAA A A A A A A A A

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A B C D

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAA AAAAAA AAAAAA AAAAAA AAAAAAAAA A A A A A A A A A A

iv) A AAAAAA AAAAAA AAAAAA AAAAAA AAAAAAAAA A A A A A A A A A A A

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A B C D E

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAA AAAAAAAAA AAAAAA AAAAAA AAAAAAAAA AAAAAA A A A A A A A A A A A AAAAA A

AAA AAAAAAAAA AAAAAA AAAAAA AAAAAAAAA AAAAAA A A A A A A A A A A A A A

v) AAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA AAAA A B C D EF

Scan Free

Fig The Cheney algorithm of breadthrst copying

that have b een reached and copied have also b een scanned for descendants This means that there are no

more reachable ob jects to b e copied and the scavenging pro cess is nished

Actually a slightly more complex pro cess is needed so that ob jects that are reached bymultiple paths

are not copied to tospace multiple times When an ob ject is transp orted to tospace a forwarding pointer is

installed in the old version of the ob ject The forwarding p ointer signies that the old ob ject is obsolete and

indicates where to nd the new copy of the ob ject When the scanning pro cess nds a p ointer into fromspace

the ob ject it refers to is checked for a forwarding p ointer If it has one it has already b een moved to tospace

so the p ointer it has b een reached by is simply up dated to p oint to its new lo cation This ensures that each

live ob ject is transp orted exactly once and that all p ointers to the ob ject are up dated to refer to the new

copy

Eciency of Copying Collection A copying garbage collector can b e made arbitrarily ecient if sucient

memory is available Lar App The workdoneateach collection is prop ortional to the amount of live

data at the time of garbage collection Assuming that approximately the same amount of data is liveatany

given time during the programs execution decreasing the frequency of garbage collections will decrease the

total amount of garbage collection eort

A simple way to decrease the frequency of garbage collections is to increase the amount of memory in

the heap If each semispace is bigger the program will run longer b efore lling it Another way of lo oking at

this is that by decreasing the frequency of garbage collections we are increasing the average age of ob jects at

garbage collection time Ob jects that b ecome garbage b efore a garbage collection neednt b e copied so the

chance that an ob ject will never have to b e copied is increased

Supp ose for example that during a program run twentymegabytes of memory are allo cated but only one

megabyte is liveatany given time If wehavetwo threemegabyte semispaces garbage will b e collected ab out

ten times Since the current semispace is one third full after a collection that leaves twomegabytes that can

b e allo cated b efore the next collection This means that the system will copy ab out half as much data as it

allo cates as shown in the top part of Fig Arrows representcopying of live ob jects b etween semispaces at

garbage collections

On the other hand if the size of the semispaces is doubled megabytes of free space will b e available

after each collection This will force garbage collections a third as often or ab out or times during the run

This straightforwardly reduces the cost of garbage collection by more than half as shown in the b ottom part

of Fig

NonCopying Implicit Collection

RecentlyBaker Bak has prop osed a new kind of noncopying collector that with some of the eciency

advantages of a copying scheme Bakers insightisthatinacopying collector the spaces of the collector are

really just a particular implementation of sets Another implementation of sets could do just as well provided

that it has similar p erformance characteristics In particular given a p ointer to an ob ject it must b e easy

to determine which set it is a memb er of in addition it must b e easy to switch the roles of the sets just as

fromspace and tospace roles are exchanged in a copy collector

Bakers noncopying system adds twopointer elds and a color eld to each ob ject These elds are

invisible to the application programmer and serve to link eachhunk of storage into a doublylinked list that

serves as a set The color eld indicates which set an ob ject b elongs to

The op eration of this collector is simple and isomorphic to the copy collectors op eration Chunks of free

space are initially linked to form a doublylinked list and are allo cated simply by incrementing a p ointer into

this list The allo cation p ointer serves to divide the list into the part that has b een allo cated and the remaining

free part Allo cation is fast b ecause it only requires advancing this p ointer to p oint at the next elementof

the free list Unlike the copying scheme this do es not eliminate fragmentation problems supp orting variable

sized ob jects requires multiple free lists and may result in fragmentation of the available space

When the free list is exhausted the collector traverses the live ob jects and moves them from the allo cated

set whichwe could call the fromset to another set the toset This is implemented by unlinking the ob ject

from the doublylinked fromset list toggling its mark eld and linking it into the tosets doublylinked list

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA AAAA A A A

AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

AAA AAAA AAAA AAAA AAAA AAAA

Fig Memory usage in a semispace GC with MB top and MB b ottom semispaces

Just as in a copy collector space reclamation is implicit When all of the reachable ob jects havebeen

traversed and moved from the fromset to the toset the fromset is known to contain only garbage It is

therefore a list of free space which can immediately b e put to use as a free list As we will explain in section

Bakers scheme is actually somewhat more complex b ecause his collector is incremental The cost of

the collection is prop ortional to the number of live ob jects and the garbage ones are all reclaimed in small

constant time

This scheme has b oth advantages and disadvantages compared to a copy collector On the minus side

the p erob ject constants are probably a little bit higher and fragmentation problems are still p ossible On

the plus side the tracing cost for large ob jects is not as high As with a marksweep collector the whole

ob ject neednt b e copied if it cant contain p ointers it neednt b e scanned either Perhaps more imp ortantly

for many applications this scheme do es not require the actual languagelevel p ointers b etween ob jects to b e

changed and this imp oses fewer constraints on compilers As well explain later this is particularly imp ortant

for parallel and realtime incremental collectors

Cho osing Among Basic Techniques

Treatments of garbage collection algorithms in textb o oks often stress asymptotic complexity but all basic

algorithms have roughly similar costs esp ecially when we view garbage collection as part of the overall free

storage managementscheme Allo cation and garbage collection are two sides of the basic memory reuse coin

and any algorithm incurs costs at allo cation time if only to initialize the elds of new ob jects

Any of the ecient collection schemes therefore has three basic cost comp onents which are the initial

work required at each collection such as ro ot set scanning the work done at p er unit of allo cation

prop ortional to the amount of allo cation or the numb er of ob jects allo cated and the work done during

garbage detection eg tracing

The latter two costs are usually similar in that the amount of live data is usually some signicant p ercentage

of the amount of garbage Thus algorithms whose cost is prop ortional to the amount of allo cation eg mark

sweep may b e comp etitive with those whose cost is prop ortional to the amountoflive data traced eg

copying

For example supp ose that p ercent of all allo cated data survive a collection and p ercent never need

to b e traced In deciding which algorithm is more ecient the asymptotic complexity is less imp ortant than

the asso ciated constants If the cost of sweeping an ob ject is ten times less than the cost of copying it the

marksweep collector costs ab out the same as as copy collector If a marksweep collectors sweeping cost is

billed to the allo cator and its small relative to the cost of initializing the ob jects then it b ecomes obvious that

the sweep phase is just not terribly exp ensive While currentcopying collectors app ear to b e more ecient

than current marksweep collectors the dierence is not large for stateofthe art implementations

Further real highp erformance systems often use hybrid techniques to adjust tradeos for dierent cat

egories of ob jects Many highp erformance copy collectors use a separate large object area CWB UJ

to avoid copying large ob jects from space to space The large ob jects are kept o to the side and usually

managed inplace bysomevariety of marking traversal and free list technique

A ma jor p ointinfavor of inplace collectors such as marksweep and treadmill schemes is the abilityto

makethemconservative with resp ect to data valuesthatmay b e p ointers or may not This allows them to

b e used for languages like C or otheshelf optimizing compilers BW Bar BDS which can make

it dicult or imp ossible to unambiguously identify all p ointers at run time A nonmoving collector can b e

conservative b ecause anything that lo oks likeapointer ob ject can b e left where it is and the p ossible p ointer

to it do esnt need to b e changed In contrast a copying collector must know whether a value is a p ointerand

whether to move the ob ject and up date the p ointer For example if presumed p ointers were up dated and

some were actually integers the program would break b ecause the integers would b e mysteriously changed by

the garbage collector

Problems with a Simple Garbage Collector

It is widely known that the asymptotic complexityofcopying garbage collection is excellentthe copying cost

approaches zero as memory b ecomes very large Treadmill collection shares this prop erty but other collectors

can b e similarly ecient if the constants asso ciated with memory reclamation and reallo cation are small

enough In that case garbage detection is the ma jor cost

Unfortunately it is dicult in practice to achieve high eciency in a simple garbage collector b ecause

large amounts of memory are to o exp ensive If virtual memory is used the p o or lo cality of the allo cation

and reclamation cycle will generally cause excessive paging Every lo cation in the heap is used b efore any

lo cations space is reclaimed and reused Simply paging out the recentlyallo cated data is exp ensivefora

highsp eed pro cessor Ung and the paging caused bythecopying collection itself may b e tremendous since

all live data must b e touched in the pro cess

It therefore do esnt generally pay to make the heap area larger than the available main memoryFor a

mathematical treatment of this tradeo see Lar Even as main memory b ecomes steadily cheap er lo cality

within cache memory b ecomes increasingly imp ortant so the problem is simply shifted to a dierentlevel of

the memory hierarchy WLM

In general we cant achieve the p otential eciency of simple garbage collection increasing the size of

memory to p ostp one or avoid collections can only b e taken so far b efore increased paging time negates any

advantage

It is imp ortant to realize that this problem is not unique to copying collectors All garbage collection

strategies involve similar spacetime tradeosgarbage collections are p ostp oned so that garbage detection

work is done less often and that means that space is not reclaimed as quicklyOnaverage that increases the



amount of memory wasted due to unreclaimed garbage

While copying collectors were originally designed to improve lo cality in their simple versions this improve

ment is not large and their lo cality can in fact b e worse than that of noncompacting collectors These systems

may improve the lo cality of reference to longlived data ob jects whichhave b een compacted into a contiguous

area However this eect is swamp ed by the pattern of references due to allo cation Large amounts of memory

are touched between collections and this alone makes them unsuitable for a virtual memory environment

The ma jor lo cality problem is not with the lo cality of compacted data or with the lo cality of the garbage

collection pro cess itself The problem is an indirect result of the use of garbage collectionby the time space

is reclaimed and reused its likely to have b een paged out simply b ecause to o many other pages havebeen

allo cated in b etween Compaction is helpful but the help is generally too little toolate With a simple

semispace copy collector lo calityislikelytobeworse than that of a marksweep collector simply b ecause the

copy collector uses more total memoryonly half the memory can b e used b etween collections Fragmentation



of live data is not as detrimental as the regular reuse of two spaces

The only waytohave go o d lo cality is to ensure that memory is large enough to hold the regularlyreused

area Another approachwould b e to rely on optimizations such as prefetching but this is not feasible at the

level of virtual memorydisks simply cant keep up with the rate of allo cation b ecause of the enormous sp eed

dierential b etween RAM and disk Generational collectors address this problem by reusing a smaller amount

of memory more often they will b e discussed in Sect For historical reasons is widely b elieved that only

copying collectors can b e made generational but this is not the case Generational marksweep collectors are

somewhat harder to construct but they do exist and are quite practical DWH



Deferred reference counting like tracing collection also trades space for timein giving up continual incremental

reclamation to avoid sp ending CPU cycles in adjusting reference counts one gives up space for ob jects that b ecome

garbage and are not immediately reclaimed At the time scale on which memory is reused the resulting lo cality

characteristics must share basic p erformance tradeo characteristics with generational collectors of the copying or

marksweep varieties which will b e discussed later



Slightly more complicated copying schemes app ear to avoid this problem Ung WM but WLM demonstrates

that cyclic memory reuse patterns can fare p o orly in hierarchical memories b ecause of recencybased eg LRU

replacement p olicies This suggests that freed memory should b e reused in a LIFO fashion ie in the opp osite order

of its previous allo cation if the entire reuse pattern cant b e kept in memory

Finally the temp oral distribution of a simple tracing collectors work is also troublesome in an interactive

programming environment it can b e very disruptive to a users work to suddenly have the system b ecome

unresp onsive and sp end several seconds garbage collecting as is common in such systems For large heaps

the pauses may b e on the order of seconds or even minutes if a large amount of data is disp ersed through

virtual memory Generational collectors alleviate this problem b ecause most garbage collections only op erate

on a subset of memoryEventually they must garbage collect larger areas however and the pauses maybe

considerably longer For real time applications this may not b e acceptable

Incremental Tracing Collectors

For truly realtime applications negrained incremental garbage collection app ears to b e necessary Garbage

collection cannot b e carried out as one atomic action while the program is halted so small units of garbage

collection must b e interleaved with small units of program execution As we said earlier it is relatively easy to

make reference counting collectors incremental Reference countings problems with eciency and eectiveness

discourage its use however and it is therefore desirable to make tracing copying or marking collectors

incremental

In most of the following discussion the dierence b etween copying and marksweep collectors is not par

ticularly imp ortant The incremental tracing for garbage detection is more interesting than the incremental

reclamation of detected garbage

The diculty with incremental tracing is that while the collector is tracing out the graph of reachable

data structures the graph maychangethe running program may mutate the graph while the collector isnt

lo oking For this reason discussions of incremental collectors typically refer to the running program as the

mutator DLM From the garbage collectors p oint of view the actual application is merely a coroutine

or concurrent pro cess with an unfortunate tendency to mo dify data structures that the collector is attempting

to traverse An incremental scheme must havesomewayofkeeping trackofthechanges to the graph of

reachable ob jects p erhaps recomputing parts of its traversal in the face of those changes

An imp ortantcharacteristic of incremental techniques is their degree of conservatism with resp ect to

changes made bythemutator during garbage collection If the mutator changes the graph of reachable ob jects

freed ob jects mayormay not b e reclaimed by the garbage collector Some oating garbage may go unreclaimed

b ecause the collector has already categorized the ob ject as live b efore the mutator frees it This garbage is

guaranteed to b e collected at the next cycle however b ecause it will b e garbage at the beginning of the next

collection

Tricolor Marking

The abstraction of tricolor marking is helpful in understanding incremental garbage collection Garbage collec

tion algorithms can b e describ ed as a pro cess of traversing the graph of reachable ob jects and coloring them

The ob jects sub ject to garbage collection are conceptually colored white and by the end of collection those

that will b e retained must b e colored black When there are no reachable no des left to blacken the traversal

of live data structures is nished

In a simple marksweep collector this coloring is directly implemented by setting mark bitsob jects whose

bit is set are black In a copy collector this is the pro cess of moving ob jects from fromspace to tospace

unreached ob jects in fromspace are considered white and ob jects moved to tospace are considered black

The abstraction of coloring is orthogonal to the distinction b etween marking and copying collectors and is

imp ortant for understanding the basic dierences b etween incremental collectors

In incremental collectors the intermediate states of the coloring traversal are imp ortant b ecause of ongoing

mutator activitythe mutator cant b e allowed to change things b ehind the collectors back in suchaway

that the collector will fail to nd all reachable ob jects

To understand and preventsuchinteractions b etween the mutator and the collector it is useful to intro duce

a third color grey to signify that an ob ject has b een reached by the traversal but that its descendants may

not have been That is as the traversal pro ceeds outward from the ro ots ob jects are initially colored grey

When they are scanned and p ointers to their ospring are traversed they are blackened and the ospring are

colored grey

In a copying collector the grey ob jects are the ob jects in the unscanned area of tospacethe ones b etween

the scan and free p ointers Ob jects that have b een passed by the scan p ointer are black In a marksweep

collector the grey ob jects corresp ond to the stack or queue of ob jects used to control the marking traversal

and the black ob jects are the ones that have b een removed from the queue In b oth cases ob jects that have

not b een reached yet are white

Intuitively the traversal pro ceeds in a wavefront of grey ob jects which separates the white unreached

ob jects from the black ob jects that have b een passed by the wavethat is there are no p ointers directly from

black ob jects to white ones This abstracts away from the particulars of the traversal algorithmit maybe

depthrst breadthrst or just ab out any kind of exhaustive traversal It is only imp ortantthata welldened

grey fringe b e identiable and that the mutator preserve the invariantthatnoblack ob ject hold a p ointer

directly to a white ob ject

The imp ortance of this invariant is that the collector must b e able to assume that it is nished with black

ob jects and can continue to traverse grey ob jects and movethewavefrontforward If the mutator creates a

p ointer from a black ob ject to a white one it must somehow co ordinate with the collector to ensure that the

collectors b o okkeeping is broughtuptodate

Figure demonstrates this need for co ordination Supp ose the ob ject A has b een completely scanned and

therefore blackened its descendants have b een reached and greyed Now supp ose that the mutator swaps the

p ointer from A to C with the p ointer from B to D The only p ointer to D is now in a eld of A which the

collector has already scanned If the traversal continues without any co ordination C will b e reached again

from B and D will never b e reached at all

Incremental approaches There are two basic approaches to co ordinating the collector with the mutator

One is to use a read barrier which detects when the mutator attempts to access a p ointer to a white ob ject

and immediately colors the ob ject grey since the mutator cant read p ointers to white ob jects it cant install

them in black ob jects The other approach is more direct and involves a write barrierwhen the program

attempts to write a p ointer into an ob ject the write is trapp ed or recorded

Write barrier approaches in turn fall into two dierent categories dep ending on which asp ect of the

problem they address To foil the garbage collectors marking traversal it is necessary for the mutator to

write a p ointer to a white ob ject intoablackobject and destroy the original p ointer b efore the collector

sees it

If the rst condition writing the p ointer into a black ob ject do es not hold no sp ecial action is neededif

there are other p ointers to the white ob ject from grey ob jects it will b e retained and if not it is garbage

and neednt b e retained anyway If the second condition obliterating the original path to the ob ject do es not

hold the ob ject will b e reached via the original p ointer and retained The two writebarrier approaches fo cus

on these two asp ects of the problem

Snapshotatbeginning collectors ensure that the second condition cannot happ enrather than allowing

p ointers to b e simply overwritten they are rst saved so that the collector can nd them Thus no paths to

white ob jects can b e broken without providing another path to the ob ject for the garbage collector

Incremental update collectors are still more direct in dealing with these troublesome p ointers Rather than

saving copies of all p ointers that are overwritten b ecause they might have already b een copied into black

ob jects they actually record p ointers stored into black ob jects and catch the troublesome p ointers at their

destination rather than their source That is if a p ointer to a white ob ject is copied into a black ob ject that

new copy of the p ointer will b e found Conceptually the black ob ject or part of it is reverted to grey when the

mutator undo es the collectors traversal Alternatively the p ointedto ob ject maybegreyed immediately

This ensures that the traversal is up dated in the face of mutator changes

Bakers Incremental Copying

The b estknown realtime garbage collector is Bakers incremental copying scheme Bak It is an adaptation

of the simple copy collection scheme describ ed in Sect and uses a read barrier for co ordination with the

A A

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAAAAA AAAAAAA AAAAAAA AAAAAAA AAAAAAA AAAAAAA A A A A A A

AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

B C B C

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAAAAA AAAAAAA AAAAAAA AAAAAAA AAAAAAA AAAAAAA A A A A A A

AAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAA

AAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAA

AAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAA

AAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAA

AAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

D D

AAAA AAAA AAAA AAAA

AAAAAAAAAAAAAA A A

AAAAAAAAAAAAAAA

AAAAAAAAAAAAAAA

AAAAAAAAAAAAAAA

AAAAAAAAAAAAAAA

AAAAAAAAAAAAAAA

AAAA AAAA AAAA AAAA

Before After

Fig A violation of the coloring invariant

mutator For the most part the copying of data pro ceeds in the Cheney breadthrst fashion byadvancing

the scan p ointer through the unscanned area of tospace and moving any referredto ob jects from fromspace

This background scavenging is interleaved with mutator op eration however

An imp ortant feature of Bakers scheme is its treatment of ob jects allo cated bythemutator during in

cremental collection These ob jects are allo cated in tospace and are treated as though they had already b een

scannedie they are assumed to b e live In terms of tricolor marking new ob jects are black when allo cated



and none of them can b e reclaimed they are never reclaimed until the next garbage collection cycle



Baker suggests copying old live ob jects into one end of tospace and allo cating new ob jects in the other end The

In order to ensure that the scavenger nds all of the live data and copies it to tospace b efore the free area

in newspace is exhausted the rate of copy collection work is tied to the rate of allo cation Each time an ob ject

is allo cated an increment of scanning and copying is done

In terms of tricolor marking the scanned area of tospace contains black ob jects and the copied but

unscanned ob jects b etween the scan and free p ointer are grey Asyet unreached ob jects in fromspace are

white The scanning of ob jects and copying of their ospring moves the wavefrontforward

In addition to the background scavenging other ob jects may b e copied to tospace as needed to ensure that

the basic invariant is not violatedp ointers into fromspace must not b e stored into ob jects that have already

b een scanned undoing the collectors work

Bakers approach is to couple the collectors copying traversal with the mutators traversal of data struc

tures The mutator is never allowed to see p ointers into fromspace ie p ointers to white ob jects Whenever

the mutator reads a p otential p ointer from the heap it immediately checks to see if it is a p ointer into

fromspace if so the referent is copied to tospace ie its color is changed from white to grey In eect this

advances the wavefront of greying just ahead of the actual references by the mutator keeping the mutator



inside the wavefront

It should b e noted that Bakers collector itself changes the graph of reachable ob jects in the pro cess of

copying The read barrier do es not just inform the collector of changes bythemutator to ensure that ob jects

arent lost it also shields the mutator from viewing temp orary inconsistencies created by the collector If this

were not done the mutator might encounter two dierentpointers to versions of the same ob ject one of them

obsolete

This shielding of the mutator from white ob jects has come to b e called a read barrier b ecause it prevents

p ointers to white ob jects from b eing read by the program at all

The read barrier may b e implemented in software by preceding each read of a p otential p ointer from the

heap with a check and a conditional call to the copyingandup dating routine Compiled co de thus contains

extra instructions to implement the read barrier Alternativelyitmay b e implemented with sp ecialized

hardware checks andor micro co ded routines

The read barrier is quite exp ensive on sto ck hardware b ecause in the general case any load of a p ointer

must check to see if the p ointer p oints to a fromspace white ob ject if so it must execute co de to move the

ob ject to tospace and up date the p ointer The cost of these checks is high on conventional hardware b ecause

they o ccur very frequently Lisp Machines have sp ecial purp ose hardware to detect p ointers into fromspace

and trap to a handlerGre Mo o Joh but on conventional machines the checking overhead is in the

tens of p ercent for a high p erformance system

Bro oks has prop osed a variation on Bakers scheme where ob jects are always referred to via an indirection

eld emb edded in the ob ject itself Bro If an ob ject is valid its indirection eld p oints to itself If its

an obsolete version in tospace its indirection p ointer p oints to the new version Unconditionally indirecting

is cheap er than checking for indirections but would still incur overheads in the tens of p ercent for a high

p erformance system A variant of this approach has b een used by North and Reppy in a concurrent garbage

collector NR Zorn takes a dierentapproach to reducing the read barrier overhead using knowledge of

imp ortant sp ecial cases and sp ecial compiler techniques Still the time overheads are on the order of twenty

p ercent Zor

The Treadmill

RecentlyBaker has prop osed a noncopying version of his scheme which uses doublylinked lists and p er

ob ject color elds to implement the sets of ob jects of each color rather than separate memory areas By

two o ccupied areas of tospace thus growtoward eachother



Nilsens variant of Bakers algorithm up dates the p ointers without actually copying the ob jectsthe copying is

lazy and space in tospace is simply reserved for the ob ject b efore the p ointer is up dated Nil This makes it

easier to provide smaller b ounds on the time taken by list op erations and to gear collector work to the amountof

allo cation i ncl udi ng guaranteeing shorter pauses when smaller ob jects are allo cated

avoiding the actual moving of ob jects and up dating of p ointers the scheme puts fewer restrictions on other



asp ects of language implementation

Allocation

Free

New

From To

Scanning

Fig Treadmill collector during collection

This noncopying scheme preserves the essential eciency advantage of copy collection by reclaiming space

implicitly As describ ed in Sect unreached ob jects on the allo cated list can b e reclaimed by app ending

the remainder of that list to the free list The realtime version of this scheme links the various lists into a

cyclic structure as shown in Fig This cyclic structure is divided into four sections

The new list is where allo cation of new ob jects o ccurs during garbage collectionit is contiguous with the

free list and allo cation o ccurs byadvancing the p ointer that separates them At the b eginning of garbage

collection the new segmentisempty



In particular it is p ossible to deal with compilers that do not unambiguousl y identify p ointer variables in the stack

making it imp ossible to use simple copy collection

The from list holds ob jects that were allo cated b efore garbage collection b egan and which are sub ject to

garbage collection As the collector and mutator traverse data structures ob jects are moved from the from list

to the to list The to list is initially empty but grows as ob jects are unsnapp ed unlinked from the from

list and snapp ed into the to list during collection

The new list contains new ob jects which are allo cated black The tolist contains b oth black ob jects

whichhave b een completely scanned and grey ones whichhave b een reached but not scanned Note the

isomorphism with the copying algorithmeven an analogue of the Cheney algorithm can b e used It is only

necessary to have a scan p ointer into the from list and advance it through the grey ob jects

Eventually all of the reachable ob jects in the from list have b een moved to the to list and scanned for

ospring When no more ospring are reachable all of the ob jects in the tolist are black and the remaining

ob jects in the from list are known to b e garbage At this p oint the garbage collection is complete The from

list is nowavailable and can simply b e merged with the free list The to list and the new list b oth hold ob jects



that were preserved and they can b e merged to form the new tolist at the next collection

The state is very similar to the b eginning of the previous cycle except that the segments have moved

partway around the cyclehence the name treadmill

Baker describ es this algorithm as b eing isomorphic to his original incremental copying algorithm presum

ably including the close coupling b etween the mutator and the collector ie the read barrier

Conservatism in Bakers scheme Bakers garbage collector uses a somewhat conservative approximation



of true liveness in twoways The most obvious one is that ob jects allo cated during collection are assumed

to b e live even if they die b efore the collection is nished The second is that preexisting ob jects may

b ecome garbage after having b een reached by the collectors traversal and they will not b e reclaimedonce

an ob ject has b een greyed it will b e considered liveuntil the next garbage collection cycle On the other

hand if ob jects b ecome garbage during collection and all paths to those ob jects are destroyed before b eing

traversed then they wil l b e reclaimed That is the mutator mayoverwrite a p ointer from a grey ob ject

destroying the only path to one or more white ob jects and ensuring that the collector will not nd them

Thus Bakers incremental scheme incrementally up dates the reachability graph of preexisting ob jects only

when grey ob jects havepointers overwritten Overwriting p ointers from black ob jects has no eect however

b ecause their referents are already grey The degree of conservatism and oating garbage thus dep ends on

the details of the collectors traversal and of the programs actions

SnapshotatBeginning writebarrier algorithms

If a noncopying collector is used the use of a read barrier is an unnecessary exp ense there is no need to protect

the mutator from seeing an invalid version of a p ointer Write barrier techniques are cheap er b ecause heap

writes are several times less common than heap reads Snapshotatbeginning algorithms use a write barrier

to ensure that no ob jects ever b ecome inaccessible to the garbage collector while collection is in progress

Conceptually at the b eginning of garbage collection a copyonwrite virtual copy of the graph of reachable

data structures is made That is the graph of reachable ob jects is xed at the moment garbage collection

starts even though the actual traversal pro ceeds incrementally

Perhaps the simplest and b estknown snapshot collection algorithm is Yuasas Yua If a lo cation is

written to the overwritten value is rst saved and pushed on a marking stack for later examination This

guarantees that no ob jects will b ecome unreachable to the garbage collector traversalall ob jects live at the

b eginning of garbage collection will b e reached even if the p ointers to them are overwritten In the example

shown in Fig the p ointer from B to D is pushed onto the stack whenitisoverwritten with the p ointer to

C



This discussion is a bit oversimplied Baker uses four colors and whole lists can have their colors changed instan

taneously bychanging the sense of the bit patterns rather than the patterns themselves



This kind of conservatism is not to b e confused with the conservative treatmentofpointers that cannot b e unambigu

ously identied For a more complete and formal discussion of various kinds of conservatism in garbage collection

see DWH

Yuasas scheme has a large advantage over Bakers on sto ck hardware b ecause only heap p ointer writes must

b e treated sp ecially to preserve the garbage collector invariants Normal p ointer dereferencing and comparison

do es not incur any extra overhead

On the other hand Yuasas scheme is more conservative than Bakers Not only are all ob jects allo cated

during collection retained but no ob jects can b e freed during collectionall of the overwritten p ointers are

preserved and traversed These ob jects are reclaimed at the next garbage collection cycle

Incremental Up date WriteBarrier Algorithms

While b oth are writebarrier algorithms snapshotatb eginning and incremental update algorithms are quite

dierent Unfortunately incremental up date algorithms have generally b een cast in terms of parallel systems

rather than as incremental schemes for serial pro cessing p erhaps due to this they have b een largely overlo oked

by implementors targeting unipro cessors

Perhaps the b est known of these algorithms is due to Dijkstra et al DLM This is similar to the

scheme develop ed indep endently by Steele Ste but simpler b ecause it do es not deal with compactication

Rather than retaining everything thats in a snapshot of the graph at the beginning of garbage collection it

heuristically and somewhat conservatively attempts to retain the ob jects that are liveatthe end of garbage

collection Ob jects that die during garbage collectionand b efore b eing reached by the marking traversalare

not traversed and marked

Toavoid the problem of p ointers escaping into reachable ob jects that have already b een scanned such

copied p ointers are caught at their destination rather than their source Rather than noticing when a p ointer

escap es from a lo cation that hasnt b een traversed it notices when the p ointer escap es into an ob ject that has

already b een traversed If a p ointer is overwritten without b eing copied elsewhere so much the b etterthe

ob ject is garbage so it mightaswell not get marked

If the p ointer is installed into an ob ject already determined to b e live that p ointer must b e taken into

accountit has now b een incorp orated into the graph of reachable data structures Such p ointer stores are

recorded by the write barrierthe collector is notied which black ob jects may hold p ointers to white ob jects

in eect reverting those ob jects to grey Those formerlyblack ob jects will b e scanned again b efore the garbage

collection is complete to nd anylive ob jects that would otherwise escap e This pro cess may iterate b ecause

more black ob jects may b e reverted while the collector is in the pro cess of traversing them The traversal is

guaranteed to complete however and the collector eventually catches up with the mutator

Ob jects that b ecome garbage during garbage collection may b e reclaimed at the end of that garbage

collection not the next one This is similar to Bakers readbarrier algorithm in its treatment of preexisting

ob jectsthey are not preserved if they b ecome garbage b efore b eing reached by the collector

It is less conservative than Bakers and Yuasas algorithms in its treatment of ob jects allo cated by the

mutator during collo cation however Bakers and Yuasas schemes assume such newlycreated ob jects are

live b ecause p ointers to them may get installed into ob jects that have already b een reached by the collectors

traversal In terms of tricolor marking ob jects are allo cated black rather than whitethey are conservatively

assumed to b e part of the graph of reachable ob jects In Bakers algorithm there is no write barrier to detect

whether they have b een incorp orated into the graph or not

In the Dijkstra et al scheme ob jects are assumed not to b e reachable when theyre allo cated In terms of

tricolor marking ob jects are allo cated white rather than black Atsomepoint the stackmust b e traversed

and the ob jects that are reachable at that time are marked and therefore preserved

We b elieve that this has a p otentially signicantadvantage over Bakers or Yuasas schemes Most ob jects

are shortlived so if the collector do esnt reach those ob jects early in its traversal theyre likely never to b e

reached and instead to b e reclaimed very promptly Compared to Bakers or Yuasas scheme theres an extra

computational costby assuming that al l ob jects allo cated during collection are reachable those schemes

avoid the cost of traversing and marking those that actually are reachable On the other hand theres a space

b enet with the incremental up date schemethe ma jority of those ob jects can b e reclaimed at the end of

a collection whichislikely to makeitworth traversing the others In Steeles algorithm some ob jects are

allo cated white and some are not dep ending on the colors of their referents Ste This heuristic attempts

to allo cate shortlived ob jects white to reclaim their space quickly while treating other ob jects conservatively

to avoid traversing them The cost of this technique is not quantied and its b enets are unknown

Cho osing Among Incremental Techniques

In cho osing an incremental collection design it is instructivetokeep in mind the abstraction of tricolor

marking as distinct from mechanisms such as marksweep or copy collection For example Bro oks collector

Bro is actually a write barrier algorithm even though Bro oks describ es it as an optimization of Bakers



scheme SimilarlyDawsonsDaw copy collection scheme is cast as a variantofBakers but it is actually

an incremental up date scheme similar to Dijkstra et als ob jects are allo cated in fromspace ie white

The choice of a read or writebarrier scheme is likely to b e made on the basis of the available hardware

Without sp ecialized hardware supp ort a write barrier app ears to b e easier to implement eciently b ecause

heap p ointer writes are much less common than p ointer traversals

App el Ellis and Li AEL use virtual memory pagewise access protection facilities as a coarse approx

imation of Bakers read barrierAEL AL Wil Rather than checking each load to see if a p ointer to

fromspace is b eing loaded the mutator is simply not allowed to see any page that mightcontain such a p ointer

Pointers in the scanned area of tospace are guaranteed to contain only p ointers into tospace Any p ointers from

tospace to fromspace must b e from the unscanned area so the collector simply accessprotects the unscanned

area ie the grey ob jects When the mutator accesses a protected page a trap handler immediately scans the

whole page xing up all the p ointers ie blackening all of the ob jects in the page referents in fromspace

are relo cated to tospace ie greyed and accessprotected

Unfortunately this scheme fails to provide meaningful realtime guarantees in the general case It do es

supp ort concurrent collection however and greatly reduces the cost of the read barrier In the worst case each



p ointer traversal may cause the scanning of a page of tospace until the whole garbage collection is complete

Of write barrier schemes incremental up date app ears to b e more eective than snapshot approaches

b ecause most shortlived ob jects are reclaimed quicklybut with an extra cost in traversing newlyallo cated

live ob jects This cost might b e reduced by carefully cho osing the ordering of ro ot traversal traversing the

most stable structures rst to avoid having the collectors work undone bymutator changes

Careful attention should b e paid to write barrier implementation Bo ehm Demers and Shenkers BDS

Bo e incremental up date algorithm uses virtual memory dirty bits as a coarse pagewise write barrier All

black ob jects in a page must b e rescanned if the page is dirtied again b efore the end of a collection As

with App el Ellis and Lis copy collector this coarseness sacrices realtime guarantees while supp orting

parallelism It also allows the use of otheshelf compilers that dont emit write barrier instructions along

with heap writes

In a system with compiler supp ort for garbage collection a list of storedinto lo cations can b e kept or dirty

bits can maintained in software for small areas of memory to reduce scanning costs and b ound the time

sp ent up dating the marking traversal This has b een done for other reasons in generational garbage collectors

as we will discuss in Sect

Generational Garbage Collection

Given a realistic amountofmemory eciency of simple copying garbage collection is limited by the fact that

the system must copy all live data at a collection In most programs in a variety of languages most objects live a

very short time while a smal l percentage of them live much longer LH Ung Sha Zor DeTHay

While gures vary from language to language and program to program usually b etween and p ercent



The use of uniform indirections may b e viewed as avoiding theneedforaBakerstyle read barrierthe indirections

isolate the collector from changes made by the mutator allowing them to b e decoupled The actual co ordination in

terms of tricolor marking is through a write barrier



Ralph Johnson has improved on this scheme by incorp orating lazier copying of ob jects to fromspace Joh This

decreases the maximum latency but in the very unlikely worst case a page may still b e scanned at eachpointer

traversal until a whole garbage collection has b een done the hard way

of all newlyallo cated ob jects die within a few million instructions or b efore another megabyte has b een

allo cated the ma jority of ob jects die even more quickly within tens of kilobytes of allo cation

Heap allo cation is often used as a measure of program execution rather than wall clo ck time for two

reasons One is that its indep endent of machine and implementation sp eedit varies appropriately with the

sp eed at which the program executes whichwall clo ck time do es not this avoids the need to continually cite



hardware sp eeds It is also appropriate to sp eak in terms of amounts allo cated for garbage collection studies

b ecause the time b etween garbage collections is largely determined by the amount of memory available

Future improvements in compiler technology may reduce rates of heap allo cation by putting more heap

ob jects on the stack this is not yet much of a problem for exp erimental studies b ecause most current state

oftheart compilers dont do much of this kind of lifetime analysis

Even if garbage collections are fairly close together separated by only a few kilobytes of allo cation most

ob jects die b efore a collection and never need to b e copied Of the ones that do survive to b e copied once

however alarge fraction survive through many col lections These ob jects are copied at every scavenge over

and over and the garbage collector sp ends most of its time copying the same old ob jects rep eatedly This is

the ma jor source of ineciency in simple garbage collectors

Generational col lection LHavoids much of this rep eated copying by segregating ob jects into multiple

areas by age and scavenging areas containing older ob jects less often than the younger ones Once ob jects have

survived a small numberofscavenges they are moved to a less frequently scavenged area Areas containing

younger ob jects are scavenged quite frequently b ecause most ob jects there will generally die quickly freeing

up space copying the few that survive do esnt cost much These survivors are advanced to older status after

a few scavenges to keep copying costs down

For historical reasons and simplicity of explanation we will fo cus on generational copying collectors The

choice of copying or marking collection is essentially orthogonal to the issue of generational collection however

DWH

Multiple Subheaps with Varying Scavenge Frequencies

Consider a generational garbage collector based on the semispace organization memory is divided into areas

that will hold ob jects of dierent approximate ages or generationseach generations memory is further divided

into semispaces In Fig we show a simple generational scheme with just two age groups a New generation

and an Old generation Ob jects are allo cated in the New generation until its current semispace is full Then

the New generation only is scavenged copying its live data into the other semispace as shown in Fig

If an ob ject survives long enough to b e considered old it can b e copied out of the new generation and into

the old rather than backinto the other semispace This removes it from consideration by singlegeneration

scavenges so that it is no longer copied at every scavenge Since relatively few ob jects live this long old

memory will ll much more slowly than new Eventually old memory will ll up and havetobegarbage

collected as well Figure shows the general pattern of memory use in this simple generational scheme Note

the gure is not to scalethe younger generation is typically several times smaller than the older one

The numb er of generations may b e greater than two with each successive generation holding older ob jects

and b eing scavenged considerably less often Tektronix Smalltalk is such a generational system using

semispaces for each of eight generations CWB

Detecting Intergenerational References

In order for this scheme to work it must b e p ossible to scavenge the younger generations without scavenging

the older ones Since liveness of data is a global prop ertyhowever oldmemory data must b e taken into



One must b e careful however not to interpret it as the ideal abstract measure For example rates of heap allo

cation are somewhat higher in Lisp and Smalltalk b ecause more control information andor intermediate data of

computations maybepassedaspointers to heap ob jects rather than as structures on the stack

Allo cationrel ati ve measures are still not the absolute b ottomline measure of garbage collector eciency though

b ecause decreasing work p er unit of allo cation is not nearly as imp ortant if programs dont allo cate much conversely

smaller p ercentage changes in garbage collection work mean more for programs whose memory demands are higher Younger Generation

ROOT SET

Older Generation

Fig A generational copying garbage collector b efore garbage collection Younger Generation

ROOT SET

Older Generation

Fig Generational collector after garbage collection AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AA AA AA AA AA AA AA AA AA AA AA AA AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AA AA AA AA AA AA AA AA AA AA AA AA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AA AA AA AA AA AA AA AA AA AA AA AA AA h generation AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA y collector with semispaces for eac AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA Memory use in a generational cop Second Generation Memory First (New) First Generation Memory

Fig p

account For example if there is a p ointer from old memory to new memorythatpointer must b e found at

scavenge time and used as one of the ro ots of the traversal Otherwise an ob ject that is livemay not b e

preserved by the garbage collector or the p ointer may simply not b e up dated appropriately when the ob ject

is moved Either eventdestroys the integrity and consistency of data structures in the heap

In the original generational collection scheme LHscheme no p ointer in old memory may p oint directly

to an ob ject in new memory instead it must p oint to a cell in an indirection table which is used as part

of the ro ot set Such indirections are transparent to the user program This technique was implemented on

Lisp machines such as the MIT machines Gre and Texas Instruments Explorer Cou There are minor



dierences b etween the two but the principles are the same

Note that other techniques are often more appropriate esp ecially on sto ckhardware Using indirection

tables intro duces overhead similar to that of Bakers read barrier A pointer recording technique can b e used

instead Rather than indirecting p ointers from old ob jects to young ones normal direct p ointers are allowed

but the lo cations of such p ointers are noted so that they can b e found at scavenge time This requires something

like a write barrier Ung Mo o that is the running program cannot freely mo dify the reachability graph

by storing p ointers into ob jects in older generation

The write barrier maydochecking at each store or it may b e as simple as maintaining dirty bits and

scanning dirty areas at collection time Sha Sob WM Wil HMS the same mechanism might

supp ort realtime incremental collection as well

The imp ortant p oint is that all references from old to new memory must b e lo cated at scavenge time and

used as ro ots for the copying traversal

Using these intergenerational p ointers as ro ots ensures that all reachable ob jects in the younger generation

are actually reached by the collector in the case of a copy collector it ensures that all p ointers to moved

ob jects are appropriately up dated

As in an incremental collector this use of a write barrier results in a conservative approximation of true

liveness anypointers from old to new memory are used as ro ots but not all of these ro ots are necessarily live

themselves An ob ject in old memory may already have died but that fact is unknown until the next time old

memory is scavenged Thus some garbage ob jects may b e preserved b ecause they are referred to from ob jects

that are oating undetected garbage This app ears not to b e a problem in practice Ung UJ

It would also b e p ossible to track all p ointers from new memory into old memory allowing old memory

to b e scavenged indep endently of new memory This is more costlyhowever b ecause there are typically

many more p ointers from new to old than from old to new This is a consequence of the way references are

typically createdby creating a new ob ject that refers to other ob jects which already exist Sometimes a

p ointer to a new ob ject is installed in an old ob ject but this is considerably less common This asymmetrical

treatment allows allows ob jectcreating co de like Lisps frequentlyused cons op eration to skip the recording

of intergenerational p ointers Only noninitializing stores into ob jects must b e checked for intergenerational

references writes that initialize ob jects in the youngest generation cant create p ointers into younger ones

Even if newtoold p ointers are not recorded it may still b e feasible to scavenge a generation without

scavenging newer ones In this case al l data in the newer generations may b e considered p ossible ro ots and

they may simply b e scanned for p ointers LH While this scanning consumes time prop ortional to the

amount of data in the newer generations each generation is usually considerably smaller than the next and

the cost may b e small relative to the cost of actually scavenging the older generation Scanning the data in

the newer generation may b e preferable to scavenging b oth generations b ecause scanning is generally faster

than copying it mayalsohave b etter lo cality

The cost of recording intergenerational p ointers is typically prop ortional to the rate of program execution



The main dierence is that the original scheme used p ergeneration entry tables indirecting and isolating the p ointers

into a generation The Explorer used exit tables indirecting the p ointers out of each generation for each generation

there is a separate exit table for p ointers into each younger generationCou

Ungar and Chamb ers improvement Cha of our card marking scheme WM Wil decreases the cost p er

heap write by using whole bytes as dirty bits Given the byte write instructions available on common architectures

the overhead is only three instructions p er p otential p ointer store at an increase in bitmap size and p ergarbage

collection scanning cost

ie its not particularly tied to the rate of ob ject creation For some programs it may b e the ma jor cost of

garbage collection b ecause several instructions must b e executed for every p otential p ointer store into the

heap This mayslow program execution down byseveral p ercent It is interesting to note that this p ointer

recording is essentially the same as that required for a write barrier incremental scheme the same cost may

serve b oth purp oses

Within the framework of the generational strategy weve outlined several imp ortant questions remain

Advancement policy Howlongmust an ob ject survive in one generation b efore it is advanced to the next

Ung WM

Heap organization How should storage space b e divided and used b etween generations and within a gen

eration Mo o Ung Sha WM How do es the resulting reuse pattern aect lo cality at the virtual

memory level Ung Zor WM and at the level of highsp eed cache memories Zor WLM

Traversal algorithms In a tracing collector the traversal of liveobjectsmayhave an imp ortantimpact

on lo cality In a copying collector ob jects are also reordered in memory as they are reached by the copy

collector What aect do es this have on lo cality and what traversal yields the b est results Bla Sta

And WLM

Col lection scheduling For a nonincremental collector how mightweavoid or mitigate the eect of disrup

tive pauses esp ecially in interactive applications UngWM Can weimprove eciency by careful

opp ortunisticscheduling WMHay Can this b e adapted to incremental schemes to reduce oating

garbage

Intergenerational references Since it must b e p ossible to scavenge younger generations without scavenging

the older ones wemust b e able to nd the livepointers from older generations into the ones were

scavenging What is the b est way to do this WM BDS Appb Wil

Conclusions

Recent advances in garbage collection technology make automatic storage reclamation aordable for use in

highp erformance systems Even relatively simple garbage collectors p erformance is often comp etitive with

conventional explicit storage management App Zor Generational techniques reduce the basic costs and

disruptiveness of collection by exploiting the empirically observed tendency of ob jects to die young sto ck

hardware incremental techniques mayeven make this relatively inexp ensive for hard realtime systems

Wehave discussed the basic op eration of several kinds of garbage collectors to provide a framework for

understanding current research in the eld A key p oint is that standard textb o ok analyses of garbage col

lection algorithms usually miss the most imp ortantcharacteristics of collectorsnamely the constant factors

asso ciated with the various costs including lo cality eects These factors require garbage collection designers

to take detailed implementation issues into account and b e very careful in their choices of features

Features also interact in imp ortantways Finegrained incremental collection is unnecessary in most systems

without hard realtime constraints Coarser incremental techniques may b e sucient b ecause the mo dest pause

times are acceptable AEL BDS and the usuallyshort pauses of a stopandcollect generational system

may b e acceptable enough for many systems Ung WM On the other hand the write barrier supp ort

for generational garbage collection could also supp ort an incremental up date scheme for incremental collection

if this recording is cheap and precise enough it might supp ort negrained realtime collection at little cost

In this intro ductory surveywehave not addressed the increasingly imp ortant areas of parallel Ste

KS DLM NR AEL SS and distributed LQPRMA JJ PS collection wehave also

given insucientcoverage of conservative collectors which can b e used with systems not originally designed

for garbage collection BW Bar Ede Wen WH These developments have considerable promise

for making garbage collection widely available and practical wehopethatweve laid a prop er foundation for

discussing them by clarifying the basic issues

Acknowledgments

I am grateful to innumerable p eople for enlightening discussions of heap managementover the last few years

including David Ungar Eliot Moss Henry Baker Andrew App el Urs Holzle MikeLamTom Moher Henry

Lieb erman Patrick Sobalvarro Doug Johnson Bob Courts Ben Zorn Mark Johnstone and David Chase

Sp ecial thanks to Hans Bo ehm Jo el Bartlett David Mo on Barry Hayes and esp ecially to Janet Swisher for

help in the preparation of this pap er

References

AEL Andrew W App el John R Ellis and Kai Li Realtime concurrent garbage collection on sto ckmultipro ces

sors In Proceedings of SIGPLAN ConferenceonProgramming Language Design and Implementation

pages SIGPLAN ACM Press June Atlanta Georgia

AL Andrew W App el and Kai Li Virtual memory primitives for user programs In Proceedings of the Fourth

International ConferenceonArchitectural Support for Programming Languages and Operating Systems AS

PLOS IV pages April Santa Clara CA

And David L Andre Paging in Lisp programs Masters thesis University of Maryland College Park Maryland

App Andrew W App el Garbage collection can b e faster than stack allo cation Information Processing Letters

June

Appa Andrew W App el Runtime tags arent necessary Lisp and Symbolic Computation

Appb Andrew W App el Simple generational garbage collection and fast allo cation SoftwarePracticeand

Experience February

App Andrew W App el Garbage collection In Peter Lee editor Topics in AdvancedLanguage Implementation

pages MIT Press Cambridge Massachusetts

Bak Henry G Baker Jr List pro cessing in real time on a serial computer Communications of the ACM

April

Bak Henry G Baker Jr The Treadmill Realtime garbage collection without motion sickness ACM SIGPLAN

Notices March

Bar Jo el F Bartlett Compacting garbage collection with ambiguous ro ots Technical Rep ort Digital

Equipment Corp oration Western Research Lab oratoryPalo Alto California February

BDS HansJ Bo ehm Alan J Demers and Scott Shenker Mostly parallel garbage collection In SIGPLAN

Symposium on Programming Language Design and Implementation pages June Toronto

Ontario Canada

Bla Ricki Blau Paging on an ob jectoriented p ersonal computer for Smalltalk In Proceedings of the ACM

SIGMETRICS ConferenceonMeasurement and Modeling of Computer Systems August Minneap ol is

MN Also app ears as Technical Rep ort UCBCSD University of California at Berkeley Computer

Science Division EECS Berkeley California August

Bob Daniel G Bobrow Managing reentrant structures using reference counts ACM Transactions on Program

ming Languages and Systems July

Bo e HansJuergen Bo ehm Hardware and op erating system supp ort for conservative garbage collection In

International Workshop on Memory Management pages Palo Alto California Octob er IEEE

Press

Bro Ro dney A Bro oks Trading data space for reduced time and co de space in realtime collection on sto ck

hardware In Proceedings of the ACM Symposium on Lisp and Functional Programming pages

August Austin Texas

BW HansJuergen Bo ehm and Mark Weiser Garbage collection in an unco op erativeenvironment Software

Practice and Experience Septemb er

CG Douglas W Clark and C Cordell Green An empirical study of list structure in LISP Communications of

the ACM February

Cha Craig Chambers The Design and Implementation of the SELF Compiler an Optimizing Compiler for an

ObjectOrientedProgramming Language PhD thesis Stanford UniversityMarch

Che C J Cheney A nonrecursive list compacting algorithm Communications of the ACM

Novemb er

Cla Douglas W Clark Measurements of dynamic list structure use in Lisp IEEE Transactions on Software

Engineering January

CN Jacques Cohen and Alexandru Nicolau Comparison of compacting algorithms for garbage collection ACM

Transactions on Programming Languages and Systems Octob er

Coh Jacques Cohen Garbage collection of linked data structures Computing Surveys September

Col George E Collins A metho d for overlapping and erasure of lists Communications of the ACM

Decemb er

Cou Rob ert Courts Improving lo cality of reference in a garbagecollecti ng memory management system Com

munications of the ACM Septemb er

CWB Patrick J Caudill and Allen WirfsBro ck A thirdgeneration Smalltalk implementation In Norman

Meyrowitz editor ACM SIGPLAN ConferenceonObject OrientedProgramming Systems Languages

and Applications OOPSLA pages Septemb er Also published as ACM SIGPLAN

Notices Novemb er

Daw Jerey L Dawson Improved eectiveness from a realtime LISP garbage collector In SIGPLAN Sympo

sium on LISP and Functional Programming pages August

DB L Peter Deutsch and Daniel G Bobrow An ecient incremental automatic garbage collector Commu

nications of the ACM Septemb er

DeT John DeTreville Exp erience with concurrent garbage collectors for Mo dula Technical Rep ort

Digital Equipment Corp oration Systems ResearchCenter Palo Alto California August

DLM Edsger W Dijkstra Leslie Lamp ort A J Martin C S Scholten and E F M Steens Onthey garbage

collection An exercise in co op eration Communications of the ACM Novemb er

DMH Amer Diwan Eliot Moss and Richard Hudson Compiler supp ort for garbage collection in a statically

typ ed language In SIGPLAN Symposium on Programming Language Design and Implementation pages

San Francisco California June

DWH Alan Demers Mark Weiser Barry Hayes Daniel Bobrow and Scott Shenker Combining generational

and conservative garbage collection Framework and implementations In Conf Record of the Seventeeth

Annual ACM Symposium on Principles of Programming Languages pages January Las

Vegas Nevada

Ede Daniel Ross Edelson Dynamic storage reclamation in C Technical Rep ort UCSCCRL Univer

sity of California at Santa Cruz June

FY RobertRFenichel and Jerome C Yochelson A LISP garbagecollector for virtualmemory computer sys

tems Communications of the ACM Novemb er

Gol Benjamin Goldb erg Tagfree garbage collection for stronglytyp ed programming languages In SIGPLAN

Symposium on Programming Language Design and Implementation pages June Toronto

Ontario Canada

Gre Richard Greenblatt The LISP machine In DR Barstow HE Shrob e and E Sandewall editors Inter

active Programming Environments McGraw Hill

Hay Barry Hayes Using key ob ject opp ortunism to collect old ob jects In ACM SIGPLAN Conferenceon

Object OrientedProgramming Systems Languages and Applications OOPSLA pages Pho enix

Arizona Octob er ACM Press

HMS Antony L Hosking J Eliot B Moss and Darko Stefanovic A comparative p erformance evaluation of write

barrier implementations In ACM SIGPLAN ConferenceonObject OrientedProgramming Systems

Languages and Applications OOPSLA Vancouver British Columbia Canada Octob er To

app ear

JJ Niels Christian Juul and Eric Jul Comprehensive and robust garbage collection in a distributed system

In International Workshop on Memory Management St Malo France Septemb er SpringerVerlag

Lecture Notes in Computer Science series

Joh Douglas Johnson The case for a read barrier In Fourth InternationalConferenceonArchitectural Support

for Programming Languages and Operating Systems ASPLOS IV pages Santa Clara California

April

Joh Ralph E Johnson Reducing the latency of a realtime garbage collector ACM Letters on Programming

Languages and Systems March

Knu Donald E Knuth The Art of Volume Fundamental Algorithmschapter

pages AddisonWesley Reading Massachusetts

KS HT Kung and SW Song An ecient parallel garbage collection system and its correctness pro of In

IEEE Symposum on Foundations of Computer Science pages Providence Rho de Island Octob er

Lar R G Larson Minimizin g garbage collection as a function of region size SIAM Journal on Computing

Decemb er

LH Henry Lieb erman and Carl Hewitt A realtime garbage collector based on the lifetimes of ob jects Com

munications of the ACM June

LQP Bernard Lang Christian Queinnec and Jose Piquer Garbage collecting the world In ACM Symposium on

Principles of Programming pages Albuquerque New Mexico January

McB J Harold McBeth On the reference counter metho d Communications of the ACM September

McC John McCarthy Recursive functions of symb olic expressions and their computation bymachine Commu

nications of the ACM April

Min Marvin Minsky A LISP garbage collector algorithm using serial secondary storage AI Memo Pro ject

MAC MIT Cambridge Massachusetts

Mo o David Mo on Garbage collection in a large Lisp system In ConferenceRecord of the ACM Symposium

on Lisp and Functional Programming pages Austin Texas August

Nil Kelvin Nilsen Garbage collection of strings and linked data structures in real time Software Practiceand

Experience July

NR S C North and J H Reppy Concurrent Garbage Col lection on Stock Hardware pages Number

in Lecture Notes in Computer Science SpringerVerlag Septemb er

PS David Plainfosse and Marc Shapiro Exp erience with fault tolerant garbage collection in a distributed Lisp

system In International Workshop on Memory Management St Malo France Septemb er Springer

Verlag Lecture Notes in Computer Science series

RMA G Ringwo o d E Miranda and S Ab dullahi Distributed garbage collection In International Workshop

on Memory Management St Malo France Septemb er SpringerVerlag Lecture Notes in Computer

Science series

Rov Paul Rovner On adding garbage collection and runtime typ es to a stronglytyp ed statically checked

concurrent language Technical Rep ort CSL XeroxPalo Alto ResearchCenter Palo Alto California

July

Sha Rob ert A Shaw Empirical Analysis of a Lisp System PhD thesis Stanford University Stanford California

February Also app ears as Technical Rep ort CSLTR Stanford University Computer Systems

Lab oratory

Sob PatrickGSobalvarro A lifetimebased garbage collector for LISP systems on generalpurp ose computers

BS thesis Massachusetts Institute of Technology Electrical Engineering and Computer Science Depart

ment Cambridge Massachusetts

SS Ravi Sharma and Mary Lou Soa Parallel generational garbage collection In ACM SIGPLAN

Conference on Object OrientedProgramming Systems Languages and Applications OOPSLA pages

Pho enix Arizona Octob er

Sta James Willia m Stamos Static grouping of small ob jects to enhance p erformance of a paged virtual memory

ACM Transactions on Programming Languages and Systems May

Ste Guy L Steele Jr Multipro cessin g compactifying garbage collection Communications of the ACM

Septemb er

UJ David Ungar and Frank Jackson Tenuring p olicies for generationbased storage reclamation In ACM SIG

PLAN Conference on Object OrientedProgramming Systems Languages and Applications OOPSLA

pages San Diego California Septemb er ACM Also published as ACM SIGPLAN Notices

Novemb er

Ung David M Ungar Generation scavenging A nondisruptive highp erformance storage reclamation algo

rithm In ACM SIGSOFTSIGPLAN Software Engineering Symposium on Practical Software Development

Environments pages Pittsburgh Pennsylvania April Also distributed as ACM SIGPLAN

Notices May

Wen EPWentworth Pitfalls of conservative garbage collection Software Practice and Experience

July

WH Paul R Wilson and Barry Hayes The OOPSLA Workshop on Garbage Collection in Ob ject Oriented

Systems organizers rep ort In Addendum to the proceedings of OOPSLA Pho enix Arizona

Wil Paul R Wilson Some issues and strategies in heap management and memory hierarchies In OOP

SLAECOOP Workshop on Garbage Col lection in ObjectOriented SystemsOttawa Ontario Canada

Octob er Also in SIGPLAN Notices January

Wil Paul R Wilson Op erating system supp ort for small ob jects In International Workshop on Object Orien

tation in Operating SystemsPalo Alto California Octob er IEEE Press Revised version to app ear

in Computing Systems

WLM Paul R Wilson Michael S Lam and Thomas G Moher Eective staticgraph reorganization to improve

lo cality in garbagecollected systems In SIGPLAN SymposiumonProgramming Language Design and

Implementation pages Toronto Canada June

WLM Paul R Wilson Michael S Lam and Thomas G Moher Caching considerations for generational garbage

collection In SIGPLAN Symposium on LISP and Functional ProgrammingSanFrancisco California June

WM Paul R Wilson and Thomas G Moher Design of the opp ortunistic garbage collector In ACM SIGPLAN

Conference on Object OrientedProgramming Systems Languages and Applications OOPSLA

pages New Orleans Louisiana Octob er

Yua Taichi Yuasa Realtime garbage collection on generalpurp ose machines Journal of Systems and Software

Zor Benjamin Zorn Comparative Performance Evaluation of Garbage Col lection Algorithms PhD thesis

University of California at Berkeley Electrical Engineering and Computer Science Department Berkeley

California Decemb er Also app ears as Technical Rep ort UCBCSD University of California

at Berkeley

Zor Benjamin Zorn Comparing markandsweep and stopandcopy garbage collection In ACM Confer

ence on Lisp and Functional Programming pages Nice France June

Zor Benjamin Zorn The eect of garbage collection on cache p erformance Technical Rep ort CUCS

University of Colorado at Boulder Dept of Computer Science Boulder Colorado May

Zor Benjamin Zorn The measured cost of conservative garbage collection Technical rep ort Universityof

Colorado at Boulder Dept of Computer Science Boulder Colorado

a

This article was pro cessed using the L T X macro package with LLNCS style

E