<<

Unipro cessor Garbage Collection Techniques

Submitted to ACM Computing Surveys

Paul R Wilson

Abstract Incremental Tracing Collectors

Coherence and Conservatism

We survey basic garbage collection and

Tricolor Marking

variations such as incremental and generational collec

Incremental approaches

tion we then discuss lowlevel implementation consid

Write Barrier Algorithms

erations and the relationships b etween storage man

Snapshotatb eginning

agement systems languages and compilers Through

Algorithms

out we attempt to present a unied view based on

Incremental Up date WriteBar

abstract traversal strategies addressing issues of con

rier Algorithms

servatism opp ortunism and immediacy of reclama

Bakers Read Barrier Algorithms

tion we also p ointoutavariety of implementation

Incremental Copying

details that are likely to have a signicant impact on

ers Incremental Noncopy Bak

p erformance

ing The Treadmill

Conservatism of Bakers Read

Barrier

Contents

Variations on the Read Barrier

Replication Copying Collection

Automatic Storage Reclamation

Coherence and Conservatism Revisited

Motivation

Coherence and Conservatism in

The TwoPhase Abstraction

Noncopying collection

Ob ject Representations

Coherence and Conservatism in

Copying Collection

Overview of the Pap er

Radical Collection and Op

Basic Garbage Collection Techniques

p ortunistic Tracing

Comparing Incremental Techniques

Realtime Tracing Collection

The Problem with Cycles

Ro ot Set Scanning

The Eciency Problem

Guaranteeing Sucient Progress

Deferred Reference Counting

Trading worstcase p erformance

V ariations on Reference Counting

for exp ected p erformance

MarkSweep Collection

Discussion

MarkCompact Collection

Cho osing an Incremental Algorithm

Copying Garbage Collection

A Simple Copying Collector

Generational Garbage Collection

StopandCopy Using Semi

Multiple Subheaps with Varying Col

spaces

lection Frequencies

Eciency of Copying Collection

AdvancementPolicies

NonCopying Implicit Collection

Heap Organization

Cho osing Among Basic Tracing Tech

Subareas in copying schemes

niques

Generations in Noncopying

Problems with Simple Tracing Collectors

Schemes

Conservatism in Garbage Collection Discussion

Tracking Intergenerational References Automatic Storage Reclama

Indirection Tables

tion

Ungars Remembered Sets

Garbage col lection is the automatic reclamation of

Page Marking

computer storage Knu Coh App While in

Word marking

many systems programmers must explicitly reclaim

Card Marking

heap memory at some p oint in the program byus

Store Lists

ing a free or disp ose statement garbage collected

Discussion

systems free the programmer from this burden The

The Generational Principle Revisited



garbage collectors function is to nd data ob jects

Pitfalls of Generational Collection

that are no longer in use and make their space avail

The Pig in the Snake Problem

able for reuse by the the running program An ob ject

Small Heapallo cated Ob jects

is considered garbage and sub ject to reclamation if it

is not reachable by the running program via any path

Large Ro ot Sets

of p ointer traversals Live p otentially reachable ob

Realtime Generational Collection

jects are preserved by the collector ensuring that the

program can never traverse a dangling p ointer into

Lo cality Considerations

a deallo cated ob ject

Varieties of Lo cality Eects

This pap er surveys basic and advanced techniques

Lo cality of Allo cation and Shortlived

in unipro cessor garbage collectors esp ecially those de

ob jects

velop ed in the last decade For a more thorough

Lo calityofTracing Traversals

treatment of older techniques see Knu Coh

Clustering of LongerLived Ob jects

While it do es not cover parallel or distributed col

Static Grouping

lection it presents a unied taxonomy of incremen

Dynamic Reorganization

tal techniques whichlays the groundwork for under

standing parallel and distributed collection Our fo

Co ordination with Paging

cus is on garbage collection for pro cedural and ob ject

oriented languages but much of the information here

Lowlevel Implementation Issues

will serveasbackground for understanding garbage

Pointer Tags and Ob ject Headers

collection of other kinds of systems such as functional

ConservativePointer Finding

or logic programming languages For further reading

Linguistic Supp ort and Smart Pointers

on various advanced topics in garbage collection the

Compiler Co op eration and Optimizations



pap ers collected in BC are a go o d starting p oint

GCAnytime vs SafePoints

Collection

Motivation

Partitioned

Register Sets vs Variable Rep

Garbage collection is necessary for fully mo dular pro

resentation Recording

gramming to avoid intro ducing unnecessary inter

mo dule dep endencies Asoftware routine op erating

Optimization of Garbage Col

on a should not havetodependwhat

lection Itself

Free Storage Management

We use the term heap in the simple sense of a storage

managementtechnique whichallows any dynamically allo cated

Compact Representations of Heap Data

ob ject to b e freed at any timethis is not to b e confused with

heap data structures which maintain ordering constraints

GCrelated Language Features



We use the term ob ject lo osely to include anykindof

structured data record suchasPascal records or structs as

Weak Pointers

well as fulledged ob jects with encapsulation and inheritance

Finalization

in the sense of ob jectoriented programming



Multiple DifferentlyManaged Heaps

There is also a rep ository of pap ers in PostScript for

mat available for anonymous Internet FTP from our FTP

host csutexasedupubgarbage Among other things this

Overall Cost of Garbage Collection

rep ository contains collected pap ers from several garbage col

lection workshops held in conjunction with ACM OOPSLA

conferences Conclusions and Areas for Research

other routines may b e op erating on the same struc grammers nd the source of leaked ob jects in lan

ture unless there is some go o d reason to co ordinate guages with explicit deallo cation HJ and these can

their activities If ob jects must b e deallo cated explic b e extremely valuable Unfortunately these to ols only

itly some mo dule must b e resp onsible for knowing nd actual leaks during particular program runs not

when other mo dules are not interested in a particular p ossible leaks due to uncommon execution patterns

ob ject Finding the source of a leaked ob ject do es not always

solve the problem either the programmer still must

Since livenessisaglobal prop erty this intro duces

b e able to determine a p oint where the ob ject should

nonlo cal b o okkeeping into routines that might other

b e deallo catedif one exists If one do esnt exist the

wise b e lo cally understandable and exibly comp os

program must b e restructured This kind of gar

able This b o okkeeping inhibits abstraction and re

bage debugging is b etter than nothing but it is very

duces extensibility b ecause when new functionalityis

fallible and it must b e rep eated whenever programs

implemented the b o okkeeping co de must b e up dated

change it is desirable to actually eliminate leaks in

The runtime cost of the b o okkeeping itself may b e sig

general rather than certain detectable leaks in partic

nicant and in some cases it mayintro duce the need

ular

for additional synchronization in concurrent applica

tions

Explicit allo cation and reclamation lead to program

The unnecessary complications and subtle interac

errors in more subtle ways as well It is common for

tions created by explicit storage allo cation are esp e

programmers to allo cate a mo derate number of ob

cially troublesome b ecause programming mistakes of

jects statically so that it is unnecessary to allo cate

ten break the basic abstractions of the programming

them on the heap and decide when and where to re

language making errors hard to diagnose Failing to

claim them This leads to xed limitations on pro

reclaim memory at the prop er p ointmayleadtoslow

grams making them fail when those limitations are

memory leaks with unreclaimed memory gradually ac

exceeded p ossibly years later when computer memo

cumulating until the pro cess terminates or swap space

ries and data sets are much larger This brittleness

is exhausted Reclaiming memory to o so on can lead to

makes co de much less reusable b ecause the undo cu

very strange b ehavior b ecause an ob jects space may

mented limits cause it to fail even if its b eing used in

b e reused to store a completely dierent ob ject while

away consistent with its abstractions For example

an old p ointer still exists The same memory may

many compilers fail when faced with automatically

therefore b e interpreted as two dierent ob jects simul

generated programs that violate assumptions ab out

taneously with up dates to one causing unpredictable

normal programming practices

mutations of the other

These problems lead many applications program

These programming errors are particularly dan

mers to implement some form of applicationsp ecic

gerous b ecause they often fail to show up rep eat

garbage collection within a large software system to

ably making debugging very dicultthey maynever

avoid most of the headaches of explicit storage man

show up at all until the program is stressed in an un

agement Many large programs have their own data

usual wa y If the allo cator happ ens not to reuse a

typ es that implement reference counting for example

particular ob jects space a dangling p ointer maynot

Because they are co ded up for a oneshot application

cause a problem Later after delivery the application

these collectors are often b oth incomplete and buggy

may crash when it makes a dierent set of memory de

The garbage collectors themselves are therefore often

mands or is linked with a dierent allo cation routine

unreliable as well as b eing hard to use b ecause they

A slow leak may not b e noticeable while a program is

are not in tegrated into the

b eing used in normal waysp erhaps for manyyears

The fact that such kludges exist despite these prob

b ecause the program terminates b efore to o much extra

lems is a testimony to the value of garbage collection

space is used But if the co de is incorp orated into a

and it suggests that garbage collection should b e part

longrunning server program the server will eventu

of programming language implementations



ally exhaust the available memory and crash

It is widely b elieved that garbage collection is quite

Recentlytoolshave b ecome available to help pro

exp ensive relative to explicit heap management but



Longrunning server programs are also esp ecially vulnerable

several studies have shown that garbage collection is

to leaks due to exception handling Exception handling co de

sometimes cheap er App than explicit deallo cation

may fail to deallo cate all of the ob jects allo cated by an ab orted

and is usually comp etitive with it Zor As wewill

op eration and these o ccasional failures may cause a leak that

is extremely hard to diagnose explain later a wellimplemented garbage collector

should slow running programs down byvery roughly oriented programming language Smalltalk incorp o

p ercent relative to explicit heap deallo cation for rates garbage collection more recently garbage collec



a highp erformance system A signicantnumber of tion has b een incorp orated into many generalpurp ose

programmers regard such a cost as unacceptable but languages such as Eiel Self and Dylan including

many others b elieve it to b e a price for the b en those designed in part for lowlevel systems program

ets in convenience development time and reliability ming such as Mo dula and Ob eron Several addon

packages also exist to retrot C and C with gar

Reliable cost comparisons are dicult however

bage collection

partly b ecause the use of explicit deallo cation aects

In the rest of this pap er we fo cus on garbage col

the structure of programs in ways that may themselves

lectors that are built into a language implementation

b e exp ensive either directly or by their impact on the

or grafted onto a language by imp orting routines from

software development pro cess

a library The usual arrangement is that the heap al

For example explicit heap management often moti

lo cation routines p erform sp ecial actions to reclaim

vates extra copying of ob jects so that deallo cation de

space as necessary when a memory request is not

cisions can b e made lo callyie eachmodulemakes

easily satised Explicit calls to the deallo cator

its own copy of a piece of information and can deal

are unnecessary b ecause calls to the collector are im

lo cate it when it is nished with it This not only in

plicit in calls to the allo catorthe allo cator invokes

curs extra heap allo cation but undermines an ob ject

the garbage collector as necessary to free up the space

oriented design strategy where the identities of ob

it needs

jects may b e as imp ortantasthevalues they store

Most collectors require some co op eration from the

The eciency cost of this extra copyingishardto

compiler or interpreter as well ob ject formats must

measure b ecause you cant fairly compare the same

b e recognizable by the garbage collector and certain

program with and without garbage collection the pro

invariants must b e preserved by the running co de De

gram would have b een written dierently if garbage

p ending on the details of the garbage collector this

collection were assumed

may require slightchanges to the compilers co de gen

In the long run p o or program structure may incur

erator to emit certain extra information at compile

extra development and maintenance costs and may

time and p erhaps execute dierent instruction se

cause programmer time to b e sp ent on maintaining in

quences at run time Bo e WH BC DMH

elegant co de rather than optimizing timecritical parts

Contrary to widespread misconceptions there is no

of applications even if garbage collection costs more

conict b etween using a compiled language and gar

than explicit deallo cation the savings in human re

bage collection stateofthe art implementations of

sources maypay for themselves in increased attention



garbagecollected languages use sophisticated optimi

to other asp ects of the system

zing compilers

For these reasons garbagecollected languages ha ve

long b een used for the programming of sophisticated

algorithms using complex data structures Many

The TwoPhase Abstraction

garbagecollected languages such as Lisp and Pro

Garbage collection automatically reclaims the space

log were originally p opular for articial intelligence

o ccupied by data ob jects that the running program

programming but have b een found useful for general

can never access again Such data ob jects are referred

purp ose programmingFunctional and logic program

to as garbage The basic functioning of a garbage col

ming languages generally incorp orate garbage col

lector consists abstractly sp eaking of two parts

lection b ecause their unpredictable execution pat

terns make it esp ecially dicult to explicitly pro

Distinguishing the live ob jects from the garbage

gram storage deallo cation The inuential ob ject

in some waygarbage detection and



This is an estimate on our part and in principle wethink

Reclaiming the garbage ob jects storage so that

garbage collection p erformance could b e somewhat b etter in

the running program can use it garbage reclama

practice it is sometimes worse Reasons for and limitations

of such an estimate will b e discussed in Sect One practical

tion

problem is that stateoftheart garbage collectors havenotgen

erally b een available for most highp erformance programming

In practice these two phases may b e functionally

systems

or temp orally interleaved and the reclamation tech



For example Rovner rep orts that an estimated of de

nique is strongly dep endent on the garbage detection

velop er eort in the Mesa system was sp ent dealing with dicult

storage management issues Rov technique

In general garbage collectors use a liveness cri an address This tagged representation supp orts p oly

terion that is somewhat more conservativethanthose morphic elds whichmaycontain either one of these

used by other systems In an immediate ob jects or a p ointer to an ob ject on the

avalue may b e considered dead at the p ointthatit heap Usually these short tags are augmented byad

can never b e used again by the running program as ditional information in heapallo cated ob jects head

determined bycontrol ow and data ow analysis A ers

garbage collector typically uses a simpler less dynamic

For a purely staticallytyp ed language no p er

criterion dened in terms of a root set and reachability

ob ject runtime typ e information is actually necessary

from these ro ots

except the typ es of the ro ot set variables This will b e

At the moment the garbage collector is invoked the

discussed in Sect Despite this headers are often

activevariables are considered live Typically this in

used for staticallytyp ed languages b ecause it sim

cludes staticallyallo cated global or mo dule variables

plies implementations at little cost Conventional

as well as lo cal variables in activation records on the

explicit heap management systems often use ob ject

activation stacks and anyvariables currently in reg

headers for similar reasons

isters These variables form the root set for the traver

Garbage collectors using conservative pointer nd

sal Heap ob jects directly reachable from anyofthese

ing BW are usable with little or no co op era

variables could b e accessed by the running program

en the typ es of named tion from the compilernot ev

so they must b e preserved In addition since the pro

variablesbut we will defer discussion of these collec

gram might traverse p ointers from those ob jects to

tors until Sect

reach other ob jects any ob ject reachable from a live

ob ject is also live Thus the set of live ob jects is sim

Overview of the Pap er

ply the set of ob jects on any directed path of p ointers

from the ro ots

The remainder of this pap er will discuss basic and

Any ob ject that is not reachable from the ro ot set

advanced topics in garbage collection

is garbage ie useless b ecause there is no legal se

The basic algorithms include reference count

quence of program actions that would allow the pro

ing marksweep markcompact copying and non

gram to reach that ob ject Garbage ob jects therefore

copying implicit collection these are discussed in

cant aect the future course of the computation and

Sect

their space may b e safely reclaimed

Incremental techniques Sect allow garbage col

lection to pro ceed piecemeal while applications are

Ob ject Representations

running These techniques can reduce the disruptive

ness of garbage collection and mayeven provide real

In most of this pap er wemake the simplifying as

time guarantees They can also b e generalized into

sumption that heap ob jects are selfidentifying ie

concurrent collections which pro ceed on another pro

that it is easy to determine the typeofanobjectatrun

cessor in parallel with actual program execution

time Implementations of staticallytyp ed garbage col

Generational schemes Sect improve eciency

lected languages typically have hidden header elds

andor lo calityby garbage collecting a smaller area

on heap ob jects ie an extra eld containing typ e in

more often while exploiting typical lifetime character

formation which can b e used to deco de the format of

istics to avoid undue overhead from longlived ob jects

the ob ject itself This is esp ecially useful for nding

Because most collections are of a small area typical

p ointers to other ob jects Such information can eas

pause times are also short and for many applications

ily b e generated by the compiler whichmust havethe

this is an acceptable alternative to incremental collec

information to generate correct co de for references to

tion

ob jects elds

Section discusses lo cality prop erties of garbage Dynamicallytyp ed languages such as Lisp and

collected systems which are rather dierentfrom Smalltalk usually use tagged pointers a slightly short

thoseofconventional systems Section explores low ened representation of the hardware address is used

level implementation considerations such as ob ject with a small typ eidentifying eld in place of the miss

formats and compiler co op eration Section describ es ing address bits This also allows short immutable ob

languagelevel constraints and features for garbage jects in particular small integers to b e represented

collected systems Section presents the basic con as unique bit patterns stored directly in the address

clusions of the pap er and sketc part of the eld rather than actually referred to by hes researchissuesin

garbage collection of parallel distributed and p ersis t systems

ten HEAP SPACE

Basic Garbage Collection

1

Techniques

2

en the basic twopart op eration of a garbage collec Giv ROOT

SET

yvariations are p ossible The rst part dis

tor man 1 1

e ob jects from garbage maybedonein

tinguishing liv 1

twoways by referencecountingorby tracingThe

1 2 1 general term tracing used to include b oth marking

and copying techniques is taken from LD Ref

erence counting garbage collectors maintain counts of

the number of pointers to each ob ject and this count

Figure Reference counting

is used as a lo cal approximation of determining true

liveness Tracing collectors determine liveness more

directlyby actually traversing the p ointers that the

have their reference counts decremented since refer

program could traverse to nd all of the ob jects the

ences from a garbage ob ject dont count in determin

program might reach There are several varieties of

ing liveness Reclaiming one ob ject may therefore lead

tracing collection marksweep markcompact copy

to the transitive decrementing of reference counts and



ing and noncopying implicit reclamation Because

reclaiming many other ob jects For example if the

each garbage detection scheme has a ma jor inuence

only p ointer into some large data structure b ecomes

on reclamation and on reuse techniques we will intro

garbage all of the reference counts of the ob jects in

duce reclamation metho ds as we go

that structure typically b ecome zero and all of the

ob jects are reclaimed

In terms of the abstract twophase garbage collec

Reference Counting

tion the adjustment and checking of reference counts

In a reference counting system Col eachobject

implements the rst phase and the reclamation phase

has an asso ciated count of the references p ointers to

o ccurs when reference counts hit zero These op era

it Each time a reference to the ob ject is created eg

tions are b oth interleaved with the execution of the

when a p ointer is copied from one place to another

program b ecause they may o ccur whenever a p ointer

by an assignment the p ointedto ob jects count is in

is created or destroyed

cremented When an existing reference to an ob ject

One advantage of reference counting is this incre

is eliminated the count is decremented See Fig

mental nature of most of its op erationgarbage col

The memory o ccupied by an ob ject may b e reclaimed

lection work up dating reference counts is interleaved

when the ob jects count equals zero since that in

closely with the running programs own execution It

dicates that no p ointers to the ob ject exist and the

can easily b e made completely incremental and real

running program cannot reach it

time that is p erforming at most a small and b ounded

In a straightforward reference counting system each

amountofwork p er unit of program execution

ob ject typically has a header eld of information de

Clearly the normal reference count adjustments are

scribing the ob ject which includes a subeld for the

intrinsically incremental never involving more than a

reference count Like other header information the

few op erations for any given op eration that the pro

reference count is generally not visible at the language

gram executes The transitive reclamation of whole

level

data structures can b e deferred and also done a lit

When the ob ject is reclaimed its p ointer elds are

tle at a time bykeeping a list of freed ob jects whose

examined and any ob jects it holds p ointers to also

reference counts have b ecome zero but whichha vent



Some authors use the term garbage collection in a nar

yet b een pro cessed

rower sense which excludes reference counting andor copycol

This incremental collection can easily satisfy real

lection systems we prefer the more inclusive sense b ecause of its

time requirements guaranteeing that memory man

p opular usage and b ecause its less awkward than automatic

storage reclamation agement op erations never halt the executing program

for more than a very brief p erio d This can supp ort

h guaranteed resp onse time is crit

applications in whic HEAP SPACE

ical incremental collection ensures that the program

is allowed to p erform a signicant though p erhaps ap

1

preciably reduced amountofwork in any signicant

amount of time Subtleties of realtime requirements

will b e discussed in the context of tracing collection 1 in Sect ROOT

SET ting systems

One minor problem with reference coun 1 1

ts themselves take up space

is that the reference coun 1

In some systems a whole machine word is used for

h ob jects reference count eld actually allowing

eac 1 2 1

it to representanynumberofpointers that mightac

tually exist in the whole system In other systems a

shorter eld is used with a provision for overowif

Figure Reference counting with unreclaimable cy

the reference count reaches the maximum that can b e

cle

represented by the eld size its count is xed at that

maximum value and the ob ject cannot b e reclaimed

Such ob jects and other ob jects reachable from them

times formed by the use of hybrid data structures

must b e reclaimed by another mechanism typically by

whichcombine advantages of simpler data structures

a tracing collector that is run o ccasionally as wewill

as well as when the applicationdomain semantics of

explain b elow such a fallback reclamation strategy is

data are most naturally expressed with cycles

usually required anyway

Systems using reference counting garbage collectors

There are two ma jor problems with reference count

therefore usually include some other kind of garbage

ing garbage collectors they are not always eective

collector as well so that if to o much uncollectable

and they are dicult to make ecient

cyclic garbage accumulates the other metho d can b e

used to reclaim it

Many programmers who use referencecounting sys

The Problem with Cycles

tems suchasInterlisp and early versions of Smalltalk

The eectiveness problem is that reference counting

have mo died their programming style to avoid the

fails to reclaim circular structures If the p ointers

creation of cyclic garbage or to break cycles b efore

in a group of ob jects create a directed cycle the

they b ecome a nuisance This has a negative impact

ob jects reference counts are never reduced to zero

on program structure and many programs still have

even if thereisnopath to the objects from the root set

storage leaks that accumulate cyclic garbage which



McB

must b e reclaimed by some other means These leaks

Figure illustrates this problem Consider the iso

in turn can compromise the realtime nature of the al

lated pair of ob jects on the right Each holds a p ointer

gorithm b ecause the system mayhavetofallbackto

to the other and therefore each has a reference count

the use of a nonrealtime collector at a critical mo

of one Since no path from a ro ot leads to either

ment

however the program can never reach them again

Conceptually sp eaking the problem here is that ref

The Eciency Problem

erence counting really only determines a conservative

The eciency problem with reference counting is that

approximation of true liveness If an ob ject is not

its cost is generally prop ortional to the amountof

p ointed to byanyvariable or other ob ject it is clearly

work done by the running program with a fairly large

garbage but the converse is often not true

constant of prop ortionality One cost is that when

It may seem that circular structures would b e very

a p ointer is created or destroyed its referents count

unusual but they are not While most data struc

must b e adjusted If a variables value is changed from

tures are acyclic it is not uncommon for normal pro

one p ointer to another two ob jects counts must b e

grams to create some cycles and a few programs cre

ate very many of them For example no des in trees



Bob describ es mo dications to reference counting to al

mayhave backp ointers to their parents to facilitate

low it to handle some sp ecial cases of cyclic structures but this

certain op erations More complex cycles are some restricts the programmer to certain stereotyp ed patterns

adjustedone ob jects reference countmust b e incre Variations on Reference Counting

mented the others decremented and then checked to

Another optimization of reference counting is to use

see if it has reached zero

avery small count eld p erhaps only a single bit

Shortlived stackvariables can incur a great deal

to avoid the need for a large eld p er ob ject WF

of overhead in a simple referencecounting scheme

Given that deferred reference counting avoids the need

When an argument is passed for example a new

to continually representthecount of p ointers from the

p ointer app ears on the stack and usually disapp ears

stack a single bit is sucient for most ob jects the

almost immediately b ecause most pro cedure activa

minority of ob jects whose reference counts are not zero

tions near the leaves of the call graph return very

or one cannot b e reclaimed by the reference counting

shortly after they are called In these cases reference

system but are caughtby a fallback tracing collector

counts are incremented and then decremented back

A onebit reference count can also b e represented in

to their original value very so on It is desirable to op

eachpointer to an ob ject if there is an unused address

timize away most of these increments and decrements

bit rather than requiring a header eld SCN

that cancel each other out

There is another cost of referencecounting collec

tion that is harder to escap e When ob jects counts

go to zero and they are reclaimed some b o okkeeping

must b e done to make them available to the running

Deferred Reference Counting

program Typically this involves linking the freed ob

jects into one or more free lists of reusable ob jects

Much of this cost can b e optimized awayby sp ecial

from which the programs allo cation requests are sat

treatment of lo cal variables DB Bakb Rather

ised Other strategies will b e discussed in the con

than always adjusting reference counts and reclaiming

text of marksweep collection in Sect Ob jects

ob jects whose counts b ecome zero references from the

p ointer elds must also b e examined so that their ref

lo cal variables are not included in this b o okkeeping

erents can b e freed

most of the time Usually reference counts are only

It is dicult to make these reclamation op erations

adjusted to reect p ointers from one heap ob ject to

take less than a few tens of instructions p er ob ject

another This means that reference counts maynotbe

and the cost is therefore prop ortional to the number

accurate b ecause p ointers from the stackmaybecre

of ob jects allo cated by the running program

ated or destroyed without b eing accounted for that

These costs of reference counting collection have

in turn means that ob jects whose count drops to zero

combined with its failure to reclaim circular structures

may not actually b e reclaimable Garbage collection

to make it unattractive to most implementors in re

can only b e done when references from the stack are

centyears As we will explain b elow other techniques

taken into accountaswell

are usually more ecient and reliable Still refer

ence counting has its advantages The immediacy of

Every no w and then the reference counts are

reclamation can have advantages for overall memory

broughtuptodateby scanning the stackforpointers

usage and for lo cality of reference DeT a refer

to heap ob jects Then any ob jects whose reference

ence counting system may p erform with little degra

counts are still zero may b e safely reclaimed The

dation when almost all of the heap space is o ccupied

interval b etween these phases is generally chosen to

by live ob jects while other collectors rely on trading

b e short enough that garbage is reclaimed often and

more space for higher eciency It can also b e useful

quicklyyet still long enough that the cost of p eri

for nalization that is p erforming cleanup actions

o dically up dating counts for stack references is not

e closing les when ob jects die Rov this will lik

high

b e discussed in Sect

This deferredreferencecounting DBavoids ad

The inability to reclaim cyclic structures is not a

justing reference counts for most shortlived p ointers

problem in some languages which do not allowthecon

from the stack and greatly reduces the overhead of

struction of cyclic data structures at all eg purely

reference counting When p ointers from one heap ob

functional languages Similarly the relatively high

ject to another are created or destroyed however the

cost of sideeecting p ointers b etween heap ob jects is

reference counts must still b e adjusted This cost is

not a problem in languages with few sideeects Ref

still roughly prop ortional to the amountofwork done

by the running program in most systems but with a

As WLMshows generational techniques can recapture

lower constant of prop ortionality some of this lo cality but not all of it

erence counts themselves maybevaluable in some sys There are three ma jor problems with traditional

tems For example they may supp ort optimizations in marksweep garbage collectors First it is dicult to

functional language implementations by allowing de handle ob jects of varying sizes without fragmentation

structive mo dication of uniquelyreferenced ob jects of the available memory The garbage ob jects whose

Distributed garbage collection can b enet from the space is reclaimed are intersp ersed with live ob jects

lo cal nature of garbage collection compared to global so allo cation of large ob jects may b e dicult several

tracing In some congurations the cost of reference small garbage ob jects may not add up to a large con

counting is only incurred for p ointers to ob jects on tiguous space This can b e mitigated somewhat by

other no des tracing collection is used within a no de keeping separate free lists for ob jects of varying sizes

and to compute changes to reference counts b etween and merging adjacent free spaces together but dif

no des Future systems may nd other uses for ref culties remain The system must cho ose whether

erence counting p erhaps in hybrid collectors also in to allo cate more memory as needed to create small

volving other techniques or when augmented byspe data ob jects or to divide up large contiguous hunks of

cialized hardware PS Wis GCtokeep CPU free memory and risk p ermanently fragmenting them

costs down This fragmentation problem is not unique to mark

While reference counting is out of vogue for high sweepit o ccurs in reference counting as well and in

p erformance implementations of generalpurp ose pro most explicit heap managementschemes

gramming languages it is quite common in other ap

The second problem with marksweep collection is

plications where acyclic data structures are common

that the cost of a collection is prop ortional to the size

Most le systems use reference counting to manage

of the heap including b oth live and garbage ob jects

les andor disk blo cks Because of its simplicity sim

All live ob jects must b e marked and all garbage ob

ple reference counting is often used in various software

jects must b e collected imp osing a fundamental limi

packages including simple interpretive languages and

tation on any p ossible improvement in eciency

graphical to olkits Despite its weakness in the area of

The third problem involves lo cality of reference

reclaiming cycles reference counting is common even

Since ob jects are never moved the live ob jects re

in systems where cycles may o ccur

main in place after a collection intersp ersed with free

space Then new ob jects are allo cated in these spaces

MarkSweep Collection

the result is that ob jects of very dierent ages b e

come interleaved in memory This has negativeim

Marksweep garbage collectors McC are named for

plications for lo cality of reference and simple non

the two phases that implement the abstract garbage

generational marksweep collectors are often consid

collection algorithm we describ ed earlier

ered unsuitable for most applications

Distinguish the live objects from the garbage

It is p ossible for the working set of activeobjects

This is done by tracingstarting at the ro ot

to b e scattered across many virtual memory pages so

set and actually traversing the graph of p ointer

that those pages are frequently swapp ed in and out

relationshipsusually by either a depthrst or

of main memory This problem may not b e as bad

breadthrst traversal The ob jects that are

as manyhave thought b ecause ob jects are often cre

reached are marked in some way either by alter

ated in clusters that are typically active at the same

ing bits within the ob jects or p erhaps by record

time Fragmentation and lo cality problems are is un

ing them in a bitmap or some other kind of

avoidable in the general case however and a p otential

table

problem for some programs

It should b e noted that these problems maynotbe

Reclaim the garbage Once the live ob jects have

b een made distinguishable from the garbage ob insurmountable with suciently clever implementa

jects memory is swept that is exhaustively ex tion techniques For example if a bitmap is used for

amined to nd all of the unmarked garbage ob mark bits bits can b e checked at once with a bit

jects and reclaim their space Traditionallyas integer ALU op eration and conditional branch If live

with reference counting these reclaimed ob jects ob jects tend to survive in clusters in memory as they

apparently often do this can greatly diminish the con

are linked onto one or more free lists so that they

are accessible to the allo cation routines stant of prop ortionalityofthesweep phase cost the

theoretical linear dep endence on heap size maynotbe

More detailed descriptions of traversal and marking

as troublesome as it seems at rst glance The clus rithms can b e found in Knu and Coh

tered survival of ob jects may also mitigate the lo cal for compaction One p ointer scans downward from the

ity problems of reallo cating space amid live ob jects top of the heap lo oking for live ob jects and the other

if ob jects tend to survive or die in groups in memory scans upward from the b ottom lo oking for holes to

Hay the intersp ersing of ob jects used by dierent put them in Manyvariations of this algorithm are

program phases may not b e a ma jor consideration p ossible to deal with multiple areas holding dierent

sized ob jects and to avoid intermingling ob jects from

widelydisp ersed areas Foramorecompletetreat

MarkCompact Collection

ment of compacting algorithms see CN

Markcompact collectors remedy the fragmentation

and allo cation problems of marksweep collectors

Copying Garbage Collection

As with marksweep a marking phase traverses and

Like markcompact but unlike marksweep copying

marks the reachable ob jects Then ob jects are com

garbage collection do es not really collect garbage

pactedmoving most of the live ob jects until all of

Rather it moves all of the live ob jects into one area

the live ob jects are contiguous This leaves the rest

and the rest of the heap is then known to b e available

of memory as a single contiguous free space This is

b ecause it contains only garbage Garbage collec

often done by a linear scan through memory nding

tion in these systems is thus only implicit and some

live ob jects and sliding them down to b e adjacentto

researchers avoid applying that term to the pro cess

the previous ob ject Eventually all of the live ob jects

Copying collectors like markingandcompacting

have b een slid down to b e adjacenttoaliveneighbor

collectors move the ob jects that are reached by the

This leaves one contiguous o ccupied area at one end of

traversal to a contiguous area While markcompact

heap memory and implicitly moving all of the holes

collectors use a separate marking phase that traverses

to the contiguous area at the other end

the live data copying collectors integrate the traversal

This sliding compaction has several interesting

of the data and the copying pro cess so that most ob

prop erties The contiguous free area eliminates frag

jects need only b e traversed once Ob jects are moved

mentation problems so that allo cating ob jects of vari

to the contiguous destination area as they are reached

ous sizes is simple Allo cation can b e implemented as

by the traversal The work needed is prop ortional to

the incrementing of a p ointer into a contiguous area of

the amount of live data all of whichmust b e copied

memoryinmuch the way that dierentsized ob jects

scavenging is applied to the copying The term

can b e allo cated on a stack In addition the garbage

traversal b ecause it consists of picking out the worth

spaces are simply squeezed out without disturb

while ob jects amid the garbage and taking them away

ing the original ordering of ob jects in memoryThis

can ameliorate lo cality problems b ecause the allo ca

tion ordering is usually more similar to subsequent

A Simple Copying Collector Stop

access orderings than an arbitrary ordering imp osed

andCopy Using Semispaces

by a copying garbage collector CGCla

Avery common kind of copying garbage collector is

While the lo cality that results from sliding com

the semispace collector FY using the Cheney algo

paction is advantageous the collection pro cess itself

rithm for the copying traversal Che We will use

shares the marksweeps unfortunate prop ertythat

this collector as a reference mo del for much of this

several passes over the data are required After the



pap er

initial marking phase sliding compactors maketwoor

In this scheme the space devoted to the heap is

three more passes over the live ob jects CN One

sub divided into two contiguous semispaces During

pass computes the new lo cations that ob jects will b e

normal program execution only one of these semi

moved to subsequentpassesmust up date p ointers to

spaces is in use as shown in Fig Memory is allo ca

refer to ob jects new lo cations and actually movethe

ted linearly upward through this current semispace

ob jects These algorithms may b e therefore b e signif



icantly slower than marksweep if a large p ercentage

As a historical note the rst copying collector was Min

skys collector for Lisp Min Rather than copying data

of data survives to b e compacted

from one area of memory to another a single heap space was

An alternative approach is to use Daniel J Ed

used The live data were copied out to a le on disk and

wards twopointer algorithm which scans inward

then read back in in a contiguous area of the heap space

from b oth ends of a heap space to nd opp ortunities

On mo dern machines this would b e unb earably slow b ecause

le op erationswriting and reading every live ob jectare now

many times slower than memory op erations Describ ed in an exercise on page of Knu

Perhaps the simplest form of copying traversal is

the Cheney algorithm Che The immediately

ROOT reachable ob jects form the initial queue of ob jects

SET

for a breadthrst traversal A scan p ointer is ad

vanced through the rst ob ject lo cation by lo cation

Each time a p ointer into fromspace is encountered

the referredtoob ject is transp orted to the end of the

queue and the p ointer to the ob ject is up dated to re

fer to the new copy The free p ointer is then advanced

and the scan continues This eects the no de ex

pansion for the breadthrst traversal reaching and

copying all of the descendants of that no de See

Fig Reachable data structures in fromspace are

FROMSPACE TOSPACE

ed by the rst shown at the top of the gure follow

several states of tospace as the collection pro ceeds

Figure A simple semispace garbage collector b efore

tospace is shown in linear address order to emphasize

garbage collection

the linear scanning and copying

Rather than stopping at the end of the rst ob ject

the scanning pro cess simply continues through sub

sequent ob jects nding their ospring and copying

ROOT

ell A continuous scan from the b eginning

SET them as w

of the queue has the eect of removing consecutive

no des and nding all of their ospring The ospring

are copied to the end of the queue Eventually the

scan reaches the end of the queue signifying that all

of the ob jects that havebeenreached and copied

have also b een scanned for descendants This means

that there are no more reachable ob jects to b e copied

and the scavenging pro cess is nished

Actually a slightly more complex pro cess is needed

hed bymultiple paths are

FROMSPACE TOSPACE so that ob jects that are reac

not copied to tospace multiple times When an ob

ject is transp orted to tospace a forwarding pointer is

Figure Semispace collector after garbage collection

installed in the old version of the ob ject The for

warding p ointer signies that the old ob ject is obso

as demanded by the executing program As with a

lete and indicates where to nd the new copy of the

markcompact collector the ability to allo cate from

ob ject When the scanning pro cess nds a p ointer

a large contiguous free space makes allo cation sim

into fromspace the ob ject it refers to is checked for

ple and fast muchlike allo cating on a stack there is

a forwarding p ointer If it has one it has already

no fragmentation problem when allo cating ob jects of

b een moved to tospace so the p ointer by whichitwas

various sizes

reached is simply up dated to p oint to its new lo ca

When the running program demands an allo cation

tion This ensures that eachlive ob ject is transp orted

that will not t in the unused area of the current semis

exactly once and that all p ointers to the ob ject are

pace the program is stopp ed and the copying garbage

up dated to refer to the new copy

collector is called to reclaim space hence the term

stopandcopy All of the live data are copied from

Eciency of Copying Collection

the current semispace fromspace to the other semis

A copying garbage collector can b e made arbitrarily ef pace tospace Once the copying is completed the

cient if sucien tospace semispace is made the current semispace t memory is available Lar App

and program execution is resumed Thus the roles The work done at each collection is prop ortional to

of the two spaces are reversed each time the garbage the amount of live data at the time of garbage collec

collector is invoked See Fig tion Assuming that approximately the same amount B F

ROOT A SET E

C D

AAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAA

AAAA AAAAAAAA

i) AAAA

AAAAAAAAAAAAAAAA A B

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAA

AAAA AAAAAAAAAAAAAAAA

ii) AAAA

AAAAAAAAAAAAAAAAAAAAAAAA A B C

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAA AAAAAAAAAAAAAAAAAAAAAAAA

iii) AAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A B C D

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA iv) AAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A B C D E

Scan Free

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

v) AAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A B C D EF

Scan Free

Figure The Cheney algorithms breadthrst copying

of data is liveatany given time during the programs to an ob ject it must b e easy to determine which set it

execution decreasing the frequency of garbage collec isamemb er of in addition it must b e easy to switch

tions will decrease the total amount of garbage collec the roles of the sets just as fromspace and tospace

tion eort roles are exchanged in a copy collector In a copying

A simple way to decrease the frequency of garbage collector the set is an area of memory but in a non

copying collector it can b e any kind of set of chunks

collections is to increase the amount of memory in the

of memory that formerly held liveobjects

heap If each semispace is bigger the program will run

longer b efore lling it Another way of lo oking at this

The noncopying system adds two p ointer elds and

is that by decreasing the frequency of garbage collec

a color eld to each ob ject These elds are invisible

tions we are increasing the average age of ob jects at

to the application programmer and servetolinkeach

garbage collection time Ob jects that b ecome garbage

hunk of storage into a doublylinked list that serves

b efore a garbage collection neednt b e copied so the

as a set The color eld indicates which set an ob ject

chance that an ob ject will never havetobecopiedis

b elongs to

increased

The op eration of this collector is simple and iso

Supp ose for example that during a program run

morphic to the copy collectors op eration Wang

twenty megabytes of memory are allo cated but only

therefore refers to this as a fake copying collector

one megabyte is liveatanygiven time If wehave

Chunks of free space are initially linked to form a

two threemegabyte semispaces garbage will b e col

doublylinked list while chunks holding allo cated ob

lected ab out ten times Since the current semispace

jects are linked together into another list

is one third full after a collection that leaves two

When the free list is exhausted the collector tra

megabytes that can b e allo cated b efore the next col

verses the live ob jects and moves them from the allo

lection This means that the system will copy ab out

cated set whichwe could call the fromset to another

half as much data as it allo cates as shown in the top

set the toset This is implemented by unlinking the

part of Fig Arrows representcopying of live ob

ob ject from the doublylinked fromset list toggling its

jects b etween semispaces at garbage collections

color eld and linking it into the tosets doublylinked

On the other hand if the size of the semispaces is

list

doubled megabytes of free space will b e available af

Just as in a copy collector space reclamation is im

ter each collection This will force garbage collections

plicit When all of the reachable ob jects havebeen

a third as often or ab out or times during the run

traversed and moved from the fromset to the toset

This straightforwardly reduces the cost of garbage col

the fromset is known to contain only garbage It is

lection by more than half as shown in the b ottom part

therefore a list of free space which can immediately

of Fig For the moment we ignore virtual memory

b e put to use as a free list As we will explain in sec

paging costs assuming that the larger heap area can

tion Bakers scheme is actually somewhat more

b e cached in RAM rather than paged to disk As we

complex b ecause his collector is incremental The

will explain in Sect paging costs maymakethe

cost of the collection is prop ortional to the number of

use of a larger heap area impractical if there is not a

live ob jects and the garbage ob jects are all reclaimed

corresp ondingly large amountofRAM

in small constant time

This scheme can b e optimized in ways that are anal

NonCopying Implicit Collection

ogous to those used in a copying collectorallo cation

can b e fast b ecause the allo cated and free lists can

RecentlyWang Wan and Baker Bakbhave in

b e contiguous and separated only by an allo cation

dep endently prop osed a new kind of noncopying col

p ointer Rather than actually unlinking ob jects from

lector with some of the eciency advantages of a copy

one list and linking them into another the allo cator

ing scheme Their insight is that in a copying collector

can simply advance a p ointer whichpoints into the list

the spaces of the collector are really just a particular

and divides the allo cated segment from the free seg

implementation of sets The tracing pro cess removes

ment Similarly a Cheneystyle breadthrst traversal

ob jects from the set sub ject to garbage collection and

can b e implemented with only a pair of p ointers and

when tracing is complete anything remaining in the

the scanned and free lists can b e contiguous so that

set is known to b e garbage so the set can b e reclaimed

advancing the scan p ointer only requires advancing

in its entirety Another implementation of sets could

the p ointer that separates them

do just as well provided that it has similar p erfor

mance characteristics In particular given a p ointer This scheme has b oth advantages and disadvantages

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

A A A AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAAA AAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

AAAA AAAA AAAA AAAA AAA AAAA

Figure Memory usage in a semispace GC with MB top and MB b ottom semispaces

compared to a copy collector On the minus side the The latter two costs are usually similar in that the

p erob ject constants are probably a little bit higher amount of live data traced is usually some signicant

and fragmentation problems are still p ossible On the p ercentage of the amount of allo cated memoryThus

plus side the tracing cost for large ob jects is not as algorithms whose cost is prop ortional to the amount

high As with a marksweep collector the whole ob of allo cation eg marksweep may b e comp etitive

ject neednt b e copied if it cant contain p ointers with those whose cost is prop ortional to the amount

it neednt b e scanned either Perhaps more imp or of live data traced eg copying

tantly for many applications this scheme do es not

For example supp ose that p ercent of all allo

require the actual languagelevel p ointers b etween ob

cated data survive a collection and p ercentnever

jects to b e changed and this imp oses fewer constraints

need to b e traced In deciding which algorithm is more

on compilers As well explain later this is particu

ecient the asymptotic complexity is less imp ortant

larly imp ortant for parallel and realtime incremental

than the asso ciated constants If the cost of sweeping

collectors

an ob ject is ten times less than the cost of copying it

The space costs of this technique are usually roughly

the marksweep collector costs ab out the same as as

comparable to those of a copying collector Two

copy collector If a marksweep collectors sweeping

p ointer elds are required p er ob ject but live ob jects

cost is billed to the allo cator and it is small relative

b eing traced do not require space for b oth fromspace

to the cost of initializing the ob jects then it b ecomes

and tospace versions In most cases this app ears to

obvious that the sweep phase is just not terribly ex

make the space cost smaller than that of a copying

p ensive While current copying collectors app ear to

collector but in some cases fragmentation costs due

b e more ecient than current marksweep collectors

to the inability to compact data may outweigh those

the dierence is not large for stateofthe art imple

savings

mentations

In systems where memory is not much larger than

the exp ected amount of live data nonmoving collec

Cho osing Among Basic Tracing

tage over copying collectors in tors haveananadvan

Techniques

that they dont need space for twoversions of each live

ob ject the from and to versions When space

Treatments of garbage collection algorithms in text

is very tight reference counting collectors are partic

b o oks often stress asymptotic complexity but all basic

ularly attractive b ecause their p erformance is essen

algorithms have roughly similar costs esp ecially when

tially indep endent of the ratio of live data to total

we view garbage collection as part of the overall free

storage

storage managementscheme Allo cation and garbage

Further real highp erformance systems often use

collection are two sides of the basic memory reuse coin

hybrid techniques to adjust tradeos for dierent cate

and any algorithm incurs costs at allo cation time if

gories of ob jects Many highp erformance copy collec

only to initialize the elds of new ob jects A common

tors use a separate large object area CWB UJ

criterion for high p erformance garbage collection is

to avoid copying large ob jects from space to space

that the cost of garbage collecting ob jects b e compa

The large ob jects are kept o to the side and usually

rable on average to the cost of allo cating ob jects

managed inplace bysomevariety of marking traversal

Any ecient tracing collection scheme therefore has

and free list technique Other hybrids may use non

three basic cost comp onents which are the initial

copying techniques most of the time but o ccasionally

work required at each collection suc h as ro ot set scan

compact some of the data using copying techniques to

ning the work done at allo cation prop ortional to

avoid p ermanent fragmentation eg LD

the amount of allo cation or the numb er of ob jects

allo cated and the work done during garbage de

Amajorpointinfavor of inplace collectors is the

tection eg tracing

ability to makethemconservative with resp ect to data

The initial work is usually relatively xed for a par values that mayormay not b e p ointers This allows

ticular program by the size of the ro ot set The them to b e used for languages like C or otheshelf

work done at allo cation is generally prop ortional to optimizing compilers BW Bar BDS which

the numb er of ob jects allo cated plus an initialization can make it dicult or imp ossible to unambiguously

cost prop ortional to their sizes The garbage detec identify all p ointers at run time A nonmoving col

tion cost is prop ortional to the amount of live data lector can b e conservative b ecause anything that lo oks

that must b e traced likeapointer ob ject can b e left where it is and the

p ossible p ointer to it do esnt need to b e changed It is imp ortant to realize that this problem is

In contrast a copying collector must know whether not unique to copying collectors All ecientgar

avalue is a p ointerand whether to move the ob bage collection strategies involve similar spacetime

ject and up date the p ointer Conservative p ointer tradeosgarbage collections are p ostp oned so that

nding techniques will b e discussed in more detail in garbage detection work is done less often and that

Sect means that space is not reclaimed as quicklyOnav

erage that increases the amountofmemorywasted

Similarly the choice of a nonmoving collector can

due to unreclaimed garbage

greatly simplify the interfaces b etween mo dules writ

ten in dierent languages and compiled using dierent Deferred reference counting like tracing collection

compilers It is p ossible to pass p ointers to garbage also trades space for timein giving up continual in

collectible ob jects as arguments to foreign routines cremental reclamation to avoid sp ending CPU cycles

that were not written or compiled with garbage col

in adjusting reference counts one gives up space for

lection in mind This is not practical with a copying

ob jects that b ecome garbage and are not immedi

collector b ecause the p ointers that escap e into for ately reclaimed At the time scale on which memory

eign routines would have to b e found and up dated is reused the resulting lo calitycharacteristics must

when their referents moved share basic p erformance tradeo characteristics with

generational collectors of the copying or marksweep

varieties which will b e discussed later

Problems with Simple Tracing Col

While copying collectors were originally designed to

lectors

in their simple versions this improve improve lo cality

ment is not large and their lo cality can in fact b e

It is widely known that the asymptotic complexityof

worse than that of noncompacting collectors These

copying garbage collection is excellentthe copying

systems may improve the lo cality of reference to long

cost approaches zero as memory b ecomes very large

lived data ob jects whichhave b een compacted into

Treadmill collection shares this prop erty but other

a contiguous area However this eect is typically

collectors can b e similarly ecient if the constants

swamp ed by the eects of references due to allo ca

asso ciated with memory reclamation and reallo cation

tion Large amounts of memory are touched between

are small enough In that case garbage detection is

collections and this alone makes them unsuitable for

the ma jor cost

a virtual memory environment

Unfortunately it is dicult in practice to achieve

The ma jor lo cality problem is not with the lo cality

high eciency in a simple garbage collector b ecause

of compacted data or with the lo cality of the garbage

large amounts of memory are to o exp ensive If virtual

collection pro cess itself The problem is an indirect

memory is used the p o or lo cality of the allo cation

result of the use of garbage collectionby the time

and reclamation cycle will generally cause excessive

space is reclaimed and reused its likely to havebeen

paging Every lo cation in the heap is used b efore

paged out simply b ecause to o many other pages have

any lo cations space is reclaimed and reused Simply

b een allo cated in b etween Compaction is helpful but

paging out the recentlyallo cated data is exp ensivefor

thehelpisgenerallytoo little toolate With a simple

a highsp eed pro cessor Ung and the paging caused

semispace copy collector lo cality is likely to b e worse

by the copying collection itself may b e tremendous

than that of a marksweep collector b ecause the copy

since all liv edatamust b e touched in the pro cess

collector uses more total memoryonly half the mem

It therefore do esnt generally paytomaketheheap

ory can b e used b etween collections Fragmentation

area larger than the available main memoryFor a

of live data is not as detrimental as the regular reuse

mathematical treatment of this tradeo see Lar



of two spaces

Even as main memory b ecomes steadily cheap er lo

The only waytohave go o d lo cality is to ensure that

cality within cache memory b ecomes increasingly im

memory is large enough to hold the regularlyreused

p ortant so the problem is partly shifted to a dierent

level of the memory hierarchy WLM



Slightly more complicated copying schemes app ear to avoid

In general we cant achieve the p otential eciency

this problem Ung WM but WLM demonstratesthat

cyclic memory reuse patterns can fare p o orly in hierarchical

of simple garbage collection increasing the size of

memories b ecause of recencybased eg LRU replacement

memory to p ostp one or avoid collections can only b e

p olicies This suggests that freed memory should b e reused

taken so far b efore increased paging time negates any

in a LIFO fashion ie in the opp osite order of its previous

allo cation if the entire reuse pattern cant b e kept in memory advantage

area Another approachwould b e to rely on opti collectors view of the reachability graph A compilers

mizations such as prefetching but this is not feasi data and control ow analysis may detect dead values

ble at the level of virtual memorydisks simply cant and optimize them awayentirely Compiler optimiza

keep up with the rate of allo cation b ecause of the enor tions may also extend the eective lifetime of vari

mous sp eed dierential b etween RAM and disk Gen ables causing extra garbage to b e retained but this

erational collectors address this problem by reusing is not typically a problem in practice

a smaller amount of memory more often they will

Tracing collectors intro duce a ma jor temporal form

b e discussed in Sect For historical reasons is

of conservatism simply by allowing garbage to go un

widely b elieved that only copying collectors can b e

collected b etween collection cycles Reference count

made generational but this is not the case Gener

ing collectors are conservative top ologically failing to

ational noncopying collectors are slightly harder to

distinguish b etween dierent paths that share an edge

construct but they do exist and are quite practical

in the graph of p ointer relationships

DWH WJ

As the remainder of this survey will show there are

Finally the temp oral distribution of a simple trac

many p ossible kinds and degrees of conservatism with

ing collectors work is also troublesome in an inter

dierent p erformance tradeos

active programming environment it can b e very dis

ruptive to a users work to suddenly have the system

b ecome unresp onsive and sp end several seconds gar

Incremental Tracing Collec

bage collecting as is common in such systems For

tors

large heaps the pauses may b e on the order of sec

onds or even minutes if a large amount of data is

For truly realtime applications negrained incre

disp ersed through virtual memory Generational col

mental garbage collection app ears to b e necessary

lectors alleviate this problem b ecause most garbage

Garbage collection cannot b e carried out as one atomic

collections only op erate on a subset of memoryEven

action while the program is halted so small units

tually they must garbage collect larger areas however

of garbage collection must b e interleaved with small

and the pauses may b e considerably longer For real

units of program execution As we said earlier it is

time applications this may not b e acceptable

relatively easy to make reference counting collectors

incremental Reference countings problems with ef

ciency and eectiveness discourage its use however

Conservatism in Garbage Collec

and it is therefore desirable to make tracing copying

tion

or marking collectors incremental

In much of the following discussion the dierence

An ideal garbage collector would b e able to reclaim

between copying and marksweep collectors is not par

every ob jects space just after the last use of the ob

ticularly imp ortant The incremental tracing for gar

ject Such an ob ject is not implementable in practice

bage detection is more interesting than the reclama

of course b ecause it cannot in general b e determined

tion of detected garbage

when the last use o ccurs Real garbage collectors can

only provide a reasonable approximation of this b e

The diculty with incremental tracing is that while

havior using conservative approximations of this om

the collector is tracing out the graph of reachable data

niscience The art of ecien t garbage collector design

structures the graph maychangethe running pro

is largely one of intro ducing small degrees of conser

gram may mutate the graph while the collector isnt

vatism which signicantly reduce the workdonein

lo oking For this reason discussions of incremen

detecting garbage This notion of conservatism is

tal collectors typically refer to the running program

very general and should not b e confused with the

as the mutator DLM From the garbage collec

sp ecic p ointeridentication techniques used by so

tors p oint of view the actual application is merely a

called conservative garbage collectors All garbage

coroutine or concurrent pro cess with an unfortunate

collectors are conservative in one or more ways

tendency to mo dify data structures that the collec

tor is attempting to traverse An incremental scheme

The rst conservative assumption most collectors

must have some wayofkeeping track of the changes to

make is that anyvariable in the stack globals or reg

the graph of reachable ob jects p erhaps recomputing

isters is live even though the variable may actually

parts of its traversal in the face of those changes

never b e referenced again There maybeinteractions

An imp ortantc between the compilers optimizations and the garbage haracteristic of incremental tech

niques is their degree of conservatism with resp ect to do esnt view reachable ob jects as unreachable and

changes made by the mutator during garbage collec erroneously reclaim their space Typicallysomegar

tion If the mutator changes the graph of reachable bage ob jects go unreclaimed for a while usually these

ob jects freed ob jects mayormay not b e reclaimed are ob jects that b ecome garbage after b eing reached

by the garbage collector Some oating garbage may versal Such oating garbage is by the collectors tra

go unreclaimed b ecause the collector has already cat usually reclaimed at the next garbage collection cy

egorized the ob ject as live b efore the mutator frees cle since they will b e garbage at the beginning of

it This garbage is guaranteed to b e collected at the that collection the tracing pro cess will not conser

next cycle however b ecause it will b e garbage at the vatively view them as live The inability to reclaim

beginning of the next collection oating garbage immediately is unfortunate but may

b e essential to avoiding very exp ensive co ordination

between the mutator and collector

Coherence and Conservatism

The kind of relaxed consistency usedand the

Incremental marking traversals must takeinto ac

corresp onding coherence features of the collection

countchanges to the reachability graph made bythe

schemeare closely intertwined with the notion of

mutator during the collectors traversal Incremen

conservatism In general the more we relax the consis

tal copying collectors p ose more severe co ordination

tency b etween the mutators and the collectors views

problemsthe mutator must also b e protected from

of the reachability graph the more conservative our

changes made by the garbage collector

collection b ecomes and the more oating garbage we

It may b e enlightening to view these issues as a vari

must accept On the p ositive side the more relaxed

etyofcoherence problemshaving multiple pro cesses

our notion of consistency the more exibilitywehave

attempt to share changing data while maintaining

in the details of the traversal algorithm In parallel

some kind of consistent view NOPH Readers un

and distributed garbage collection a relaxed consis

familiar with coherence problems in parallel systems

tency mo del also allows more parallelism andor less

should not worry to o much ab out this terminology

synchronization but that is b eyond the scop e of this

the issues should b ecome apparentaswegoalong

survey

An incremental marksweep traversal p oses a multi

ple readers single writer coherence problemthe col

Tricolor Marking

lectors traversal must resp ond to changes but only

The abstraction of tricolor marking is helpful in under the mutator can change the graph of ob jects Simi

standing incremental garbage collection DLM larly only the traversal can change the mark bits each

pro cess can up date values but any eld is writable by

Garbage collection algorithms can b e describ ed as a

pro cess of traversing the graph of reachable ob jects only one pro cess Only the mutator writes to p ointer

and coloring them The ob jects sub ject to garbage elds and only the collector writes to mark elds

collection are conceptually colored white and by the Copying collectors p ose a more dicult problema

end of collection those that will b e retained must b e multiple readers multiple writers problem Both the

colored black When there are no reachable no des left mutator and the collector may mo dify p ointer elds

to blacken the traversal of live data structures is n and eachmust b e protected from inconsistencies in

ished tro duced by the other

In a simple marksweep collector this coloring is Garbage collectors can eciently solve these prob

directly implemented b y setting mark bitsob jects lems by taking advantage of the semantics of garbage

whose bit is set are black In a copy collector this collection and using forms of relaxedconsistency

is the pro cess of moving ob jects from fromspace to that is the pro cesses neednt always have a consistent

tospaceunreached ob jects in fromspace are consid view of the data structures as long as the dierences

between their views dont matter to the correctness ered white and ob jects moved to tospace are consid

ered black The abstraction of coloring is orthogonal

of the algorithm

to the distinction b etween marking and copying col

In particular the garbage collectors view of the

lectors and is imp ortant for understanding the basic

reachability graph is typically not identical to the ac

tual reachability graph visible to the mutator It is dierences b etween incremental collectors

only a safe conservative approximation of the true In incremental collectors the intermediate states of

reachability graphthe garbage collector may view the coloring traversal are imp ortant b ecause of on

some unreachable ob jects as reachable as long as it going mutator activitythe mutator cant b e allowed

to change things b ehind the collectors back in such

ay that the collector will fail to nd all reachable

aw A A

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

ob jects

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

To understand and preventsuchinteractions b e

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

tween the mutator and the collector it is useful to

tro duce a third color gray to signify that an ob ject

in B C B C

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

has b een reached by the traversal but that its descen

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

dants may not have been That is as the traversal

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

pro ceeds outward from the ro ots ob jects are initially

colored gray When they are scanned and p ointers to

versed they are blackened and

their ospring are tra D D

AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA

the ospring are colored gray

AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA

In a copying collector the gray ob jects are the ob

jects in the unscanned area of tospaceif a Cheney Before After

breadthrst traversal is used thats the ob jects b e

tween the scan and free p ointers In a marksweep

Figure A violation of the coloring invariant

collector the gray ob jects corresp ond to the stackor

queue of ob jects used to control the marking traver

Incremental approaches

sal and the black ob jects are the ones that have b een

removed from the queue In b oth cases ob jects that

There are two basic approaches to co ordinating the

have not b een reached yet are white

collector with the mutator One is to use a read bar

Intuitively the traversal pro ceeds in a wavefrontof

rierwhich detects when the mutator attempts to ac

gray ob jects which separates the white unreached

cess a p ointer to a white ob ject and immediately col

ob jects from the black ob jects that have b een passed

ors the ob ject gray since the mutator cant read p oin

by the wavethat is there are no p ointers directly

ters to white ob jects it cant install them in black

from black ob jects to white ones This abstracts away

ob jects The other approach is more direct and in

from the particulars of the traversal algorithmit may

volves a write barrierwhen the program attempts to

b e depthrst breadthrst or just ab out any kind

write a p ointer into an ob ject the write is trapp ed or

of exhaustivetraversal It is only imp ortantthata

recorded

welldened gray fringe b e identiable and that the

Write barrier approaches in turn fall into two dif

mutator preserve the invariant that no blackobject

ferent categories dep ending on which asp ect of the

hold a p ointer directly to a white ob ject

problem they address To foil the garbage collectors

The imp ortance of this invariant is that the collector

marking traversal it is necessary for the mutator to

must b e able to assume that it is nished with black

write a p ointer to a white ob ject into a black ob ject

ob jects and can continue to traverse gray ob jects and

and destroy the original p ointer b efore the collector

movethewavefrontforward If the mutator creates

sees it

a p ointer from a black ob ject to a white one it must

If the rst condition writing the p ointer into a black

somehow notify the collector that its assumption has

ob ject do es not hold no sp ecial action is neededif

b een violated This ensures that the collectors b o ok

thereare other pointers to the white object from gray

keeping is broughtuptodate

objects it wil l beretained and if not it is garbage and

neednt beretained anyway If the second condition

Figure demonstrates this need for co ordination

obliterating the original path to the ob ject do es not

Supp ose the ob ject A has b een completely scanned

hold the ob ject will b e reached via the original p ointer

and therefore blackened its descendants have b een

and retained The two writebarrier approaches fo cus

reached and grayed Now supp ose that the mutator

wo asp ects of the problem on these t

swaps the p ointer from A to C with the p ointer from B

to D The only p ointer to D is now in a eld of A which Snapshotatbeginning collectors ensure that the sec

the collector has already scanned If the traversal con ond condition cannot happ enrather than allowing

tinues without any co ordination B will b e blackened p ointers to b e simply overwritten they are rst saved

C will b e reached again from B and D will never b e in a data structure o to the side so that the col

reac hed at all and hence will b e erroneously deemed lector can nd them Thus no path to a white ob ject

garbage and reclaimed can b e broken without providing another path to the

ob ject for the garbage collector able ob jects is xed at the moment garbage collection

starts even though the actual traversal pro ceeds in

Incremental update collectors are more direct in

crementally

dealing with these troublesome p ointers Rather than

saving copies of all p ointers that are overwritten b e

The rst snapshotatb eginning algorithm was ap

cause they might have already b een copied into black

parently that of Abrahamson and Patel which used

ob jects they actually record p ointers stored into black

virtual memory copyonwrite techniques AP but

ob jects and catch the troublesome copies where they

the same general eect can b e achieved straightfor

are stored rather than noticing if the original is de

wardly and fairly eciently with a simple software

stroyed That is if a p ointer to a white ob ject is copied

write barrier

into a black ob ject that new copy of the p ointer will

Perhaps the simplest and b estknown snapshot col

b e found Conceptually the black ob ject or part of it

lection algorithm is Yuasas Yuab If a lo cation

is reverted to graywhenthemutator undo es the col

is written to the overwritten value is rst saved and

lectors traversal Ste Alternatively the p ointed

pushed on a marking stack for later examination This

to ob ject maybegrayed immediately DLM

guarantees that no ob jects will b ecome unreachable

This ensures that the traversal is up dated in the face

to the garbage collector traversalall ob jects which

of mutator changes

are live at the b eginning of garbage collection will b e

Read barriers and write barriers are conceptually

ters to them are overwritten reached even if the p oin

synchronization op erationsb efore the mutator can

In the example shown in Fig the p ointer from B to

p erform certain op erations it must activate the gar

Dissaved on a stack whenitisoverwritten with the

bage collector to p erform some action In practice

p ointertoC

this invo cation of the garbage collector only requires

Snapshotatb eginning schemes are very conserva

a relatively simple action and the compiler can simply

tive b ecause they actually allow the tricolor invari

emit the necessary additional instructions as part of

ant to b e broken temp orarily during incremental

the mutators own machine co de Each p ointer read

tracing Rather than preventing the creation of p oin

or write dep ending on the incremental strategy is

ters from black ob jects to white ones a more global

accompanied by a few extra instructions that p erform

and conservative strategy prevents the loss of such

the collectors op erations Dep ending on the complex

white ob jects the original path to the ob ject cant b e

ity of the read or write barrier the entire barrier action

lost b ecause al l overwritten p ointer values are saved

may b e compiled inline alternatively the barrier may

and traversed

simply b e a hidden outofline pro cedure call accom

This implies that no ob jects can b e freed during col

panying eachpointer read or write Other strategies

lection b ecause a p ointer to any white ob ject might

are p ossible relying less on additional instructions in

have b een stored into a reachable ob ject This in

compiled co de and more on assistance from sp ecial

cludes ob jects that are created while the collection is

ized hardware or virtual memory features

in progress Newlyallo cated ob jects are therefore con

sidered to b e black as though they had already b een

Write Barrier Algorithms

traversed This shortcircuits the traversal of new ob

jects whichwould fail to free anyofthemanyway

If a noncopying collector is used the use of a read

The collectors view of the reachability graph is thus

barrier is an unnecessary exp ense there is no need to

the set union of the graph at the b eginning of garbage

protect the mutator from seeing an invalid version of a

collection plus all of those that are allo cated during

p ointer Write barrier techniques are cheap er b ecause

tracing

heap writes are several times less common than heap

An imp ortant feature to notice ab out snapshotat

reads

b eginning algorithms is that since dont actually pre

serve Dijkstras tricolor invariant grey ob jects havea

Snapshotatb eginning Algorithms

subtle role Rather than guaranteeing that each path

from a blac Snapshotatbeginning algorithms use a write barrier k ob ject to a white ob ject must go through

to ensure that no ob jects ever b ecome inaccessible to a grey ob ject it is only guaranteed that for eachsuch

the garbage collector while collection is in progress reachable white ob ject there will b e at least one path

Conceptually at the b eginning of garbage collection to the ob ject from a grey ob ject A grey ob ject there

a copyonwrite virtual copy of the graph of reachable fore do es not just represent the lo cal part of the collec

data structures is made That is the graph of reach tors traversal wavefrontit may also represent p oin

ters elsewhere in the reachability graph which cross b etterthe ob ject is garbage so it mightaswell not



the wavefront unnoticed get marked

If the p ointer is installed into an ob ject already de

termined to b e live that p ointer must b e taken into

Incremental Up date WriteBarrier Al

accountit has now b een incorp orated into the graph

gorithms

of reachable data structures Those formerlyblackob

While b oth are writebarrier algorithms snapshot jects will b e scanned again b efore the garbage collec

atb eginning and incremental update algorithms are tion is complete to nd any live ob jects that would

otherwise escap e This pro cess may iterate b ecause

quite dierent Unfortunately incremental up date al

more black ob jects may b e reverted while the collec

gorithms have generally b een cast in terms of parallel

tor is in the pro cess of traversing them The traversal

systems rather than as incremental schemes for serial

pro cessing p erhaps due to this they have b een largely is guaranteed to complete however and the collector

 

overlo oked by implementors targeting unipro cessors eventually catches up with the mutator

Perhaps the b est known of these algorithms is due Several variations of this incremental up date algo

to Dijkstra et al DLM This is similar to the rithm are p ossible with dierent implementations of

scheme develop ed by Steele Ste but simpler b e

the write barrier and dierent treatments of ob jects

cause it do es not deal with compaction Rather than

allo cated during collection

retaining everything thats in a snapshot of the graph

In the incremental up date scheme of Dijkstra et al

at the beginning of garbage collection it heuristically

DLM ob jects are optimistically assumed to b e

and somewhat conservatively attempts to retain the

unreachable when theyre allo cated In terms of tri

ob jects that are live at the end of garbage collection

color marking ob jects are allo cated white rather than

Ob jects that die during garbage collectionand b e

black At some p oint the stackmust b e traversed

fore b eing reached by the marking traversalare not

and the ob jects that are reachable at that time are

traversed and marked More precisely an ob ject will

marked and therefore preserved In contrast snap

not b e reached by the collector if all paths to it are

shot schemes must assume that such newlycreated

broken at a p oint that the garbage collector has not

ob jects are live b ecause p ointers to them mightget

yet reached If a p ointer is obliterated after being

installed into ob jects that have already b een reached

reached by the collector it is to o late Eg if the

by the collectors traversal without b eing detected

head of a list has already b een reached and grayed

Dijkstra also cho oses to allo cate new ob jects white

and then b ecomes garbage the rest of the list will

on the assumption that new ob jects are likely to b e

still b e trav ersed

shortlived and quickly reclaimed

Toavoid the problem of p ointers b eing hidden in

We b elieve that this has a p otentially signicant

reachable ob jects that have already b een scanned

advantage over schemes that allo cate black Most

such copied p ointers are caught when they are stored

ob jects are shortlived so if the collector do esnt

into the scanned ob jects Rather than noticing when

reach those ob jects early in its traversal theyre likely

a p ointer escap es from a lo cation that hasnt b een tra

never to b e reached and instead to b e reclaimed

versed it notices when the p ointer hides in an ob ject

very promptly Compared to the snapshot scheme or

that has already b een traversed If a p ointerisover

Bakers describ ed b elow theres an extra computa

written without b eing copied elsewhere so muchthe

tional costthe newlycreated ob jects that are still

live at the end of collection must b e traversed and also



This nonlo cal constraint p oses signicant problems for op

any that b ecame garbage to o late to b e reclaimed b e

timization of the garbage collection pro cess particularlywhen

cause the traversal had already started along a path to

trying to makea hierarchical generational or distributed version

of a snapshot algorithm where multiple garbage collections of

them As we will explain later whether this is worth

dierent scop es pro ceed concurrentlyWJ

while may dep end on several factors such as the rel



Another probable reason is that the early pap ers on concur

ative imp ortance of average case eciency and hard

rent garbage collection addressed dierent concerns than those

realtime resp onse Steele prop oses a heuristic that

facing most language implementors DLM stressed ele

gance of correctness pro ofs at the exp ense of eciency and

allo cates some ob jects white and other ob jects black

readers mayhave missed the fact that trivial changes to the al

attempting to reclaim the shortlive ob jects quickly

gorithm would makeitvastly more practical Stepresented

while avoiding traversal of most other ob jects Ste

a complex algorithm with an optional incremental compaction

phase many readers doubtless failed to recognize that the in



cremental up date strategy was itself simple and orthogonal to The algorithm of DLM actually uses a somewhat more

the other features conservative technique as we will explain shortly

The eectiveness of this heuristic is unproven and it and the collection cycle completes b efore memory is

app ears to b e dicult to implement eciently on stan exhausted

dard hardware

An imp ortant feature of Bakers scheme is its treat

Dijkstras incremental up date algorithm DLM

ment of ob jects allo cated by the mutator during in

which apparently predates Steeles slightly actu

cremental collection These ob jects are allo cated in

ally preserves the tricolor invariantby blackening the

tospace and are treated as though they had already

p ointedto white ob ject rather than reverting the

b een scannedie they are assumed to b e live In

storedinto blackobjecttogray Intuitivelythis

terms of tricolor marking new ob jects are black when

pushes the graywavefront outward to preserve the tri

allo cated and none of them can b e reclaimed they

color invariant rather than pushing it back This is

are never reclaimed until the next garbage collection



more conservative than Steeles strategy b ecause the

cycle

p ointer might later b e overwritten freeing the ob ject

In order to ensure that the collector nds all of the

On the other hand it app ears to b e simpler and faster

live data and copies it to tospace b efore the free area in

in practice it also makes it slightly easier to provehe

newspace is exhausted the rate of copy collection work

correctness of the algorithm b ecause there is an obvi

is tied to the rate of allo cation Each time an ob ject

ous guarantee of forward progress

is allo cated an increment of scanning and copying is

done

In terms of tricolor marking the scanned area of

Bakers Read Barrier Algorithms

tospace contains black ob jects and the copied but un

The b estknown realtime garbage collector is Bakers

scanned ob jects b etween the scan and free p ointer

incremental copying scheme Bak It is an adap

are gray Asyet unreached ob jects in fromspace are

tation of the simple copy collection scheme describ ed

white The scanning of ob jects and copying of their

in Sect and uses a read barrier for co ordination

ospring moves the wavefront forward

with the mutator More recentlyBaker has prop osed

In addition to the background tracing other ob

a noncopying version of this algorithm which shares

jects may b e copied to tospace as needed to ensure

many prop erties with the cop ying version Bakb

that the basic invariant is not violatedp ointers into

fromspace must not b e stored into ob jects that have

already b een scanned undoing the collectors work

Incremental Copying

Bakers approach is to couple the collectors copying

Bakers original copying algorithm was an adaptation

traversal with the mutators traversal of data struc

of the Cheney algorithm For the most part the copy

tures The mutator is never allowed to see p ointers

ing of data pro ceeds in the Cheney breadthrst

into fromspace ie p ointers to white ob jects When

fashion by advancing the scan p ointer through the

ever the mutator reads a p otential p ointer from the

unscanned area of tospace and moving any referredto

heap it immediately checks to see if it is a p ointer into

ob jects from fromspace This background scavenging

fromspace if so the referent iscopiedtotospaceie

is interleaved with mutator op eration however and

its color is changed from white to gray In eect this

mutator activity can also trigger copying as needed

advances the wavefrontofgraying just ahead of the

to ensure that the mutators view of data structures

actual references bythemutator keeping the mutator

is always consistent



inside the wavefront The preservation of the tri

In Bakers system a garbage collection cycle b egins

color invariant is therefore indirectrather than actu

with an atomic ip which conceptually invalidates all

ally checking to see whether p ointers to white ob jects

ob jects in fromspace and copies to tospace all ob

are stored into black ones the read barrier ensures

jects directly reachable from the ro ot set Then the



mutator is allowed to resume Any fromspace ob ject

Baker suggests copying old live ob jects into one end of

tospace and allo cating new ob jects in the other end The two

that is accessed bythemutator mustrstbecopied

o ccupied areas of tospace thus growtoward each other and

to tospace and this copyingondemand is enforced by

older ob jects arent intersp ersed with new ones

 the read barrier The read barrier is typically imple

Nilsens variantofBakers algorithm up dates the p ointers

mented as a few instructions emitted by the compiler

without actually copying the ob jectsthe copying is lazyand

space in tospace is simply reserved for the ob ject b efore the

forming a wrapp er around p ointerdereferencing read

pointer is up dated Nil This makes it easier to provide

instructions The background scavenging pro cess is

smaller b ounds on the time taken by list op erations and to gear

also interleaved with normal program execution to

collector work to the amount of allo cationincluding guaran

teeing shorter pauses when smaller ob jects are allo cated ensure that all reachable data are copied to tospace

that the mutator cant see suchpointers in the rst place

Allocation

It should b e noted that Bakers collector itself

hanges the graph of reachable ob jects in the pro cess

c Free

of copying The read barrier do es not just inform the

hanges bythemutator to ensure that

collector of c New

ob jects arent lost it also shields the mutator from

viewing temp orary inconsistencies created by the col

ere not done the mutator mighten

lector If this w From

ter two dierent p ointers to versions of the same

coun To ob ject one of them obsolete

Scanning

The read barrier may b e implemented in software

by preceding each read of a p otential p ointer from

the heap with a check and a conditional call to the

copyingandup dating routine Compiled co de thus

Figure Treadmill collector during collection

contains extra instructions to implement the read bar

rier Alternativelyitmay b e implemented with sp e

Bakers Incremental Noncopying Algo

cialized hardware checks andor micro co ded routines

rithmThe Treadmill

The read barrier is exp ensive on sto ckhardware

Recently Baker has prop osed a noncopying version

b ecause in the general case anyloadofapointer

of his scheme which uses doublylinked lists and

must check to see if the p ointer p oints to a fromspace

p erob ject color elds to implement the sets of ob

white ob ject if so extra co de must b e executed to

jects of each color rather than separate memory ar

move the ob ject to tospace and up date the p ointer

eas By avoiding the actual moving of ob jects and

The cost of these checks is high on conventional hard

up dating of p ointers the scheme puts fewer restric

ware b ecause they o ccur very frequently Lisp Ma

tions on other asp ects of language implementation

chines have sp ecial purp ose hardware to detect p oin

Bakb WJ

ters into fromspace and trap to a handler Gre

This noncopying scheme preserves the essential ef

Mo o Joh but on conv entional machines the

ciency advantage of copy collection by reclaiming

checking overhead is in the tens of p ercent for a high

space implicitly As describ ed in Sect unreached

p erformance system Zor

ob jects in the white set can b e reclaimed in constant

time by app ending the remainder of that list to the

free list The realtime version of this scheme links

Bro oks has prop osed a variation on Bakers scheme

the various lists into a cyclic structure as shown in

where ob jects are always referred to via an indirection

Fig This cyclic structure is divided into four sec

eld emb edded in the ob ject itself Bro If an ob

tions

ject is valid its indirection eld p oints to itself If

The new list is where allo cation of new ob jects o c

its an obsolete version in fromspace its indirection

curs during garbage collectionit is contiguous with

p ointer p oints to the new version Unconditionally

the freelist and allo cation o ccurs byadvancing the

indirecting is cheap er than checking for indirections

p ointer that separates them At the b eginning of gar

but could still incur overheads in the tens of p ercent

bage collection the new segment is empty

for a highp erformance system Ung A variantof

The from list holds ob jects that were allo cated b e

this approach has b een used by North and Reppyina

fore garbage collection b egan and which are sub ject

concurrent garbage collector NR another variant

to garbage collection As the collector and mutator

exploits immutablevalues in ML to allow reading of

traverse data structures ob jects are moved from the

some data from fromspace HL Zorn takes a dif

fromlist to the tolist The tolist is initially empty

ferent approach to reducing the read barrier overhead

using knowledge of imp ortant sp ecial cases and sp e

In particular it is p ossible to deal with compilers that do

cial compiler techniques Still the time overheads are

not unambiguously identify p ointer variables in the stack mak

on the order of twenty p ercent Zor ing it imp ossible to use simple copy collection

but grows as ob jects are unsnapp ed unlinked from p ointer from a gray ob ject destroying the only path to

the fromlist and snapp ed into the tolist during col one or more white ob jects and ensuring that the col

lection lector will not nd them Thus Bakers incremental

scheme incrementally up dates the reachabilitygraph

The newlist contains new ob jects which are allo

cated black The tolist contains b oth black ob jects of preexisting ob jects but only when grayobjects

havepointers overwritten Overwriting p ointers from

whichhave b een completely scanned and grayones

black ob jects has no eect on conservatism b ecause

whichhave b een reached but not scanned Note

their referents are already gray The degree of con

the isomorphism with the copying algorithmeven an

servatism and oating garbage thus dep ends on the

analogue of the Cheney algorithm can b e used It is

only necessary to have a scan p ointer into the tolist details of the collectors traversal and of the programs



actions

and advance it through the gray ob jects

Eventually all of the reachable ob jects in the from

list have b een moved to the to list and scanned for

Variations on the Read Barrier

ospring When no more ospring are reachable all

Several garbage collectors have used slightvariations

of the ob jects in the tolist are black and the remain

of Bakers read barrier where the mutator is only al

ing ob jects in the from list are known to b e garbage

lowed to see black ie completely scanned ob jects

At this p oint the garbage collection is complete The

Recall that Bakers read barrier copies an ob ject to

fromlist is nowavailable and can simply b e merged

tospace as so on as the mutator encounters a p ointer

with the freelist The tolist and the newlist b oth

to the ob ject This may b e inecient b ecause the

hold ob jects that were preserved and they can b e



h reference in the checking costs are incurred at eac

merged to form the new tolist at the next collection

general case and b ecause it costs something to trap

The state is very similar to the b eginning of the pre

to the scanningandcopying routine typicallya con

vious cycle except that the segments havemoved

ditional branch and a subroutine call

part way around the cyclehence the name tread

It may therefore b e preferable to scan an entire ob

mill

ject when it is rst touched bythemutator and up

Baker describ es this algorithm as b eing isomorphic

date all of the ob jects p ointer elds This maybe

to his original incremental copying algorithm presum

cheap er than calling the scanningandcopying routine

ably including the close coupling b etween the mutator

each time a eld is rst referenced the compiler may

and the collector ie the read barrier

also b e able to optimize away redundantchecks for

multiple references to elds of the same ob ject Juul

Conservatism of Bakers Read Barrier

and Juls distributed garbage collector JJ uses such

Bakers garbage collectors use a somewhat conserva

an ob jectwise scanning technique and combines some

tive approximation of true liveness in twow ays The

of the garbage collectors checking costs with those

most obvious one is that ob jects allo cated during col

incurred for negrained ob ject migration

lection are assumed to b e live even if they die b e

Such a read barrier is coarser and more conserva

fore the collection is nished The second is that

tive than Bakers original read barrier It enforces a

preexisting ob jects may b ecome garbage after having

stronger constraintnot only is the mutator not al

b een reached by the collectors traversal and they will

lowed to see white ob jects it is only allowed to see

not b e reclaimedonce an ob ject has b een grayed it

black ob jects Since an entire ob ject is scanned when

will b e considered liveuntil the next garbage collec

it is rst touched and its referents are grayed the ob

tion cycle On the other hand if ob jects b ecome gar

ject b ecomes black b efore the mutator is allowed to

bage during collection and all paths to those ob jects

see it This advances the wavefront of the collectors

are destroyed before b eing traversed then they wil l

traversal an extra step ahead of the mutators pattern

b e reclaimed That is the mutator mayoverwrite a

of references

Such a black only read barrier prevents any data



Because the list structure is more exible than a contiguous

from b ecoming garbage from the garbage collectors

area of memoryitiseven p ossible to implement a depthrst

p oint of view during a garbage collectionb efore any

traversal with no auxiliary stack in much the same waythat

the Cheney algorithm implements breadthrst WJ

p ointer can b e overwritten the ob ject containing it



This discussion is a bit oversimplied Baker uses four col

will b e scanned and the p ointers referent will b e

ors and whole lists can have their colors changed instanta

grayed In eect this implements a snapshotat

neously bychanging the sense of the bit patterns rather than

the patterns themselves b eginning collection using a read barrier rather than

a write barrier The consistency issues in replication copying are

App el Ellis and Lis concurrent incremental col very dierent from those in Bakerstyle copying The

lector AEL uses virtual memory primitives to im mutator continues to access the same versions of ob

plement a pagewise blackonly read barrier Rather jects during the copying traversal so it neednt check

than detecting the rst reference to any grey ob ject for forwarding p ointers This eliminates the need

in tospace entire pages of unscanned data in tospace for a read barrierconceptually all ob jects are for

are accessprotected so that the virtual memory sys warded to their new versions at once when the ip

tem will implicitly p erform read barrier checks as part o ccurs

of the normal functioning of the virtual memory hard On the other hand this strategy requires a write

ware When the mutator accesses a protected page barrier and the write barrier must deal with more

a sp ecial trap handler immediately scans the whole

than just p ointer up dates In Bakers collector the

page xing up all the p ointers ie blackening all of

mutator only sees the new versions of ob jects so any

the ob jects in the page referents in fromspace are re

writes to ob jects automatically up date the current

lo cated to tospace ie grayed and accessprotecte d tospace version In replication copying however

This avoids the need for continual software checks to the mutator sees the old version in fromspace if an

implement the read barrier and in the usual case is

ob ject has already b een copied to tospace and the

more ecient If the op erating systems trap han

fromspace version is then mo died bythemutator

dling is slow however it may not b e worth it Despite

the new replica can have the wrong old values in

reliance on op erating system supp ort this technique

itit gets out of synch with the version seen by the

is relatively p ortable b ecause most mo dern op erating mutator

systems provide the necessary supp ort

h al l up Toavoid this the write barrier must catc

Unfortunately this scheme fails to provide meaning

dates and the collector must ensure that all up dates

ful realtime guarantees in the general case WM

have b een propagated when the ip o ccurs That is

NS WJ It do es supp ort concurrent collection

all of the mo dications to old versions of ob jects must

however and can greatly reduces the cost of the read

b e made to the corresp onding new versions so that

barrier In the worst case each p ointer traversal may

the program sees the correct values after the ip

cause the scanning of a page of tospace until the whole

This write barrier app ears to b e exp ensive for most



garbage collection is complete

generalpurp ose programming languages but not for

functional languages or nearlyfunctional languages

such as ML where side eects are allowed but infre

Replication Copying Collection

quently used

Recently Nettles et al NOPH ONG have

devised a new kind of incremental copying collec

Coherence and Conservatism Re

tion replication copying which is quite dierent from

visited

Bakers incremental copying scheme Recall that in

Bakers collector garbage collection starts with a

As we mentioned in Sect incremental collectors

ip which copies the immediatelyreachable data

may take dierent approaches to co ordinating the mu

to tospace and invalidates fromspace from that mo

tator with the collectors tracing traversal If these

ment on the mutator is only allowed to see the new

quasiparallel pro cesses co ordinate closely their views

versions of ob jects never the versions in fromspace

of data structures can b e very precise but the co ordi

Replication copying is almost the reverse of this

nation costs may b e unacceptable If they do not co

While copying is going on the mutator continues to

ordinate closelytheymay suer from using outdated

see the fromspace versions of ob jects rather than the

information and retain ob jects whichhave b ecome

replicas in tospace When the copying pro cess is

garbage during collection

complete a ip is p erformed and the mutator then

sees the replicas

Coherence and Conservatism in Non



Johnson has improved on this scheme by incorp orating

copying collection

lazier copying of ob jects to tospace Joh this is essentially

an application of Nilsens lazy copying technique Niltothe

The noncopying writebarrier algorithms wehavede

App elEllisLi collector This decreases the maximum latency

scrib ed lie at dierent p oints along a sp ectrum of ef

but in the very unlikely worst case a page may still b e scanned

fectiveness and conservatism Snapshotatb eginning

at each p ointer traversal until a whole garbage collection has

b een done the hard way algorithms treat everything conservatively reducing

their eectiveness Dijkstra et als incremental up Coherence and Conservatism in Copy

date algorithm is less conservative than snapshot algo ing Collection

rithms but more conservative than Steeles algorithm

Bakers read barrier algorithm do es not fall neatly

In Steeles algorithm if a p ointer to a white ob

into the ab ove sp ectrum It is less conservative than

ject is stored into a black ob ject that white ob ject

snapshotatb eginning in that a p ointerinagrayob

is not immediately grayedinstead the storedinto

ject maybeoverwritten and never traversed it is more

black ob ject is reverted to gray undoing the black

conservative than the incremental up date algorithms

ening done by the collector This means that if the

however b ecause anything reached bythemutator is

storedinto eld is again overwritten the white ob ject

yedob jects cannot b ecome garbage from the col gra

may b ecome unreachable and may b e reclaimed at the

lectors viewp oint after simply b eing touched by the

end of the current collection In contrast Dijkstras

mutator during a collection

algorithm will have grayed that ob ject and hence will

Nettles et als replication copying algorithm like

not reclaim it

an incremental up date algorithm is able to reclaim

ob jects that b ecome unreachable b ecause a p ointer

It may seem that this is a trivial dierence but it

can b e overwritten b efore b eing reached by the collec

is easy to imagine a scenario in which it matters Con

tor Their collector is less conservative than Bakers

sider a program that stores most of its data in stacks

in part b ecause it can use a weaker notion of consis

implemented as linked lists hanging o of stack ob

tency Because the mutator do esnt op erate in tospace

jects If a stack ob ject is reached and blackened by

until after the copying phase is complete the copies of

the collectors traversal and then many ob jects are

data in tospace neednt b e entirely consistent during

pushed onto and p opp ed o of the stack Dijkstras al

incremental copying The changes made to fromspace

gorithm will not reclaim any of the p opp ed itemsas

data structures by the mutator must b e propagated to

the stack ob jects list p ointer progresses through the

tospace eventually but the entire state only needs to

list rep eatedly b eing overwritten with the p ointer to

b e consistent at the end of collection when the atomic

the next item each item will b e grayed when the pre

ip is p erformed Like the other writebarrier algo

vious one is p opp ed Steeles algorithm on the other

rithms replication copying might b enet signicantly

hand may reclaim almost all of the p opp ed items b e

from opp ortunistic traversal ordering

cause the p ointer eld maybeoverwritten manytimes

b efore the collectors traversal examines at it again

Radical Collection and Opp ortunistic

Note that this sp ectrum of conservatism snapshot

Tracing

algorithms Dijkstras Steeles is only a linear order

ing if the algorithms use the same traversal algorithm

The tracing algorithms weve describ ed fall roughly

sc heduled in the same way relative to the programs

into a sp ectrum of decreasing conservatism thus

actual b ehaviorand this is unlikely in practice De

tails of the ordering of collector and mutator actions

Snapshotatb eginning write barrier

determine howmuch oating garbage will b e retained

Blackonly read barrier

Any of these collectors will retain any data reachable

via paths that are traversed by the collector b efore

Bakers read barrier

b eing broken bythemutator

Dijkstras write barrier

This suggests that the reachability graph might

protably b e traversed opportunistical ly ie total

Steeles write barrier

costs might b e reduced by carefully ordering the scan

ning of gray ob jects For example it might b e desir

In considering this quasisp ectrum it is interesting

able to avoid scanning rapidlychanging parts of the

to ask is there anything less conservative than Steeles

graph for as long as p ossible to avoid reaching ob jects

algorithm That is can wehave a b etterinformed

that will shortly b ecome garbage

collector than Steeles one whic h resp onds more ag

All other things b eing equal ie in lieu of op gressivelytochanges in the reachability graph The

p ortunism and random luck snapshotatb eginning answer is yes Such a garbage collector would b e will

is more conservative hence less eective than incre ing to redo some of the traversal its already done

mental up date and Dijkstras incremental up date is unmarking ob jects that were previously reached to

more conservative than Steeles avoid conservatism We refer to this as a radical



garbage collection strategyAt rst glance suchcol scheme SimilarlyDawsons copying scheme pro

lectors may seem impractical but under some circum posed in Daw is cast as a variantofBakers but it

stances approximations of them maymake sense is actually an incremental up date scheme and ob jects

are allo cated in fromspace ie white as in Dijkstras

The limiting case of decreasing conservatism is to

collector

resp ond fully to anychange in the reachability graph

unmarking ob jects that have already b een reached

The choice of a read or writebarrier scheme is

so that all garbage can b e detected Wemight call

likely to b e made on the basis of the available hard

this a ful ly radical collector

ware Without sp ecialized hardware supp ort a write

barrier app ears to b e easier to implement eciently

One way of doing that is to p erform a full trace of

b ecause heap p ointer writes are much less common

the actual reachability graph at every p ointer write

than p ointer traversals If appropriate virtual mem

on the part of the application Naturally this is im

ory supp ort is available and hard realtime resp onse

practical b ecause of its extreme cost A normal non

is not required a pagewise read barrier may b e desir

incremental collector can b e viewed as an approxima

able

tion of this the graph is traversed instantaneously

by stopping the mutator for the whole traversal but

Of write barrier schemes snapshotatb eginning al

thats only done o ccasionally

gorithms are signican tly more conservative than in

Another wayofachieving fully radical collection cremental up date algorithms This advantage of incre

would b e to record all of the dep endencies within mental up date might b e increased by carefully cho os

the reachability graph and up date the dep endency ing the ordering of ro ot traversal traversing the most

database at every p ointer up date Whenever all stable structures rst to avoid having the collectors

paths keeping an ob ject alive is broken the ob ject is work undone bymutator changes

known to b e garbage Again a full implementationof

While incremental up date schemes increase eec

this strategy would b e impractical for generalpurp ose

tiveness they may also increase costs In the worst

garbage collection b ecause the dep endency database

case everything that b ecomes garbage during a col

could b e very large and p ointer up dates would b e very

lection oats ie it b ecomes unreachable to o late

exp ensive

and is traversed and retained anyway If new ob jects

Note however that approximations of this dep en

are allo cated white sub ject to reclamation incre

dency information could b e relatively cheap and in

mental up date algorithms may b e considerably more

fact thats exactly what reference counts are A ref

exp ensive than snapshotatb eginning algorithms in

erence countisaconservative approximation of the

the worst caseit is p ossible that all of the newly

numb er of paths to an ob ject and when those paths

allo cated ob jects will oat and require traversal with

are eliminated the reference counts usually go to zero

no increase in the amount of memory reclaimed We

and allow the ob ject to b e reclaimed immediately

will discuss this in more detail in Sect

Some distributed garbage collection algorithms also

Careful attention should b e paid to write bar

p erform somewhat radical collection by frequently re

rier implementation Bo ehm Demers and Shenkers

computing some lo cal parts of the collectors traversal

BDS Bo e incremental up date algorithm uses

virtual memory dirty bits as a coarse pagewise write

barrier All black ob jects in a page must b e rescanned

Comparing Incremental

if the page is dirtied again b efore the end of a collec

Techniques

tion As with App el Ellis and Lis copy collector

this coarseness sacrices realtime guarantees while

In comparing collector designs it is instructivetokeep

supp orting parallelism It also allows the use of o

in mind the abstraction of tricolor markingas dis

theshelf compilers that dont emit write barrier in

tinct from concrete tracing mechanisms suchasmark

structions along with heap writes

sweep or copy collection The choice of a read or

writebarrier and strategy for ensuring correctness



The use of uniform indirectionsmay b e viewed as avoiding

is mostly indep endentofthechoice of a tracing and

the need for a Bakerstyle read barrierthe indirections isolate

reclamation mechanism

the collector from changes made by the mutator allowing them

to b e decoupled The actual co ordination in terms of tricolor

For example Bro oks copying collector Bro

marking is through a write barrier Bro oks algorithm uses

whichwementioned in Sect is actually an in

a simple write barrier to protect the mutator from the collec

cremental up date write barrier algorithm even though

tor and a simple read barrier to protect the collector from the

Bro oks describ es it as an optimization of Bakers mutator

In a system with compiler supp ort for garbage col of program statements eg to decide whichnoteis

lection a list of storedinto lo cations can b e kept or b eing played what the corresp onding pitch is given

dirty bits can maintained in software for small areas the current tunings how loud to play it which sound

of memory to reduce scanning costs and b ound the comp onents to mix in what prop ortions to achieve the

time sp ent up dating the marking traversal This has right timbre for the given pitch and so on

b een done for other reasons in generational garbage

For most applications therefore a more realistic

collectors as we will discuss in Sect

requirement for real time p erformance is that the ap

plication always b e able to use the CPU for a given

fraction of the time at a timescale relevant to the ap

Realtime Tracing Collection

plication Naturally the relevant fraction will dep end

Incremental collectors are often designed to b e real

on b oth the application and the sp eed of the pro ces

time ie to imp ose strictly limited delays on pro

sor

gram execution so that programmers can guaran

Forachemical factorys pro cess control computer

tee that their garbagecollected programs will meet

it might b e sucient for the controlling application to

realtime deadlines Realtime applications are many

execute for at least one second out of every two b e

and varied including industrial pro cess controllers

cause the controller must resp ond to changes eg in

testing and monitoring equipment audiovisual pro

vat temp eratures within two seconds and one second

cessing ybywire aircraft controls and telephone

is enough to compute the appropriate resp onse On

switching equipment Realtime applications can usu

the other hand a controller for a musical synthesizer

ally b e classied as hard real time where computations

might require the CPU to run its control program for

must complete with strictlylimited time b ounds and

half a millisecond out of every two milliseconds to

soft realtime where it is acceptable for some tasks to

keep delays in the onset of individual notes b elow the

miss their schedules some of the time as long as it

threshold of noticeability



do esnt happ en to o often

Note that either of these applications can function

The criterion for real time garbage collection is of

correctly if the garbage collector sometimes stops the

ten stated as imposing only smal l and bounded delays

application for a quarter of a millisecond Provided

on any particular program operation For example

that these pauses arent to o frequent theyre to o short

traversing a p ointer might never take more than a mi

ant to the applications realtime deadlines to b e relev

crosecond heapallo cating a small ob ject mightnever

But supp ose these pauses are clustered in time if

take more than a few microseconds and so on

they happ en frequently enough they will destroy the

There are two problems with this kind of crite

applications ability to meet deadlines simply bysoak

rion One problem is that the appropriate notion of

ing up to o large a fraction of the CPU time If the ap

a small delay is inevitably dep endent on the nature

plication only executes for a sixteenth of a millisecond

of an application For some applications it is accept

between quartermillisecond pauses it cant get more

able to have resp onses that are delayed by a signicant

than a fth of the CPU time In that case either of

fraction of a second or even man y seconds For other

the ab ove programs would fail to meet its realtime re

applications a delay of a millisecond or twoisnota

quirements even the pro cess control system that only

problem while for others delays of more than a few

needs to resp ond within two seconds

microseconds could b e fatal On one hand consider a

As we describ ed ab ove some copy collectors use

music synthesizer controller where humans own im

virtual memory protections to trigger pagewise scan

precision will swamp a delay of a millisecond and b e

ning and this coarseness may fail to resp ect realtime

unnoticeable on the other consider a highprecision

guarantees In the worst case traversing a list of a

guidance system for antimissile missiles

thousand elements may cause a thousand pages to b e

Another problem with this kind of criterion is that it

scanned p erforming considerable garbage collection

unrealistically emphasizes the smallest program op er

work and incurring trap overheads as well In this

ations When you press a key on a musical keyb oard

way a list traversal that would normally take a few

the controller may b e required to execute thousands

thousand instructions may unexp ectedly take millions



For example in a digital telephone system making a con

increasing the time to traverse the list by several or

nection might b e a soft realtime task but once a connection

ders of magnitude Lo cality of reference maymake

is established delivering continuous audio may b e a hard real

such situations improbable but the probability of bad

time task In this section we will deal primarily with hard

realtime issues cases is not negligible

Unfortunately using a negrained incremental col the convenient prop erty that their time overheads are

lector may not x this problem either Nil Wit more predictable their space costs are muchmoredif

Consider Bakers copying technique The time to tra cult to reason ab out A copying algorithm generally

verse a list dep ends on whether the list elements re frees a large contiguous area of memory and requests

quire relo cation to tospace Traversing a single p ointer for ob jects of any size can b e satised by a constant

may require an ob ject to b e copied this may increase ying al time stacklike allo cation op eration Noncop

the cost of that memory reference by an order of mag gorithms are sub ject to fragmentationmemory that

nitude even if objects are smal l and hardware support is freed may not b e contiguous so it maynotbepos

is available Consider copying a Lisp cons cell con sible to allo cate an ob ject of a given size even if there

sisting of a header a CAR eld and a CDR eld At is that much memory free

least three memory reads and three memory writes

The following sections discuss techniques for obtain

are required for the actual copy plus extra instruc

ing realtime p erformance from an incremental trac

tions to install a forwarding p ointer adjust the free

ing collector We assume that the system is purely

space p ointer and probably to branch to and from

hard real timethat is the program consists only of

the garbage collector routine that do es this work

computations which must complete b efore their dead

In such cases it is p ossible for the garbage collec

lines we also assume that there is only one timescale

tor overheads to consume over of the CPU time

for realtime deadlines In such a system the main

reducing the available computing p owerthat is the

goal is to make the worstcase p erformance as go o d as

power guaranteed to b e available for meeting realtime

p ossible and further increases in exp ectedcase p er

deadlinesby an order of magnitude

formance do no go o d Later we will briey dis

cuss tradeos in systems with soft realtime sched

In Bakers original incremental copying scheme the

ules where dierences in exp ectedcase p erformance

worstcase cost is even worse b ecause any p ointer

may also b e imp ortant We also assume that either a

traversal may force the copying of a large ob ject Ar

copying algorithm is used or all ob jects are of a uni

rays are treated sp ecially and copied lazily ie only



form size This allows us to assume that any mem

when they are actually touched Nilsen reduces the

ory request can b e satised byanyavailable memory

worstcase by extending this lazy copying to all typ es

and ignore p ossible fragmentation of free storage

of ob jects When a p ointer to a tospace ob ject is en

coun tered bythemutator space is simply reserved in

tospace for the ob ject rather than actually copying it

Ro ot Set Scanning

The actual copying o ccurs later incrementally when

An imp ortant determinant of realtime p erformance

the background scavenger scans that part of tospace

is the time required to scan the ro ot set Recall that

Nil

in Bakers incremental collector the ro ot set is up

In deciding on a realtime tracing strategy there

dated and immediatelyreachable ob jects are copied

fore it is imp ortant to decide what kind of guaran

to tospace in a single atomic op eration uninterrupted

tees are necessary and at what timescales While

bymutator execution This means that there will o c

Bakers is the b estknown incremental algorithm it

casionally b e a pause of a duration roughly prop or

may not b e the most suitable for most realtime appli

tional to the size of the ro ot set This pause is likely

cations b ecause its p erformance is very unpredictable

to b e much larger than a pause for a normal incre

at small timescales Algorithms with a weaker cou

ment of tracing and may b e the main limitationon

pling b etween the mutator and the collector suchas

realtime guarantees

most writebarrier algorithms may b e more suitable

Similar pauses o ccur in incremental up date trac

WJ It may b e easier for programmers to reason

ing algorithms when attempting to terminate a collec

ab out realtime guarantees if they know that p ointer

tion Before a collection can b e considered nished

traversals always take a constant time indep endentof

the ro ot set must b e scanned along with anygrayob

whether the p ointer b eing traversed has b een reached

jects recorded by the write barrier in the case of an

by the garbage collector yet Write barrier algorithms

algorithm like Steeles and all reachable data must

require more work p er p ointer store but the work p er

b e traversed and blac kened atomically This ensures

program op eration is less variable and most of it need



In some systems it is feasible to transparently fragment

not b e p erformed immediately to maintain correct

languagelevel ob jects into easilymanaged chunks to make gar

ness

bage collection easier and reduce or eliminate fragmentation

Unfortunately while noncopying algorithms have problems

that no p ointers have b een hidden from the collector some way of ensuring that the collector gets enough

by storing them in ro ots after those ro ots were last CPU time to complete its task b efore free memory is

scanned If this work cannot b e accomplished within exhausted even in the worst p ossible case To pro

the time allowed by the realtime b ounds the collector vide such a guarantee it is necessary to quantify the

must b e susp ended and the mutator resumed and the worst casethat is to put some b ound on what the

entire termination pro cess must b e tried again later collector could b e exp ected to do Since a tracing col

Snapshot at b eginning algorithms dont p ose as dif lector must traverse live data this requires putting a

cult a problem for termination detection since no b ound on the amountoflive data In general the

paths can b e hidden from the collector programmer of an application must ensure that the

program do esnt have more than a certain amountof

One way to b ound the work required for a ip or for

live data to traverse and the collector can then de

termination is to keep the ro ot set small Rather than

termine howfastitmust op erate in order to meet its

considering all lo cal and global variables to b e part of

deadline It can then determine whether this requires

the ro ot set some or all of them may b e treated like

more CPU time than it is allowed to consume Natu

ob jects on the heap Reads or writes to these variables

rally this generally allows some tradeos to b e made

will b e detected by the read barrier or write barrier

in the parameter settings If more memory is avail

and the collector will therefore maintain the relevant

able the collector generally needs a smaller fraction

information incrementally

of the CPU time to guarantee that it nishes b efore

The problem with keeping the ro ot set small is

memory is exhausted

that the cost of the read or write barrier go es up

corresp ondinglya larger number of variables is pro

The usual strategy for ensuring that free memory is

tected by a read or write barrier incurring overhead

not exhausted b efore collection is nished is to use an

each time they are read or written One p ossible trade

al location clockfor each unit of allo cation a corre

o is to avoid the read or writebarrier cost for register

ork is done and the lat sp onding unit of collection w

allo cated variables and to scan only the register set

ter unit is large enough to ensure that the traversal is

atomically when necessary If there are to o manyoper

completed b efore the free space is exhausted Bak

ations on stackallo cated lo cal variables however this

The simplest form of this is to key collection work di

will slow execution signicantly In that case the en

rectly to allo cationeach time an ob ject is allo cated

tire stackmay b e scanned atomically instead While

a prop ortional amount of garbage collection work is

this may sound exp ensive most realtime programs

done This guarantees that no matter howfastapro

never have deep or unb ounded activation stacks and

gram uses up memory the collector is accelerated cor

the cost may b e negligible at the scale of the pro

resp ondingly In actual implementations the work

grams intended resp onse times Similarly for small

is usually batchedupinto somewhat larger units over

systems using fast pro cessors or with relatively large

several allo cations for eciency reasons

timescales for realtime requirements it maybede

In the rest of this section weshowhowtocompute

sirable to avoid the read or write barrier for all global

the minimum safe tracing rate starting with a non

variables and scan them atomically as w ell Interme

copying snapshotatb eginning collector which allo

diate strategies are p ossible treating some variables

cates ob jects black ie not sub ject to collection We

one way and others another p erhaps based on prol

make the simplifying assumption that all ob jects are of

ing information

a uniform size so that there is a single p o ol of memory

that is allo cated from and reclaimed After describing

Guaranteeing Sucient Progress

this simple case we will explain how the safe tracing

rate diers for other incremental tracing algorithms

The preceding section fo cused on ensuring that the

collector do es not use to o much CPU time at the

For a snapshotatb eginning algorithm all of the

relevant timescale keeping the pro cessor from b eing

live data at the b eginning of a collection must b e tra

able to meet its realtime deadlines Converselythe

versed by the end of the collection Other algorithms

collector has a realtime deadline of its own to meet

must do this to o in the worst case b ecause ob jects

it must nish its traversal and free up more memory

mayhavetobetraversed even if they are freed during

b efore the currentlyfree memory is exhausted If it

collection In the absence of any other information

do esnt the application will havetohaltandwait for

from the programmer the collector must generally as

the collection to complete and free up more memory

sume that at the b eginning of collection the maximum

amount of live data is in fact live For hard realtime programs then there must b e

Since we assume that ob jects created during collec safe traversal rate do es not approach zero as mem

tion are allo cated black ie not sub ject to reclama ory b ecomes very largeit approaches the allo cation

tion we need not traverse themthose ob jects will b e rate the traversal must keep up with the allo cation

ignored until the next garbage collection cycle rate and go at least a little faster to ensure that it

eventually catches up If we increase the amountof

At rst glance it might app ear that for a maxi

memory relative to the amount of live data we reach

mum amountoflive data L and a memory size of

a p oint of diminishing returnswemust always trace

M wewould haveM  Lmemoryavailable to

at least as fast as we allo cate

allo catethis would imply a minimum safe tracing

The ab ove analysis applies to noncopying collectors

rate of M  LLtotrace L data b efore this head

for uniformsized ob jects In copying collectors more

ro om is exhausted Unfortunately though we also

memory is required to hold the new versions of ob jects

have to deal with oating garbage The data that are

b eing copied there must b e another L units of mem

live at the b eginning of collection may b ecome gar

ory available in the worst case to ensure that tospace

bage during collection but to o late to b e reclaimed

is not exhausted b efore fromspace is reclaimed This

at this garbage collection cycle The data weveal

a ma jor space cost if L is large relative to the ac

lo cated may also b e garbage but since we allo cate

tual amountofmemoryavailable In noncopying

blackwe dont know that yet If wewere to use up

collectors for nonuniformsized ob jects fragmentation

M  L memorywe mightnotget any space back

must b e taken into account Fragmentation reduces

at this garbage collection cycle and wewould have

the eective memory available requiring faster trac

no headro om left to try another collection the maxi

ing to complete collection in b ounded memory Com

mum data we should allo cate is therefore only half the

putations of worstcase fragmentation are intrinsically

headro om or M  L The minimum safe tracing

programsp ecic WJ due to space limitations we

rate allows us to allo cate that in the time it takes to

will not discuss them here

traverse the maximum live data so the safe tracing

rate is M  LLorM  LL This is su

cient for the worst case in which all garbage oats for

Trading worstcase p erformance for ex

an entire garbage collection cycle but is reclaimed at

p ected p erformance

the next cycle

When a collection phase is complete the collector

As mentioned ab ove the situation is essentially the

versal rate can often determine a less conservativetra

same for other incremental tracing algorithms so long

slowing down the collection pro cess and yielding more

as they allo cate new ob jects black b ecause in the

CPU cycles to the mutator This is p ossible b ecause

worst case they retain all of the same ob jects as a

at the end of the collection the collector can deter

snapshotatb eginning algorithm The minimum safe

mine howmuchlive data was in fact traced and revise

tracing rate is prop ortional to the amount of live data

downward its worstcase estimate of what could b e live

and inversely prop ortional to the amount of free mem

at the next collection This mayimprove p erformance

ory it therefore approaches zero as memory b ecomes

somewhat but usually not dramatically

very large relative to the maximum amountoflive

Alternatively when the collector can determine that

data

it has less than the worstcase amountofwork to do

For allo cating white however the situation is con

it mayavoid GC activityentirely for a while then

siderably worse When allo cating white we are gam

reactivate the collector in time to ensure that it will

bling that newlyallo cated data will b e shortlived we

meet its deadline This is an attractive option if the

therefore make them sub ject to garbage collection in

read or write barrier can b e eciently disabled on the

hop es of reclaiming their space at the current cycle

y

This obliges us to traverse reachable white ob jects

and in the worst case wetraverse everything we al lo

Discussion

cate b efore it b ecomes garbage Even though weas

sume that there is a b ound on the amount of live data

The foregoing analysis assumes a fairly simple mo del

provided by the programmer wemust takeinto ac

of realtime p erformance with a single timescale for

count the conservatism of the traversal pro cess and

hard realtime deadlines More complex schemes are

ersed bythecol the fact that anypointer may b e trav

certainly p ossible in systems with mixed hard and soft

lector b efore its broken by the mutator

deadlines or systems whichhavemultiple timescales

for dierent kinds of goals For example the col When allo cating white therefore the worstcase

lectors infrequent but relatively exp ensive op erations p erformance Algorithms that are less conservative

like ro otset scanning mightbescheduled along with may not b e more attractive than others b ecause the

the applications own longerterm deadlines in a com less conservative algorithms are just as conservative

plementary pattern This could achieve higher p erfor in the worst case

mance overall while providing tight realtime guaran

Even in the usual case the less conservative algo



tees where necessary

rithms may not b e desirable b ecause they maysim

Wehave also assumed a fairly simple mo del of ply b e slower eg b ecause their write barriers re

garbage collection in that there is a single p o ol of quire more instructions Paradoxically this can make

memory available for all memory requests In a non a less conservative algorithm moreconservative in

copying system with ob jects of widely diering sizes practice b ecause its cost maykeep it from b eing run

this will not b e the case b ecause freeing several small

as often Because of the higher overhead the reduced

ob jects do es not necessarily make it p ossible to al conservatism in terms of incremental strategies may

lo cate a larger one On the other hand it app ears intro duce greater conservativeness in how frequently

that many applications memory usage is dominated garbage is collected at all

byavery few sizes of ob jects reasoning ab out real

Overall system design goals are therefore imp ortant

time collection may not b e as hard as it app ears at

y garbage collection algorithm As to the choice of an

rst glance for the ma jority of programs WJ

we will explain in the next section generational tech

Still such reasoning must b e done on applicationby

niques make the overheads of incremental collection

application basis For some programs guaranteeing

unnecessary for many systems where hard realtime

realtime p erformance may cost considerable mem

resp onse is not necessary and it is sucient for the

ory due to p ossible fragmentation unless application

collector to b e nondisruptive in typical op eration

level ob jects can b e split into more uniform chunks

For other systems it may b e desirable to combine in

Another p ossibility is to staticallyallo cate the most

cremental and generational techniques and careful at

troublesome datatyp es as is usually done in realtime

tention should b e paid to how they are combined

systems anyway but rely on the garbage collector to

manage most of the ob jects automatically

For fully general realtime garbage collection with Generational Garbage Col

reasonable worstcase memory usage it app ears that

lection

negrained copying collection is required Nil As

mentioned ab ove copying collection can b e quite ex

Given a realistic amount of memory eciency of sim

p ensive in the worst case even if Lispmachine style

ple copying garbage collection is limited by the fact

hardware supp ort is available to sp eed up the read bar

that the system must copy all live data at a collection

rier EV Wit Nilsen and Schmidt hav e designed

In most programs in a variety of languages most ob

and simulated hardware supp ort which wil l guarantee

jects live a very short time while a smal l percentage

usefully realtime p erformance NS but it is signif

of them live much longer LHUng Sha Zor



icantly more complex

DeTHay While gures vary from language to

language and from program to program usually b e

tween and p ercent of all newlyallo cated heap

Cho osing an Incremental Algo

ob jects die within a few million instructions or b efore

rithm

another megabyte has b een allo cated the ma jorityof

In cho osing an incremental strategy it is imp ortantto

ob jects die even more quickly within tens of kilobytes

prioritize overall average p erformance and worstcase

of allo cation

Heap allo cation is often used as a measure of pro



Consider an autonomous rob ot which might need to revise

gram execution rather than wall clo ck time for two

its overall highlevel planning only every second or so but might

reasons One is that its indep endent of machine and

also need to resp ond reexively to changes in its environment

within a few milliseconds The lowlevel vision and reactive

implementation sp eedit varies appropriately with

adjustments might consume a xed p ercentage of CPU time on

the sp eed at which the program executes whichwall

the scale of a few milliseconds with the remainder available

clo ck time do es not this avoids the need to con tinually

alternately to the highlevel planning functions and to the GC



cite hardware sp eeds It is also appropriate to sp eak

alternating every halfsecond



Nilsens approachisinteresting in that it requires relatively



complex memory controllers but it is compatible with othe One must b e careful however not to interpret it as the

shelf highp erformance micropro cessors ideal abstract measure For example rates of heap allo cation

Younger Generation

in terms of amounts allo cated b ecause the time b e

tween garbage collections is largely determined bythe

 ROOT

tofmemoryavailable Future improvements

amoun SET

in compiler technology may reduce rates of heap allo

cation by putting more heap ob jects on the stack

this is not yet much of a problem for exp erimental

studies b ecause most current stateoftheart compil

ers dont do much of this kind of lifetime analysis

Even if garbage collections are fairly close together

separated by only a few kilobytes of allo cation most

ob jects die b efore a collection and never need to b e

copied Of the ones that do survive to b e copied once

wever alarge fraction survive through many col lec

ho Older Generation

tions These ob jects are copied at every collection

over and over and the garbage collector sp ends most

of its time copying the same old ob jects rep eatedly

This is the ma jor source of ineciency in simple gar

bage collectors

Generational col lection LH avoids muchofthis

rep eated copying by segregating ob jects into multiple

areas by age and collecting areas containing older ob

jects less often than the younger ones Once ob jects

have survived a small numb er of collections they are

Figure A generational copying garbage collector

moved to a less frequently collected area Areas con

b efore garbage collection

taining younger ob jects are collected quite frequently

b ecause most ob jects there will generally die quickly

freeing up space copying the few that survive do esnt

acceptable substitute for more exp ensive incremental

cost much These survivors are advanced to older sta

techniques as well as to improveoverall eciency

tus after a few collections to keep copying costs down

For historical reasons and simplicity of explana

For stopandcollect nonincremental garbage col

tion we will fo cus on generational copying collectors

lection generational garbage collection has an addi

The choice of copying or marking collection is essen

tional b enet in that most collections take only a short

tially orthogonal to the issue of generational collec

timecollecting just the youngest generation is much

tion however DWH

faster than a full garbage collection This reduces the

frequency of disruptive pauses and for many programs

Multiple Subheaps with Varying

without realtime deadlines this is sucientforac

ceptable interactive use The ma jority of pauses are

Collection Frequencies

so brief a fraction of a second that they are unlikely

Consider a generational garbage collector based on the

to b e noticed by users Ung the longer pauses for

semispace organization memory is divided into areas

multigeneration collections can often b e p ostp oned

that will hold ob jects of dierent approximate ages

until the system is not in use or hidden within nonin

or generationseach generations memory is further

teractive computeb ound phases of program op eration

divided into semispaces In Fig we show a simple

WM Generational techniques are often used as an

generational scheme with just two age groups a New

are typically higher in Lisp and Smalltalk b ecause more control

generation and an Old generation Ob jects are allo

information andor intermediate data of computationsmaybe

cated in the New generation until its current semis

passed as p ointers to heap ob jects rather than as structures on

pace is full Then the New generation only is col

the stack



lected copying its live data into the other semispace

Allo cationrelative measures are still not the absolute

b ottomline measure of garbage collector eciency though b e

as shown in Fig

cause decreasing work p er unit of allo cation is not nearly as

If an ob ject survives long enough to b e considered

imp ortant if programs dont allo cate much conversely smaller

old it can b e copied out of the new generation and

p ercentage changes in garbage collection work mean more for

into the old rather than backinto the other semis programs whose memory demands are higher

Younger Generation

ately when the ob ject is moved Either event destroys

the integrity and consistency of data structures in the ROOT

SET heap

Ensuring that the collector can nd p ointers into

young generations requires the use of something like

the write barrier of an incremental collectorthe

running program cant freely store p ointers into heap

ob jects at will Eachpotential p ointer store must

b e accompanied by some extra b o okkeeping in case

an intergenerational p ointer is b eing created As in

an incremental collector this is usually accomplished

byhaving the compiler emit a few extra instructions

Older Generation

along with each store of a p ossible p ointer value into

an ob ject on the heap

The write barrier maydochecking at each store

or it may b e as simple as maintaining dirtybitsand

scanning dirty areas at collection time Sha Sob

WM Wil HMS The imp ortant p ointisthat

all references from old to new memory must b e lo cated

at collection time and used as ro ots for the copying

traversal

Using these intergenerational p ointers as ro ots en

sures that all reachable ob jects in the younger gener

Figure Generational collector after garbage collec

ation are actually reached by the collector in the case

tion

of a copying collector it also ensures that all p ointers

to moved ob jects are appropriately up dated

pace This removes it from consideration by single

As in an incremental collector this use of a write

generation collections so that it is no longer copied

barrier results in a conservative approximation of true

at every collection Since relatively few ob jects live

liveness any p ointers from old to new memory are

this long old memory will ll much more slowly than

used as ro ots but not all of these ro ots are necessarily

new Eventually old memory will ll up and haveto

live themselves An ob ject in old memory may already

b e garbage collected as well Figure shows the gen

have died but that fact is unknown until the next time

eral pattern of memory use in this simple generational

old memory is collected Thus some garbage ob jects

scheme Note the gure is not to scalethe younger

may b e preserved b ecause they are referred to from

generation is typically several times smaller than the

ob jects that are oating undetected garbage This

older one

app ears not to b e a problem in practice Ung UJ

The numb er of generations may b e greater than two

ters from It would also b e p ossible to track all p oin

with each successive generation holding older ob jects

newer ob jects into older ob jects allowing older ob jects

and b eing collected considerably less often Tektronix

to b e collected indep endently of newer ones This is

Smalltalk is such a generational system using

more costlyhowever b ecause there are typically many

semispaces for each of eight generations CWB

more p ointers from new to old than from old to new

In order for this scheme to work it must b e p ossi Such exibility is a consequence of the way references

ble to collect the younger generations without col are typically createdby creating a new ob ject that

lecting the older ones Since liveness of data is a refers to other ob jects which already exist Sometimes

global prop ertyhowever oldmemory data must b e a p ointer to a new ob ject is installed in an old ob ject

taken into account For example if there is a p ointer but this is considerably less common This asymmetri

from old memory to new memory that p ointer must cal treatmentallows ob jectcreating co de like Lisps

b e found at collection time and used as one of the frequentlyused cons op eration to skip the record

ro ots of the traversal Otherwise an ob ject that is ing of intergenerational p ointers Only noninitializing

livemay not b e preserved by the garbage collector stores into ob jects must b e checked for intergenera

or the p ointer may simply not b e up dated appropri tional references writes that initialize ob jects in the

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA

First (New) AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AAA AA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA

Generation AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA Memory AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AAA AAA AA AAA AAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Second AA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA Generation AA

Memory

AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AA

AAAA AAAA AAAA AAAA AAAA AA

Figure Memory use in a generational copy collector with semispaces for each generation

youngest generation cant create p ointers into younger AdvancementPolicies

ones

The simplest advancement p olicy is to simply advance

Even if youngtoold p ointers are not recorded it

all live data into the next generation whenever they

may still b e feasible to collect a generation without

are traversed This has an advantage of ease of imp

collecting younger ones In this case al l data in the

lementation b ecause it is not necessary to b e able to

younger generations may b e considered p ossible ro ots

distinguish b etween ob jects of dierent ages within a

and they may simply b e scanned for p ointers LH

generation In a copying collector this allows the use

While this scanning consumes time prop ortional to the

of a single contiguous area for a generation with no

amount of data in the younger generations each gen

division into semispaces and it do es not require any

eration is usually considerably smaller than the next

header elds to hold age information

and the cost may b e small relative to the cost of ac

An additional advantage of advancing everything

tually collecting the older generation Scanning the

out of a generation at the rst traversal is that

data in the younger generation may b e preferable to

it avoids the buildup of longlived ob jects within a

collecting b oth generations b ecause scanning is gener

generationa longlived ob ject cannot b e copied re

ally faster than tracing and copying it may also have

p eatedly at the same timescale b ecause it will b e

b etter lo cality

quickly advanced to the next older generation which

The cost of recording intergenerational p ointers is

is collected less often

typically prop ortional to the rate of program execu

The problem here is that ob jects maybeadvanced

tion ie its not particularly tied to the rate of ob

too fastshortlived ob jects allo cated shortly b efore

ject creation For some programs it may b e the ma jor

a collection will b e advanced to the next generation

cost of garbage collection b ecause several instructions

even though they are quite young and likely to die

must b e executed for every p otential p ointer store into

almost immediately Ung WM This will cause

the heap This mayslow program execution down by

the older generation to ll up more quickly and b e

several p ercent

collected more often The problem of very shortlived

Within the framework of the generational strategy

ob jects may b e alleviated by delaying the advance

weve outlined several imp ortant questions remain

ment of ob jects by just one garbage collection cycle

this ensures that all ob jects are roughly the same age

Advancement policy How long must an ob ject

within a factor of two when they are advanced to

survive in one generation b efore it is advanced to

an older generation In an incremental generational

the next

collector allo cating blackcanhave a similar eect if

incremental collection phases are of a sucient dura

Heap organization How should storage space b e

tion WJ

divided and used b etween generations and within

It is unclear whether keeping ob jects within a gen

a generation How do es the resulting reuse pat

eration for more than two collection cycles is worth

tern aect lo cality at the virtual memory level

the extra copying cost Under most conditions it ap

and at the lev el of highsp eed cache memories

p ears that successive copies do not greatly reduce the

amount of data advanced WM Zor although

Col lection scheduling For a nonincremental col

this is highly dep endent on the nature of the applica

lector how mightweavoid or mitigate the eect

tion it may also b e desirable to vary the advancement

of disruptive pauses esp ecially in interactive ap

p olicy dynamically WM UJ

plications Can weimprove eciency by careful

The desirabilityofkeeping data in a generation for

opp ortunistic scheduling Can this b e adapted

multiple collection cycles is also aected bythenum

to incremental schemes to reduce oating gar

b er and size of older generations In general if there

bage

are very few generations eg two as in Ungars Gen

eration Scavenging system it is more desirable to

retain data longer to avoid lling up older genera Intergenerational references Since it must b e

tions If intermediate generations are available it is p ossible to collect younger generations without

usually preferable to advance things more quicklybe collecting the older ones wemust b e able to nd

cause they are likely to die in an inb et the live p ointers from older generations into the ween genera

ones were collecting What is the b est waytodo tion and never b e advanced to the oldest generation

this WM Zor

Heap Organization likely to die shortly after b eing advanced needlessly

taking up space in the next generation and forcing it

A generational collector must treat ob jects of dierent

to b e collected again so oner

ages dierently While tracing it must b e able to

Ungars solution to this problem in the Young gen

tell which generation an ob ject b elongs to in order

eration of his Generation Scavenging collector is

to decide whether to trace its ospring and whether

to use three spaces instead of just two with all ob

to advance it to another generation The write barrier

jects b eing initially allo cated in the third space The

must also b e able to determine ob jects generations to

newlycreated ob jects in this third space are copied

detect whether a p ointer to a younger ob ject is b eing

into a semispace along with the ob jects from the

stored into an older ob ject

other semispace The third space is emptied at ev

In a copying collector this is usually done bykeep

ery garbage collection cycle and can therefore b e

ing ob jects of dierent ages in dierent areas of mem

reused immediately each time It therefore has lo cal

ory In many systems these are contiguous areas an

itycharacteristics similar to those of a singlespace

ob jects generation can therefore b e determined by

p ergeneration system It might seem that this third

simple address comparisons In other systems the

space would increase memory usage since semispaces

areas may b e noncontiguous sets of pagesan ob

are still required in that generation so that ob jects can

jects generation can b e determined by using the page

be kept in the generation for multiple collections The

numb er part of its address to index into a table that

creation area is used in its entiretyateach allo cation

says which generation that page b elongs to

andcollection cycle while each semispace is used to

In other systems such as noncopying collectors

hold the survivors at every other collection cycle Typ

each ob ject b elongs to a generation but ob jects of

ically only a small minority of new ob jects typically

dierent generations maybeintersp ersed in memory

survives even a rst collection so only a small part of

Typicallyeach ob ject has a header eld indicating

each semispaces is actually used most of the time and

which generation it b elongs to

the overall memory usage is lower

Wilsons Opp ortunistic Garbage Collector WM

Subareas in copying schemes

uses a variation on this scheme with the subareas

within a generation used for the additional purp ose

Generational copying collectors divide each genera

of deciding when to advance an ob ject from one gen

tions space into several areas For example each gen

eration to anotherob jects are advanced out of the

eration may consist of a pair of semispaces so that

semispaces to the next generation at each cycle rather

ob jects can b e copied back and forth from one space

than b eing copied back and forth from one semispace

to another to retain them within a generation over

to the other at successive collections In eect this

multiple collections If only one space is used ob jects

is a simple bucket brigade advancementmechanism

must b e advanced to another generation immediately

Sha using the segregation of ob jects into subareas

b ecause theres nowhere to copy them to within the

to enco de their ages for the advancement p olicyIt

same generation

avoids the need for age elds in ob ject headers which

The lo cality of semispace memory usage is p o or

may b e advantageous in some systems where some

only half of the memory of a generation can b e in use 

ob jects do not have headers at all It do es provide

at a given time yet b oth of the spaces are touched in

a guarantee that ob jects will not b e advanced out of

their entiretyevery two collection cycles Lisp ma

a generation without surviving for at least one and

chine garbage collectors Mo oCouavoid this

up to two collection cycles this is sucienttoavoid

problem by using only a single space p er generation

premature advancementofvery shortlived data

Rather than copying ob jects from one semispace to

In several generational copying collection systems

the other until they are advanced garbage collection

the oldest generation is treated sp ecially In the Lisp

of a generation advances al l ob jects into the next gen

machine collectors this is necessitated by the fact that

eration This avoids the need for a pair of semispaces

most generations are emptied at every collection cycle

except in the oldest generation which has no place to

and their contents copied to the next generationfor

copy things to Unfortunatelyithasthedrawback

the oldest generation there isnt an older generation

that relatively young ob jects maybeadvanced along

with relatively old onesob jects allo cated shortly b e



For example in some highp erformance Lisp systems a sp e

fore a collection are not given much time to die b efore

cial p ointer tag signies a p ointer to a cons cell and the cons

b eing advanced These relatively young ob jects are cell itself has no header

to copy things into The oldest generation dynamic plained in Sect pages of older generations dirt

space is therefore structured as a pair of semispaces ied since the previous collection are scanned in their

which are used alternately A further enhancementis entiretyandany p ointers to young generations are

to provide a sp ecial area called static space which noted and used as part of the ro ot set

is not garbage collected at all during normal op eration

This area holds system data and compiled co de that

Discussion

are exp ected to change very rarely

Manyvariations on generational collection are p ossi

Some copying collectors based on Ungars Gener

ble and hybrids are common to allowvarious trade

ation Scavenging system treat the oldest generation

os to b e adjusted

sp ecially by structuring it as a single space and using

It is common for copying collectors to manage

a markcompact algorithm In these systems all gen

large ob jects dierently storing them in a sp ecial

erations typically reside in RAM during normal execu

large object area and avoiding actually copying them

tion and the use of a single space reduces the RAM re

CWB This essentially combines copying of small

quired to keep the oldest generation memory resident

ob jects where its cheap with marksweep for large

While the markcompact algorithm is more exp ensive

ob jects to avoid the larger space and time overheads

than a typical copy collection the ability to p erform

of copying them Commonly large ob jects are actu

a full collection without paging makes it worthwhile

ally represented by a small copycollected proxy ob

for the oldest generation Noncopying techniques can

ject which holds an indirection p ointer to the actual

b e used for the same purp ose although they are more

storage for the ob jects data elds

sub ject to fragmentation problems

Ob jects known not to contain p ointers mayalsobe

segregated from other ob jects to optimize the tracing

Generations in Noncopying Schemes

pro cess andor the scanning involved in some schemes

In our discussion of generational collection thus far we

for tracking intergenerational p ointers Lee this

have fo cused primarily on copying garbage collection

may also improve lo cality of reference during tracing

schemes where generations can b e viewed as areas

if copycollected proxies are used b ecause the actual

of memory holding ob jects of dierent ages This is

storage for nonp ointer ob jects neednt b e touched at

unnecessaryhowever as long as it is p ossible to dis

all

tinguish ob jects of dierent ages and treat them dif

The ParcPlace Smalltalk garbage collector com

ferently Just as incremental collection algorithms are

oung generation bines stopandcopy collection of the y

b est understo o d in terms of the abstraction of tricolor

where the worstcase pause is not large with incre



marking generational algorithms are b est understo o d

mental marksweep collection of older data

in terms of sets of ob jects which are garbage collected

at dierent frequencies Each of these age sets in

Tracking Intergenerational Refer

turn can b e divided into shaded and unshaded sets

ences

for the purp oses of the tracing traversal

The XeroxPARC PCR Portable Common Run

Generational collectors must detect p ointers from

time garbage collector is a generational marksweep

older to younger generations requiring a write barrier

collector with a header eld p er ob ject indicating the

similar to that used by some incremental tracing algo

ob jects age Ob jects of dierent generations maybe

rithms That is a program cannot simply store p oin

allo cated in the same page although the system uses a

ters into heap ob jectsthe compiler andor hardware



heuristic to minimize this for lo cality reasons When

must ensure that each p otential store is accompanied

garbage collecting only young data the PCR collec

bychecking or recording op erations to ensure that if

tor scans the ro ot set and traverses ob jects whose age

any p ointers to younger generations are created they

elds signify that they are sub ject to collection This

can b e found later by the collector Typically the

tracing continues transitively in the usual waytracing

compiler emits additional instructions along with each

all reachable young ob jects The generational write

potential p ointer store instruction to p erform the re

barrier uses pagewise dirtybitsmaintained by virtual

quired write barrier op erations

memory access protection techniques as will b e ex

For many systems this may b e the largest cost



of generational garbage collection For example in

Pages containing old ob jects are not used to hold young

ob jects unless they are more than half empty this tends to



David Ungar p ersonal communication avoid gratuitously mixing older and younger data

a mo dern Lisp system with an optimizing compiler micro co de detected and dereferenced the forwarding

p ointer stores typically account for one p ercentor p ointers automatically making the indirections invis

so of the total instruction count SH Zor If ible to running programs

each p ointer store requires twenty instructions for the

When garbage collecting a particular generation it

write barrier p erformance will b e degraded byroughly

was therefore only necessary to use the entry table as

twenty p ercent Optimizing the write barrier is there

an additional set of ro ots rather than actually nding

fore very imp ortanttooverall garbage collector p er

the p ointers in other generations and up dating them

formance and signicantly faster write barriers have

A somewhat dierentscheme was used in the TI Ex

b een develop ed Because of their key role in the overall

plorer system rather than having a table of incoming

p erformance of a generational collector we will discuss

p ointers p er generation a separate table of outgoing

write barriers in some detail

p ointers was maintained for each combination of older

Many write barrier techniques have b een used with and younger generation So for example the oldest

dierent p erformance tradeos on dierenthardware generation had a separate exit table for eachyounger

generation holding its indirected p ointers into each

and for dierent languages and language implementa

of those generations This allowed the scanning of

tion strategies

tables to b e more precise ie only scanning the p oin

Some systems use a collection strategy that is al

ters relevant to the generations b eing collected and

most generational but without using a write barrier

made it simpler to garbage collect the table entries

to get some of the b enet of generational collection

themselves

Rather than keeping track of p ointers from old data

Unfortunately the indirection table schemes were

to young data as they are created old data are sim

not fast or ecient esp ecially on sto ck hardware with

ply scanned for such p ointers at collection time This

no supp ort for transparently dereferencing forwarded

requires more scanning work and has worse lo cal

p ointers As with Bakerstyle incremental copying

ity than true generational collection but it maybe

the cost of common p ointer op erations is greatly in

considerably faster than tracing all reachable data

creased if p ointers must b e checked for indirections

Scanning is typically several times faster than trac

Recent generational collectors have therefore avoided

ing and has strong spatial lo cality of reference The

indirections and allowed p ointers directly from any

Stratied Garbage Collector for Austin Kyoto Com

generation into any other Rather than requiring such

mon Lisp an enhancementofKyoto CommonLisp

p ointers to b e lo calized they simply keep trackof

uses suchascheme to avoid the overhead of a write



where such p ointers are so that they can b e found at

barrier Bartlett has used a similar technique in a

collection time W e refer to suchschemes as pointer

collector designed to work with otheshelf compilers

recording schemes b ecause they simply record the lo

which do not emit write barrier instructions Bar

cation of p ointers

Indirection Tables

Ungars Rememb ered Sets

The original generational collectors for Lisp Machines

Ungars Generation Scavenging collector used an ob

LH used sp ecialized hardware andor micro co de

jectwise p ointerrecording scheme recording which

to sp eed up the checks for p ointers into younger gen

ob jects had p ointers to younger generations stored

erations and the p ointers that were found were in

into them At each p otential p ointer store the write

directed by a micro co ded routine through an entry

barrier would check to see if an intergenerational

tableNopointers directly into a younger generation

p ointer was b eing createdbychecking to see if the

were allowed only p ointers to a table entry holding

stored value was in fact a p ointer p ointed into the

the actual p ointer Each generation had its own entry

young generation and was b eing stored into an ob

table holding the actual p ointers to ob jects When the

ject in the old generation If so the storedinto ob ject

mutator executed a store instruction and attempted

was added to the remembered set of ob jects holding

to create a p ointer into a younger generation the store

suchpointers if it was not already there Each ob ject

instruction trapp ed to micro co de Rather than actu

had a bit in its header saying whether it was already in

ally creating a p ointer directly into the younger gen

the rememb ered set so that duplicate entries could b e

eration an invisible forwarding pointer was created

avoided This makes the collectiontime scanning cost

and stored instead The Lisp Machine hardware and

dep endentonthenumb er and size of the storedinto



ob jects not the actual numb er of store op erations William Schelter p ersonal communication

In the usual case this scheme worked quite well actual store op erations

for a Smalltalk virtual machine Unfortunatelyin

Unfortunately this scheme would b e considerably

the worst case this checking and recording incurred

slower if implemented on sto ck hardware with larger

tens of instructions at a p ointer store and the relative

pages and no dedicated hardware for page scanning or

cost would have b een appreciably higher in a higher

write checking It also requires the ability to scan an

p erformance language implementation

arbitrary storedinto page from the b eginning which

is more complicated on standard hardware than on the

A signicant drawbackofthisscheme was that the

Symb olics machines in whichevery machine word had

rememb ered set of ob jects must b e scanned in its en

extra bits holding a typ e tag

tirety at the next garbage collection whichcouldbe

More recently virtual memory dirty bits havebeen

exp ensivefortwo reasons Some of the checking cost

used as a coarse write barrier Sha The underlying

was rep eated b ecause a storedinto lo cation might

b e stored into many times b etween collections b eing virtual memory system typically maintains a bit p er

checked each time and b ecause storedinto ob jects page that indicates whether the page has b een dirtied

changed in anyway since it was last written out to

had to b e scanned again at collection time Worse

disk Most of the work done to maintain these bits

very large ob jects might b e stored into regularlyand

is done in dedicated memory hardware so from the

have to b e scanned in their entiretyateach collec

p oint of view of the language implementor it is free

tion The latter was observed to cause large amounts

of scanning workand even thrashingfor some pro Unfortunately most op erating systems do not provide



facilities for examining dirty bits so op erating system

grams running in Tektronix Smalltalk

kernel mo dications are required Alternatively vir

tual memory protection facilities can b e used to simu

Page Marking

late dirty bits by writeprotecting pages so that writes

to them can b e detected by the hardware and invoke

Mo ons Ephemeral Garbage Collector for Symbol

a trap handler BDS this technique is used in the

ics Lisp machines used a dierent p ointerrecording

XeroxPortable Common Runtime garbage collector

scheme Mo o Rather than recording which ob

The trap handler simply records that the page has

jects had intergenerational p ointers stored into them

b een written to since the last garbage collection and

it recorded which virtual memory pages were stored

unprotects the page so that program execution can

into The use of the page as the granularity of record

resume In the PCR collector ob jects of dierent

ing avoided the problem of scanning very large ob

generations may reside in the same page so when a

jects although it increased costs for sparse writes to

dirtied page is scanned at collection time ob jects of ir

small ob jects b ecause the entire page would still b e

relevant generations are skipp ed As with the App el

scanned The scanning cost was not large on the Sym

Ellis and Li collector the use of virtual memory pro

b olics hardware b ecause it had sp ecial tag supp ort

tections makes it imp ossible to satisfy hard realtime

to make generation checking very fast and b ecause

requirements and may incur signicant trap overhead

pages were fairly small Much of the writebarrier was

scanning costs may also b e relatively high if write lo

also implemented directly in hardware rather than

cality is p o or As we will explain in Sect however

by additional instructions accompanying each p ointer

this kind of write barrier has advantages when deal

write so the time cost at each p ointer store was small

ing with compilers that are unco op erative and do not

In this system the information ab out p ointers into

emit write barrier instructions

younger generations is held in a pagewise table This

has the advantage that the table implicitly eliminates

duplicatesa page may b e stored into anynumber of

Word marking

times and the same bit set in the table but the page

In adapting Mo ons collector for standard hardware

will only b e scanned once at the next garbage collec

Sobalvarro avoided the cost of scanning large pages by

tion This duplicate elimination is equivalenttoUn

the use of a word marking system which used a bitmap

gars use of bits in ob ject headers to ensure uniqueness

to record which particular machine words of memory

of entries in the rememb ered set The time required

actually had p ointers stored into them Sob This

to scan the recorded items at a garbage collection is

avoided the need to b e able to scan an arbitrary page

therefore prop ortional to the numb er of storedinto

for p ointers b ecause the lo cations of the relevant p oin

pages and to the page size but not the number of

ters were stored exactly



Patrick Caudill p ersonal communication Sobalvarro also optimized the scheme for standard

hardware by making the write barrier simplermost ing collector AEL In Wilsons scheme the bit

writebarrier checking was eliminated and deferred corresp onding to a card is left set if the card contains

until collection time The storedinto lo cations are a p ointer into a younger generation Such cards must

checked at collection time to see whether the stored b e scanned again at the next garbage collection even

items are intergenerational p ointers While this is less if they are not stored into again

precise than Mo ons or Ungars checking and may

Ungar Chamb ers and Holzle have further rened

cause more words to b e examined at collection time

this cardmarking scheme in a garbage collector for

it also has the b enet that implicit duplicate elimina

the Self language Rather than using a table of bits

tion is p erformed rst and the other checks only need

to record storedinto cards it uses a table of

to b e p erformed once p er storedinto word

even though only a bit is needed a byteisusedbe

The drawback of Sobalvarros scheme is that for a cause stores are fast on most architectures This

allows the write barrier to consist of only three instruc

reasonably large heap the table of bits is fairly large

ab out three p ercent of the total size of memoryScan tions which unconditionally store a byte into a byte

ning this table would b e relatively exp ensiveifitwere array This comes at an increase in scanning costs

represented as a simple linear arrayofbits Stor b ecause the byte array is eight times larger than a

ing individual bits would also make the write bar bit array but for most systems the decrease in write

barrier cost is well worth it Cha DMH Holzle

rier exp ensive on some architectures where subword

has further rened this by relaxing the precision of the

write instructions are slow or must b e synthesized us

ing several other instructions Sobalvarros solution write barriers recording bringing the cost p er store

to this was to use a sparse twolevel representation wo instructions on a Sun SPARC pro ces down to t

of the table this incurred an additional writebarrier sor with a slight increase in scanning costs H As

cost b ecause op erations on the sparse array are signif we will explain in the next section card marking can

also b e combined with store lists to reduce scanning

icantly slower than op erations on a contiguous array

of cards which hold p ointers into younger generations

but arent stored into again

Card Marking

An alternative to marking pages or words is to con

Store Lists

ceptually divide memory into intermediatesized units

The simplest approach to p ointer recording is simply

called cards Sob The use of relatively small cards

to record each storedinto address in a list of some

has the advantage that a single store op eration can

sort This might b e a linked list or a preallo cated

only cause a small amount of scanning at collection

array that has successive lo cations stored into much

time making the cost smaller than pagemarking on

like pushing items on a lineararray stack

average As long as the cards arent extremely small

App els very simple lines of C co de gen

the table used for recording storedin to cards is much

erational collector for Standard ML of New Jersey

smaller than the corresp onding table for word mark

Appb uses such a list which is simply scanned at

ing For most systems this makes it feasible to rep

each collection with go o d p erformance for typical ML

resent the table as a contiguous linear arraykeeping

programs

the write barrier fast

Simple store lists have a disadvantage for many

One problem of using card marking on standard

language implementations however in that they im

hardware is that it requires that cards b e scanned

plement bags multisets of storedinto lo cations not

for p ointers even if the card do es not b egin with the

sets That is the same lo cation may app ear in the list

b eginning of an ob ject Wilsons Opp ortunistic Gar

many times if it is frequently stored into and the gar

bage Collector addresses this bymaintaining a cross

bage collector must examine each of those entries at

ing map recording which cards b egin with an un

collection time The collectiontime cost is therefore

scannable part of an ob ject WM In the case of a

prop ortional to the number of pointer stores rather

card thats not scannable from the b eginning the map

than to the numb er of storedinto lo cations This

can b e used to nd a previous card that is scannable

lack of duplicate elimination can also lead to exces

lo cate an ob ject header on that card and skip forward

sive space usage if p ointer stores are very frequent

ob ject byobjectuntil it nds the headers of the ob

For ML this is generally not a problem b ecause side

jects on the card to b e scanned This is a renement

eects are used relatively infrequently

of the crossing maps used by App el Ellis and Li to

supp ort pagewise scanning in their incremental copy Moss et al have devised a variation of the store list

technique which allows a b ounded numberofentries The actual cost of write barriers is somewhat con

in a sp ecial kind of list called a static store buerand troversial Several studies have measured write bar

calls a sp ecial routine when this buer is full The sp e rier overheads for interpreted systems eg Ung

cial routine pro cesses the list using a fast hash table HMS making them hard to relate to high

to remove duplicates This technique HMSreduces p erformance systems using optimizing compilers

space costs and avoids doing all of the storelist pro Mo oH It may b e more reasonable to combine

cessing at garbage collection time but it do es not have measurements of highp erformance systems with an

the same duplicate elimination advantage as the table analytic understanding of garbage collector costs to

based schemesduplicates are eliminated but only infer what the approximate cost of a wellimplemented

after theyve already b een put in the store list Each collector would b e for a wellimplemented systems

p ointer store creates an entry in the store list which

As mentioned earlier compiled Lisp systems app ear



must b e fetched and examined later

to execute roughly one p ointer store into a heap ob

Hosking and Hudson HHhave combined some

ject p er hundred instructions a cardmarking write

of the b est features of card marking and store

barrier should only slow such a system down by ab out

lists Pointer stores are recorded by a cardmarking

four or ve p ercent executing two or three instruc

write barrier in the usual way but when a card

tions at the time of each p ointer store plus a smaller

is scanned the individual lo cations containing p oin

cardscanning cost at each collection For many pro

ters into younger generations are recorded This al

grams with little live data or lifetime distributions

lows subsequent collections to avoid rescanning whole

favorable to generational collection the tracing and

cards if they are not stored into again

reclamation cost will b e similarly low and the cost of

garbage collection should b e under ten p ercent

This gure can vary considerablyhoweverand of

Discussion

ten upwardbased on the workload typ e informa

In cho osing a write barrier strategy for a generational

tion in the programming language data representa

collector it is imp ortanttotakeinto accountinterac

tions and optimizations used by the compiler Sev

tions with other asp ects of the systems implementa

tation choices will b e considered in later eral implemen

tion For example the XeroxPARC mostlyparallel

sections

collector uses virtual memory techniques to imple

If a compiler generates faster co de the write bar

ment pagewise dirty bits partly b ecause it is designed

rier cost may b ecome a larger fraction of the smaller

to work with a variety of compilers whichmay not co

overall running time On the other hand the compiler

op erate in the implementation of the write barrier

may also b e able to reduce write barrier costs by infer

eg otheshelf C compilers do not emit writebarrier

ring that some p ointer recording is redundant or that

instructions along with eachpointer store In other

some dynamicallytyp ed values will never b e p ointers

systems esp ecially systems with static t yp e systems

Sect

andor typ e inference capabilities the compiler can

Unfortunately the cost of write barriers in conven

signicantly reduce the cost of the write barrier by

tional imp erative staticallytyp ed systems is p o orly

omitting the write barrier checks that can b e done

understo o d Static typ e systems generally distinguish

statically by the compiler

p ointer and nonp ointer typ es whichmay help the

Another issue is whether realtime resp onse is re

compiler but typ e declarations may improve other

quired Tablebased schemes such as page marking

areas of the systems p erformance even more mak

and card marking may make it dicult to scan the

ing the relative p erformance of the garbage collector

recorded p ointers incrementally in realtime Store

worse On the other hand conventional statically

lists are easier to pro cess in real time in fact the work

and stronglytyp ed languages often havelower overall

done for the write barrier may b e similar to the work

rates of heap ob ject allo cation and mutation reducing

done for an incremental up date tracing technique al

b oth the write barrier and tracing costs as a fraction

lowing some of the costs to b e optimized away WJ

of overall program running time

by combining the two write barriers

Programming style can also have a signicantim



Ungars technique of using ag bits to signify whether an

pact on writebarrier costs In many languages de

ob ject is already in a set could conceivably b e used but it

signed for use with garbage collection allo cation rou

is probably not worthwhile to use a bit p er memory word as

tines are primitives whichtakevalues as arguments

opp osed to a bit p er ob ject header unless there is hardware

supp ort to make it fast and initialize elds of ob jects In other languages

the programmer may b e exp ected to initialize a new recently as well

ob jects elds explicitly In the former case the lan

guage implementation can omit the write barrier for

Pitfalls of Generational Collection

initializing writes to p ointer elds but in the latter

Generational collection attempts to improve p erfor

it generally cannot It is not clear whether this is a

mance heuristically taking advantage of characteris

problem programs in such languages maytypically

tics of typical programs naturally this cannot b e suc

havelower rates of heap allo cation and fewer p ointer

cessful for all programs For some programs genera

elds in ob jects

tional collection will fail to improve p erformance and

may decrease it

The Generational Principle Revis

ited

The Pig in the Snake Problem

One problematic kind of data for generational collec

Generational garbage collectors exploit the fact

tion is a cluster of relatively longlived ob jects which

that heapallo cated ob jects are typically shortlived

are created at ab out the same time and p ersist for

Longerlived ob jects are collected less often on the

a signicant p erio d This often o ccurs b ecause data

assumption that the minority ob jects whichlivefora

structures are built during one phase of execution

signicant p erio d are likely to live a while longer still

then traversed during subsequent phases of execution

This simple idea is widespread but it is not obvious

and b ecome garbage all at once eg when the ro ot

that it is true in any strong sense Hay Baka it is

of a large tree b ecomes garbage This kind of data

also unclear that it must b e true to make generational

structure will b e copied rep eatedlyuntil the advance

collection worthwhile in practice

ment p olicy succeeds in advancing it to a generation

Consider a system in which this prop ertydoesnot

large enough to hold the whole cluster of related data

holdie the probability that an ob ject will die at

ob jects This increases traversal costs rst in the

a particular momentis not correlated with its age

youngest generation then in the next generation and

Lifetimes distributions in such a system may still b e

so on like the bulge in a snakeadvancing along the

heavily skewed toward shortlived ob jects A simple



snakes b o dy after a large meal Until the bulge is

example of such a system is one with a random expo

advanced to a generation that will hold it untilitdies

nential decay prop erty where a xed fraction of the

the collectors age heuristic will fail and cause addi

ob jects die in a xed p erio d of time much likethe

tional tracing work

halflife prop erty of radioactive isotop es

The piginthesnake problem therefore favors the

In such a system the lifetime distribution may ap

use of relatively rapid advancement from one genera

p ear ideally suited to generational collection b ecause

tion to the next whichmust b e balanced against the

young ob jects die young On closer examination how

disadvantage of advancing to o much data and forcing

ever it turns out that picking any subset of the ob

the next generation to b e collected more often than

jects will yield an equal prop ortion of live and dead

necessary Ungar and Jackson vary the advancement

ob jects over a given p erio d of time In that case any

p olicy dynamically in an attempt to advance large

advantage of generational collection would b e due to

clusters of data out of the youngest generation b efore

restricting the scop e of collection not to a higher mor

they incur to o much copying cost this app ears to work

tality rate among the ob jects sub ject to collection

well for most programs but may cause the older gen

This analogy app ears to undermine the notion that

eration to ll rapidly in some cases Wilson advo cates

generational collection is a go o d idea but in fact it

the use of more generations to alleviate this problem

may not Even under an exp onential decaymodel

along with careful opp ortunistic scheduling of gar

generational collection may improve locality despite

bage collections in an attempt to collect when little

the fact that it wont directly improve algorithmic

data is live WM Hayess key ob ject opp ortu

eciencyreclaiming and reusing recentlyallo cated

nism renes this by using changes in the ro ot set to

space improves lo cality compared to reusing memory



that has b een idle for a signicant p erio d Travers

Incidentally this term was inspired by a similar descrip

tion of the baby b o om generations use of resources through

ing live ob jects is likely to b e cheap er if the ob jects

its life cyclerst requiring more kindergartens then elemen

were allo catedhence touchedrecently reclaiming

tary scho ols and so on and ultimately causing an increase in

and reusing space of garbage ob jects is also likely to

demand for retirement homes and funeral directors Jon L

be cheap er b ecause that space will havebeentouched White p ersonal communication

inuence garbage collection p olicy Hay These rounding prop erties some algorithms maycompute

techniques app ear to b e b enecial but more exp eri incorrect answers or even fail to terminate Simulat

mentation is needed in larger systems ing these characteristics in software is extremely ex

p ensive The alternative is to sacrice bits from the

exp onent part of the hardwaresupp orted format and

Small Heapallo cated Ob jects

restrict the range of numb ers that can b e represented

One of the assumptions b ehind the generational

erhead This intro duces several instructions extra ov

heuristic is that there will b e few p ointers from old ob

in converting b etween the hardwaresupp orted format

jects to young ones some programs may violate this

and the tagged format Wil Cha

assumption however One example is programs that

Using arrays with typ ed elds intro duce irregulari

use large arrays of p ointers to small heapallo cated

ties into dynamicallytyp ed systems eg most arrays

oating p ointnumb ers In many dynamicallytyp ed

can hold any kind of data but some cant but this

systems oating p ointnumb ers do not t in a ma

strategy is easy to implement eciently and is fre

chine word and in the general case must b e repre

quently used in Lisp systems

sented as tagged p ointers to heapallo cated ob jects

Unfortunately neither of these solutions xes the

Up dating the values in an array of oating p ointnum

problem in the general case b ecause oating p oint

b ers may actually cause new oating p oint ob jects to

numb ers are not the only p ossible data typ e that can

b e allo cated and p ointers to them to b e installed in

cause this problem Consider complex numbersorD

the array If the array is large and not shortlived

p oint ob jects its unlikely that such ob jects are going

it is likely to reside in an older generation and each

to b e crammed intoamachine word Similarly the

up date of a oating p ointvalue will create a young

problem do esnt only o ccur with arraysany aggre

ob ject and an intergenerational p ointer If a large

gate data structure such as a binary tree can exhibit

numb er of elements of the array are up dated eg

the same problem In such cases the programmer may

by a sequential pass through the whole array each

cho ose to use a dierent representation eg parallel

oating p ointnumberobjectislikely to live a sig

arrays of real and imaginary comp onents rather than

nicant p erio d of timelong enough to b e traced by

an array of complex numbers to avoid unnecessary

several youngestgeneration collections and advanced

garbage collection overhead

to an older generation Thus these large numb ers of

intermediatelifetime numb er ob jects will cause con

Large Ro ot Sets

siderable overhead b oth in the write barrier and in

tracing In the case of systems using Ungars original

Another p otential problem with generational collec

rememb ered set scheme the rememb ered set scanning

tion is the handling of ro ot sets This is essentially

costs maybevery large if large ob jects hold intergen

the same problem that o ccurs for incremental collec

erational p ointers

tion Generational techniques reduce the scop e of

tracing work at most collections but that do es not in Toavoid these costs two approaches are common

the use of short oatingp oint formats which can b e

itself reduce the numberofrootsthatmust b e scanned

at each collection If the youngest generation is small represented as tagged immediate values and the use of

and frequently collected and global variables and the arrays with typ ed elds whichcancontain raw oat

stack are scanned each time that may b e a signicant ing p ointvalues rather than p ointers to heapallo cated

cost of garbage collection in large systems Large sys ob jects

ve tens of thousands of global or mo dule tems mayha

The problem with short oating p ointvalues is that

variables

they maynothave the desired numerical characteris

An alternative is to consider fewer things to b e part tics In the rst place numb ers short enough to t in

of the usual ro ot set and treat most variables like heap amachine word may not have sucient precision for

ob jects with a write barrier Stores into suchobjects some applications In addition some bits must b e sac

that may create p ointers into young generations are riced for the tag eld further reducing the precision

then recorded in the usual way so that the p ointers or the range of the numb er The simplest scheme is

can b e found at collection time The problem with to sacrice bits of precision by removing bits from the

this approach is that it may signicantly increase the mantissa but this has the problem that the resulting

cost of stores into lo cal variables This cost can b e number does not map well onto typical oatingp oint

reduced if the compiler can determine typ es at compile hardware which has very carefully designed precision

time and omit the write barrier co de for nonp ointer and rounding characteristics Without the exp ected

writes eective Alternatively the programmer maysupply

weaker assurances at the risk of a failure to meet a

Treatment of lo cal variables is more complicated

in languages that supp ort closurespro cedures which realtime deadline if an assurance is wrong The for

can capture lo cal variable binding environments forc mer reasoning is necessary for missioncritical hard

ing lo cal variables to b e allo cated on the garbage real time systems and it is necessarily application

sp ecic The latter nearrealtime approachissuit

collected heap rather than a stack If all variable

able for many other applications suchastypical inter

bindings are allo cated on the heap this requires that

p ointer stores into lo cal variables use a write barrier active audio and video control programs where the

this may signicantly increase write barrier costs for p ossibility of a reduction in resp onsiveness is not fa

many programs where sideeects to lo cal variables tal

are relatively common For such systems it is desir

When it is desirable to combine generational and in

able to have compiler optimizations whichavoid the

cremental techniques the details of the generational

heap allo cation of lo cal variables that are never ref

scheme may b e imp ortant to enabling prop er incre

erenced from closures and hence can b e allo cated on

mental p erformance For example the Symb olics

a stackorinregistersSuchtechniques are valuable

LMI and TI Lisp machines collectors are the b est

in themselves for improving the sp eed of co de op erat

known realtime generational systems but the in

ing on those variables Kra as well as for reducing

teractions b etween their generational and incremental

overall heap allo cation

features turn out to have a ma jor eect their worst

In most application programs ro ot set scanning

case p erformance

time is negligible b ecause there are only a few thou

Rather than garbage collecting older generations

sand global or mo dule variables In large integrated

slowly ov er the course of several collections of younger

programming environments however this ro ot set

generations only one garbage collection is ongoing

maybevery large to avoid large amounts of scan

at any time and that collection collects only the

ning at every garbage collection it may b e desirable

youngest generation or the youngest two or or the

to use a write barrier for some variables so that only

youngest three etc That is when an older genera

the storedinto variables are actually scanned at col

tion is collected it and all younger generations are ef

lection time It is easy to imagine a large integrated

fectively regarded as a single generation and garbage

development system which uses a write barrier for

collected together This makes it imp ossible to b ene

most variables but which can also generate stripp ed

t from younger generations generational eect while

down standalone application co de for which the ro ot

garbage collecting older generations in the case of a

set is scanned atomically to avoid the writebarrier

full garbage collection it eectively degenerates into a

cost when programs are distributed

simple nongenerational incremental copying scheme

During such largescale collections the collector

Realtime Generational Collection

must op erate fast enough to nish tracing b efore

Generational collection can b e combined with incre

the available free space is exhaustedthere are no

mental techniques but the marriage is not a particu

younger generations that can reclaim space and re

larly happy one WJ Typically realtime garbage duce the safe tracing rate Alternatively the collection

collection is oriented toward providing absolute worst sp eed can b e kept the same but space requirements

case guarantees while generational techniques im will b e much larger during large scale collections For

prove exp ected p erformance at the exp ense of worst

programs with a signicant amountoflonglived data

case p erformance If the generational heuristic fails

therefore this scheme can b e exp ected to have system

and most data are longlived garbage collecting the atic and p erio dic p erformance losses even if the pro

young generations will b e a waste of eort b ecause gram has an ob ject lifetime distribution favorable to

no space will b e reclaimed In that case the fullscale generational collection and the programmer can pro

garbage collection mustproceedjustasfastasifthe

vide the appropriate guarantees or assurances to the

collector were a simple nongenerational incremental

collector Either the collector must op erate at a much

scheme

higher sp eed during full collections or memory usage

Realtime generational collection may still b e desir will go up dramatically The former typically causes

able for many applications however provided that the ma jor p erformance degradation b ecause the collector

programmer can supply guarantees ab out ob ject life uses most of the CPU cycles the latter either requires

times to ensure that the generational scheme will b e very large amounts of memorynegating the advan

tage of generational collectionor incurs p erformance copy the data structure so that it can make a lo cal de

degradation due to virtual cision as to when its memory can b e reclaimed Such

distortions make it extremely dicult to compare the

lo cality of garbage collected and nongarbagecollected

Lo cality Considerations

systems directlyTypical studies that attempt to do

so compare use programs written without garbage col

Garbage collection strategies have a ma jor eect on

lection in mind in b oth their original form and with

the way memory is used and reused naturallythis

explicit deallo cation replaced by garbage collection

has a signicant eect on lo cality of reference

This implicitly hides the eects of distorted program

ming style b ecause the garbage collected version of

Varieties of Lo cality Eects

the program inherits the distortions

The second category of lo cality eectslo calityof

The lo cality eects of garbage collection can b e

the garbage collection pro cess itselfis the one that

roughly divided into three classes

usually comes immediately to mind It is sometimes

Eects on programming style whichchange the

quite signicant although it may b e the least imp or

way data structures are created and manipulated

tant of the three For a full garbage collection all

live data must b e traced and this can interact very

Direct eects of the garbage collection pro cess it

p o orly with conventional memory hierarchies Most

self and

live ob jects will b e touched only once during tracing

Indirect eects of garbage collection esp ecially

so there will b e little temporal lo cality ie few re

patterns of reallo cation of free memory and clus

p eated touches to the same data over a short p erio d

tering of live data

of time On the other hand there may b e considerable

spatial lo calitytouches to several dierent nearbyar

The rst of theseeects of programming style

eas of memory eg within the same virtual memory

is p o orly understo o d In systems with an ecient

page over short p erio ds of time

garbage collector programmers are likely to adopt

The third category of eects is probably the most

a programming style that is appropriate to the task

imp ortant although its signicance is not widely ap

at hand often an ob jectoriented or functional ap

preciated The strategy for memory real location im

proach New data ob jects will b e dynamically allo

poses localitycharacteristics on the way memory is

cated to hold newly computed data and the ob jects

hed rep eatedly even if the objects themselves die touc

will b e discarded when the data are no longer interest

quickly and are therefore never touched againThisis

ing Ideally the programmer expresses the computa

of course one of the main reasons for generational gar

tion in the most natural form with applicationlevel

bage collectionto reuse a small area of memory the

data mapping fairly directly onto languagelevel data

youngest generation rep eatedly by allo cating many

ob jects

shortlived ob jects there It is also the reason that

In contrast explicit allo cation and deallo cation of

activation stacks typically have excellent lo calityof

ten encourage a distorted programming style where

referencenear the top of the stack memory is reused

the programmer reuses languagelevel ob jects to rep

very frequently by allo cating manyvery shortlived

resent conceptually distinct data over time simply b e

activation records there A generational garbage col

cause its to o exp ensive to deallo cate an ob ject and

lector can b e seen as imp osing a roughly stacklike

reallo cate another one Similarly in systems with in

memory reuse pattern on a heap by exploiting the

ecient garbage collection suchasmany older Lisp

fact that the lifetime distributions are roughly simi

implementations programmers often resort to similar

lar

languagelevel ob ject reuse for example destructively

sideeecting list structures to avoid allo cating new As we p ointed out earlier a simple non

list elements or allo cating a single large array used to generational garbage collector has very poor locality

hold several sets of data over time Explicit deallo ca if allo cation rates are high simply b ecause to o much

tion often leads to distortions of the opp osite variety memory is touched in any reasonable p erio d of time

As Ungar has p ointed out Ung simply paging out as wellmapping a single conceptual data ob ject onto

the resulting garbage to make ro om in memory for multiple languagelevel ob jects Programmers mayal

new data would typically b e an unacceptable cost lo cate many extra ob jects to simplify explicit deallo ca

in a highp erformance system A generational gar tion Anymoduleinterested in a data structure may

bage collector restricts the scop e of this lo calitydis of keeping everything in RAM For the latter in con

aster to a manageable areathe youngest generation trast lo cality at the level of virtual memory maybe

or two This allows the real lo calitycharacteris crucial

tics of rep eated access to longerlived data to show

In a nonmoving collector the situation is some

up in the older generations Because of this eect

what dierentthe space cost of the youngest gen

the overall lo calitycharacteristics of garbage collected

eration also dep ends on the degree of fragmentation

systems app ear to b e roughly comparable to that of

and how data from various generations are intermin

nongarbage collected systemsthe youngest genera

gled in memory The extent of these problems is not

tion lters out most of the heapallo cated data that

well understo o d but they may b e less serious than is

might b e stackallo cated in other languages

widely b elieved Hay BZ Bo e

Direct comparisons are dicult however b ecause

of the large numberofdesignchoices involved in b oth

Lo cality of Allo cation and Short

garbage collectors and explicit heap management

lived ob jects

With copying garbage collection there is obviously

another indirect lo cality eectbymoving data ob

As noted ab ove the pattern of allo cation often has

jects around in memory the collector aects the lo cal

the most imp ortant eect on lo cality in a simple col

ity of the programs own accessesthat is it aects

lector In a generational copy collector this eect

the mapping b etween a programs logical references

is much reduced from the p oint of view of virtual

to languagelevel data and the resulting references to

memorythe pages that makeuptheyoungest gener

particular memory lo cations

ation or two are reused so frequently that they simply

This kind of eect is not restricted to copying col

stay in RAM and eectively incur a xed space cost

lection Noncopying collectors also have considerable

On the other hand the frequent reuse of the whole

latitude in deciding how to map languagelevel ob jects

youngest generation mayhave a deleterious eect on

onto the available free memory When the program

the next smaller level of the memory hierarchyhigh

requests a piece of memory of a given size the allo

sp eed CPU caches The cycle of memory reuse has

cator is free to return any suitablysized piece of free

b een made much smaller but if the cycle do es not t

memory It is unclear what rules should govern this

in cache the cache will suer extra misses in much the



decision

same way that main memory do es for a simple collec

The imp ortance of dierentlocality eects is nat

h misses are not as dramatic as tor The eects of suc

urally dep endent on the relative size of the data used

those for virtual memoryhowever b ecause the ratio

by a system and the main memory of the hardware

of cache to main memory sp eeds is not nearly as large

it runs on For many systems the full system heap

as the ratio of main memory to disk sp eeds Also in

ts in main memory including an editor a browser

copying collection at least the pattern of reallo cation

a compiler and application co de and data the nor

is so strongly sequential that misses can b e reduced

mal mo de of such systems is to hav e enough RAM

considerably by simple prefetching strategies or even

that programs typically do not page at all In other

just the use of large blo ck sizes On current pro cessors

systems however the oldest generation or twoistoo

the cost app ears to b e no more than a few p ercentof

large to t in typical main memories either b ecause

overall run time even when allo cation rates are rela

the system itself includes large complex software or

tively high Faster pro cessors may suer more from

b ecause application data or co de are very large

this eects however if cachetomemory bandwidths

In a copying collector there is therefore a binary

do not scale with pro cessor sp eeds or if bus bandwidth

distinction b etween generations that are eectively

is at a premium as in sharedbus multipro cessors

memoryresident and those that are so large they must

Zorn has shown that relatively large caches can b e

truly rely on virtual memory caching For the former

quite eective for generationally garbagecollected sys

whichmay include the whole system lo calityofalloca

tems Zor Wilson has shown that the relative sizes

tion and collection are essentially irrelevant to virtual

of the cache and the youngest generation are esp ecially

memory p erformancethe space cost is the xed cost

imp ortant the particulars of the cache replacement

p olicy may b e imp ortantaswell due to p eculiarities



This o ccurs in explicit heap management systems as well of



in the access patterns due to reallo cation WLM

course but it has not b een systematically studied there either

Most studies of explicit techniques have studied fragmentation



and CPU costs but have ignored eects on caching in hierar The eect of asso ciativityisvery dep endent of the ratio of

chical memories cache size to youngest generation size and lower asso ciativities

Several researchers have suggested optimizations to of the fact that while environments and continuations

avoid the fetching of garbage data in areas ab out to can b e captured the vast ma jority of them are not

b e reallo cated this can cut cachetomemory band and neednt b e put on the heap

width requirements nearly in half Ko opman et al

rst illustrated this in a lan

Lo calityofTracing Traversals

guage implementation based on combinator reduction

KLS and showed that some conventional cache

The lo cality of GC tracing traversals is dicult to



designs can achieve this eect Tarditi and Diwan

study in isolation but it app ears to havesomeobvi

show that the same eect can b e achieved in a more

ous characteristics Most ob jects in the generations

conventional language implementation using genera

b eing collected will b e touched exactly once b ecause

tional garbage collection and demonstrate the value

most ob jects are p ointed to by exactly one other ob

of a cachetomemory interface supp orting high write

ject Rov DeTtypical data structures do not

rates DTM

contain a large numb er of cycles and many cycles are

The dierence in lo calitybetween moving and non

small enough to have little impact on traversal lo cal

moving collectors do es not app ear to b e large at the

ity

scale of highsp eed cache memoriesthe typ e of col

Given this the main characteristic of the traversal

lector is not as imp ortant as rate of allo cation and the

is to exhaustively touch all live data but for the most

size of the youngest generation ie howquickly mem

part very briey There is very little temporal lo cality

ory is used reclaimed and reused in the usual case

of reference ie rep eated touching of the same data

Zor shows that a noncopying collector can have

Most ob jects are referenced by exactly one p ointer at

b etter lo calitythana copying collector using semi

anygiven time and will therefore only b e reached once

spaces simply b ecause it only needs a single space

by the tracing traversal The ma jor lo calitycharac

p er generation WLM shows that Ungars tech

teristic that can b e exploited is the spatial lo calityof

nique of using a sp ecial creation area can yield similar

data structures layouts in memoryif closelylinked

b enets at the level of large caches just as it do es

ob jects are close to each other in memorytouching

at the level of virtual memory WLM also shows

one ob ject may bring several others into fast memory

that the allo cation of variable binding environments

shortly b efore they are traversed

and activation records on the heap can greatly exac

Exp erience with the Xerox PCR system indicates

erbate cachelevel lo cality problems due to a y oungest

that even in a noncopying collector ie without

generation that wont t in the cache This is b orne

compaction there is useful lo cality in ob jects initial

out by simulation studies of Standard ML of New Jer

layouts in memory related ob jects are often created

sey DTM on highp erformance pro cessors It sug

andor die at ab out the same time so simple allo ca

gests that activation information and binding environ

tion strategies result in useful clustering in memory

ments should b e allo cated on a stack using compile

The PCR collector enhances this by sorting its ro ot

time analysis KKR orinasoftware stackcache

set b efore traversing data so that ro ot p ointers into

CHO WM Kel Software stackcaches can

the same area are traversed at ab out the same time

b e used in languages like ML and Scheme where bind

This has b een observed to signicantly reduce paging



ing environments may b e captured by rstclass pro ce

during full garbage collections

dures andor activation chains may b e captured with

In a copying collector it would seem that traver

rstclass continuations The cache takes advantage

sals usually have go o d spatial lo calit yinthatobjects

are typically organized by the traversal ordering when

may actually p erform b etter when the two are nearly equal



they are rst copied and then are traversed in the

The essential feature is the use a of a writeal locate p olicy

in combinationwith subblock placementWriteallo cate means

same order by subsequenttraversals At the rst

that if a blo ck is written to without rst writing it is allo cated a

traversal of course the matchbetween ob jects ini

blo ckofcache memory rather than having the write bypass the

tial allo cation layout and the traversal order may also

cache and simply up date the blo ck out in main memoryThis

b e signicant

ensures that when ob jects are allo cated and hence written

they will b e in cache when they are referenced shortly thereafter

Because of the high spatial lo calityandlow tem

Subblo ck placement means that the cache blo ck is divided into

poral localityitmay b e desirable to limit the mem

several indep endent lines which can b e valid or invalid in the

ory used by a tracing traversal to avoid needlessly

cache This allows writes to one word to avoid triggering a

displacing the contents of memory by large amounts

fetch of the rest of the blo ck Ob jects can thus b e allo cated

and initialized without stalling the pro cessor and fetching the



old contents of the storage they o ccupy Carl Hauser p ersonal communication

of data that are only briey touched during tracing Once hash tables are treated prop erly further lo cality

Wil Baka Incremental tracing may yield some gains can b e made by using an algorithm that clusters

of the same b enet byallowing the mutator to touch data structures hierarchically Similar results were

data during the tracing phase keeping the most active rep orted for the Symb olics Lisp Machine system in a



data in fast memory masters thesis by DL Andre And While those

exp eriments were not particularly wellcontrolled the

results were strikingly p ositive and this is particu

Clustering of LongerLived Ob

larly signicant b ecause the results were obtained for

jects

a large commercial system

Several studies have addressed copying collections in

direct eect on lo calityie the eect of reorganizing

Dynamic Reorganization

the data which are subsequently accessed by the run

In White prop osed a system in which garbage

ning program

collection was deferred for long p erio ds of time but

in which Bakers readbarrier incremental copier was

Static Grouping

enabled to copy data for its lo cality eects Whi

the goal was not to reclaim empty space but instead

Stamos Sta Sta Blau Bla studied the ef

to simply cluster the active data so that it could b e

fects of using dierentcopying traversal algorithms

kept in fast memory One interesting prop erty of this

to reorganize longlived system data and co de in

scheme is that it reorganizes data in the order in which

Smalltalk systems While these algorithms reorganize

they are touched bythemutator and if subsequent

data during program execution ie at garbage col

access patterns are similar it should greatly improve

lection time they are referred to as static grouping

lo cality

algorithms b ecause they reorganize data according to

This scheme is impractical in its original form b e

how ob jects are linked at the time garbage collection

cause the sheer volume of garbage would swamp the

o ccursclustering is not based on the dynamic pat

write bandwidth of typical disks if memory were not

tern of the programs actual accesses to data ob jects

reclaimed Ung but the same basic idea has b een

Both studies concluded that depthrst reorganiza

incorp orated into the garbage collector of the Texas

tion was preferable to breadthrst reorganization but

hines Cou Joh Instruments Explorer Lisp mac

not by a large margin there is useful lo cality infor

This collector avoids p erforming exhaustiveback

mation in the top ology of data structures but any

ground scavenging until toward the end of the gar

reasonable traversal will do a decent job of organizing

bage collection cycle to enhance the o dds that ob jects

the data Breadth and depthrst traversals b oth

will b e reached rst by the mutator and copied in a

dramatically outp erform a random reorganization

lo calityenhancing order

Wilson et al p erformed a similar study for a Lisp

A similar approach has b een used in simulations by

system WLM and showed that traversal algo

the MUSHROOM pro ject at the UniversityofManch

rithms can make an appreciable dierence The most

ester WWH Rather than relying on the garbage

imp ortant dierence in those exp eriments was not b e

collector however this system is triggered bycache

tween traversal algorithms per sehowev er but in how

misses it can therefore resp ond more directly to the

large hash tables were treated System data are often

lo calitycharacteristics of a program rather than to

stored in hash tables which implementlargevariable

the interaction b etween the program and the garbage

binding environments such as a global namespace or

collector

a package Hash tables store their items in pseudo

Unfortunately suchschemes rely on sp ecialized

random order and this may cause a copying collec

hardware to b e worthwhile The Explorer system ex

tor to reach and copy data structures in a pseudo

ploits Lisp machines hardware supp ort for a Baker

random fashion This greatly reduces lo calitybut



style read barrier and the MUSHROOM system is

is easy to avoid by treating hash tables sp ecially



This b enet may not b e large relative to the additional

other nonrandom ordering and then mo dify the collector to

space cost of incremental collectionthe deferred reuse of mem use this information to order its examination of the entries

ory that cant b e reclaimed until the end of the incremental Another technique is to make hash tables indirect indexes into

tracing phase an ordered arrayofentries this has the advantage that it can b e



implemented without mo difying the collector and can therefore One technique is to mo dify the hash tables structure to

b e used for userdened table structures record the order in whichentries are made or imp ose some

based on a novel ob jectoriented architecture More recently microkernel op erating systems have

oered the ability to mo dify virtual memory p olicies

Wilson Wil casts these techniques as a form

without actually mo difying the kernel The kernel

of adaptive prefetching and argues that as memo

calls usersp ecied routines to control the paging of

ries continue to grow such negrained reorganiza

a pro cess rather than hardco ding the entire paging

tion maybeoverkill reorganization of virtual memory

p olicy into the kernel itself In the Mach system for

pages within larger units of disk transfer mayyield

example external pager pro cesses can b e used to con

go o d results on sto ck hardware This is supp orted

trol paging activity this feature has b een used to re

by data such as those from the LOOM ob jectoriented

duce paging for Standard ML of New Jersey in ways

virtual memory for Smalltalk Sta whichshowthat

similar to those used in the Symb olics system Sub

nergrained caching strategies are helpful primarily

when memories are very small As memories get

larger the optimal unit of caching gets larger as well

Lowlevel Implementation Is

and pagewise schemes tend to work roughly as well

as ob jectwise schemes Wilson also argues that the

sues

improvements due to negrained reorganization may

b e mostly due to deciencies in the staticgraph algo So far wehave mostly discussed basic issues of gar

rithms used for comparisonin particular treatment

bage collector design and basic p erformance tradeos

of large hash tables However Llames has rep orted In addition to these primary considerations a garbage

Lla that dynamic reorganization can signicantly collector designer is faced with many design choices

improve lo calityeven after ro ots are treated appro which can have a signicant impact on the ultimate

priately and a go o d background scavenging traversal p erformance of the system and on how easily the gar



is used

bage collector can b e integrated with other comp o

nents of the system In this section we discuss these

lowlevel implementation questions in more detail

Co ordination with Paging

Several systems have co ordinated the garbage collec

Pointer Tags and Ob ject Headers

tor with the virtual memory system to improve paging

For most of this pap er wehave assumed that p ointers

p erformance

are tagged with small tags and that p ointedto ob jects

The Symb olics Lisp machines have had p erhaps the

have header elds that enco de more sp ecic typ e in

most comprehensive co ordination of garbage collec

formation this sp ecic typ e information can b e used

tion with virtual memory The Symb olics allo cator

to determine ob jects layouts including the lo cations

could notify the virtual memory system when a page

of emb edded p ointer elds This is the most common

no longer contained any useful data allo cate pages

h as Lisp scheme in dynamicallytyp ed languages suc

of virtual memory without paging their old garbage

and Smalltalk It is common in such languages that

contents into memory when the page was rst touched

ob jects are divided into two ma jor categories ob jects

Virtual memory co op eration was also used in the

that can b e stored within a machine word as tagged

intergenerational p ointer recording mechanism Be

immediate values and ob jects which are allo cated on

fore paging out a page holding p ointers to younger

the heap and referred to via p ointers Heapallo cated

generations the page was scanned and the intergen

ob jects are often further divided into two categories

erational p ointers found Mo o This allowed the

those which contain only tagged values immediates

garbage collector to avoid paging in data just to scan

and p ointers whichmust b e examined by the collec

them for p ointers into younger generations Virtual

tor to nd the p ointers and those whichcontain only

memory mapping techniques were also used to opti

nonp ointer elds which can b e ignored by the garbage

mize the copying of large ob jectsrather than actu

collector

ally copying the data within a large ob ject the pages

Another p ossibilityisthateach eld contains b oth

holding the data could simply b e mapp ed out of the

the ob ject if its a small immediate value or p ointer

old range of virtual addresses and into the new range

and the detailed typ e information This generally re

Wit

quires elds to b e twowordsone word long enough

to hold a raw p ointer or raw immediate and another



Llames added dynamic grouping to Mo ons Ephemeral Gar

to hold a bit pattern long enough to enco de all of the

bage Collector which uses the static grouping techniques de

typ es in the system Generally a whole word is used scrib ed in And

for the latter eld b ecause of alignment constraints cator to allo cate ob jects in the appropriate pages A



for load and store op erations Despite the waste of typ e test requires masking and shifting a p ointer to

space this scheme may b e attractive on some archi derive the page numb er and a table lo okup to nd

tectures esp ecially those with wide buses the typ e descriptor for ob jects in that page BiBOP

For a language with a static typ e system still an enco ding can save space by letting a tag p er page suf

other p ossibility is that all ob jects have headers and ce to enco de the typ es of many ob jects

that p ointer elds contain no tags This requires a

Another variation on conventional tagging schemes

static typ e system which ensures that immediate val

is to avoid putting ob ject headers directly on the ob

ues cant b e stored in the same elds as p ointersif jects and to store them in a parallel array this may

they could it would require a tag to tell the dier haveadvantages for lo cality of reference by separating

ence Simply knowing which elds of ob jects may

out the data relevant to normal program op eration

contain p ointers is sucient if the p ointedto ob

from those that are only of interest to the collector

jects have headers to deco de their structure Some

and allo cator

dynamicallytyp ed systems use this representation as

well and avoid having immediate values within a

ConservativePointer Finding

wordeven short integers are represented as ob jects



with headers An extreme case of catering to other asp ects of a lan

guage implementationis conservative pointernding

If strictly static typing is used even the headers

which is a strategy for coping with compilers that

can b e omittedonce a p ointer eld is found and the

dont oer any supp ort for runtime typ e identica

typ e of the p ointer is known the typ e of the ob ject it



tion or garbage collection BW In such a system

p oints to is obvious Appa Gol To allowtrac

ing traversals it is only necessary that the typ es of the collector treats anything that might be a pointer

as a p ointereg any prop erlyaligned bit pattern

the ro ot p ointers eg lo cal variables in an activation

that could b e the address of an ob ject in the heap

stack b e known This can b e accomplished in sev

The collector may mistake other values suchasan

eral ways as we will explain later From there we

integer with the same bit pattern for p ointers and

can determine the typ es of their referents and th us

their referents p ointer elds and so on transitively retain ob jects unnecessarily but several techniques

Still some systems with static typing put headers on can b e used to make the probabilityofsuch mistakes

very small Surprisingly these techniques are eec

ob jects anyway b ecause the cost is not that large and



tive enough that most C programs can b e garbage

it simplies some asp ects of the implementation

collected fairly eciently with little or no mo dica

The choice of tag and p ointer schemes is usually

made with some regard to garbage collection but tion Bo e This simplies the garbage collection of

most often the main consideration is making nor programs written without garbage collection in mind

and programs written in multiple languages some of

mal program op erations ecient Tagging schemes

which are unco op erative WDH

are usually chosen primarily to maketyp e dispatch

ing arithmetic op erations and p ointer dereferencing Making such a collector generational requires sp e

as fast as p ossible whichscheme is b est dep ends cial techniques due to the lack of compiler co op eration

largely on the language semantics and the strategy in implementing a write barrier to detect intergenera

for ensuring that the most frequent op erations are fast tional p ointers Virtual memory dirty bits or access

KKR GG SH RosGud protection traps can b e used to detect whichpages

In some systems individual ob jects do not have are written to so that they can b e scanned at collec

headers and typ e information is enco ded by segre tion time to detect p ointers into younger generations

gating ob jects of particular typ es into separate sets of DWH

pages This big bag of pages or BiBOP technique

Conservativepointer nding imp oses additional

asso ciates typ es with pages and constrains the allo

constraints on the garbage collector In particular the

collector is not free to move ob jects and up date p oin



On manyarchitectures normal loads and stores must b e

ters b ecause a nonp ointer might b e mistaken for a

aligned on word b oundaries and on others there is a time

p enalty for unaligned accesses

p ointer and mistakenly up dated This could result in



This typically requires clever implementation strategies to



optimize away the heap allo cation of most integers Yuaa These techniques are usually asso ciated with Bo ehm and



For example if using page marking or card marking for his asso ciates who have develop ed them to a high degree but

generational collection headers makeitmuch simpler to scan similar techniques app ear to have b een used earlier in the Kyoto

pages Common Lisp system and p erhaps elsewhere

mysterious and unpredictable changes to nonp ointer subword parts of a p ointer with temp orary inconsis

data likeintegers and character strings Conservative tencies in the state of the p ointer While most compil

collectors therefore cant use a straightforward copy ers dont p erform these optimizations very often they

ing traversal algorithm do o ccur and a few compilers do them regularlyFor

most compilers it is sucient to turn o the highest

Conservative p ointer nding can b e combined with

levels of optimization to avoid such p ointermangling

other techniques to cop e with language implementa

optimizations but this is not reliable across compil

tions that are only partly co op erative For example

ers and typically costs a few p ercentinruntime e

Barletts and Detlefs mostlycopying collectors use

ciency due to missed optimizations Since most com

headers to deco de elds of ob jects in the heap but

pilers do not provide the ability to selectively turn

rely on conservativetechniques to nd p ointers from

o only the optimizations that are troublesome for

the activation stack Bar Det This supp orts

garbage collectors it is usually necessary to turn o

copying techniques that relo cate and compact most

several optimizations ie the high level optimiza

but not all ob jects Ob jects conservatively identi

tions Toavoid this problem garbage collector design

ed as b eing p ointed to from the stack are pinned



ers have prop osed a set of constraints that compilers

in place and cannot b e moved

can preserve to ensure that garbage collection is p os

The choice of conservative stack scanning and more

sible these constraints do not require ma jor changes

precise heap tracing is often reasonable b ecause it is

to existing compilers Bo e BC

usually easier to retrot ob ject headers into a language

than it is to mo dify the compiler to make stackframe

formats deco dable Headers can b e added to ob jects

by a heap allo cation routine whichmay simply b e

Similarly programming guidelines have b een pro

a library routine that can b e changed or substituted

p osed to ensure that programmers in C avoid con

easily Compilers often record enough information to

structions that make correct garbage collection im

deco de record layouts for debugging purp oses and

p ossible ED by programming in a very slightly

that information can b e captured and massaged into

restricted subset of C it is p ossible to ensure that

runtime typ e identication for heapallo cated ob jects

a co op erative compiler can supp ort correct garbage

WJ

collection By using a slightly more restricted safe

Conservative p ointernding can b e defeated by

subset of C it is p ossible to ensure that even buggy

languagelevel facilities such as the ability to cast p oin

programs do not break the basic memory abstractions

ters to integers destroy the original p ointer and p er

and pro duce hardtodiagnose errors

form arbitrary arithmetic on the integer value If

the original value is then restored and cast backtoa

p ointer the referredto ob ject may no longer exist

the garbage collector mayhave reclaimed the ob ject

Some p eople ob ject to conservative p ointernding

b ecause it couldnt tell that the integer value p ointed

are known not to be techniques b ecause they

to the ob ject even when viewed as a p ointer value

correctit is p ossible for example for an integer

Fortunately most programs do not p erform this se

to b e mistaken for a p ointer causing garbage to b e

quence of op erationsthey may cast a p ointer to an

retained In the worst case this may cause consider

integer but the original p ointer is likely to still b e

able garbage to go unreclaimed and a program may

present and the ob ject will therefore b e retained

simply run out of memory and crash Advo cates of

Compiler optimizations can p erform similar op era

conservative p ointernding counter that the proba

tions on p ointers and this is unfortunately harder for

bility of such an o ccurrence can b e made very small

the programmer to avoidthe compiler may use alge

and that such errors are much less likely than fatal er

braic transformations on p ointer expressions disguis

rors due to programmers mistakes in explicit heap

ing the p ointers or they may p erform op erations on

management In practice therefore the use of an

incorrect technique may b e b etter than having pro 

Such ob jects are pinned in place in memorybutcanbe

grammers write programs that are even less correct

advanced to an older generation bychanging the set to which

a page b elongsthat is ob jects b elong to pages and pages

In the long run it is hop ed conservative p ointer nd

b elong to generations but an entire page can b e moved to a

ing will make garbage collection widely usable and

dierent generation simply bychanging the tables that record

once its widely used compiler vendors will provide

which pages b elong to which generations In essence ob jects

ages are enco ded using something like a BiBOP tagging scheme some degree of supp ort for more precise techniques

Linguistic Supp ort and Smart Recentwork in reective systems has explored lan

guages with very p owerful and regular extension mech

Pointers

anisms and which exp ose some of the underlying

Another approach to retrotting garbage collection

implementation to allow ecient reimplementation of

into existing systems is to use the extension facili

existing language features KdRB MN YS

ties provided by the language and implement gar

While most existing reective languages supply gar

bage collection within the language itself The most

bage collection as part of the base language it is p ossi

common form of this is to implement a sp ecial set of

ble to imagine implementing garbage collection within

garbagecollected data typ es with a restricted set of

a small p owerful reective language As in any reec

access functions that preserve the garbage collector

tive system however it is imp ortant to exp ose only

constraints For example a referencecounted data

the most imp ortantlowlevel issues to avoid limiting

typ e might b e implemented and accessor functions

the choices of the base language implementor this is

or macros used to p erform assignmentsthe acces

an area of active research

sor functions maintain the reference counts as well as

p erforming the actual assignments

Compiler Co op eration and Opti

A somewhat more elegant approach is to use ex

mizations

tension mechanisms suchasoperatoroverloading to

allow normal program op erators to b e used on gar

In any garbagecollected system there must b e a set of

bage collected typ es with the compiler automatically

conventions used by b oth the garbage collector and the

selecting the appropriate userdened op eration for

rest of the system the interpreter or compiled co de

given op erands In C it is common to dene

to ensure that the garbage collector can recognize ob

smart p ointer classes that can b e used with garbage

jects and p ointers In a conservative p ointernding

collectible classes Str with appropriately dened

system the contract b etween the compiler and col

meanings for p ointermanipulating op erations suchas

lector is very weak indeed but its still thereif the

and and for the addresstaking op eration on

compiler avoids strange optimizations that can defeat

collectible ob jects Unfortunately these userdened

the collector In most systems the compiler is much

p ointer typ es cant b e used in exactly the same ways

more co op erative ensuring that the collector can nd

as builtin p ointer typ es for several reasons Ede

p ointers in the stack and from registers

One reason is that there is no waytodeneallofthe

automatic co ercions that the compiler p erforms auto

GCAnytime vs SafePoints Collection

matically for builtin typ es Another problem is that

not all op erators can b e overloaded in this way C

ween the collector and run Typically the contract b et

provides most but not all of the extensibility neces

ning co de takes one of two forms whichwe call the

sary to integrate garbage collection into the language

gcanytime and safepoints strategies In a gcanytime

gracefully It is apparently easier in Ada Bakb

system the compiler ensures that running co de can b e

b ecause the overloading system is more p o werful and

interrupted at any p oint and it will b e safe to p erform

the builtin p ointer typ es have fewer subtleties which

a garbage collectioninformation will b e available to

must b e emulated Yet another limitation is that it is

nd all of the currently activepointer variables and

imp ossible to redene op erations on builtin classes

deco de their formats so that reachable ob jects can b e

making it dicult to enhance the existing parts of the

found

languageonly userdened typ es can b e garbage col

In a safepoints system the compiler only ensures

lected gracefully

that garbage collection will b e p ossible at certain

Still another limitation is that garbage collection

selected p oints during program execution and that

is dicult to implement eciently within the lan

these p oints will o ccur reasonably frequently and reg

guage b ecause it is imp ossible to tell the compiler

ularly In many systems pro cedure calls and back

how to compile for certain imp ortant sp ecial cases

ward branches are guaranteed to b e safe p oints en

Det Ede For example in C or Ada there

suring that the program cannot lo op or recurse in

is no way to sp ecialize an op eration for ob jects that

denitely without reaching a safe p ointthe longest

are known at compile time not to b e allo cated in the

p ossible time b etween safe p oints is the time to take



heap

the longest noncalling forward path through any pro



In C terminology op erators can only b e sp ecialized on the typ es of their arguments not the arguments storage class

cedure Finergrained resp onsiveness can b e guaran where within an ob ject if the collector ensures that

teed byintro ducing intermediate safe p oints if neces headers of ob jects can b e derived from them using

sary alignment constraints If a noncopying collector is

used untraced registers might b e allowed to hold

The advantage of a safep oints scheme is that the

optimized p ointers which might not actually p oint

compiler is free to use arbitrarily complex optimiza

within an ob ject at all due to an algebraic transforma

tions within the unsafe regions b etween safe p oints

tion as long as an unoptimized p ointer to the same

and it is not obligated to record the information nec

ob ject is in a traced register in a known format

essary to make it p ossible to lo cate and deoptimize



p ointer values One disadvantage of a safe p oints The main problem with partitioned register sets is

scheme is that it restricts implementation strategies that it reduces the compilers freedom to allo cate pro

for lightweight pro cesses threads If several threads gram variables and temp oraries in whichever registers

are available Some co de sequences might require sev

of control are executing simultaneously in the same

garbagecollected heap and one thread forces a gar eral p ointer registers and very few nonp ointer regis

bage collection the collection must wait until all ters and other co de sequences might require the opp o

threads have reached a safe p oint and stopp ed site This would mean that more variables would have

to b e spilled ie allo cated in the stack or heap in

One implementation of this scheme is to mask hard

stead of in registers and co de would run more slowly

ware interrupts b etween safe p oints a more common

one is to provide a small routine which can handle An alternative to partitioned register sets is to give

the compiler more freedom in its use of registers but

the actual lowlevel interrupt at anytimeby sim

ply recording basic information ab out it setting a require it to communicate more information to the

ag and resuming normal execution Compiled co de garbage collectorie to tell it where the p ointers

checks the ag at each safe p oint and dispatches to are and howtointerpret them prop erlyWe call this

a highlevel interrupt handler if it is set This intro strategy variable representation recording b ecause the

compiler must record its decisions ab out which reg

duces a higherlevel notion of interrupts somewhat

isters or stackvariables hold p ointers and howto

insulated from actual machine interruptsand with a

somewhat longer latency recover the ob jects address if the p ointer has b een

or each range of in transformed by an optimization F

With either a safep oints or a gcanytime strategy

structions where p ointer variable representations dif

there are many p ossible conventions for ensuring that

fer the compiler must emit an annotation This in

p ointers can b e iden tied for garbage collection with

formation is similar to that required for debugging

a safep oints system however the compiler is free to

optimized co de and most optimizations can b e sup

violate the convention in b etween safe p oints

p orted with little overhead The space requirements

for this additional information which essentially an

Partitioned Register Sets vs Variable

notates the executable co de may not b e negligible

Representation Recording

however and it may b e desirable in some cases to

In many systems the compiler resp ects a simple con

change the compilers co de generation strategy slightly

vention as to which registers can b e used for holding

DMH

which kinds of values In the T system using the

Orbit compiler KKR Kra for example some

Optimization of Garbage Collection It

registers are used only for tagged values and others

self

only for raw nonp ointer values The p ointers are as

While garbage collectors can b e constrained by their

sumed to b e in the normal formatdirect p ointers to

relationship to the compiler and its optimizations it

known oset within the ob ject plus a tag headers

is also p ossible for the compiler to assist in making the

can thus b e extracted from the p ointedto ob jects by

garbage collector ecient For example an optimizing

a simple indexed load instruction

compiler may b e able to optimize away unnecessary

Other register set partitionings and conventions are

or redundant reador writebarrier op erations or to

p ossible For example it would b e p ossible to have

detect that a heap ob ject can b e safely stackallo cated

registers holding raw p ointers p erhaps p ointers any

instead



This is particularly imp ortant when using a highlevel lan

In most generational and incremental algorithms

guage suchasC asanintermediate language and it is un

the write barrier is only necessary when a p ointer is

desirable or imp ossible to prevent the C compiler from using

b eing stored but it may not b e obvious at compile optimizations that mask p ointers

time whether a value will b e a p ointer or not In Free Storage Management

dynamically typ ed languages variables may b e able

Nonmoving collectors must deal with the fact that

to take on either p ointer or nonp ointer values and

freed space may b e distributed through memory in

even in statically typ ed languages p ointer variables

tersp ersed with the live ob jects The traditional way

may b e assigned a null value Compilers may p erform

of dealing with this is to use one or more free lists

dataow analysis whichmayallow the omission of a

but it is p ossible to adapt any of the free storage

write barrier for more nonp ointer assignment op era

management techniques whichhave b een develop ed

tions In the future advanced typ eow analysis such

for explicitlymanaged heaps Knu Sta

as that used in the Self compiler CU CUmay

The simplest scheme used in many early Lisp inter

provide greater opp ortunities for read and write bar

preters is to supp ort only one size of heapallo cated

rier optimizations

ob ject and use a single free list to hold the freed items

When garbage is detected eg during a sweep phase

The compiler may also b e able to assist by eliminat

or when reference counts go to zero the ob jects are

ing redundantchecks or marking when a write barrier

simply strung together into a list using one of the

tests the same p ointer or marks the same ob ject mul

normal data elds to hold a list p ointer

tiple times To the b est of our knowledge no exist

When generalizing this scheme to supp ort multi ing compilers currently do this For this to b e p os

ple ob ject sizes two basic choices are p ossible to sible however the optimizer must b e able to assume

maintain separate lists for each ob ject size or ap that certain things are not changed by the collector in

ways that arent obvious at compile time For exam

proximate ob ject size or to keep a single list hold

ple with a gcanytime collector and multiple threads

ing various sizes of freed spaces Techniques with

of control a thread could b e preempted at any time

separate lists for dierentsized ob jects include seg

regated storage and buddy systems PN Systems and a garbage collection could o ccur b efore the thread

with a unied free list include sequential t meth is resumed In such a system there are fewer opp ortu

o ds and bitmapped techniques An intermediate strat nities for optimization b ecause the collector can b e in

voked b etween anytwo consecutive instructions With

egy is to use a single data structure but use a tree

a safep oints system however the optimizer maybe

or similar structure sorted by the sizes andor ad

dresses of the free spaces Ste to reduce search able to p erform more optimizations across sequences

times Most of these systems and several hybrids of instructions that execute atomically with resp ect to

eg BBDT OA WW GZ are already de garbage collection

scrib ed in the literature on and

Several researchers ha veinvestigated compiler op

we will not describ e them here An exception to this

timizations related to heapallo cated structures b oth

is bitmapp ed memory management whichhasbeen

to detect p otential aliasing and to allow heap ob jects

used in several garbage collectors but is not usually

to b e stack allo cated when it is p ossible to infer that

discussed in the literature

they b ecome garbage at a particular p oint in a pro

Bitmapp ed memory management simply maintains

gram Scha Schb Hud JM RM LH

a bitmap corresp onding to units of memory typically

HPR CWZ Bak Cha discusses interac

words or pairs of words if ob jects are always aligned

tions b etween conventional optimizations and garbage

on twoword b oundaries with the bits value indicat

collection and when garbage collectionoriented opti

ing whether the unit is in use or not The bitmap is

mizations are safe These topics are b eyond the scop e

up dated when ob jects are allo cated or reclaimed The

of this survey and to the b est of our knowledge no

bitmap can b e scanned to construct a free list as in

actual systems use these techniques for garbage col

BDS or it can b e searched at allo cation time in

lection optimizations still suchtechniques maylead

a manner analogous to the search of a free list in a

to imp ortantimprovements in garbage collector p er

sequential t algorithm

formance by shifting much of the garbage detection

work to compile time

Compact Representations of Heap

Certain restricted forms of lifetime analysisfor lo

Data

cal variable binding environments onlycan b e sim

Representations of heap data are often optimized for ple but eectiveinavoiding the need to heapallo cate

sp eed rather than for space In dynamically typ ed most variable bindings in languages with closures

languages for example most elds of most ob jects Kra

are typically of a uniform size large enough to hold a hardware The basic idea is to devote some fraction of

tagged p ointer Pointers in turn are typically repre main memory to storing pages in a compressed form

sented as fullsize virtual addresses in a at nonseg so that one main memory page can hold the data for

mented address space Much of this space is wasted several virtual memory pages Normal virtual memory

in some sense b ecause many nonp ointer values could accessprotection hardware is used to detect references

b e represented in fewer bits and b ecause p ointers typ to compressed pages and trap to a routine that will

ically contain very little information due to limited uncompress them Once touched a page is cached in

heap sizes and lo cality of reference normal uncompressed form for a while so that the

program can op erate on it After a page has not b een

At least two mechanisms havebeendevelop ed to ex

touched for some time it is recompressed and access

ploit the regularities in data structures and allowthem

protected The compression routine is not limited to

to b e stored in a somewhat compressed form most of

compressing CDR eldsit can b e designed to com

the time and expanded on demand when they are

press other p ointers and nonp ointer data suchas

op erated on One of these is a negrained mecha

integers and character strings or executable co de

nism called cdr codingwhich is sp ecic to list cells

In eect compressed paging adds a new level to the

such as Lisp cons cells Han BC LH The

memory hierarchyintermediate in cost b etween nor

other is compressedpaging a more generalpurp ose

mal RAM and disk This may not only decrease over

mechanism that op erates on virtual memory pages

all RAM requirements but actually improve sp eed

Wil Dou WB Both mechanisms are invis

y reducing the numb er of disk seeks Using simple b

ible at the language level

and fast but eective compression algorithms heap

Cdrco ding was used in many early Lisp systems

data can typically b e compressed by a factor of two

when randomaccess memory was very exp ensive Un

to fourin considerably less time than it would take

fortunately it tended to b e rather exp ensiveinCPU

to do a disk seek even on a relatively slow pro cessor

time b ecause the changeable representations of list

WB As pro cessor sp eed improvements continue to

complicate basic list op erations Common op erations

outstrip disk sp eed improvements compressed paging

such as CDR require extra instructions to checkto

b ecomes increasingly attractive

see what kind of list cell they are traversingis it a

normal cell or one that has b een compressed

The compressed representation of a list in a cdr

GCrelated Language Fea

co ded system is really an array of items corresp ond

tures

ing to the items of the original list The cdrco ding

system works in concert with the garbage collector

The main use of garbage collection is to supp ort the

which linearizes lists and pac ks consecutive items into

simple abstraction that innite amounts of uniform

arrays holding the CAR values list items the CDR

memory are available for allo cation so that ob jects

valuesthe p ointers that link the listsare omitted

can simply b e created at will and can conceptually

To make this work it is necessary to store a bit some

live forever Sometimes however it is desirable to

where eg a sp ecial extra bit in the tag of the eld

alter this view It is sometimes desirable to have p oin

holding the CAR value saying that the CDR value

ters which do not prevent the referredto ob jects from

is implicitly a p ointer to the next item in memory

b eing reclaimed or to trigger sp ecial routines when an

Destructive up dates to CDR values require the gen

ob ject is reclaimed It can also b e desirable to have

eration of actual cons cells on demand and the for

more than one heap with ob jects allo cated in dierent

warding of references from the predecessor part of the

heaps b eing treated dierently

array

This scheme is really only worthwhile with sp ecial

Weak Pointers

hardware andor micro co ded routines as found on

Lisp Machines On current generalpurp ose pro ces

A simple extension of the garbage collection abstrac

sors it is seldom worth the time for the savings that

tion is to allow programs to hold p ointers to ob jects

can b e gained Roughly half the data in a Lisp sys

without those p ointers preventing the ob jects from b e

tem consists of cons cells so compressing them from

ing collected Pointers that do not keep ob jects from

twowords to one can only save ab out at b est

b eing collected are known a weak pointersandthey

are useful in a variety of situations One common ap More recently compressedpaging has b een prop osed

plication is the maintenance of tables whichmakeit as a means of reducing memory requirements on sto ck

p ossible to enumerate all of the ob jects of a given kind nalizable ob jects in some way and registering them in

For example it might b e desirable to have a table of a data structure much like that used for weak p oin

all of the le ob jects in a system so that their buers ters They may in fact use the same data structure

could b e ushed p erio dically for fault tolerance An Rather than simply nilling the p ointers if the ob jects

other common application is the maintenance of col arent reached by the primary traversal however the

lections of auxiliary information ab out ob jects where p ointers are recorded for sp ecial treatment after the

the information alone is useless and should not keep collection is nished After the collection is complete

the describ ed ob jects alive Examples include prop and the heap is once again consistent the referredto

erty tables and do cumentation strings ab out ob jects ob jects have their nalization op erations invoked

the usefulness of the description dep ends on the de

Finalization is useful in a variety of circumstances

scrib ed ob ject b eing otherwise interesting not vice

but it must b e used with care Because nalization o c

versa

curs asynchronouslyie whenever the collector no

Weak p ointers are typically implemented by the use

tices the ob jects are unreachable and do es something

of a sp ecial data structure known to the garbage col

ab out itit is p ossible to create race conditions and

lector recording the lo cations of weak p ointer elds

other subtle bugs For a more thorough discussion of

The garbage collector traverses all other p ointers in

both weak p ointers and nalization see Hay

the system rst to determine which ob jects are reach

able via normal paths Then the weak p ointers are tra

versed if their referents have b een reached by normal

paths the weak p ointers are treated normally In a

copying collector they are up dated to reect the new

lo cation of the ob ject If their referents havenotbeen

Multiple DifferentlyManaged

reached however the weak p ointers are treated sp e

Heaps

ciallytypically replaced with a nonp ointer value such

as null to signal that their referents no longer exist

In some systems a garbage collected heap is provided

for convenience and a separate explicitlymanaged

Finalization

heap is provided to allowvery precise control over the

Closely related to the notion of weak p ointers is the

use of memory

concept of nalization ie actions that are p erformed

and In some languages such as Mo dula CDG

automatically when an ob ject is reclaimed This is es

an extended version of C ED a garbage col

p ecially common when an ob ject manages a resource

lected heap co exists with an explicitlymanaged heap

other than heap memorysuchasaleoranetwork

This supp orts garbage collection while allowing pro

connection For example it may b e imp ortant to close

grammers to explicitly control deallo cation of some

a le when the corresp onding heap ob ject is reclaimed

ob jects for maximum p erformance or predictability

In the example of annotations and do cumentation it

Issues in the design of suchmultipleheap systems are

is often desirable to delete the description of an ob ject

discussed in Del and ED

once the ob ject itself is reclaimed Finalization can

thus generalize the garbage collector so that other re

In other systems such as large p ersistent or dis

sources are managed in muc h the same wayasheap

tributed shared memories it may also b e desirable to

memory and with similar program structure This

havemultiple heaps with dierent p olicies for distri

makes it p ossible to write more general and reusable

bution eg shared vs unshared access privileges

co de rather than having to treat certain kinds of ob

and resource managementMos Del

jects very dierently than normal ob jects Con

Many language implementations have had such fea sider a routine that iterates over a list applying an

arbitrary function to each item in the list If le de

tures internally more or less hidden from normal pro

scriptors are garbage collected the very same iteration

grammers but there is little published information

routine can b e used for a list of le descriptors as for

ab out them and little standardization of program

mer interfaces The terminology is also not stan a list of heap ob jects If the list b ecomes unreachable

dardized and these heaps are variously called heaps the garbage collector will reclaim the le descriptors

along with the list structure itself

zones areas arenas segments p o ols re

gions and so on Finalization is typically implemented by marking

Overall Cost of Garbage Col with a stateofthe art generational collector and com

piler co op eration

lection

Tarditi and Diwan TD studied garbage collec



tion costs in Standard ML of NJ and found the total

The costs of garbage collection have b een studied in

time cost of garbage collection b etween and

several systems esp ecially Lisp and Smalltalk sys

While this is a very interesting and informativestudy

tems Most of these studies have serious limitations

we b elieve these numb ers to b e unnecessarily high

however A common limitation is the use of a slow lan

due to extreme load on the garbage collector caused

guage implementation eg an interpreter or an inter

by allo cating tremendous amounts of activation infor

preted virtual machine or a p o or compiler When pro

mation and binding environments on the heap rather

grams execute unrealistically slowly the CPU costs of

than on a stack Sect We also b elieve their write

garbage collection app ear very low Another common

barrier could b e improved

limitation of cost studies is the use of toyprograms

Our own estimate of the typical cost of garbage col

or synthetic b enchmarks Such programs often b e

lection in a wellimplemented nonincremental gener

havevery dierently from large realworld applica

ational system for a highp erformance language imp

tions Often they have little longlived data making

lementation is that it should cost roughly ten p ercent

tracing collection app ear unrealistically ecient This

of running time over the cost of a wellimplemented

may also aect measurements of write barrier costs in

explicit heap management system with a space cost

generational collectors b ecause programs with little

of roughly a factor of two in data memory size Nat

longlived data do not create manyintergenerational

urally increased CPU costs could b e traded for re

p ointers

duced space costs This must b e taken as an educated

Yet another diculty is that most studies are now

guess however comp ensating for what we p erceiveas

somewhat dated b ecause of later improvements in

deciencies in existing systems and limitations of the

garbage collection implementations The most valu

studies to date

able studies are therefore the ones with detailed statis

Clearly more studies of garbage collection in high

tics ab out various events which can b e used to infer

p erformance systems are called for esp ecially for go o d

how dierent strategies would fare

implementations of stronglytyp ed languages

Perhaps the b est study to date is Zorns investi

gation of GC cost in a large commercial Common

Lisp system using eight large programs Zor Zorn

Conclusions and Areas for

found the time cost of generational garbage collection

Research

to b e to W e susp ect those numb ers could

b e improved signicantly with the use of a fast card

Wehave argued that garbage collection is an essen

marking write barrier Shaws thesis provides similar

tial for fully mo dular programming to allow exible

p erformance numb ers Sha and Steenkistes thesis

reusable co de and to eliminate a large class of ex

contains similar relevant statistics Ste

tremely dangerous co ding errors

Zorn has also studied simple nongenerational col

Recentadvances in garbage collection technology

lector using conservativepointernding Zor this

make automatic storage reclamation aordable for use

allowed him to study garbage collection using a high

in highp erformance systems Even relatively sim

p erformance implementation of the C language and

ple garbage collectors p erformance is often comp et

compare the costs of garbage collection to those of

itivewithconventional explicit storage management

various implementations of explicit deallo cation Un

App Zor Generational techniques reduce the

fortunately the garbage collector used was not state

basic costs and disruptiveness of collection by exploit

ofthe art partly due to the lack of compiler co op era

ing the empiricallyobserved tendency of ob jects to die

tion Zorn found that when compared to using a well

young Incremental techniques mayeven makegar

implemented malloc and free programs using

bage collection relatively attractive for hard realtime

a simple conservative GC used b etween and

systems

more CPU time and b etween and more

Wehave discussed the basic op eration of several

memory Zorns test programs were unusually heap

kinds of garbage collectors to provide a framework

allo cationintensive however and the costs would pre



sumably b e lower for a more typical workload We also

ML is a statically typ ed generalpurp ose mostlyfunctional

b elieve these gures could b e improved considerably language

for understanding current research in the eld A key lectors p oint of view A p ersistent store keeps those

p oint is that standard textb o ok analyses of garbage data within the scop e of garbage collection oering

collection algorithms usually miss the most imp ortant the attractive prosp ect of automatic managementof

characteristics of collectorsnamely the constant fac longlived dataas well as the challenge of doing it

tors asso ciated with the various costs such as write eciently

barrier overhead and lo cality eects Similarly real

Parallel computers raise new issues for garbage col

time garbage collection is a more subtle topic than is

lectors as well It is desirable to make the collector

widely recognized These factors require garbage col

concurrent ie able to run on a separate pro cessor

lection designers to take detailed implementation is

from the application to takeadvantage of pro cessors

sues into account and b e very careful in their choices

not in use by the application It is also desirable to

of features Pragmatic decisions such as the need

make the collector itself parallel so that it can b e sp ed

to interop erate with existing co de in other languages

up to keep up with the application Concurrent collec

may also outweigh small dierences in p erformance

tors raise issues of co ordination b etween the collector

Despite these complex issues many systems actu

and the mutator which are somewhat more dicult

ally have fairly simple requirements of a garbage col

than those raised by simple incremental collection

lector and can use a collector that consists of a few

Parallel collectors also raise issues of co ordination of

hundred lines of co de Systems with large complex

dierent parts of the garbage collection pro cess and

optimizing compilers should have more attention paid

of nding sucient parallelism despite p otential b ot

to garbage collection and use stateoftheart tech

tlenecks due to the top ologies of the data structures

niques One promising developmentistheavailabil

b eing traversed

ity of garbage collectors written in p ortable highlevel

Distributed systems pose still more problems

languages typically C and adaptable for use with

AMR the limitations on parallelism are partic



various implementations of various languages

ularly severe when the collection pro cess must pro

Garbage collector designers must also keep up with

ceed across multiple networked computers and com

advances in other asp ects of system design The tech

munication costs are high In large systems with

niques describ ed in this survey app ear to b e su

longrunning applications networks are typically un

cienttoprovide go o d p erformance in most relatively

reliable and distributed garbage collection strategies

conventional unipro cessor systems but continual ad

for such systemslike the applications running on

vances in other areas intro duce new problems for gar

themmust b e robust enough to tolerate computer

bage collector design

and network failures

Persistent ob ject stores ABC DSZ AM

In large p ersistent or distributed systems data in

allow large interrelated data structures to b e sa ved in

t garbage collection tegrity is particularly imp ortan

denitely without writing them to les and rereading

strategies must b e co ordinated with checkp ointing and

them when they are needed again by automatically

recovery b oth for eciency and to ensure that the col

preserving p ointerlinked data structures they relieve

lector itself do es not fail Kol Det ONG

the programmer of tedious and errorprone co ding of

As highp erformance graphics and sound capabil

inputoutput routines Large p ersistent ob ject stores

ities b ecome more widely available and economical

can replace le systems for many purp oses but this in

computers are likely to b e used in more graphical and

tro duces problems of managing large amounts of long

interactiveways Multimedia and virtual realityap

lived data It is very desirable for p ersistent stores to

plications will require garbage collection techniques

have garbage collection so that storage leaks do not

that do not imp ose large delays making incremen

result in a large and p ermanent accumulation of un

tal techniques increasingly desirable Increasing use

reclaimed storage Garbage collecting a large p ersis

of emb edded micropro cessors makes it desirable to fa

tent store is a very dierent task from garbage col

cilitate programming of hard realtime applications

lecting the memory of a single pro cess of b ounded du

making negrained incremental techniques esp ecially

ration In eect a collector for a conventional sys

attractive

tem can avoid the problem of very longlived data

As computer systems b ecome more ubiquitous net

b ecause data written to les disapp ear from the col

worked and heterogeneous it is increasingly desirable



Examples of this include the UMass Garbage Collection

to integrate mo dules which run on dierent comput

To olkit HMDW and Wilson and Johnstones realtime gar

ers and mayhave b een develop ed in extremely dier

bage collector which has b een adapted for use with C Eiel

Scheme and Dylan WJ ent programming languages interop erabilitybetween

diverse systems is an imp ortant goal for the future App Andrew W App el Garbage collection In Pe

ter Lee editor Topics in AdvancedLanguage

and garbage collection strategies must b e develop ed

Implementation pages MIT Press

for such systems

Cambridge Massachusetts

Garbage collection is applicable to many current

Bak Henry G Baker Jr List pro cessing in real

generalpurp ose and sp ecialized programming sys

time on a serial computer Communications

tems but considerable work remains in adapting it

of the ACM April

to new evermore advanced paradigms

Bak Henry G Baker Jr Unify and conquer gar

bage up dating aliasing in functional

languages In ConferenceRecord of the

Acknowledgments

ACM Symposium on LISP and Functional

Programming LFP pages

References

Baka Henry G Bak er Jr Cacheconscious copy

ing collection In OOPSLA Workshop on

ABC MPAtkinson PJ Bailey KJ Chisholm

Garbage Col lection in ObjectOrientedSys

PWCockshott and R Morrison An ap

tems OOP Position pap er

proach to p ersistent programming Computer

Bakb Henry G Baker Jr The Treadmill Real

Journal Decemb er

time garbage collection without motion sick

AEL Andrew W App el John R Ellis and Kai

ness In OOPSLA Workshop on Gar

Li Realtime concurrent garbage collection

bage Col lection in ObjectOriented Systems

on sto ckmultipro cessors In Proceedings of

OOP Position pap er Also app ears as

the SIGPLAN ConferenceonProgram

SIGPLAN Notices March

ming Language Design and Implementation

Baka Henry G Baker Infant mortality and genera

PLD pages

tional garbage collection SIGPLAN Notices

AM Antonio Albano and Ron Morrison editors

April

Fifth International Workshop on Persistent

Bakb Henry G Baker Jr Safe and leakpro of

Object Systems San Miniato Italy Septem

resource management using Ada limited

b er SpringerVerlag

typ es Unpublished

AMR Saleh E Ab dullahi Eliot E Miranda and

Bar Jo el F Bartlett Compacting garbage collec

Graem A Ringwood Distributed garbage

tion with ambiguous ro ots Technical Rep ort

collection In Bekkers and Cohen BC

Digital Equipment Corp oration West

pages

ern Research Lab oratoryPalo Alto Califor

And David L Andre Paging in Lisp programs

nia February

Masters thesis University of Maryland Col

Bar Jo el F Bartlett Mostlycopying garbage col

lege Park Maryland

lection picks up generations and C Tech

nical Note TN Digital EquipmentCorpo

AP S Abraham and J Patel Parallel gar

ration Western Research Lab oratory Octob er

bage collection on a virtual memory sys

tem In E Chiricozzi and A DAmato

editors International ConferenceonParal

BBDT G Bozman W Buco T PDaly and W H

lel Processing and Applications pages

Tetzla Analysis of free storage algorithms

LAquila Italy Septemb er Elsevier

revisited IBM Systems Journal

NorthHolland

App Andrew W App el Garbage collection can

BC Daniel G Bobrow and Douglas W Clark

b e faster than stack allo cation Information

Compact enco dings of list structure ACM

Processing Letters June

gramming Languages and Transactions on Pro

Systems Octob er

Appa Andrew W App el Runtime tags arent neces

sary Lisp and Symbolic Computation

BC HansJuergen Bo ehm and David Chase A

prop osal for garbagecollectorsafe compila

tion The Journal of C Language Translation

Appb Andrew W App el Simple generational gar

Decemb er

bage collection and fast allo cation Soft

warePractice and Experience BC Yves Bekkers and Jacques Cohen editors In

February ternational Workshop on Memory Manage

CG Douglas W Clark and C Cordell Green mentnumb er in Lecture Notes in Com

An empirical study of list structure in LISP puter Science St Malo France September

Communic ations of the ACM SpringerVerlag

February

BDS HansJ Bo ehm Alan J Demers and Scott

Cha David Chase Garbage Col lection and Other

Shenker Mostly parallel garbage collection

Optimizations PhD thesis Rice University

In Proceedings of the SIGPLAN Confer

Houston Texas August

enceonProgramming Language Design and

Cha Craig Chamb ers The Design and Imple

Implementation PLD pages

mentation of the SELF Compiler an Opti

Bla Ricki Blau Paging on an ob jectoriented p er

mizing Compiler for an ObjectOrientedPro

sonal computer for Smalltalk In Proceedings

gramming Language PhD thesis Stanford

of the ACM SIGMETRICS Conferenceon

University March

Measurement and Modeling of Computer Sys

Che C J Cheney A nonrecursive list compact

tems Minneap oli s Minnesota August

ing algorithm Communications of the ACM

Also available as Technical Rep ort UCBCSD

Novemb er

University of California at Berkeley

CHO Will Clinger Anne Hartheimer and Erik Ost

Computer Science Division EECS August

Implementation strategies for continuations

In ConferenceRecord of the ACM Sym

Bob Daniel G Bobrow Managing reentrantstruc

posium on LISP and Functional Program

tures using reference counts ACM Trans

ming pages Snowbird Utah July

actions on Programming Languages and Sys

ACM Press

tems July

Cla Douglas W Clark Measurements of dynamic

Bo e HansJuergen Bo ehm Hardware and op er

list structure use in Lisp IEEE Transactions

ating system supp ort for conservativegar

on Software Engineering January

bage collection In International Workshop

on Memory Management pages Palo

CN Jacques Cohen and Alexandru Nicolau Com

Alto California Octob er IEEE Press

parison of compacting algorithms for garbage

collection ACM Transactions on Program

Bo e HansJuergen Bo ehm Spaceecientconser

ming L anguages and Systems

vative garbage collection In Proceedings of

Octob er

the SIGPLAN ConferenceonProgram

Coh Jacques Cohen Garbage collection of

ming Language Design and Implementation

linked data structures Computing Surveys

PLD pages

Septemb er

Bro Ro dney A Bro oks Trading data space for re

Col George E Collins A metho d for overlapping

duced time and co de space in realtime collec

and erasure of lists Communications of the

tion on sto ck hardware In ConferenceRecord

ACM Decemb er

of the ACM Symposium on LISP and

Cou Rob ert Courts Improving lo cality of refer

Functional Programming LFP pages

ence in a garbagecollecti ng memory manage

ment system Communications of the ACM

BW HansJuergen Bo ehm and Mark Weiser Gar

Septemb er

bage collection in an unco op erativeenviron

CU Craig Chamb ers and David Ungar Cus

ment SoftwarePractice and Experience

tomization Optimizing compiler technology

Septemb er

for Self a dynamicall ytyp ed ob jectoriented

BZ David A Barrett and Bejamin G Zorn Using

language In Proceedings of SIGPLAN

lifetime predictors to improve memory allo ca

pages

tion p erformance In Proceedings of the

CU Craig Chamb ers and David Ungar Making

SIGPLAN ConferenceonProgramming Lan

pure ob jectoriented languages practical In

guage Design and Implementation PLD

Paep ckePae pages

pages

CWB Patrick J Caudill and Allen WirfsBro ck

CDG Luca Cardelli James Donahue Lucille Glass A thirdgeneration Smalltalk implementa

man Mick Jordan Bill Kalso and Greg Nel tion In ConferenceonObject OrientedPro

son Mo dula rep ort revised ResearchRe gramming Systems Languages and Applica

p ort Digital Equipment Corp oration Sys tions OOPSLA Proceedings pages

tems Research Center Novemb er ACM Press Octob er

CWZ David R Chase Mark Wegman and F Ken Persistent Object Systems Marthas Vine

neth Zadeck Analysis of p ointers and struc yard Massachusetts Septemb er Mor

tures In Proceedings of the SIG gan Kaufman

PLAN ConferenceonProgramming Language

DTM Amer Diwan David Tarditi and Eliot Moss

Design and Implementation pages

Memory subsystem p erformance of programs

White Plains New York June ACM

with intensive heap allo cation Submitted for

Press

publication August

Daw Jerey L Dawson Improved eectiveness

DWH Alan Demers Mark Weiser Barry Hayes

from a realtime LISP garbage collector In

Daniel Bobrow and Scott Shenker Combin

ConferenceRecord of the ACM Sym

ing generational and conservative garbage col

posium on LISP and Functional Program

lection Framework and implementation s In

ming pages Pittsburgh Pennsylva

ConferenceRecord of the Seventeenth Annual

nia August ACM Press

ACM Symposium on Principles of Program

DB L Peter Deutsch and Daniel G Bobrow ming Languages pages San Fran

An ecient incremental automatic garbage cisco California January ACM Press

collector Communications of the ACM

ED John R Ellis and David L Detlefs Safe ef

Septemb er

cient garbage collection for C Technical

Del V Delacour Allo cation regions and imple Rep ort Digital Equipment Corp oration

mentation contracts In Bekkers and Cohen Systems ResearchCenter

BC pages

Ede Daniel Ross Edelson Smart p ointers Theyre

DeT John DeTreville Exp erience with concurrent smart but theyre not p ointers In USENIX

garbage collectors for Mo dula Techni C Conference USE pages Tech

cal Rep ort Digital Equipment Corp ora nical Rep ort UCSCCRL Universityof

tion Systems ResearchCenter Palo Alto Cal California at Santa Cruz Baskin Center for

ifornia August Computer Engineering and Information Sci

ences June

Det David L Detlefs Concurrent Atomic Gar

bage Col lection PhD thesis Dept of Com EV Steven Engelstad and Jim Vandendorp Au

puter Science Carnegie Mellon University tomatic storage management for systems with

Pittsburgh Pennsylvania Novemb er real time constraints In OOPSLA

Technical rep ort CMUCS Workshop on Garbage Col lection in Object

Oriented Systems OOP Position pap er

Det David L Detlefs Garbage collection and run

time typing as a C libraryInUSENIX FY Rob ert R Fenichel and Jerome C Yochelson

C Conference USE A LISP garbagecollector for virtualmemory

computer systems Communications of the

DLM Edsger W Dijkstra Leslie Lamp ort A J

ACM Novemb er

Martin C S Scholten and E F M Steens

Onthey garbage collection An exercise in GC Edward Gehringer

co op eration Communications of the ACM and Ellis Chang Hardwareassisted memory

Novemb er management In OOPSLA Workshop on

Memory Management and Garbage Col lection

DMH Amer Diwan Eliot Moss and Richard Hud

OOP Position pap er

son Compiler supp ort for garbage collection

in a staticallytyp ed language In Proceedings GG Ralph E Griswold and Madge T Griswold

of the SIGPLAN ConferenceonPro The Implementation of the Icon Program

gramming Language Design and Implementa ming Language Princeton University Press

tion pages San Francisco Califor Princeton New Jersey

nia June ACM Press

Gol Benjamin Goldb erg Tagfree garbage col

Dou Fred Douglis The compression cache Using lection for stronglytyp ed programming lan

online compression to extend physical mem guages In Proceedings of the SIGPLAN

oryInProceedings of Winter USENIX ConferenceonProgramming Language De

Conference pages San Diego Cali sign and Implementation PLD pages

fornia January

DSZ Alan Dearle Gail M Shaw and Stanley B Gre Richard Greenblatt The LISP machine In

Zdonik editors Implementing Persistent Ob DR Barstow HE Shrob e and E Sande

ject Bases Principles and PracticeProce wall editors Interactive Programming Envi ed

ronments McGraw Hill ingsoftheFourth International Workshop on

Gud David Gudeman Representing typ e informa gramming Sys ence on Object OrientedPro

tion in dynamicall ytyp ed languages Techni tems Languages and Applications OOP

cal Rep ort TR University of Arizona SLA pages Vancouver British

Department of Tucson Columbia Octob er ACM Press Pub

Arizona lished as SIGPLAN Notices Octob er

GZ Dirk

Grunwald and Benjamin Zorn CustoMallo c

HPR Susan Horwitz P Pfeier and T Reps De

Ecientsynthesized memory allo cators Soft

p endence analysis for p ointer variables In

warePractice and Experience

Proceedings of the SIGPLAN SIGPLAN

August

Symposium on Compiler Construction June

Published as SIGPLAN Notices

H Urs Holzle A fast write barrier for gener

ational garbage collectors In OOPSLA

Hud Paul Hudak A semantic mo del of refer

Workshop on Memory Management and Gar

ence counting and its abstraction In Con

bage Col lection OOP Position pap er

ferenceRecord of the ACM Symposium

on LISP and Functional Programming pages

Han Wilfred J Hansen Compact list representa

Cambridge Massachusetts August

tion Denition garbage collection and sys

ACM Press

tem implementation Communications of the

ACM Septemb er

JJ Niels Christian Juul and Eric Jul Compre

hensive and robust garbage collection in a

Hay Barry Hayes Using key ob ject opp ortunism

distributed system In Bekkers and Cohen

to collect old ob jects In Paep ckePae

BC pages

pages

JM Neil D Jones and Steven S Muchnick Flow

Hay Barry Hayes Finalizati on in the garbage

analysis and optimization of LISPlikestruc

collector interface In Bekkers and Cohen

tures In Steven S Muchnik and Neil D

BC pages

Jones editors Program Flow Analysis pages

HH Antony Hosking and Richard Hudson Re

PrenticeHall

memb ered sets can also

Joh Douglas Johnson The case for a read barrier

play cards In OOPSLAGC OOP Avail

In F ourth International ConferenceonArchi

able for anonymous FTP from csutexasedu

tectural Support for Programming Languages

in pubgarbageGC

and Operating Systems ASPLOS IV pages

HJ Reed Hastings and Bob Joyce Purify Fast

Santa Clara California April

detection of memory leaks and access errors

Joh Ralph E Johnson Reducing the latency

In USENIX Winter Technical Confer

of a realtime garbage collector ACM Let

ence pages USENIX Asso ciation

ters on Programming Languages and Systems

January

March

HL Lorenz Huelsb ergen and James R Larus A

KdRB Gregor Kiczales Jim des Rivieres and

concurrent copying garbage collector for lan

Daniel G Bobrow The Art of the Metaobject

guages that distingui sh immutable data In

Protocol MIT Press Cambridge Massachu

Proceedings of the Fourth ACM SIGPLAN

setts

Symposium on Principles and Practiceof

Paral lel Programming PPOPP pages

Kel Richard Kelsey Tail recursivestack dis

San Diego California May ACM

ciplines for an interpreter Available

Press Published as SIGPLAN Notices

via anonymous FTP from nexusyorkuca

July

in pubschemetxtstackgcps Slightly en

hanced version of Technical Rep ort NU

HMDW Richard L Hudson J Eliot B Moss

CCS College of Computer Science

Amer Diwan and Christopher F Weight

Northeastern University

A languageindep endent garbage collector

to olkit Coins Technical Rep ort

KKR David A Kranz Richard Kelsey Jonathan

University of Massachusetts Amherst

Rees Paul Hudak James Philbin and Nor

MA Septemb er

man Adams ORBIT An optimizing com

piler for Scheme In SIGPLAN Symp osium on HMS Antony L Hosking J Eliot B Moss and

Compiler Construction pages Palo Darko Stefanovic A comparative p erfor

Alto California June Published as mance evaluation of write barrier implemen

ACM SIGPLAN Notices July tations In Andreas Paep cke editor Confer

KLS Phillip J Ko opman Jr Peter Lee and McB J Harold McBeth On the reference counter

Daniel P Siewiorek Cache p erformance of metho d Communications of the ACM

combinator graph reduction ACM Trans Septemb er

actions on Programming Languages and Sys

McC John McCarthy Recursive functions of sym

tems April

b olic expressions and their computation by

Knu Donald E Knuth The Art of Computer

machine Communications of the ACM

Programmingvolume Fundamental Algo

April

rithms AddisonWesley Reading Massachu

Mey Norman Meyrowitz editor Conferenceon

setts

Object OrientedProgramming Systems Lan

Kol Elliot Kolo dner Atomic incremental garbage

guages and Applications OOPSLA Pro

collection and recovery for large stable heap

ceedings San Diego California September

In Dearle et al DSZ pages

ACM Press Published as SIGPLAN

Kra David A Kranz ORBIT An Optimizing

Notices Novemb er

Compiler For Scheme PhD thesis Yale Uni

Min Marvin Minsky A LISP garbage collector al

versityNewHaven Connecticut February

gorithm using serial secondary storage AI

Memo Massachusetts Institute of Tech

Lar R G Larson Minimizing garbage collection

nology Pro ject MAC Cambridge Massachu

as a function of region size SIAM Journal on

setts

Computing Decemb er

MN Pattie

LD Bernard Lang and Francis Dup ont Incremen

Maes and Daniele Nardi editors MetaLevel

tal incrementally compacting garbage collec

Architectures and Reection NorthHolland

tion In SIGPLAN Symposium on In

Amsterdam

terpreters and Interpretive Techniques pages

Mo o David Mo on Garbage collection in a large

SaintPaul Minnesota June

eRecordofthe Lisp system In Conferenc

ACM Press Published as SIGPLAN Notices

ACM Symposium on LISP and Func

July

tional Programming LFP pages

Lee Elgin Ho eSing Lee Ob ject storage and in

Mos J Eliot B Moss Addressing large distributed

heritance for SELF a prototyp ebased ob ject

collections of p ersistent ob jects The Mneme

orien ted programming language Engineers

pro jects approach In Second International

thesis Stanford UniversityPalo Alto Cali

Workshop on Database Programming Lan

fornia Decemb er

guages pages Glenedon Beach Ore

LFP ConferenceRecord of the ACM Sympo

gon June Also available as Techni

siumonLISPandFunctional Programming

cal Rep ort UniversityofMassachusetts

Austin Texas August ACM Press

Dept of Computer and Information Science

LFP ConferenceRecord of the ACM Sympo

Amherst Massachusetts

siumonLISPandFunctional Programming

Nil Kelvin Nilsen Garbage collection of strings

Nice France June ACM Press

and linked data structures in real time Soft

LH Henry Lieb erman and Carl Hewitt A real

warePractice and Experience

time garbage collector based on the lifetimes

July

of ob jects Communications of the ACM

NOPH Scott Nettles James OTo ole David Pierce

June

and Nicholas Haines Replicatio nbased in

LH Kai Li and Paul Hudak A new list com

cremental copying collection In Bekkers and

paction metho d SoftwarePractice and Ex

Cohen BC pages

perience February

NR S C North and J H Reppy Concurrent gar

LH James R Larus and Paul N Hilnger De

bage collection on sto ckhardware In Gilles

tecting conicts b etween record accesses In

Kahn editor ACM ConferenceonFunctional

Proceedings of the SIGPLAN Confer

Programming Languages and Computer Arch

enceonProgramming Language Design and

itecturenumb er in Lecture Notes in

Implementation PLD pages

Computer Science pages Springer

Lla Rene Llames PerformanceAnalysis of Gar

Verlag September

bage Col lection and Dynamic Reordering in a

NS Kelvin Nilsen and Willia m J Schmidt A Lisp System PhD thesis Department of Elec

highlevel overview of hardware assisted real trical and Computer Engineering University

time garbage collection Technical Rep ort of Illinois ChampaignUrbana Illinois

TR a Dept of Computer Science Iowa RM Christina Ruggieri and Thomas P Murtagh

State University Ames Iowa Lifetime analysis of dynamically allo cated ob

jects In ConferenceRecord of the Fifteenth

NS Kelvin Nilsen and Willi am J Schmidt

Annual ACM Symposium on Principles of

Costeective ob ject space management for

Programming Languages pages San

hardwareassisted realtime garbage collec

Diego California January ACM Press

tion ACM Letters on Programming Lan

guages and Systems December

Ros John R Rose Fast dispatchmechanisms for

sto ckhardware In Meyrowitz Mey pages

OA R R Oldeho eft and S J Allan Adaptive

exactt storage management Communica

Rov Paul Rovner On adding garbage collection

tions of the ACM May

and runtime typ es to a stronglytyp ed stati

ONG James OTo ole Scott Nettles and David Gif cally checked concurrent language Technical

ford Concurrent compacting garbage collec Rep ort CSL XeroxPalo Alto Research

tion of a p ersistent heap In Proceedings of the

Center Palo Alto California July

Fourteenth Symposium on Operating Systems

Scha Jacob T Schwartz Optimization of very high

Principles Asheville North Carolina Decem

level languagesI Value transmission and its

b er ACM Press Published as Operating

corollaries Journal of Computer Languages

Systems Review

OOP OOPSLA Workshop on Garbage Col

Schb Jacob T Schwartz Optimization of very high

lection in ObjectOriented Systems Octob er

lev el languagesI I Deducing relationships of

Available for anonymous FTP from

inclusion and memb ership Journal of Com

csutexasedu in pubgarbageGC

puter Languages

OOP OOPSLA Workshop on Memory Man

SCN W R Stoye T J W Clarke and A C

agement and Garbage Col lection Octob er

Norman Some practical metho ds for rapid

Available for anonymous FTP from

combinator reduction In ConferenceRecord

csutexasedu in pubgarbageGC

of the ACM Symposium on LISP and

Pae Andreas Paep cke editor Conferenceon

Functional Programming LFP pages

Object OrientedProgramming Systems Lan

guages and Applications OOPSLA

SH Peter Steenkiste and John Hennessy Tags

Pho enix Arizona Octob er ACM

and typ e checking in Lisp In Second In

Press Published as SIGPLAN Notices

ternational ConferenceonArchitectural Sup

Novemb er

port for Programming Languages and Operat

PLD Proceedings of the SIGPLAN Confer

ing Systems ASPLOS II pages Palo

enceonProgramming Language Design and

Alto California Octob er

ImplementationAtlanta Georgia June

Sha Rob ert A Shaw Empirical Analysis of a Lisp

ACM Press

System PhD thesis Stanford UniversityPalo

PLD Proceedings of the SIGPLAN Con

Alto California February Technical

ferenceonProgramming Language Design

Rep ort CSLTR Stanford University

and ImplementationToronto Ontario June

Computer Systems Lab oratory

ACM Press Published as SIGPLAN

Sob Patrick G Sobalvarro A lifetimebased gar

Notices June

bage collector for LISP systems on general

PLD Proceedings of the SIGPLAN Confer

purp ose computers BS thesis Massachu

enceonProgramming Language Design and

setts Institute of Technology EECS Depart

Implementation Albuquerque New Mexico

t Cambridge Massachusetts men

June ACM Press

Sta Thomas Standish Data StructureTech

PN J L Peterson and T A Norman Buddy

niques AddisonWesley Reading Massachu

systems Communications of the ACM

setts

June

PS CJ Peng and Gurindar S Sohi Cache Sta James Willia m Stamos A large ob ject

memory design considerations to supp ort lan oriented virtual memory Grouping strate

guages with dynamic heap allo cation Tech gies measurements and p erformance Tech

nical Rep ort Computer Sciences Dept nical Rep ort SCG XeroxPalo Alto Re

University of Wisconsin Madison Wisconsin searchCenter Palo Alto California May

July

Sta James William Stamos Static grouping WDH Mark Weiser Alan Demers and Carl Hauser

of small ob jects to enhance p erformance of The p ortable common runtime approachto

eedings of the Twelfth a paged virtual memory ACM Transac interop erabili tyInProc

Symposium on Operating Systems Principles tions on Programming Languages and Sys

Decemb er tems May

WF David S Wise and Daniel PFriedman The

Ste Guy L Steele Jr Multipro cessin g compacti

onebit reference count BIT

fying garbage collection Communications of

Septemb er

the ACM Septemb er

WH Paul R Wilson and Barry Hayes The

Ste C J Stephenson Fast ts New metho ds for

OOPSLA Workshop on Garbage Collection in

dynamic storage allo cation In Proceedings of

Ob ject Oriented Systems organizers rep ort

the Ninth Symposium on Operating Systems

In Jerry L Archibald editor OOPSLA

Principles pages Bretton Wo o ds New

Addendum to the Proceedings pages

Hampshire Octob er ACM Press Pub

Pho enix Arizona Octob er ACM Press

lished as Operating Systems Review Oc

Published as OOPS Messenger Octob er

tob er

Ste Peter Steenkiste Lisp on a Reduced

Whi Jon L White Addressmemory management

InstructionSet Processor Characterization

for a gigantic Lisp environment or GC con

and Optimization PhD thesis Stanford Uni

sidered harmful In LISP Conference pages

versityPalo Alto California March

Redwo o d Estates California Au

Technical Rep ort CSLTR Stanford

gust

University Computer System Lab oratory

Wil Paul R Wilson Some issues and strategies

Str Bjarne Stroustrup The evolution of C

in heap management and memory hierarchies

to In USENIX C Workshop

In OOPSLAECOOP Workshop on Gar

pages USENIX Asso ciation

bage Col lection in ObjectOriented Systems

Sub Indira Subramanian Managing discardable

Octob er Also app ears in SIGPLAN No

pages with an external pager In USENIX

tices January

Mach Symposium pages Mon terey

Wil Paul R Wilson Op erating system supp ort

California Novemb er

for small ob jects In International Workshop

on Object Orientation in Operating Systems

TD David Tarditi and Amer Diwan The full

pages Palo Alto California Octob er

cost of a generational copying garbage collec

IEEE Press Revised version to app ear

tion implementation Unpublished Septem

in Computing Systems

b er

Wis Da vid S Wise Design for a multipro

UJ David Ungar and Frank Jackson Tenuring

cessing heap with onb oard reference count

p olicies for generationbased storage reclama

ing In Functional Programming Languages

tion In Meyrowitz Mey pages

and Computer Architecture pages

Ung David M Ungar Generation scaveng

SpringerVerlag Septemb er Lecture

ing A nondisruptive highp erformance stor

Notes in Computer Science series no

age reclamation algorithm In ACM SIG

Wit P T Withington How real is real time gar

SOFTSIGPLAN Software Engineering Sym

bage collection In OOPSLA Workshop

posium on Practical Software Development

on Garbage Col lection in ObjectOriented

Environments pages ACM Press

Systems OOP Position pap er

April Published as ACM SIGPLAN No

tices May

WJ Paul R Wilson and Mark S Johnstone Truly

realtime noncopying garbage collection In

USE USENIX Asso ciation USENIX C Confer

OOPSLA Workshop on Memory Manage

encePortland Oregon August

ment and Garbage Col lection OOP Ex

Wan Thomas Wang MM garbage collector for

panded version of workshop p osition pap er

C Masters thesis California Polytechnic

submitted for publicatio n

State University San Luis Obisp o California

WLM Paul R Wilson Michael S Lam and

Octob er

Thomas G Moher Eective staticgraph re

WB Paul R Wilson and V B Balayoghan Com organization to improve lo cality in garbage

pressed paging In preparation collected systems In Proceedings of the

SIGPLAN ConferenceonProgramming Lan Zor Benjamin Zorn The measured cost of conser

guage Design and Implementation PLD vative garbage collection SoftwarePractice

pages Published as SIGPLAN No and Experience

tices June

WLM Paul R Wilson Michael S Lam and

Thomas G Moher Caching considerations

for generational garbage collection In Con

ferenceRecord of the ACM Symposium

on LISP and Functional Programming pages

San Francisco California June

ACM Press

WM Paul R Wilson and Thomas G Moher De

sign of the Opp ortunistic Garbage Collec

tor In ConferenceonObject OrientedPro

gramming Systems Languages and Applica

tions OOPSLA Proceedings pages

New Orleans Louisiana ACM

Press

WW Charles B Weinsto ck and William A Wulf

Quickt an ecient algorithm for heap stor

age allo cation ACM SIGPLAN Notices

Octob er

WWH Ifor W Williams Mario I Wolczko and

Trevor P Hopkins Dynamic grouping in an

ob jectoriented virtual memory hierarchyIn

Eur opean Conference on Object OrientedPro

gramming pages Paris France June

SpringerVerlag

YS Akinori Yonezawa and Brian C Smith ed

itors Reection and MetaLevel Architec

ture Proceedings of the International Work

shop on New Models for SoftwareArchitecture

Tokyo Japan Novemb er Research

Institute of Software Engineering RISE and

InformationTechnology Promotion Agency

Japan IPA in co op eration with ACM SIG

PLAN JSSST IPSJ

Yuaa Taichi Yuasa The design and implementation

of Kyoto Common Lisp Journal of Informa

tion Processing

Yuab Taichi Yuasa Realtime garbage collection

on generalpurp ose machines Journal of Sys

tems and Software

Zor Benjamin Zorn Comparative Performance

Evaluation of Garbage Col lection Algorithms

ersity of California at Berke PhD thesis Univ

ley EECS Department December

Technical Rep ort UCBCSD

Zor Benjamin Zorn Comparing markandsweep

and stopandcopy garbage collection In

ConferenceRecord of the ACM Sympo

sium on LISP and Functional Programming

LFP pages