Pointer Swizzling at Page Fault Time Eciently and Compatibly



Supp orting Huge Address Spaces on Standard Hardware

Paul R Wilson and Sheetal V Kakkad

Department of Computer Sciences

University of Texas

Austin Texas

wilsoncsutexasedu

I could be bounded in a nutshel l and hardware can sp ecify directly Applications of large

count myself king of innite space address spaces include distributed shared memories

eg Li op erating systems with a single shared

W Shakesp eare Hamlet I Iii address space eg CLLBH and p ersistent ob ject

+

stores eg ABC DSZ Distributed shared

memories provide a single address space for appli

Abstract

cations that span multiple machines shared address

space op erating systems provide a single addressing

Pointer swizzling at page fault time is a novel

mo del for all pro cesses on one or more machines and

address translation mechanism that exploits conven

p ersistent ob ject stores provide sharable recoverable

tional address translation hardware It can supp ort

heap storage to eliminate the use of les for most pur

huge address spaces eciently without long hardware

p oses

addresses such large address spaces are attractive for

p ersistent ob ject stores distributed shared memories

A common feature of these systems is the empha

and shared address space op erating systems This

sis on simplifying programming by preserving p ointer

swizzling scheme can b e used to provide data com

semanticsthat is ob ject identity The programmers

patibility across machines with dierent word sizes

default view of data is of a heap of ob jects connected

and even to provide binary co de compatibility across

by p ointers Programs may traverse p ointers freely

machines with dierent hardware address sizes

and complex data structures may b e passed by refer

Pointers are translated swizzled from a long for

ence

mat to a shorter hardwaresupp orted format at page

It most current systems without large shared ad

fault time No extra hardware is required and no con

dress spaces it is often necessary to write routines

tinual software overhead is incurred by presence checks

to atten data structures into a lowlevel linear form

or indirection of p ointers This pagewise technique ex

and reconstruct them later in order to communicate

ploits temp oral and spatial lo cality in much the same

them from one machine to another or to make them

way as a normal virtual memory this gives it many

recoverable in the case of a crash or to save them

desirable p erformance characteristics esp ecially given

on disk so that they can b e op erated on later p er

the trend toward larger main memories It is easy

haps by a dierent program This tedious and error

to implement using common compilers and op erating

prone co ding is a large software development problem

systems

in conventional systems The explicitly programmed

conversions typically lose most typ e information as

well as p ointer semantics they bypass typ e systems

Huge Address Spaces

leaving consistency entirely up to the

programmer

It is often desirable to supp ort a larger virtual mem

Shared and p ersistent memory systems avoid much

ory address space than the word size of the available

of this diculty but they themselves p ose challenges

for system implementors One problem in implement



Pro c Intl Workshop on Ob ject Orientation

ing such large memories is that it involves addressing a

in Op erating Systems Paris France Sept

huge numb er of ob jects p otentially more than can b e pp IEEE Press

1

sp ecied by the hardwares address bits Schemes for them

supp orting large virtual addresses on normal hardware

eg LOOM Kae Sta E WD and Mneme

Mos typically incur signicant overhead

Address Translation at PageFault

Two approaches are commonly used to implement

Time

large address spaces in software One is to indirect

p ointers through an object table and translate large

Our approach is to fault pages into conventional vir

identiers into ob ject table osets when ob jects are

tual memory on demand translating long p ersistent

brought into memory Untranslated identiers for

memory addresses into normal hardwaresupp orted

2

which there isnt a corresp onding table entry may

addresses at pagefault time Wil This strategy

b e agged and translated into osets lazily

exploits lo cality of reference in much the same way

as a normal virtual memory we b elieve this pagewise

The other main approach is pointer swizzling

approach is increasingly attractive as main memories

Rather than using an indirection through a table

grow larger As the numb er of instructions executed

longformat p ointers are converted to actual hardware

b etween page faults go es up the cost of address trans

supp orted addresses in an incremental fashion

lation can b e amortized across more useful program

In a conventional p ointer swizzling scheme a p er

work In contrast software schemes involving pres

sistent p ointer is converted into a virtual memory ad

ence checks and indirections do not scale as nicely

dress only when the running program tries to use it

most of their address translation cost is tied directly

this entails bringing the ob ject into the transient mem

to the rate of program execution

ory address space This may involve checking each

Any incremental faulting scheme must somehow de

p ointer at each use to see if it is an actual address

tect references to ob jects in p ersistent memory so that

White and Dewitt refer to this as swizzling on dis

they can b e copied into virtual memory b efore b eing

covery WD it is also p ossible to swizzle all of

op erated on We cho ose to use existing virtual mem

the p ointers in an ob ject at once the rst time it is

ory hardwares pagewise access protection capability

touched Mos In either case p ointer elds or ob

for this This allows the checks to o ccur in parallel as

jects must b e checked very frequently to see if theyve

part of the normal functioning of the virtual memory

b een swizzled yet so that unswizzled p ointers can b e

system and avoids continual overhead in software

swizzled b efore b eing traversed

Our approach is analogous to that of App el El

We would like to avoid these costs so that pro

lis and Lis incremental garbage collection scheme

grams that do not access p ersistent data do not pay

AEL Their system copies live reachable ob jects

the freight and so that programs that access p ersis

incrementally from fromspace to tospace to sepa

tent ob jects or pages many times do not incur addi

3

rate them from garbage ob jects ours relo cates pages

tional checking costs at every access Ideally we would

of ob jects from p ersistent memory transient memory

like this mechanism to op erate eciently on standard

so that they may b e directly addressed Still the basic

hardware rather than requiring an exotic ob ject

principles of op eration are the same

oriented hardware memory hierarchy such as that

To simplify the explanation of this technique we

of the MUSHROOM pro ject WWH or capability

will assume for the moment that ob jects in virtual

based addressing schemes eg Lev Ros

memory are the same size and shap e memory layout

For simplicity much of this pap er is cast in terms

as p ersistentmemory ob jects Later we will explain

of p ersistent memories b ecause our current implemen

how mismatches b etween representations eg one

tation fo cus is on p ersistent ob ject stores The pap er

1

This terminology is appropriate in terms of our address

is actually ab out address space implementation more

translation scheme as welllong addresses conceptually endure

generally however the ideas are equally applicable to

over time but they may b e mapp ed in changing transient

op erating systems and distributed shared memories

ways to dierent virtual memory addresses

2

A similar approach develop ed indep endently app ears to

We sometimes refer to conventional hardware

b e used in Ob jectStore LLOW a commercial pro duct from

supp orted virtual memory as transient memory dis

Ob ject Design Inc Few details of their scheme are available

tinguishing it from a larger persistent memory Con

however

3

In a copying garbage collector unreachable garbage data

ventional virtual memories are transient in that an

ob jects are reclaimed implicitlyrather than nding the gar

address space ceases to exist when the heavyweight

bage the live reachable ob jects are moved copied into

pro cess it b elongs to terminates Persistent memories

another area of memory tospace and the obsolete area

fromspace is then reclaimed in its entirety like le systems may outlive the pro cesses that create

the p ersistent store any pages it holds entry p oint vs twoword p ointer elds can b e handled

ers into must b e relo cated into virtual memory and

Our scheme relo cates ob jects into transient memory

accessprotected This allows the entry p ointers to b e

somewhat so oner than a straightforward software

translated into machine format so that the program

only p ointer swizzling scheme This allows us to pre

can b egin execution The second part of the gure serve one essential constraintthe running program is

shows this state with page A app earing in virtual never allowed to encounter any p ointers into p ersistent

0

memory as page A

memory Any pages that contain p ersistent p ointers

must b e accessprotected If the program attempts to

When the program attempts to access such a page

access a page that may contain p ersistent p ointers a

0

in this case A the accessprotection handler trans

trap handler is invoked it translates all of the p er

lates al l of the p ointers in the page into the desired

sistent p ointers in that page into transient p ointers

format For all of the p ointers to b e translated it

relo cating their referents as needed the page is then

must b e known where in memory the referredto ob

unprotected and the program may safely resume

jects will reside That is if the program accesses page

Rather than relo cating particular objects referred

0 0

A and page A contains a p ointer into page B page

to by p ointers in the faultedon page as the App el

0

B must b e relo cated into transient memory as B

Ellis and Li garbage collector do es our scheme relo

0

This allows p ersistent p ointers from page A into page

4

cates whole pages those ob jects are lo cated in This

B to b e translated into absolute virtual memory ad

reduces the size of tables required to hold mappings

0 0

dresses in B so that page A can b e made accessible

b etween transient and p ersistent addressesonly the

to the program We simply treat each p ointer in the

page numb ers must b e recorded not individual ob

faultedon page the same way we treated the entry

jects It also comp orts well with page faultingwe

p ointer This pro cess rep eats whenever a p ointer into

b elieve it is desirable to bring ob jects into memory a

a protected page is traversed advancing the pagewise

whole page at a time anyway caching pages is usu

read barrier of protection one step ahead of the pro

ally more attractive than ob ject faulting when main

grams actual access patterns

memories are not small Sta Wil WD

us o ccurs in a sort of wave

Entry Relo cation of pages th

t just ahead of the running program insulating

Pointer fron

it from untranslated p ointers Pointers can b e derefer

ay ie in a single machine instruc

A B C D enced in the usual w

tion with no extra overhead except at page faults

AAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAA

Entry AAAA

AAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAA

Pointer AAAA

AAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAA A'

i) State of transient memory after translating entry pointer

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Moving Data Lazily

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Entry AAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Pointer AAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A' D' B'

ii) Transient memory after traversing entry pointer and touching page A'

Note that since the pages are accessprotecte d at

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

relo cation time we can simply reserve the space and

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Entry AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Pointer AAAA

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

note the mapping without actual ly copying their con

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A' D' B' C'

into transient memory The actual data can

iii) After traversing pointer from page A' and touching page B' tents

b e moved more lazily when the program rst at

tempts to touch the page The recording of p ersistent

Figure Incremental faulting and p ointer swizzling

totransient page mappings thus runs one p ointer

ahead of the programs access patterns but the

Figure illustrates this mechanism The top part of

movement of actual data into the p ersistent store is

the gure shows some data ob jects in pages of the p er

just as lazy as with a normal demandpaged virtual

sistent store with p ersistentformat ie long p oint

memory Since reservedbutuntouched pages are ac

ers b etween them When a program is given access to

cess protected anyway they neednt have any storage

4

Thanks to Ralph Johnson for this our original conception

b ehind them at allnot even swap disk Establishing

adhered to o slavishly to the App elEllisLi mo del and relo cated

a p ersistenttotransient page mapping amounts to a

individual ob jects Johnson had devised a somewhat similar

promise that well put a page there if and only if its

scheme from which we got the idea to relo cate whole pages

instead Johnson p ersonal communication needed

system ab out times as many pages may b e re Handling the Pointer Size Mismatch

served as are actually touched While this is unlikely

for most programs it is conceivable and in fact rather

Since the contents of a page need not b e translated

nearfetchedpages holding multiway no des may

at the time it is relo cated but only if it is accessed and

approximate the worst case

its p ointers are translated to the normal hardware

To avoid using up the virtual address space we have

supp orted size ob ject formats in p ersistent memory

two strategies Neither is currently implemented

may b e dierent form those in transient memory The

howeverwe currently have no applications requiring

main requirement is that a p ersistent page full of ob

them but exp ect to in the future The rst strategy

jects t into a transient page of memory once they

is to reduce the eective page size and slow the rate

have had their formats translated It is also necessary

of address space use This strategy may not b e en

to b e able to derive ob jects eventual lo cations in tran

tirely eective so we have also devised an algorithm

sient pages from their lo cations in p ersistent pages

for reclaiming virtual address space incrementally and

The most straightforward p ossibility is making ev

reusing it

ery p ointer in p ersistent memory b e twice the size of

a transient memory p ointer eg use a eld size of

Reducing the Eective Page Size

bits When a p ointer is swizzled into a bit native

p ointer it only uses half the eld This is not a big

Large page sizes are a p otential problem for our

cost for a statically typ ed language like C b ecause

5

scheme b ecause large pages may have a high fan

most elds are not p ointers The space cost is usually

out ie hold p ointers to many other pages We

only ten to twenty p ercent

can reduce the eective page size by only using part

of each virtual address page when allo cating ob jects

with large numb ers of p ointers For example if we

Avoiding Exhaustion of Virtual Ad

only use one fourth of each KB page we reduce the

dress Space

fanout by a factor of four A naive implementation of

this strategy would b e very wasteful of course so it is

A p otential problem with the basic scheme is that

desirable to avoid using actual RAM and disk storage

the transient memory could ll up with relo cated

sparsely

pages that are used for a while and then not used

A b etter strategy is to only use a fraction of each

again for a long time These pages could ll up the

virtual page but arrange the fragments in a com

virtual memory causing excessive paging This is ac

plementary pattern so that several virtual pages can

tually not much of a problem b ecause the pro cess of

share a physical RAM or disk page Supp ose we

swizzling address translation is nearly orthogonal to

wanted to implement KB fragments of KB pages

issues of levels in the storage hierarchyan inactive

we could map four virtual pages to a single physical

page can still b e paged out to backing disk It may

page and use a dierent quarter of each virtual page

b e paged to a swap area temp orarily or it may b e

for data While the physical page as a whole would

unswizzled and evicted back to the p ersistent store

have four aliases virtual page numb ers the nonover

The real problem then is not the exhaustion of

lapping pattern of allo cation would ensure that no ob

hardware memory but the exhaustion of the hardware

ject or cache blo ck was actually aliased

supported virtual address space This is not just a

To avoid fragmenting memory mappings and de

problem for programs that actually touch millions of

creasing virtual memory p erformance an entire range

pages b ecause touching one page may cause the re

of pages could b e aliased by four virtual memory ad

serving of several pages of the address space so that

dress ranges eg using the mmap system call avail

addresses can b e translated In the worst case a page

able in most mo dern versions of UNIX

contains nothing but p ointers causing the reserving

This solution is not entirely satisfactory for two

of as many pages as there are p ointersin our current

reasons First it do es not deal with ob jects whose

size is comparable to a page or larger Second while

5

Our original implementation for Scheme a dialect of Lisp

it wastes little or no physical storage it do es decrease

used a somewhat dierent approach b ecause most ob jects

the eectiveness of translation lo okaside buerseach

elds must b e large enough to hold p ointers In that implemen

tation al l elds were twice as big in the p ersistent store Wil

partiallyused virtual page requires its own virtualto

and p ersistent pages were twice as big as normal virtual mem

physical page mapping in the virtual memory system

ory pages This avoided wasting transient memory space at a

While it defers the exhaustion of the address space

cost in p ersistent data density while allowing simple conversion

from p ersistent addresses to virtual memory addresses in the sense of delaying the discovery and swizzling

Reclamation of pages can b egin after the program of p ointers it actually increases total address space

has run a while to recreate or revalidate all the map usage in the long run by decreasing the usable size

pings of its current working set Candidates for of each virtual page

reclamation are pages that have not b een referenced

This strategy is therefore most attractive for small

since the mass invalidation and which are not directly

p ointerrich ob jects exp ected to b e used in ways that

reachable from pages that have b een The reclamation

create highfanout pages In particular it should b e

p olicy should probably favor evicting pages that are

useful for the no des of trees used as large sparsely

only directly reachable from pages that havent b een

searched indices

touched for a long time To increase eciency in sys

tems where page fault traps are exp ensive the tran

Address Space Reuse

sient virtual memory systems recency information

might b e consulted and the most recentlytouched

The easy way to deal with the exhaustion of the

pages could b e assumed to b e part of the current

address space is simply to o ccasionally evict all of the

working set These pages would have their addresses

pages from virtual memory throw away the existing

recomputed or revalidated immediately rather than

mappings and then b egin faulting pages in again in

b eing accessprotecte d to avoid most of the urry of

6

the normal way Pages that are no longer in use

access protection faults immediately after the mass in

will not b e faulted in again but the current working

validation

set will b e restored quickly This incurs unnecessary

and bursty trac b etween the transient and p ersistent

stores however

Ob jects that Cross Page Boundaries

To avoid this we take advantage of the fact that

address translation and data caching are essentially

One p otential hitch in our scheme is that pages that

orthogonal We dont really have to write pages out to

if an ob ject crosses page b oundaries in the p ersistent

reclaim the corresp onding pages of the virtual address

store the corresp onding pages must b e adjacent in the

space Evicting pages from lo cal virtual memory

transient virtual memory as well If an ob ject strad

storage is easy clean pages can simply b e discarded

dling a page b oundary is not relo cated as a contigu

and dirty pages can b e unswizzled and written back

ous ob ject indexing to access its elds will not work

to the p ersistent store

prop erly Lazy copying is particularly helpful here

Rather than actually writing everything out then

address space must b e reserved for the whole ob ject

we can simply invalidate and incrementally rebuild the

but there neednt actually b e any physical RAM or

virtual memory mappings That is we pretend to

disk memory used for untouched pages

write out all of the data but we leave it cached lo

To supp ort this the language implementation must

cally and actually just accessprotect the pages We

somehow supp ort op erations that nd ob ject b ound

can then incrementally fault on them to build a new

aries andor maintain crossing maps to tell which

set of mappings If a page is faulted on and it is still

pages hold part of an ob ject continued from a previous

in lo cal storage RAM or disk so much the b etter

page These requirements are not much dierent from

its p ointers can simply b e reswizzled according to the

those of garbage collected systems that must p erform

current mappings in much the same way as when the

pagewise or cardwise op erations AEL WMb

page was originally faulted in The obsolete map

within the heap there is no ma jor diculty supp ort

pings could b e consulted and could even b e reused

ing such op erations eciently for languages like Lisp

in many cases If a page is not dirty ie written

or ML Slightly conservative versions of these schemes

to since it was faulted into transient memory then it

will work well for languages with derived p ointers and

contains no p ointers into pages that it did not previ

limited p ointer arithmetic in much the same way

ously contain p ointers into Lowfanout pages should

that conservative garbage collectors op erate with lan

thus have their referredto pages recorded at swizzling

guages like C BW The main mo dications are to

time

the allo cation and deallo cation routines which must

provide headers andor groupings andor alignment

6

Note that we cant just evict a page from the virtual address

restrictions to allow ob jects to b e identied

space b ecause we do not keep track of p ointer assignments

any page that in the virtual address space must b e assumed to

Large ob jects still p ose a p otential problem for our

have p ointers into it from other pages in the address space We

system in terms of exhaustion of the hardware ad

therefore cannot reuse that page we rebuild the mappingswe

dress space If a page is touched and it holds p ointers

have to traverse the graph of p ointers and rebuild the mappings

to nd out which pages are reclaimable to several large multipage ob jects space must b e

reserved for all of those ob jects pages even if they translate the p ointers in a page into the native format

are never touched Programs that deal with many of the receiving machine

very large ob jects may therefore b enet from a larger Pointer swizzling only requires that it b e easy to

hardware address space to decrease the frequency of nd the p ointers in a page and that it b e easy to con

address space reuse While we dont think that this is vert a large p ersistent p ointer into the hardware sup

a serious problem for most programs on most comput p orted format This is done by translating a the high

ers it is worth considering As the following section order bits page numb er to the shorter bit pattern

shows we can integrate machines which require large of the transient page numb er and adjusting the low

hardware addresses with those that dont and allow order bits that represent the oset within the page

the sharing of most data b etween them The simplest way of ensuring this ability is to have

the p ersistent data format b e the same as the tran

sient format so that the oset part of a p ointer do es

not change at all

Compatibility

This can b e done for multiple p ointer sizes by sim

ply leaving enough ro om for the largest hardware

We see p ointer swizzling at page fault time as part

of a general purp ose reconciliation layer that can b e supp orted p ointer size whether it is needed on all

used to structure systems very exibly at little p erfor machines or not So a bit eld can b e used on

bit machines and also on bit machinesbut only

mance cost Wil It can b e used to supp ort data

half of the eld is used for transient p ointers on

formats that allow exible sharing of data across ma

bit machines As mentioned ab ove the other half of

chines with dierent word sizes and even to supp ort

the eld go es to waste but this space cost is relatively

binary co de compatibility within families of machines

that have dierent address sizes These capabilities small

This is similar to the approach used in the Com

only require very slight changes to existing programs

mandos MG op erating system where ob ject iden

andor compilers

tiers are used on disk but they are swizzled to actual

Naturally any program that actually requires a

p ointers in memory Commandos do es not use page

huge at address space cannot b e run on the bit

machines for example programs that need at array fault time swizzling however and incurs overhead in

checking for unswizzled p ointers Using ob ject identi

indexing into multiple gigabyte arrays or programs

ers rather than p ersistent addresses also makes trans

whose lo cality is very bad on the gigabyte scale These

lations more exp ensive

programs are likely to require highend pro cessors any

way and b e executed on the larger machines By

Binary Co de Compatibility Across

and large however most programs and data could

Hardware Address Sizes

b e shared across dierentsized machines

We dont claim to have a panaceathe problems

Not only is it p ossible to dene compatible data for

of data sharing are dicult and deep in the general

mats that can b e used by programs running on hard

case On the other hand the trend toward standard

ware with dierent address sizes it is even p ossible to

numeric formats and byte addressing is encouraging

dene instruction set architectures ISAs so that the

bit integers bit IEEE oats and swizzled p oint

same compiled code can run on machines with dierent

ers would supp ort a much higher greatest common di

word sizes

visor than the currently ubiquitous streams of bytes

For architecture families like the current MIPS pro

cessors ie bit R and bit R or the

Address SizeIndep endent Data For

IBMAppleMotorola PowerPC a few instructions

mats

added to the bit machine can provide an addressing

For compatibility across dierent machines it may mo de that can b e reconciled with bit addressing by

b e desirable to have a single data format that can using p ointer swizzling In each of these families the

b e used irresp ective of the address word size of the bit instruction set architecture is a sup erset of the

machine op erating on the data This is particularly bit ISA The same register designations are used

attractive for a shared p ersistent store or a distributed but on the bit machine the registers are twice as

virtual memory long Backward compatibility is ensured by dening

By using p ointer swizzling to adjust p ointer sizes it most op co des in such a way that on the bit machine

is easy to accomplish this When pages are transferred they op erate on a bit register but the eect on the

from one machine to another it is only necessary to loworder bits is the same as the op eration on the

The implications for compilers are that a compat smaller machines bit registers Any program com

ibility mo de back end can easily b e implemented by piled for the bit ISA still runs on the bit ISA

slightly mo difying existing back ends The instruction b ecause it do esnt dep end on the values in the upp er

set should b e limited to instructions available on b oth halves of registers Some new op co des are added as

bit and bit versions of the architecturethat is well such as bit loads and stores

it should b e limited to the bit ISAexcept when

As describ ed previously data ob jects can b e laid

dealing with addresses bit elds should b e used

out in a compatible way with a bit p ointer eld

for addresses including stackallo cated variables and

only half of which is used on bit machines It is

bit op co des should b e emitted to op erate on them

only necessary to dene the bit machines op co des

The resulting co de should also work on bit archi

on the bit machines to do the right thing for the

tectures with p ointer swizzling at run time

actual hardware address size The op co des for bit

While the ab ove scheme would provide maximum

loads and stores on the bit machines should do

p erformance by simply aliasing existing op erations

bit loads and stores Equally imp ortant the loads and

with new op co des another variation could provide

stores should use bit register values for addressing

compatibility on an installed base of bit machines

The strategy here is to use the bit op co des as

Many machines provide for the fast trapping of unim

compatibility op co des which use whatever hard

plemented op co des so that they can b e emulated in

ware address size is b est on a particular machine

software This trapping could b e used to emulate

That is they b ecome bitsifyouvegotthem op

the compatibility op co des rather than actually hav

co des For compatibility with existing compilers and

ing a new op co de On bit machines bit op co des

libraries co de compiled to use bit addressing on

would trap to routines that simply executed the corre

bit machines using the plain bit op co des will

sp onding bit instructions While less ecient than

work on either machine in the normal way Co de com

hardware aliasing this would still allow the genera

piled to run on bit machines using p ointer swizzling

tion of compatible binaries using bit op co des as

will use bit op co des instead to load bit values

9

compatibility op co des

using bit addresses The p ointer swizzling scheme

will ensure that any p ointer eld the pro cessor can see

Using Existing Languages and Com

will in fact have a bit p ointer in the relevant half of

pilers

the word But the same op co des actually mean bit

loads and stores on bit machines and the p ointer

While p ointer swizzling at page fault time is obvi

swizzling scheme will ensure that those elds do hold

ously applicable to languages like Lisp and Smalltalk

bit addresses on those machines

which use tagged p ointers it can also b e used for

Its not actually quite this simple b ecause of the

stronglytyp ed languages such as Mo dula ML and

p ossibility of p erforming p ointer arithmetic on these

10

with slight restrictions C or C The success

addresses All that is necessary though is to use the

of conservative garbage collectors shows that most C

relevant arithmetic op co des in the same way as load

7

programs require little or no mo dication to meet the

and store op co des

11

necessary constraints Some compiler optimizations

One of the great attractions of this scheme is that

may mutilate p ointers b eyond recognition but there

it is not actually necessary to use any new op erations

12

are ways around this

A slight change to op co de deco ding will do the trick

9

bit instruction op co des which currently trap as il

In marketing terms this would allow the distribution of

compatible binaries that would run on the installed base with

legal instructions on a bit machine should instead

a twolevel upgrade strategy A cheap upgrade would consist

do the equivalent op eration but using bit address

of swapping the CPU chip for a pincompatible chip with the

ing and registers This should not break any exist

op co de aliases in hardware A more exp ensive upgrade would

ing bit co de and requires no change to the bit

b e a fullblown bit CPU

10

8

The main restriction is the avoidance of untagged unions

architecture

holding p ointers in the variant part Another is the avoid

7

In some architecture families bit machines will never at ance of storing intermediate values from p ointer arithmetic ex

tempt to actually do bit op erations so the bit op co des can pressions without retaining an actual p ointer to the ob ject

b e used as pseudonyms for bit op erations In other families BW Bo e

11

the same op co des are used anyway Either way those op co des Untagged unions are not very attractive in C b ecause

eectively b ecome compatibility mo de address arithmetic in of its ob jectoriented features and most p ointer arithmetic that

structions which do the right thing on either architecture breaks garbage collection or p ointer swizzling relies on un

8

Thanks to Rich Oehler and Keith Diefendor for p oint p ortable assumptions anyway

12

ing out a aw in an earlier version of this compatibility mo de The easy way is to forgo certain advanced optimizations

scheme and suggesting details of this one as is done by several systems that use C as an intermedi

zled ob jects to b e used in the same programs with a It is only necessary for heap allo cation routines to

few restrictions on how they may interact Persistence ensure that data ob jects within heap pages can b e

is a prop erty of individual ob jects not of classes recognized and that the p ointers to and within those

a p ersistent ob ject is simply one that is allo cated on ob jects can b e found The techniques for this are well

the p ersistent heap rather than the conventional tran understo o d having b een develop ed for the purp ose of

sient heap garbage collecting staticallytyp ed languages

We have found this to b e extremely convenient in

It is even p ossible to do p ointer swizzling for pro

the pro cess of developing our p ersistent storewe can grams compiled with standard otheshelf compilers

intermix conventional C co de and p ersistent C It is only necessary to treat the stack conservatively

co de even linking to existing binaries It is not nec

as is done by garbage collectors develop ed for othe

essary to recompile libraries for example as long as

shelf compilers BW Bar Det WH Det

transient ob jects are not exp ected to p ersist or sur

Any bit pattern within the stack which could rep

vive across crashes Transient ob jects may hold p oint resent a p ointer must b e conservatively assumed to

ers to p ersistent ob jects and vice versa as long as be a p ointer The p ointedto page is then pinned

they follow a few simple rules SKW Naturally

in the transient address space b ecause the page

more alternatives are p ossible if the source is available

cant b e relo cated without invalidating the p ossible

In that case it can b e run through a precompiler to

p ointer This do es not require the page to actually re

record ob ject layouts and linked with our heap allo main in transient memoryjust that its p ersistentto

cator this allows all ob jects to b e made p ersistent if transient mapping for a particular transient memory

13

desired

not b e changed

Interestingly it is p ossible to cast p ointers to inte

gers and back to pointers again without breaking the

p ointer swizzling system If p ointer casting to an inte

Truly Enormous Address Spaces

ger is implemented as an unswizzling into the p ersis

tent address bit pattern and if casting an integer into

In the scheme describ ed ab ove the size of p ointer

a p ointer swizzles it according to the prevailing page

elds is determined primarily by the size of hardware

14

mappings then all is well This requires the use of

supp orted virtual addresses We currently use a

an integer long enough to hold the p ersistent form of

bit eld b ecause it is large enough to hold either a

the address which is probably ne if only a bit ad

bit or a bit address The size of p ersistent ad

dress space is required It can break copying garbage

dresses can b e arbitrarily large however we b elieve it

collectors however in this resp ect the requirements

will b e desirable to have or bit addressing in

of the swizzling system are actually weaker than those

the future This is easy to do with p ointer swizzling

15

of a copy collector

at page fault time

To supp ort enormous address spaces it is only nec

Linking to Existing Binaries

essary that the p ersistent ob ject format b e able to

hold the maximumsized p ointers The uniform

Because p ointer swizzling at page fault time re

bit transient p ointer elds can still b e supp orted but

quires no changes to ob jects data formats or the co de

pages may take up more space in p ersistent memory

that manipulates them it allows swizzled and unswiz

than in a pro cessors virtual memorythat is the size

of a page may change when it is swizzled Variable

ate languagebut this incurs a small p erformance p enalty A

sized p ersistent pages complicate page lo okup b ecause

b etter technique in the long term is to ensure that compilers

they complicate the mapping of conceptual pages to

dont violate the invariants unnecessarily Bo e or that they

record clarifying information when they do DMH WH

disk blo cks We do not b elieve that this is a serious

BMBC we b elieve compilers should retain this kind of in

problem in the long run however b ecause of current

formation anyway to supp ort deoptimizing in sourcelevel

trends in the evolution of le systemsnamely block

debuggers eg HCU

13

mobility and compressed storage

It is not dicult to swap a page from one machines address

space to anothers and back The pages mappings to transient

Block mobility is increasingly imp ortant as in the

pages do not change in either space but when a page is trans

case of log structured le systemsthat is the lo ca

ferred the p ointers in it are unswizzled from the source space

tion of a blo ck may change over time to maximize

representation and reswizzled into the destination space repre

eective write bandwidth RO and p ossibly adapt

sentation It might b e p ossible to make other format changes at

the same time such as endianness reversal when dealing with

15

dierent machine architectures

Our attitude toward address space might b e summed up as

14

Thanks to David Chase for suggesting this implementation crunch all you wantwell make more

16

to observed patterns of disk usage to reduce read la ther of these systems

tencies Gri Wil

At a higher level of abstraction p ointer swizzling at

page fault time is really a compression of the stream

Compressed storage is likely to b ecome increasingly

of addresses going by on the address bus and also

imp ortant whether at the level of le storage or as

those within the pagewise wavefront around it It is

part of the virtual memory system AL Wil The

therefore philosophically akin to dynamic base register

problems of variablesized units of storage are thus

caching FP but entirely dierent in its implemen

likely to b e solved anyway for entirely dierent rea

tation Our scheme incrementally constructs a com

sons Compressed storage is attractive for reducing

pression table mapping long p ersistent page numb ers

storage costs and for reducing average latency by

to short transient ones which is rebuilt when its al

storing more data in fast memory andor decreas

phab et is exhausted that is when the transient page

ing communication costs If compression is used

numb ers are used up

long p ersistent addresses should typically take up con

The pagewise wavefront of compression mappings is

siderably less space than the inmemory format

a purely pagewise analogue of Bakers read barrier

presumably the actual information content will not

technique for traversing ob jects in incremental tracing

b e much higher despite the very horizontal p ointer

garbage collection Bak Bak Wil Our page

format

wise read barrier is an variation of App el Ellis and

While such compressed storage techniques are b e

Lis incremental garbage collection algorithm which

yond the scop e of this pap er but see WLM

uses a pagewise wavefront with ob jectwise relo cation

Wil we would like to make one observation ab out

why these problems are not as dicult as they may

seem at rst glance it is not necessary to store the

Discussion

compressed pages contiguously Pages only need to b e

contiguous and the right size when they are made

accessible to a program so that hardware addressing

Pagefaulttime swizzling is essentially a RISC ad

works correctly When stored in compressed format

dressing scheme exploiting the common cases of pro

they may b e broken into smaller blo cks to reduce frag

grams memory referencing b ehavior ie mostly re

mentation problems

p eated touches to a subset of all pages it provides

extremely high p erformance in the usual casewith

no impact on cycle timeand traps to software in the

unusual cases Ironically it could have b een used at

Related Work

any time in the last twenty or thirty years to avoid

the hardware and software wo es of segmented archi

tectures

While there are quite a few p ersistent storage sys

It can b e argued that while it would have b een a

tems see eg DSZ to the b est of our knowl

very go o d idea in the past it will so on b e unneces

edge only one uses virtual memory techniques to al

sary b ecause bit hardware addressing will b ecome

low p ointer swizzling at page fault time avoiding con

ubiquitous over the next few years We think that its

tinual runtime overhead and supp orts a very large

still an excellent idea for several reasons the large

address space That is Ob jectStore a commercially

installed base of narrow machines the p ossibility that

available system LLOW from Ob ject Design Inc

narrow machines may b ecome necessary again if hard

Their system was designed indep endently of ours and

ware trends change and the likeliho o d that even

apparently predates it We b elieve that it op erates by

bits will not b e enough We also think its simply the

similar means but no details have b een published

right way to lo ok at memory freeing the notion of a

Our scheme also b ears some resemblance to the

p ointer from its realization as a hardware address

Moby address space designed and implemented by

The current trend toward bit architectures may

Richard Greenblatt at Lisp Machines Incorp orated

well b e irreversible and not as exp ensive as it might

Moby used pagewise relo cation but exploited the

seem at rst glance b ecause most of the costs of a chip

tagged architecture of the Lisp machine to swizzle

are not directly prop ortional to address width Still

p ointers one at a time rather than translating all of

we b elieve our scheme to b e valuable even in a world

the p ointers in a page with a single page protection

dominated by bit machines

fault A similar scheme designed and implemented

16

by Bob Courts was later used on the TI Explorer

Thanks to David Mo on Richard Greenblatt George Car

rette and Doug Johnson for information ab out these systems Apparently no pap ers were ever published ab out ei

bit machines will b e manufactured for quite a Even with compression a relatively small numb er of

while if only for palmtop computers and emb edded cameras could achieve that rate of data capture We

17

applications Their small address spaces should not also b elieve that the uncompressed image should b e

prevent them from sharing a truly global address space mapp ed into memory even if the actual storage is in

with their larger cousins p erhaps over infrared links compressed form A single lo callynetworked site may

within a building or via a cellular telephone network therefore capture hundreds of megabytes of raw data

Pointer swizzling can extend also the life of bit ar p er second an international network of several thou

chitectures indenitely even if many bits are stolen sands such sites could exhaust a bit address space

from the address words and devoted to bit addressing in a single year

pro cessor IDs typ e tags and so on

We also b elieve that it may b e desirable to adopt

As a reconciliation layer swizzling can allow an in

a noreuse p olicy toward virtual addressesthat is

denite numb er of machines say all the computers

when an ob ject is deallo cated its long conceptual

in the world to share an indenitely large address

address should not b e reused Just as CLLBH ad

space without each having to map p otentially shared

vo cates the separation of addressing from protection

pages into the same place in each hardwaresupp orted

we advo cate the separation of addressing and protec

address space And even the lowly emb edded bit

tion from storage Reusing memory should not neces

singlechip computers of the coming decade should b e

sarily entail reusing part of the global address space

able to share data with the largest bit compute

even within a single thread of computation

servers And if current trends change toward lower

This separation of concerns could have several b en

density highersp eed materials eg GaAs or Joseph

ets One would b e that debugging longlived pro

son junctions word size may again b ecome critical

cesses would b e much easier b ecause distinct ob jects

and we can provide a graceful downgrade path

would have distinct addresses over timean address

In CLLBH Chase et al argue that bit com

would not b e reused for a new ob ject This would

puters eliminate the need to play tricks with address

b e esp ecially desirable in the context of a timetravel

spaces either in the conventional way of interpreting

debugging system WMa FB TA

addresses dierently for each pro cess or by p ointer

swizzling They p oint out that even if storage is used

Along with this ability to distinguish distinct ob

at a rate of megabytes p er second it will take sev

jects over time comes the p ossibility of detecting most

eral thousand years to use up a bit address space

uses of dangling p ointers If empty virtual pages are

64

is billion billion and thats a very large numb er

not reused they can simply b e access protected and

While we agree that a single global address space

have their storage reclaimed An attempt to derefer

is very desirable and greatly simplies the design of

ence a stale p ointer will then cause an access protec

sophisticated systems we dont b elieve that bits

tion fault rather than silently op erating on a com

18

is enough in the long run or that the global ad

pletely dierent ob ject

dress space should b e equated with conventional vir

Many other uses of enormous address spaces are

tual memory addressing We also disagree with the

p ossible and the cheap availability of indenitely large

contention that p ointer swizzling interferes with shar

spaces will probably encourage the development of

ing and protection dierent protection domains on a

many more We are particularly interested in the

no de can share data pagesand swizzling mappings

lazy construction of large memorymapp ed data struc

while using dierent virtual memory protections for

tures whose incremental materialization is triggered

the swizzled pages

by accessprotection faults For example programs

We b elieve it is p ossible and even likely that bit

might see an area of address space as a gigabyte ar

address spaces will b e exhausted in the foreseeable fu

ray but the elements of the array would b e computed

ture Eventually computers will encompass multime

on an asneeded basis triggered by actual memory

dia systems and for example highdenition digital

referencing

video data may b e mapp ed into memory HDTV infor

mation rates will b e tremendousconsider megapixels

p er frame in color at tens of frames p er secondand

18

This will not detect all uses of dangling p ointers b ecause

MB p er second ceases to seem all that unusual

the page cannot b e access protected until it is completely empty

17

We b elieve that the clustered births and deaths of ob jects The recent intro duction of new subnoteb ook computers

Hay will frequently cause entire pages to b ecome empty based on bit CPUs suggests that it will b e quite a while

however small pages andor subpage protections would also b efore bit machines take over the entire market Even bit

make this more eective machines could b e supp orted with our scheme

Current Status and Future Work part numb er with the index implemented as an AVL

20

tree Each part ob ject also has p ointers to several

Our initial prototyp e p ersistent store for Scheme

other parts that it is conceptually connected to and

Wil lacked recovery features and a b o dy of data

a p ointer to a set ob ject that holds backp ointers to ob

intensive programs to truly stress it so was little

jects that are connected to it There is lo cality in the

more than an existence pro of showing that it is p os

connections based on mostly nearby part numb ers

sible and in fact easy to p erform address trans

and some randomness as well

lation at page fault time Performance problems

The standard b enchmark searches the database

with our byteco ded Scheme system also prevented

several times in the following way a part numb er

us from precisely measuring the overheads of p ointer

is selected at random and then its connections to

swizzlingthe swizzling cost were to o low compared

other parts are traversed recursively to nd three other

to the overhead of interpretation

parts This traversal continues transitively for seven

We have recently constructed a new highly

levels with a branching factor of three touching a

p ortable p ersistent store for C called Texas

total of connected parts

SKW The current implementation running on

The graph shows the times to p erform successive

unmo died ULTRIX incurs unnecessary costs due to

traversals each starting from a dierent randomly

diculties in avoiding allo cating backing store and

selected part and tracing out its connections

bypassing le system caching Despite these aws

Our p ersistent C system Texas outp erforms

we have run a few b enchmarks that suggest that our

nonp ersistent C in most of these early iterations

19

p ointer swizzling techniques p erform well

which should not b e p ossible but this is apparently an

artifact of the fetch p olicy used in the underlying le

10

system We fetch faultedon pages from the p ersistent

9

h is implemented as le data C simply C++ store whic 8 Texas faults them in from virtual memory as KB pages 7 6 0.2 5 4 C++ 3

seconds per iteration 0.15 Texas 2 1 Hot 0.1 0 0 1 2 3 4 5 6 7 8 9 10 11 iterations

seconds per iteration 0.05

Figure Small OO database traversal times

0

0 25 50 75 100 125 150 175 200

Figure shows the results of running a simple

iteration

b enchmark with our p ersistent store and with non

p ersistent C In b oth cases the test system is a

Figure More iterations traversing small database

DECstation with MB main memory The

virtual memory system and le caches were ushed to

Figure shows what happ ens when more iterations

disk by reading and writing unrelated data b efore the

of the b enchmark are p erformed Both C and

start of the b enchmark

Texas approach the same sp eed as the virtual memory

The plots show the time taken to p erform each of

warms up ie as more and more pages have b een

iterations of the traversal comp onent of the OO

faulted into main memory This is to b e exp ected

b enchmark Cat using the standard small OO test

since Texas incurs little or no overhead except at page

database The database consists of part ob

fault time Following White and Dewitt WD we

jects representing parts in a hyp othetical engineer

also show the hot traversal time traversing only

ing database application The parts are indexed by

19 20

We are currently implementing several improvements these We used the standard AVLMap template parameterized

will b e rep orted in SKW including with details of our new class from the Free Software Foundations GNU C library

logstructured storage manager and its eects on checkp ointing Only the calls to new were changed so that ob jects would

costs b e allo cated on the p ersistent heap

pages that are already in memory by rep eating pre compatible system comp onents and abstractions vious iterations

40

Acknowledgements C++

Texas

e are grateful to Seth White who provided the

30 W

OO b enchmark co de and esp ecially to Vivek Singhal

who implemented much of the Texas p ersistent store 20

seconds per iteration 10

References

+

MP Atkinson PJ Bailey KJ Chis

0 ABC

P W Co ckshott and R Morri 0 20 40 60 80 100 holm

iterations

son An approach to p ersistent program

ming The Computer Journal

Figure OO Large database traversal times

Decemb er

AEL Andrew W App el John R Ellis and Kai

Figure shows the same kind of traversal p erformed

Li Realtime concurrent garbage collec

against the large OO database including

tion on sto ck multipro cessors In Proceed

part ob jects which will not t in memory Cat

ings of SIGPLAN Conference on Pro

The startup eects of lesystem fetching quickly di

gramming Language Design and Imple

minish and the p erformance of b oth systems is domi

mentation pages SIGPLAN ACM

nated by virtual memory paging costs After the rst

Press June Atlanta Georgia

iterations or so the p erformance of Texas is ab out

of the p erformance of C Again the hot times

AL Andrew W App el and Kai Li Virtual

were essentially identical at ab out ms This indi

memory primitives for user programs In

cates that the time overhead of our system is nearly

Proceedings of the Fourth International

zero in a large main memory at least when address

Conference on Architectural Support for

space reuse is not necessary during the execution of a

Programming Languages and Operating

program

Systems ASPLOS IV pages

We should stress that the ab ove measurements are

April Santa Clara CA

very preliminarythey are the rst measurements

weve made of an unoptimized untuned system Still

Bak Henry G Baker Jr List pro cessing in

they suggest that p ointer swizzling at page fault time

real time on a serial computer Communi

will b e quite ecient for many applications and in

cations of the ACM April

particular that it is comp etitive with a conventional

virtual memory

Bak Henry G Baker Jr The Treadmill

Realtime garbage collection without mo

tion sickness ACM SIGPLAN Notices

Conclusions

March

Pointer swizzling at page fault time can supp ort

Bar Jo el F Bartlett Compacting garbage

enormous address spaces on standard hardware us

collection with ambiguous ro ots Techni

ing only commonly available compilers and op erating

cal Rep ort Digital Equipment Cor

sytem features We b elieve it can b e an imp ortant

p oration Western Research Lab oratory

part of a radical and longoverdue reorganization of

Palo Alto California February

the relationships b etween software and hardware with

a highp erformance RISC approach to address trans BMBC HansJuergen Bo ehm Eliot Moss Jo el

lation Nonetheless it is of immediate practical value Bartlett and David Chase Panel discus

given existing hardware op erating systems and com sion Conservative vs accurate garbage

pilers as a reconciliation layer b etween otherwise in collection Summary app ears in Wilson

and Hayes OOPSLA GC workshop Fourth International Workshop on Per

rep ort sistent Object Systems Morgan Kauf

man Marthas Vineyard Massachusetts

Bo e HansJuergen Bo ehm Hardware and

Septemb er

op erating system supp ort for conserva

FB Stuart I Feldman and Channing B

tive garbage collection In Interna

Brown IGOR A system for program

tional Workshop on Memory Manage

debugging via reversible execution In

ment pages Palo Alto California

ACM SIGPLANSIGOPS Workshop on

Octob er IEEE Press

Paral lel and Distributed Debugging May

Also distributed as SIGPLAN No

BW HansJuergen Bo ehm and Mark Weiser

tices January pp

Garbage collection in an unco op erative

environment Software Practice and Ex

perience Septemb er

FP Matthew Farrens and Arvin Park Dy

namic base register caching A technique

Cat R G G Cattell An engineering database

for reducing address bus width In Proc

b enchmark In Jim Gray editor The

th Annual Intl Symposium on Com

Benchmark Handbook for Database and

puter Architecture pages May

Transaction Processing Systems pages

Toronto Canada

MorganKaufman

Gri Knut S Grimsrud Multiple prefetch

CLLBH Jerey S Chase Henry M Levy Ed

adaptive disk caching with strategic data

ward D Lazowska and Miche Baker

layout Masters thesis Brigham Young

Harvey Lightweight shared ob jects in a

University Decemb er

bit op erating system In ACM SIG

Hay Barry Hayes Using key ob ject opp or

PLAN Conference on Object Orien

tunism to collect old ob jects In ACM

ted Programming Systems Languages

SIGPLAN Conference on Object

and Applications OOPSLA Van

Oriented Programming Systems Lan

couver British Columbia Canada Octo

guages and Applications OOPSLA

b er

pages Pho enix Arizona Octob er

ACM Press

Det David L Detlefs Concurrent garbage col

lection for C Technical Rep ort CMU

HCU Urs Ho elzle Craig Chamb ers and David

CS CarnegieMellon University

Ungar Debugging optimized co de with

May

dynamic deoptimization In SIGPLAN

Symposium on Programming Language

Det David L Detlefs Garbage collection

Design and Implementation pages

and runtime typing as a C library

San Francisco California June

In USENIXC Conference Portland

Oregon August

Kae T Kaehler Virtual memory for an

ob jectoriented language Byte

DMH Amer Diwan Eliot Moss and Richard

August

Hudson Compiler supp ort for garbage

Lev Henry M Levy CapabilityBased Com

collection in a staticallytyp ed language

puter Systems Digital Press Bedford

In SIGPLAN Symposium on Program

Massachusetts

ming Language Design and Implementa

tion pages San Francisco Cali

Li Kai Li Shared Virtual Memory on

fornia June

LooselyCoupled Processors PhD thesis

Yale University

DSZ Alan Dearle Gail M Shaw and Stan

ley B Zdonik editors Implement LLOW Charles Lamb Gordon Landis Jack

ing Persistent Object Bases Princi Orenstein and Dan Weinreb The Ob

ples and Practice Proceedings of the jectStore database system Communica

tions of the ACM Octob er Columbia Canada Octob er To ap

p ear

WH Paul R Wilson and Barry Hayes The

MG Jose Alves Marques and Paulo Guedes

OOPSLA Workshop on Garbage

Extending the op erating system to sup

Collection in Ob ject Oriented Systems

p ort an ob jectoriented environment

organizers rep ort In Addendum to

In ACM SIGPLAN Conference

the proceedings of OOPSLA Pho enix

on Object Oriented Programming Sys

Arizona

tems Languages and Applications OOP

SLA pages New Orleans

Wil Paul R Wilson Op erating system sup

Louisiana Octob er

p ort for small ob jects In International

Workshop on Object Orientation in Oper

Mos J Eliot B Moss Working with ob jects

ating Systems Palo Alto California Oc

To swizzle or not to swizzle Tech

tob er IEEE Press Revised version

nical Rep ort University of Mas

to app ear in Computing Systems

sachusetts Amherst Massachusetts May

Wil Paul R Wilson Unipro cessor garbage

collection techniques In International

RO Mendel Rosenblum and John K Ouster

Workshop on Memory Management St

hout The design and implementation of a

Malo France Septemb er Springer

logstructured le system In Thirteenth

Verlag Lecture Notes in Computer Sci

ACM Symposium on Operating Systems

ence series

Principles pages Pacic Grove

California Octob er

WLM Paul R Wilson Michael S Lam and

Thomas G Moher Eective static

Ros John Rosenb erg Architectural supp ort

graph reorganization to improve lo cality

for p ersistent ob jects In Intl Work

in garbagecollected systems In SIG

shop on Object Orientation in Operating

PLAN Symposium on Programming Lan

Systems pages IEEE Computer

guage Design and Implementation pages

So ciety Press Octob er Palo Alto

Toronto Canada June

California

WMa Paul R Wilson and Thomas G Moher

SKW Vivek Singhal Sheetal V Kakkad and

Demonic memory for pro cess histories

Paul R Wilson Texas an ecient

In SIGPLAN Symposium on Program

p ortable p ersistent store In Fifth Inter

ming Language Design and Implementa

national Workshop on Persistent Object

tion pages Portland Oregon

Systems Pisa Italy Septemb er

June Also distributed as SIGPLAN

Notices July

Sta James William Stamos A large ob ject

oriented virtual memory Grouping

WMb Paul R Wilson and Thomas G Mo

strategies measurements and p erfor

her Design of the opp ortunistic gar

mance Technical Rep ort SCG Xe

bage collector In ACM SIGPLAN

rox Palo Alto Research Centers Palo

Conference on Object Oriented Program

Alto California May

ming Systems Languages and Applica

tions OOPSLA pages New

TA Andrew Tolmach and Andrew App el De

Orleans Louisiana Octob er

bugging Standard ML without reverse en

WWH Ifor W Williams Mario I Wolkzko and

gineering In SIGPLAN Symposium on

Trevor P Hopkins Dynamic grouping in

LISP and Functional Programming

an ob jectoriented virtual memory hier

WD Seth J White and David J Dewitt A

archy In European Conference on

p erformance study of alternative ob ject

ObjectOriented Programming pages

faulting and p ointer swizzling strategies

Paris France June Springer

In th International Conference on Very

Verlag

Large Data Bases Vancouver British