Lazy Reference Counting for Transactional Storage Systems

Miguel Castro, Atul Adya, Barbara Liskov

Lab oratory for Computer Science,

Massachusetts Institute of Technology,

545 Technology Square, Cambridge, MA 02139

fcastro,adya,[email protected]

Abstract ing indirection table entries in an indirect pointer

swizzling scheme.

Hac isanovel technique for managing the client cache

Pointer swizzling [11 ] is a well-established tech-

in a distributed, p ersistent ob ject storage system. In

a companion pap er, we showed that it outp erforms

nique to sp eed up p ointer traversals in ob ject-

other techniques across a wide range of cache sizes

oriented databases. For ob jects in the client cache,

and workloads. This rep ort describ es Hac's solution

it replaces contained global ob ject names by vir-

to a sp eci c problem: how to discard indirection table

tual memory p ointers, therebyavoiding the need to

entries in an indirect pointer swizzling scheme. Hac uses

translate the ob ject name to a p ointer every time a

lazy reference counting to solve this problem. Instead

reference is used. Pointer swizzling schemes can b e

of eagerly up dating reference counts when ob jects are

classi ed as direct [13 ] or indirect [11 , 10 ]. In direct

mo di ed and eagerly freeing unreferenced entries, which

schemes, swizzled p ointers p oint directly to an ob-

can b e exp ensive, we p erform these op erations lazily in

the background while waiting for replies to fetch and

ject, whereas in indirect schemes swizzled p ointers

commit requests. Furthermore, weintro duce a number

p oint to an indirection table entry that contains

of additionaloptimizations to reduce the space and time

a p ointer to the ob ject. Indirection allows Hac

overheads of maintaining reference counts. The net

to eciently evict cold ob jects from the cache and

e ect is that the overhead of lazy reference counting

compact hot ob jects to avoid storage fragmenta-

is low.

tion | when an ob ject is evicted or moved, it is

Keywords: reference counting, p ointer swizzling,

not necessary to nd and correct all p ointers to

transactions, caching, p erformance

the ob ject; instead, we can just x the ob ject's en-

try in the indirection table. However, it intro duces

1 Intro duction

the problem of reclaiming storage in the indirection

Hac is a novel technique for managing the client

table.

cache in a client-server, p ersistent ob ject storage

Reference counting [5 ] is a simple incremental

system [4]. It combines page and ob ject caching to

automatic storage reclamation technique. It main-

reduce the miss rate in client caches dramatically

tains a count of the numb er of references p ointing

while keeping overheads low. Hac provides im-

to an item and reclaims the item when its count

proved p erformance over earlier approaches|by

drops to zero. Existing reference counting schemes

more than an order of magnitude on common work-

su er from four problems [14 ] | space overheads,

loads [4 ]. The contribution of this rep ort is to do c-

the problem of cycles, the cost of reclaiming the

ument lazy reference counting: the technique used

unreferenced entries, and the cost of changing ref-

by Hac to solve the sp eci c problem of discard-

erence counts at every p ointer up date. Our lazy

reference counting scheme overcomes these prob-

This research was supp orted in part by the Advanced

lems. First, it uses only one-byte reference counts

Research Pro jects Agency of the Department of Defense,

with larger counts stored in a separate table. Sec-

monitored by the Oce of Naval Research under contract

ond, the inability to collect cyclic garbage using ref-

N00014-91-J-4136. M. Castro is supp orted by a PRAXIS

erence counts is not a problem here since evicting

XXI fellowship.

ob jects from the cache breaks cycles. Third, freeing 1

found in [8]. We serialize transactions using opti- an unreferenced entry is achieved by simply setting

mistic concurrency control; the mechanism is de- a bit in the entry reusing the entry is similarly

scrib ed in [1 , 7]. Our concurrency control algo- inexp ensive. Finally, reference counts are not cor-

rithm outp erforms the b est p essimistic metho ds on rected at every p ointer up date but only once at the

almost all workloads [7 ]. end of a transaction to account for all of the trans-

action's mo di cations. Furthermore, lazy reference

Hac partitions the client cache into page frames

counting p erforms reference count adjustments and

and fetches entire pages from the server. To make

frees unreferenced entries in the background while

ro om for an incoming page, Hac selects some page

the clientiswaiting for replies to fetch and commit

frames for compaction, discards the cold ob jects

requests.

in these frames, and compacts the hot ob jects to

free one of the selected frames. The approach is

This rep ort describ es how lazy reference count-

illustrated in Figure 1. This compaction pro cess is

ing is implemented in Hac [4 ] and presents the re-

p erformed in the background while waiting for the

sults of exp eriments that evaluate its p erformance.

fetch reply.

The results show that the lazy reference counting

scheme has low space and time overheads.

Before compaction

Even though we fo cus on the implementation of

lazy reference counting in Hac, the algorithm and

1 5

eintro duce could b e ap- most of the optimizations w 3

6 7

e other related problems. For example, plied to solv 4

2

assume a client cache management scheme where

Framet 14 Frame 2 Frame 53

direct swizzling is used and pages are evicted by Cold object read protecting the virtual page frame where they After compaction

Hot object

ere mapp ed. In this setting, lazy reference count- w 4

1

can be used to determine when there are no ing 5 Free

2 6 Frame

more p ointers to the virtual page frame, allowing

3 7

it to b e reused to map a di erent page.

rest of the rep ort is organized as follows.

The Frame 14 Frame 2 Frame 53

Section 2 presents an overview of our system and

ob ject naming scheme. Section 3 describ es lazy

Figure 1: Compaction of pages by Hac

reference couting. The results of our exp eriments

are presented in Section 4.

While a transaction runs, the client keeps track

of ob jects it used in the read set, and ob jects it

mo di ed in the write set. Additionally, when a

2 System Overview

transaction mo di es an ob ject for the rst time, the

Hac and lazy reference counting are part of

client makes a copy of the ob ject in the undo log.

Thor-1, a new implementation of the Thor ob ject-

These copies are used to restore mo di ed ob jects to

oriented database system; more information ab out

their previous state when transactions ab ort. The

Thor-1 can be found in [4]. This section b egins

copies are imp ortant b ecause they sp eed up the re-

with a brief overview of the client and server

execution of ab orted transactions. This mechanism

architectures. Next, it describ es ob ject naming and

is essential for our optimistic concurrency control

indirect swizzling.

scheme to outp erform lo cking alternatives [7 ]. The

space dedicated to holding the undo log has a

2.1 Clients and Servers

limited size; when it lls up, no more copies are

created. We exp ect a small bu er to be capable

The system is made up of clients and servers.

of holding copies of all mo di ed ob jects for most

Servers provide p ersistent highly available stor-

transactions [7 ].

age for ob jects. Clients cache copies of p ersistent

When a transaction commits, the client sends the ob jects. Application transactions run entirely at

new versions of all ob jects mo di ed or created by clients, on the cached copies. More information

the committing transaction to a server, along with ab out the client interface to the database can be 2

the names of ob jects that were read and written swizzle table; it is used to translate ob ject names

during the current transaction. The server vali- into p ointers to indirection table entries. The

dates the transaction against those that have al- table is implemented as an over ow chaining hash

ready committed or prepared we use a two-phase table. Eachentry in the indirection table contains

proto col if the transaction used ob jects at multiple a p ointer to the elds of some ob ject or null if that

servers. If validation succeeds, information ab out ob ject has b een discarded, a reference count, and

the committing transaction, including all its mo di- a next p ointer used to link the entries that make

up a hash table bucket. cations is forced to the transaction log; only after

the information has b een recorded on stable stor-

3 Reference Count Management

age do es the server inform the client of the commit.

When a transaction that mo di ed ob ject o

Indirection table entries are freed using the lazy

commits, it causes all other clients that cache o

reference counting scheme. The scheme takes ad-

to contain out of date information. We cause

vantage of the copies maintained by our concur-

invalid information to be ushed from caches

rency control algorithm in the undo log to up-

by sending invalidations piggybacked on other

date reference counts and free entries eciently at

messages already b eing sent.

commit time other schemes maintain similar undo

logs [12 , 2 ].

2.2 Ob ject References

The lazy reference counting scheme is optimized

Ob jects stored on disk refer to one another using

for the common case | transactions whose base

names. The name of an ob ject identi es the

copies of mo di ed ob jects t in the undo log

disk page where the ob ject is stored, but not

and whose accessed ob jects t in a small fraction

the lo cation of the ob ject's within that

of the client cache [7 ]. We start by presenting

page. This is imp ortant b ecause it allows storage

the algorithm that handles the common case

management at servers to avoid fragmentation by

and later explain how it is extended to handle

moving ob jects within their pages indep endently

transactions that access and mo dify a large number

from other clients or servers, since no other server

of ob jects. We nish the section by describing key

or client knows the actual lo cation where an ob ject

implementation details.

resides.

3.1 The Base Algorithm

Using names as references is not appropriate for

the ob jects cached at the client, b ecause every

Let e be an entry in the indirection table. The

dereference requires a translation of the name to

reference count eld of e keeps an approximate

the lo cation of the cached ob ject. Instead we use

count of the number of swizzled references to e.

an indirect form of lazy swizzling on discovery [13 ];

We call this count the current reference count of

references are swizzled to p oint to an entry in an

e or CRCe. The actual reference count of e or

indirection table that p oints to the ob ject's elds.

ARCe is the exact numb er of swizzled references

Swizzling is p erformed when a reference is loaded

that refer to e in the stack, registers or instance

from an instance variable into a lo cal variable. This

variables of cached ob jects. We de ne stacke

is implemented using edge marking [10 ]; references

as the number of swizzled references to e from

are marked as swizzled or unswizzled using one

the stack or registers. The function refso; e is

bit. When a reference is loaded from an instance

de ned as the numb er of swizzled references in the

variable into a lo cal variable, a check is p erformed

instance variables of ob ject o that refer to e, and

to determine whether the reference is swizzled or

the function refdo maps o to the multiset whose

not. If the reference is not swizzled, the ob ject

elements are the indirection table entries p ointed

name is translated into a p ointer to the ob ject's

to by each swizzled reference in o.

entry in the indirection table this may require

Eager reference counting schemes ensure that

fetching the ob ject if it is not cached, and the

CRCe is always equal to ARCe for all e, but in

resulting value is stored b oth in the lo cal variable

our lazy reference counting scheme they are usually

and back in the instance variable.

di erent. Instead, our system satis es the following

invariant: In our system, the indirection table is also a 3

Invariant 1: For each entry e in the indirection to o's corresp onding indirection table entry e,

table and al l objects y modi ed during the current ARCf  increases by one and ARCe decreases

transaction, the actual number of swizzled refer- by one. The invariant is preserved b ecause

ences to e is given by: refsy; f  increases by one while refsy; e de-

creases by one.

P

0

ARCe=CRCe+stacke+ refsy; erefsy ;e

 If the reference to o was not swizzled, only

ARCf  changes; it increases by one. The in-

0

where y is the copy of y in the undo log.

variant is preserved b ecause refsy; f  increases

by one. Note that even if o has an asso ciated

The invariant captures two imp ortant features

entry e, refsy; e will not change in this case.

of our algorithm. First, CRCe never accounts for

references in the stack or registers. Second, CRCe

d Eviction: When an ob ject o is evicted, the

do es not account for references in the new versions

ARC of entries in refdo decreases. Our algorithm

of ob jects mo di ed during a transaction until it

decrements the CRC for all these entries and

commits, but accounts for the swizzled references

therefore maintains the invariant. In the current

in the ob jects' copies in the undo log. The

version of the system, we never evict ob jects

rst feature is similar to what deferred reference

mo di ed by the current transaction [4 ].

counting [6 ] provides but the second is new.

e Invalidation: Servers send invalidations to a

We now prove that the system preserves Invari-

client when an ob ject o in that client's cache

ant1by induction on the length of the execution,

is mo di ed by another client. When such an

and simultaneously explain the various steps in our

invalidation is received, o is removed from the

algorithm. The invariant holds vacuously for the

cache. Thus, like in the eviction case, the ARC of

base case b ecause initially there are no ob jects in

entries in refdo decreases. Two cases may o ccur:

the cache. For the induction step, there are six

 If the invalidated ob ject was not mo di ed by

op erations that can change the reference counts in

the current transaction, the ob ject is simply

the system; we show that the invariant is preserved

evicted using pro cedure d.

in all these cases:

 If the invalidated ob ject o was mo di ed by

a Swizzling: When a reference to entry e

the current transaction, our algorithm preserves

from ob ject o is swizzled, ARCe increases by

Invariant 1 by decrementing the CRCe for

0 0

one. If o has not b een mo di ed, our algorithm

each e in refdo  and discarding o o's copy.

increments CRCe by one and the invariant is

After pro cessing the invalidation, if any invali-

preserved. Otherwise, the algorithm leaves CRCe

dated ob ject was used by the current transaction,

unchanged, but refso; e increases by one and the

the transaction ab orts triggering the ab ort pro cess-

invariant is maintained.

ing describ ed next.

b Creation of copy in undo log: Our system

f  Transaction commit: We distinguish two cases:

0

creates a copy y of an ob ject y in the undo

 If the commit is successful, our lazy reference

log, the rst time y is mo di ed during a transac-

counting algorithm computes the value of refs

tion. The invariant is preserved b ecause initially

0

0

for each mo di ed ob ject y and its copy y ;

refsy; e = refsy ;e for all e.

P

0

increments CRCe by refsy; erefsy ;e

c Reference modi cation: Supp ose a reference

for eachentry e; and clears the transaction logs.

eld is up dated in ob ject y such that it now refers

Thus, the invariant is maintained.

to entry f instead of containing a reference to

1

ob ject o . No CRC is mo di ed by our algorithm

 If the transaction ab orts, the system reverts

when this happ ens. Let us consider two cases:

all mo di ed ob jects to their copies and clears

the transaction logs. Therefore, the value of

 If the reference to o was swizzled and p ointed

0

ARCe decreases by refsy; erefsy ;e for

every entry e and mo di ed ob ject y . The

1

The technique we use to swizzle references ensures that,

invariant is preserved b ecause y is reverted to

when a new reference value is assigned to a reference eld,

the new value is swizzled. the value in its copy. 4

Invariant 2: For each entry e in the indirection In the current implementation of our system,

table, al l objects y 62 eager modi ed during the there are no swizzled references on the stack or reg-

0

current transaction, and al l copies x of objects isters at commit time, i.e., stacke = 0 for all en-

x 2 eager, the actual number of swizzled references tries e. Therefore, just after the commit pro cessing

to e is given by: describ ed in f  is p erformed ARCe=CRCe for

every e. At this p oint, our algorithm frees entries e

P

0

ARCe=CRCe+stacke refsx ;e+

such that CRCe = 0, ensuring the following basic

P

0

refsy; erefsy ;e

safety and liveness prop erties.

0

where y is y 's copy in the undo log.

Theorem 1 Safety: Indirection table entries

are freed only if there are no swizzled references

P

0

The term refsx ;e re ects the fact that

referring to them.

CRCe accounts for b oth the references that p oint

to e in ob jects in eager and in their copies. This

Theorem 2 Liveness: Indirection table entries

guarantees that entries e that are p ointed to by

with no swizzled references referring to them are

copies have CRCe > 0 until the transaction

freed when a transaction terminates unless the

commits or ab orts. Therefore, these entries are

entry corresponds to an object that is stil l cached.

not freed during a transaction making it p ossible

to revert the state of mo di ed ob jects in eager to

3.2 Algorithm Extensions

their copies if the transaction ab orts.

When ob jects are evicted, the AR C of some

The extended algorithm works as follows:

indirection table entries may drop to zero. For

a  Swizzling: When a reference to entry e from

1

transactions that access a large amount of data

an ob ject y 2 eager is swizzled, the algorithm

and incur many client cache misses, the amount

increments CRCe by one. For ob jects not in

of storage wasted by unreferenced indirection table

eager, it pro ceeds as describ ed in a.

entries may be signi cant. This section explains

b  Creation of copy in undo log: If the undo

1

how the algorithm is extended to allow freeing of

log is full the rst time an ob ject y is mo di ed

indirection table entries during transactions and to

during a transaction, the algorithm adds y to eager.

handle transactions that mo dify more data than

Otherwise, the algorithm pro ceeds as in b.

what ts in the undo log.

c  Pointer modi cation: When a p ointer con-

Let eager b e a subset of the ob jects mo di ed by

1

tained in ob ject y 2 eager is mo di ed to p oint to

the current transaction. The extended algorithm

entry f instead of containing a reference to ob ject

up dates reference counts eagerly to account for

o, the algorithm increments CRCf  by one. If

mo di cations to references contained in ob jects in

the reference to o was swizzled and p ointed to o's

the eager set. The algorithm inserts into eager

corresp onding indirection table entry e, CRCeis

mo di ed ob jects whose copies do not t in the

decremented by one. For ob jects not in eager, the

undo log. Additionally, when the amount of

algorithm b ehaves as describ ed in c.

storage wasted by unreferenced entries is signi cant

d  Eviction: While waiting for the reply to a

relative to the total client cache size e.g., when the

1

fetch request, the algorithm computes the value of

storage used up byentries whose CRC dropp ed to

refs for each mo di ed ob ject y 62 eager; increments

zero is 1 of the client cache size, the algorithm

P

CRCe by refsy; e; and adds all y to eager.

inserts all mo di ed ob jects in eager in order to

Then for each evicted ob ject o, it decrements

allow freeing of entries during the transactions.

CRCe for each e in refdo and frees e if CRCe=

This involves incrementing the reference counts of

0 ^ stacke=0.

entries p ointed to by mo di ed ob jects that are still

not in eager and it is p erformed while waiting for

e  Invalidation: If the client receives an invalida-

1

the reply to a fetch request.

tion for an ob ject y 2 eager, the algorithm decre-

ments CRCe for each e 2 refdy . If y has a The extended algorithm veri es a mo di ed in-

0

copy y , the algorithm decrements CRCe for each variant this can b e shown using a pro of similar to

0

e 2 refdy . The algorithm also p erforms the pro- the one presented in Section 3.1: 5

a transaction ts in the undo log and no entries cessing describ ed in e.

are freed during a transaction i.e., the eager set is

f  Transaction commit: We distinguish two

1

empty.

cases:

Adjusting reference counts when a reference

 If the commit is successful: The algorithm

is swizzled can p otentially add an extra cache

0

decrements CRCe for each e 2 refdy  where

miss. To avoid this overhead, our swizzle table

0

y is a copy of some ob ject in eager. No

is also the indirection table. Therefore, when a

pro cessing is p erformed for ob jects in eager

reference is swizzled its corresp onding entry in the

without a copy. For ob jects not in eager, the

indirection table is accessed and the reference count

algorithm pro ceeds as describ ed in f .

adjustment is inexp ensive.

When an ob ject is evicted, its references are

 If the transaction aborts: For each y 2 eager,

scanned and for each swizzled reference, the refer-

the algorithm decrements CRCe for all e in

0

ence count of the target entry is decremented and if

refdy . If y has a copy y , the algorithm also

0

it drops to zero the entry's index is app ended to the

decrements CRCe for each e 2 refdy  and

zero log if the corresp onding ob ject was discarded

reverts y to its copy. No pro cessing is required

from the cache. Similarly, if the entry b eing dis-

for ob jects not in eager.

carded also has a null reference count it is added to

the zero log. This pro cess is memory intensive and

Once more, just after the commit pro cessing

relatively exp ensive. To reduce the overhead of ad-

describ ed in f  is p erformed ARCe =CRCe

1

justing reference counts when ob jects are evicted,

for every e and the algorithm frees entries with

our system p erforms this pro cessing while waiting

CRCe = 0. Using invariant 2 it is p ossible to

for the reply to a fetch request. Unless the client

show that the extended algorithm also veri es the

is multi-threaded or multi-programmed, this allows

safety and liveness theorems.

us to completely overlap eviction time adjustment

3.3 Implementation and Optimizations

of reference counts with the fetch delay. Therefore,

eviction time pro cessing usually intro duces no over-

This section describ es the key overheads in the

head.

implementation of lazy reference counting and

Commit time pro cessing is divided in two steps.

presents the techniques used to reduce them.

The rst step adjusts reference counts to re ect

3.3.1 Space Overheads

the current state of mo di ed ob jects. It compares

mo di ed ob jects with their copies to determine

The reference count eld in the indirection table is

which references were mo di ed, and adjusts refer-

only 1 byte, which should be enough to store the

ence counts accordingly. As an optimization, the

CRC of most entries. If an over ow o ccurs, the

comparison is only p erformed if the p ointer-up date

eld takes the value 255 and the actual reference

bit is set b ecause otherwise it is known that no ad-

count is stored in a hash table keyed by the index

justments are required.

of the entry in the indirection table. The algorithm

In a system that p erforms p ointer swizzling,

also uses 2 bits in the ob ject header: the pointer-

the client must unswizzle references in mo di ed

update bit which is set if and only if a p ointer in the

ob jects b efore it sends their new states to the

ob ject was mo di ed during the current transaction;

server in a commit request. Our implementation

and the eager bit which is set if and only if the

takes advantage of this by dividing the rst step

ob ject is in the eager set. The p ointer-up date bit

into an increment phase and a decrement phase.

is set by co de that is automatically generated by

The increments are all p erformed when references

our Theta [9] compiler.

are unswizzled and intro duce a negligible overhead

3.3.2 Time Overheads

b ecause the unswizzling co de accesses the entries

whose counts need to be incremented. The We now discuss the time overhead intro duced by

decrement phase is p erformed in the background each step in the lazy reference counting algorithm

while waiting for the reply to the commit request and the optimizations used to reduce it. We fo cus

and intro duces no additional delay unless the on the common case where all data mo di ed by 6

posite parts chosen randomly from among 500 such client is multi-threaded or multi-programmed in

ob jects. Each comp osite part contains a graph which case it may intro duce some delay. Entries

of atomic parts linked by connection ob jects; each whose reference counts drop to zero during the

atomic part has 3 outgoing connections. We used decrement phase are marked as free. However,

the medium database which has 200 atomic parts since these entries are p ointed to by the copies

p er comp osite part and a total size of 37.8 MB. We of mo di ed ob jects, the changes p erformed while

implemented the database in Theta [9 ], following freeing them are logged such that they can be

the sp eci cation [3] closely. undone in case the transaction ab orts.

The second step, iterates over the zero log and

We ran traversals T1 and T2b. These traversals

marks as free any entry whose reference count

p erform a depth- rst traversal of the assembly

is still zero and whose corresp onding ob ject was

and execute an op eration on the comp osite parts

discarded from the cache. The second step is also

referenced by the leaves of this tree. T1 is read-

p erformed in the background while waiting for the

only; it reads the entire graph of a comp osite part.

reply to the commit request. No logging is required

T2b is identical to T1 except that T2b mo di es

in this case b ecause the ob jects that referred to

all the atomic parts. All p ointer up dates in T2b

these entries were discarded from the cache.

are due to p ointer swizzling; therefore, we added a

If marking an entry as free required removing

new traversal called R2b that has p ointer up dates

the entry from its hash bucket chain in the swizzle

due to b oth p ointer swizzling and assignments

table and inserting it in some free list, commit

to p ointer variables. Traversal R2b is similar to

time pro cessing could b e excessive. We avoid this

T2b except that it swaps the second and third

overhead by simply setting a bit in the entry. The

connections in the set of incoming connections of

free entry remains linked to its hash bucket chain

an atomic part.

and will be reused when a new ob ject whose

The co de was compiled with GNU's gcc with

name hashes to the same bucket is installed

optimization ag -O2. Each traversal was run in a

in the indirection table. This technique works

single transaction as sp eci ed in [3]. The traversals

well in practice b ecause our multiplicative hashing

were run on an isolated network.

function spreads the entries across buckets very

4.2 Exp eriments

uniformly.

Figure 2 shows the time sp ent managing reference

4 Performance Evaluation

counts and freeing indirection table entries for T1,

T2b and R2b. The bars lab eled Con guration

This section shows that the overheads intro duced

2 corresp ond to a con guration with a 512 KB

by our lazy reference counting scheme are low.

undo log and the lazy reference counting algorithm

4.1 Exp erimental Setup

con gured to start freeing indirection table entries

during a transaction when the storage used up by

The exp eriments ran a single client on a DEC

entries with null reference counts reaches 1 of the

3000/400 workstation, and a server on another,

client cache size. This is our default con guration

identical workstation. Both workstations had

which was used to obtain the results presented

128 MB of memory and ran OSF/1 version 3.2.

in [4 ].

They were connected by a 10 Mb/s Ethernet

and had DEC LANCE Ethernet interfaces. The

T1, T2b and R2b run in a single transaction,

database was stored on a Seagate ST-32171N disk,

access signi cantly more data than what ts in

with a p eak transfer rate of 15.2 MB/s, an average

the client cache, and T2b and R2b mo dify more

read seek time of 9.4 ms, and an average rotational

data than what ts in the undo log T2b mo di-

latency of 4.17 ms. The server cache was 36 MB

es approximately 4MB of data and R2b mo di-

and the client cache was 12 MB. We used 8KB

es approximately 3.5 MB of data. Therefore, the

pages.

bars lab eled Con guration 2 corresp ond to trans-

actions that do not fall in the common case for The exp eriments ran the single-user OO7 b ench-

which we designed our lazy reference counting al- mark [3 ]. The OO7 database contains a tree of as-

gorithm. We b elieve these transactions are uncom- sembly ob jects, with leaves p ointing to three com- 7

monly large [7 ]. To simulate the b ehavior of the will be the b ehavior in the common case. This

system with smaller transactions, we also ran T1, happ ens b ecause reference count up dates and

T2b and R2b with an undo log large enough to hold freeing entries can be delayed until the end of

a transaction which reduces the total amount copies of all mo di ed ob jects and without freeing

of work to be p erformed. The results also entries in the middle of transactions. The bars la-

show that p erforming reference count management b eled Con guration 1 corresp ond to the latter con-

op erations in the background reduces the overhead. guration.

Table 1 shows the reference counting overhead

The tops of the bars in Figure 2 were erased to

relative to the total traversal time. This table

represent the p ortion of the time sp ent managing

shows best case and worst case overhead for each

reference counts that is overlapp ed with fetch and

traversal. The b est case corresp onds to Con gu-

commit delays.

ration 1 with overlapping and the worst case cor-

The exp erimental metho dology we followed to

resp onds to Con guration 2 without overlapping.

obtain the overheads presented in Figure 2 con-

The imp ortant fact to retain is that the overhead

sisted in comparing the execution times of our sys-

is low; it is lower than 8 in the worst case, lower

tem and a version of our system from which all

than 4 in the b est case, and we exp ect most

reference counting co de was removed. Therefore,

traversals to fall in the b est case.

the numb ers we present include the overheads due

to increased instruction cache and TLB misses. In

fact, the foreground overhead of reference counting

T1 T2b R2b

in traversals T1 and T2b is signi cantly higher than

Worst Case 6 8 8

we exp ected and we b elieve this is due to con icts

Best Case 3 4 3

in our pro cessor's direct-mapp ed co de cache.

Table 1: Reference Counting Overhead Relativeto

Total Traversal Time Conclusion Configuration 1 (4MB log, 20% waste) 5

Configuration 2 (0.5MB log, 1% waste)

a companion pap er [4], we intro duced a novel

10 In

cache management technique, Hac, and showed

that it outp erforms other techniques across a wide

range of cache sizes and workloads. This rep ort's

main contribution is a detailed description of lazy

reference counting; a technique that solves a sp e-

Hac: how to discard indirection ta-

5 ci c problem in

ble entries in an indirect pointer swizzling scheme.

Time Overhead (seconds)

This rep ort also shows that lazy reference count-

ing has a low cost b ecause it reduces the amount

of reference count managementwork and p erforms

a signi cant p ortion of this work in the background

0

while waiting for the replies to fetch and commit

T1 T2b R2b

requests.

References

Figure 2: Reference Count ManagementOverhead

[1] A. Adya, R. Grub er, B. Liskov, and U. Mahesh-

the white bars represent overhead that was over-

wari. Ecient optimistic concurrency control us-

lapp ed with fetch and commit delays

ing lo osely synchronized clo cks. In Proceedings of

the SIGMOD International Conference on Man-

The results in Figure 2 show that the system

agement of Data, pages 23{34. ACM Press, May

1995. p erforms b etter in Con guration 1 whichwe exp ect 8

the Eighteenth International Conference on Very [2] M. Carey et al. The EXODUS extensible DBMS

Large Data Bases, pages 419{431, Vancouver, BC, pro ject: An overview. In Readings in Object-

Canada, 1992. Oriented Database Systems, pages 474{499. Mor-

gan Kaufmann, 1990.

[14] P. Wilson. Unipro cessor garbage collection tech-

niques. ACM Computing Surveys, 1996.

[3] M. J. Carey, D. J. DeWitt, and J. F. Naughton.

The OO7 b ench-

mark. Technical Rep ort; Revised Version dated

7/21/1994 1140, University of Wisconsin-Madison,

1994. At ftp://ftp.cs.wisc.edu/OO7.

[4] Miguel Castro, Atul Adya, Barbara Liskov, and

Andrew C. Myers. HAC: Hybrid Adaptive Caching

for Distributed Storage Systems. In 16th ACM

Symp. on Operating System Principles, Saint-

Malo, France, Octob er 1997.

[5] G. Collins. A metho d for overlapping and erasure

of lists. Communications of the ACM, 212,

Decemb er 1960.

[6] L. Deutsch and D Bobrow. An ecient, incremen-

tal, automatic garbage collector. Communications

of the ACM, 199, Octob er 1976.

[7] R. Grub er. Optimism vs. Locking: A Study of Con-

currency Control for Client-Server Object-Oriented

Databases. PhD thesis, M.I.T., Cambridge, MA,

1997.

[8] Barbara Liskov, Atul Adya, Miguel Castro, Mark

Day, Sanjay Ghemawat, Rob ert Grub er, Umesh

Maheshwari, Andrew C. Myers, and Liuba Shrira.

Safe and ecient sharing of p ersistent ob jects

in Thor. In Proc. SIGMOD Int'l Conf. on

Management of Data, pages 318{329, June 1996.

[9] Barbara Liskov, Dorothy Curtis, Mark Day, San-

jay Ghemawat, Rob ert Grub er, Paul Johnson,

and Andrew C. Myers. Theta Reference Man-

ual. Programming Metho dology Group Memo 88,

MIT Lab oratory for Computer Science, Cam-

bridge, MA, February 1994. Available at

http://www.pmg.lcs.mit.edu/pap ers/thetaref/.

[10] J. Eliot B. Moss. Working with p ersistent ob jects:

To swizzle or not to swizzle. IEEE Transactions

on Software Engineering, 183, August 1992.

[11] Kaehler T. and Krasner G. LOOM|Large Object-

Oriented Memory for Smal ltalk-80 Systems, pages

298{307. Morgan Kaufmann Publishers, Inc., San

Mateo, CA, 1990.

[12] S. J. White and D. J. Dewitt. Quickstore: A high

p erformance mapp ed ob ject store. In SIGMOD

'94, pages 187{198, 1994.

[13] Seth J. White and David J. Dewitt. A p erfor-

mance study of alternative ob ject faulting and

p ointer swizzling strategies. In Proceedings of 9