Lazy Reference Counting for Transactional Storage Systems
Miguel Castro, Atul Adya, Barbara Liskov
Lab oratory for Computer Science,
Massachusetts Institute of Technology,
545 Technology Square, Cambridge, MA 02139
fcastro,adya,[email protected]
Abstract ing indirection table entries in an indirect pointer
swizzling scheme.
Hac isanovel technique for managing the client cache
Pointer swizzling [11 ] is a well-established tech-
in a distributed, p ersistent ob ject storage system. In
a companion pap er, we showed that it outp erforms
nique to sp eed up p ointer traversals in ob ject-
other techniques across a wide range of cache sizes
oriented databases. For ob jects in the client cache,
and workloads. This rep ort describ es Hac's solution
it replaces contained global ob ject names by vir-
to a sp eci c problem: how to discard indirection table
tual memory p ointers, therebyavoiding the need to
entries in an indirect pointer swizzling scheme. Hac uses
translate the ob ject name to a p ointer every time a
lazy reference counting to solve this problem. Instead
reference is used. Pointer swizzling schemes can b e
of eagerly up dating reference counts when ob jects are
classi ed as direct [13 ] or indirect [11 , 10 ]. In direct
mo di ed and eagerly freeing unreferenced entries, which
schemes, swizzled p ointers p oint directly to an ob-
can b e exp ensive, we p erform these op erations lazily in
the background while waiting for replies to fetch and
ject, whereas in indirect schemes swizzled p ointers
commit requests. Furthermore, weintro duce a number
p oint to an indirection table entry that contains
of additionaloptimizations to reduce the space and time
a p ointer to the ob ject. Indirection allows Hac
overheads of maintaining reference counts. The net
to eciently evict cold ob jects from the cache and
e ect is that the overhead of lazy reference counting
compact hot ob jects to avoid storage fragmenta-
is low.
tion | when an ob ject is evicted or moved, it is
Keywords: reference counting, p ointer swizzling,
not necessary to nd and correct all p ointers to
transactions, caching, p erformance
the ob ject; instead, we can just x the ob ject's en-
try in the indirection table. However, it intro duces
1 Intro duction
the problem of reclaiming storage in the indirection
Hac is a novel technique for managing the client
table.
cache in a client-server, p ersistent ob ject storage
Reference counting [5 ] is a simple incremental
system [4]. It combines page and ob ject caching to
automatic storage reclamation technique. It main-
reduce the miss rate in client caches dramatically
tains a count of the numb er of references p ointing
while keeping overheads low. Hac provides im-
to an item and reclaims the item when its count
proved p erformance over earlier approaches|by
drops to zero. Existing reference counting schemes
more than an order of magnitude on common work-
su er from four problems [14 ] | space overheads,
loads [4 ]. The contribution of this rep ort is to do c-
the problem of cycles, the cost of reclaiming the
ument lazy reference counting: the technique used
unreferenced entries, and the cost of changing ref-
by Hac to solve the sp eci c problem of discard-
erence counts at every p ointer up date. Our lazy
reference counting scheme overcomes these prob-
This research was supp orted in part by the Advanced
lems. First, it uses only one-byte reference counts
Research Pro jects Agency of the Department of Defense,
with larger counts stored in a separate table. Sec-
monitored by the Oce of Naval Research under contract
ond, the inability to collect cyclic garbage using ref-
N00014-91-J-4136. M. Castro is supp orted by a PRAXIS
erence counts is not a problem here since evicting
XXI fellowship.
ob jects from the cache breaks cycles. Third, freeing 1
found in [8]. We serialize transactions using opti- an unreferenced entry is achieved by simply setting
mistic concurrency control; the mechanism is de- a bit in the entry reusing the entry is similarly
scrib ed in [1 , 7]. Our concurrency control algo- inexp ensive. Finally, reference counts are not cor-
rithm outp erforms the b est p essimistic metho ds on rected at every p ointer up date but only once at the
almost all workloads [7 ]. end of a transaction to account for all of the trans-
action's mo di cations. Furthermore, lazy reference
Hac partitions the client cache into page frames
counting p erforms reference count adjustments and
and fetches entire pages from the server. To make
frees unreferenced entries in the background while
ro om for an incoming page, Hac selects some page
the clientiswaiting for replies to fetch and commit
frames for compaction, discards the cold ob jects
requests.
in these frames, and compacts the hot ob jects to
free one of the selected frames. The approach is
This rep ort describ es how lazy reference count-
illustrated in Figure 1. This compaction pro cess is
ing is implemented in Hac [4 ] and presents the re-
p erformed in the background while waiting for the
sults of exp eriments that evaluate its p erformance.
fetch reply.
The results show that the lazy reference counting
scheme has low space and time overheads.
Before compaction
Even though we fo cus on the implementation of
lazy reference counting in Hac, the algorithm and
1 5
eintro duce could b e ap- most of the optimizations w 3
6 7
e other related problems. For example, plied to solv 4
2
assume a client cache management scheme where
Framet 14 Frame 2 Frame 53
direct swizzling is used and pages are evicted by Cold object read protecting the virtual page frame where they After compaction
Hot object
ere mapp ed. In this setting, lazy reference count- w 4
1
can be used to determine when there are no ing 5 Free
2 6 Frame
more p ointers to the virtual page frame, allowing
3 7
it to b e reused to map a di erent page.
rest of the rep ort is organized as follows.
The Frame 14 Frame 2 Frame 53
Section 2 presents an overview of our system and
ob ject naming scheme. Section 3 describ es lazy
Figure 1: Compaction of pages by Hac
reference couting. The results of our exp eriments
are presented in Section 4.
While a transaction runs, the client keeps track
of ob jects it used in the read set, and ob jects it
mo di ed in the write set. Additionally, when a
2 System Overview
transaction mo di es an ob ject for the rst time, the
Hac and lazy reference counting are part of
client makes a copy of the ob ject in the undo log.
Thor-1, a new implementation of the Thor ob ject-
These copies are used to restore mo di ed ob jects to
oriented database system; more information ab out
their previous state when transactions ab ort. The
Thor-1 can be found in [4]. This section b egins
copies are imp ortant b ecause they sp eed up the re-
with a brief overview of the client and server
execution of ab orted transactions. This mechanism
architectures. Next, it describ es ob ject naming and
is essential for our optimistic concurrency control
indirect swizzling.
scheme to outp erform lo cking alternatives [7 ]. The
space dedicated to holding the undo log has a
2.1 Clients and Servers
limited size; when it lls up, no more copies are
created. We exp ect a small bu er to be capable
The system is made up of clients and servers.
of holding copies of all mo di ed ob jects for most
Servers provide p ersistent highly available stor-
transactions [7 ].
age for ob jects. Clients cache copies of p ersistent
When a transaction commits, the client sends the ob jects. Application transactions run entirely at
new versions of all ob jects mo di ed or created by clients, on the cached copies. More information
the committing transaction to a server, along with ab out the client interface to the database can be 2
the names of ob jects that were read and written swizzle table; it is used to translate ob ject names
during the current transaction. The server vali- into p ointers to indirection table entries. The
dates the transaction against those that have al- table is implemented as an over ow chaining hash
ready committed or prepared we use a two-phase table. Eachentry in the indirection table contains
proto col if the transaction used ob jects at multiple a p ointer to the elds of some ob ject or null if that
servers. If validation succeeds, information ab out ob ject has b een discarded, a reference count, and
the committing transaction, including all its mo di- a next p ointer used to link the entries that make
up a hash table bucket. cations is forced to the transaction log; only after
the information has b een recorded on stable stor-
3 Reference Count Management
age do es the server inform the client of the commit.
When a transaction that mo di ed ob ject o
Indirection table entries are freed using the lazy
commits, it causes all other clients that cache o
reference counting scheme. The scheme takes ad-
to contain out of date information. We cause
vantage of the copies maintained by our concur-
invalid information to be ushed from caches
rency control algorithm in the undo log to up-
by sending invalidations piggybacked on other
date reference counts and free entries eciently at
messages already b eing sent.
commit time other schemes maintain similar undo
logs [12 , 2 ].
2.2 Ob ject References
The lazy reference counting scheme is optimized
Ob jects stored on disk refer to one another using
for the common case | transactions whose base
names. The name of an ob ject identi es the
copies of mo di ed ob jects t in the undo log
disk page where the ob ject is stored, but not
and whose accessed ob jects t in a small fraction
the lo cation of the ob ject's data within that
of the client cache [7 ]. We start by presenting
page. This is imp ortant b ecause it allows storage
the algorithm that handles the common case
management at servers to avoid fragmentation by
and later explain how it is extended to handle
moving ob jects within their pages indep endently
transactions that access and mo dify a large number
from other clients or servers, since no other server
of ob jects. We nish the section by describing key
or client knows the actual lo cation where an ob ject
implementation details.
resides.
3.1 The Base Algorithm
Using names as references is not appropriate for
the ob jects cached at the client, b ecause every
Let e be an entry in the indirection table. The
dereference requires a translation of the name to
reference count eld of e keeps an approximate
the lo cation of the cached ob ject. Instead we use
count of the number of swizzled references to e.
an indirect form of lazy swizzling on discovery [13 ];
We call this count the current reference count of
references are swizzled to p oint to an entry in an
e or CRCe. The actual reference count of e or
indirection table that p oints to the ob ject's elds.
ARCe is the exact numb er of swizzled references
Swizzling is p erformed when a reference is loaded
that refer to e in the stack, registers or instance
from an instance variable into a lo cal variable. This
variables of cached ob jects. We de ne stacke
is implemented using edge marking [10 ]; references
as the number of swizzled references to e from
are marked as swizzled or unswizzled using one
the stack or registers. The function refso; e is
bit. When a reference is loaded from an instance
de ned as the numb er of swizzled references in the
variable into a lo cal variable, a check is p erformed
instance variables of ob ject o that refer to e, and
to determine whether the reference is swizzled or
the function refdo maps o to the multiset whose
not. If the reference is not swizzled, the ob ject
elements are the indirection table entries p ointed
name is translated into a p ointer to the ob ject's
to by each swizzled reference in o.
entry in the indirection table this may require
Eager reference counting schemes ensure that
fetching the ob ject if it is not cached, and the
CRCe is always equal to ARCe for all e, but in
resulting value is stored b oth in the lo cal variable
our lazy reference counting scheme they are usually
and back in the instance variable.
di erent. Instead, our system satis es the following
invariant: In our system, the indirection table is also a 3
Invariant 1: For each entry e in the indirection to o's corresp onding indirection table entry e,
table and al l objects y modi ed during the current ARCf increases by one and ARCe decreases
transaction, the actual number of swizzled refer- by one. The invariant is preserved b ecause
ences to e is given by: refsy; f increases by one while refsy; e de-
creases by one.
P
0
ARCe=CRCe+stacke+ refsy; e refsy ;e
If the reference to o was not swizzled, only
ARCf changes; it increases by one. The in-
0
where y is the copy of y in the undo log.
variant is preserved b ecause refsy; f increases
by one. Note that even if o has an asso ciated
The invariant captures two imp ortant features
entry e, refsy; e will not change in this case.
of our algorithm. First, CRCe never accounts for
references in the stack or registers. Second, CRCe
d Eviction: When an ob ject o is evicted, the
do es not account for references in the new versions
ARC of entries in refdo decreases. Our algorithm
of ob jects mo di ed during a transaction until it
decrements the CRC for all these entries and
commits, but accounts for the swizzled references
therefore maintains the invariant. In the current
in the ob jects' copies in the undo log. The
version of the system, we never evict ob jects
rst feature is similar to what deferred reference
mo di ed by the current transaction [4 ].
counting [6 ] provides but the second is new.
e Invalidation: Servers send invalidations to a
We now prove that the system preserves Invari-
client when an ob ject o in that client's cache
ant1by induction on the length of the execution,
is mo di ed by another client. When such an
and simultaneously explain the various steps in our
invalidation is received, o is removed from the
algorithm. The invariant holds vacuously for the
cache. Thus, like in the eviction case, the ARC of
base case b ecause initially there are no ob jects in
entries in refdo decreases. Two cases may o ccur:
the cache. For the induction step, there are six
If the invalidated ob ject was not mo di ed by
op erations that can change the reference counts in
the current transaction, the ob ject is simply
the system; we show that the invariant is preserved
evicted using pro cedure d.
in all these cases:
If the invalidated ob ject o was mo di ed by
a Swizzling: When a reference to entry e
the current transaction, our algorithm preserves
from ob ject o is swizzled, ARCe increases by
Invariant 1 by decrementing the CRCe for
0 0
one. If o has not b een mo di ed, our algorithm
each e in refdo and discarding o o's copy.
increments CRCe by one and the invariant is
After pro cessing the invalidation, if any invali-
preserved. Otherwise, the algorithm leaves CRCe
dated ob ject was used by the current transaction,
unchanged, but refso; e increases by one and the
the transaction ab orts triggering the ab ort pro cess-
invariant is maintained.
ing describ ed next.
b Creation of copy in undo log: Our system
f Transaction commit: We distinguish two cases:
0
creates a copy y of an ob ject y in the undo
If the commit is successful, our lazy reference
log, the rst time y is mo di ed during a transac-
counting algorithm computes the value of refs
tion. The invariant is preserved b ecause initially
0
0
for each mo di ed ob ject y and its copy y ;
refsy; e = refsy ;e for all e.
P
0
increments CRCe by refsy; e refsy ;e
c Reference modi cation: Supp ose a reference
for eachentry e; and clears the transaction logs.
eld is up dated in ob ject y such that it now refers
Thus, the invariant is maintained.
to entry f instead of containing a reference to
1
ob ject o . No CRC is mo di ed by our algorithm
If the transaction ab orts, the system reverts
when this happ ens. Let us consider two cases:
all mo di ed ob jects to their copies and clears
the transaction logs. Therefore, the value of
If the reference to o was swizzled and p ointed
0
ARCe decreases by refsy; e refsy ;e for
every entry e and mo di ed ob ject y . The
1
The technique we use to swizzle references ensures that,
invariant is preserved b ecause y is reverted to
when a new reference value is assigned to a reference eld,
the new value is swizzled. the value in its copy. 4
Invariant 2: For each entry e in the indirection In the current implementation of our system,
table, al l objects y 62 eager modi ed during the there are no swizzled references on the stack or reg-
0
current transaction, and al l copies x of objects isters at commit time, i.e., stacke = 0 for all en-
x 2 eager, the actual number of swizzled references tries e. Therefore, just after the commit pro cessing
to e is given by: describ ed in f is p erformed ARCe=CRCe for
every e. At this p oint, our algorithm frees entries e
P
0
ARCe=CRCe+stacke refsx ;e+
such that CRCe = 0, ensuring the following basic
P
0
refsy; e refsy ;e
safety and liveness prop erties.
0
where y is y 's copy in the undo log.
Theorem 1 Safety: Indirection table entries
are freed only if there are no swizzled references
P
0
The term refsx ;e re ects the fact that
referring to them.
CRCe accounts for b oth the references that p oint
to e in ob jects in eager and in their copies. This
Theorem 2 Liveness: Indirection table entries
guarantees that entries e that are p ointed to by
with no swizzled references referring to them are
copies have CRCe > 0 until the transaction
freed when a transaction terminates unless the
commits or ab orts. Therefore, these entries are
entry corresponds to an object that is stil l cached.
not freed during a transaction making it p ossible
to revert the state of mo di ed ob jects in eager to
3.2 Algorithm Extensions
their copies if the transaction ab orts.
When ob jects are evicted, the AR C of some
The extended algorithm works as follows:
indirection table entries may drop to zero. For
a Swizzling: When a reference to entry e from
1
transactions that access a large amount of data
an ob ject y 2 eager is swizzled, the algorithm
and incur many client cache misses, the amount
increments CRCe by one. For ob jects not in
of storage wasted by unreferenced indirection table
eager, it pro ceeds as describ ed in a.
entries may be signi cant. This section explains
b Creation of copy in undo log: If the undo
1
how the algorithm is extended to allow freeing of
log is full the rst time an ob ject y is mo di ed
indirection table entries during transactions and to
during a transaction, the algorithm adds y to eager.
handle transactions that mo dify more data than
Otherwise, the algorithm pro ceeds as in b.
what ts in the undo log.
c Pointer modi cation: When a p ointer con-
Let eager b e a subset of the ob jects mo di ed by
1
tained in ob ject y 2 eager is mo di ed to p oint to
the current transaction. The extended algorithm
entry f instead of containing a reference to ob ject
up dates reference counts eagerly to account for
o, the algorithm increments CRCf by one. If
mo di cations to references contained in ob jects in
the reference to o was swizzled and p ointed to o's
the eager set. The algorithm inserts into eager
corresp onding indirection table entry e, CRCeis
mo di ed ob jects whose copies do not t in the
decremented by one. For ob jects not in eager, the
undo log. Additionally, when the amount of
algorithm b ehaves as describ ed in c.
storage wasted by unreferenced entries is signi cant
d Eviction: While waiting for the reply to a
relative to the total client cache size e.g., when the
1
fetch request, the algorithm computes the value of
storage used up byentries whose CRC dropp ed to
refs for each mo di ed ob ject y 62 eager; increments
zero is 1 of the client cache size, the algorithm
P
CRCe by refsy; e; and adds all y to eager.
inserts all mo di ed ob jects in eager in order to
Then for each evicted ob ject o, it decrements
allow freeing of entries during the transactions.
CRCe for each e in refdo and frees e if CRCe=
This involves incrementing the reference counts of
0 ^ stacke=0.
entries p ointed to by mo di ed ob jects that are still
not in eager and it is p erformed while waiting for
e Invalidation: If the client receives an invalida-
1
the reply to a fetch request.
tion for an ob ject y 2 eager, the algorithm decre-
ments CRCe for each e 2 refdy . If y has a The extended algorithm veri es a mo di ed in-
0
copy y , the algorithm decrements CRCe for each variant this can b e shown using a pro of similar to
0
e 2 refdy . The algorithm also p erforms the pro- the one presented in Section 3.1: 5
a transaction ts in the undo log and no entries cessing describ ed in e.
are freed during a transaction i.e., the eager set is
f Transaction commit: We distinguish two
1
empty.
cases:
Adjusting reference counts when a reference
If the commit is successful: The algorithm
is swizzled can p otentially add an extra cache
0
decrements CRCe for each e 2 refdy where
miss. To avoid this overhead, our swizzle table
0
y is a copy of some ob ject in eager. No
is also the indirection table. Therefore, when a
pro cessing is p erformed for ob jects in eager
reference is swizzled its corresp onding entry in the
without a copy. For ob jects not in eager, the
indirection table is accessed and the reference count
algorithm pro ceeds as describ ed in f .
adjustment is inexp ensive.
When an ob ject is evicted, its references are
If the transaction aborts: For each y 2 eager,
scanned and for each swizzled reference, the refer-
the algorithm decrements CRCe for all e in
0
ence count of the target entry is decremented and if
refdy . If y has a copy y , the algorithm also
0
it drops to zero the entry's index is app ended to the
decrements CRCe for each e 2 refdy and
zero log if the corresp onding ob ject was discarded
reverts y to its copy. No pro cessing is required
from the cache. Similarly, if the entry b eing dis-
for ob jects not in eager.
carded also has a null reference count it is added to
the zero log. This pro cess is memory intensive and
Once more, just after the commit pro cessing
relatively exp ensive. To reduce the overhead of ad-
describ ed in f is p erformed ARCe =CRCe
1
justing reference counts when ob jects are evicted,
for every e and the algorithm frees entries with
our system p erforms this pro cessing while waiting
CRCe = 0. Using invariant 2 it is p ossible to
for the reply to a fetch request. Unless the client
show that the extended algorithm also veri es the
is multi-threaded or multi-programmed, this allows
safety and liveness theorems.
us to completely overlap eviction time adjustment
3.3 Implementation and Optimizations
of reference counts with the fetch delay. Therefore,
eviction time pro cessing usually intro duces no over-
This section describ es the key overheads in the
head.
implementation of lazy reference counting and
Commit time pro cessing is divided in two steps.
presents the techniques used to reduce them.
The rst step adjusts reference counts to re ect
3.3.1 Space Overheads
the current state of mo di ed ob jects. It compares
mo di ed ob jects with their copies to determine
The reference count eld in the indirection table is
which references were mo di ed, and adjusts refer-
only 1 byte, which should be enough to store the
ence counts accordingly. As an optimization, the
CRC of most entries. If an over ow o ccurs, the
comparison is only p erformed if the p ointer-up date
eld takes the value 255 and the actual reference
bit is set b ecause otherwise it is known that no ad-
count is stored in a hash table keyed by the index
justments are required.
of the entry in the indirection table. The algorithm
In a system that p erforms p ointer swizzling,
also uses 2 bits in the ob ject header: the pointer-
the client must unswizzle references in mo di ed
update bit which is set if and only if a p ointer in the
ob jects b efore it sends their new states to the
ob ject was mo di ed during the current transaction;
server in a commit request. Our implementation
and the eager bit which is set if and only if the
takes advantage of this by dividing the rst step
ob ject is in the eager set. The p ointer-up date bit
into an increment phase and a decrement phase.
is set by co de that is automatically generated by
The increments are all p erformed when references
our Theta [9] compiler.
are unswizzled and intro duce a negligible overhead
3.3.2 Time Overheads
b ecause the unswizzling co de accesses the entries
whose counts need to be incremented. The We now discuss the time overhead intro duced by
decrement phase is p erformed in the background each step in the lazy reference counting algorithm
while waiting for the reply to the commit request and the optimizations used to reduce it. We fo cus
and intro duces no additional delay unless the on the common case where all data mo di ed by 6
posite parts chosen randomly from among 500 such client is multi-threaded or multi-programmed in
ob jects. Each comp osite part contains a graph which case it may intro duce some delay. Entries
of atomic parts linked by connection ob jects; each whose reference counts drop to zero during the
atomic part has 3 outgoing connections. We used decrement phase are marked as free. However,
the medium database which has 200 atomic parts since these entries are p ointed to by the copies
p er comp osite part and a total size of 37.8 MB. We of mo di ed ob jects, the changes p erformed while
implemented the database in Theta [9 ], following freeing them are logged such that they can be
the sp eci cation [3] closely. undone in case the transaction ab orts.
The second step, iterates over the zero log and
We ran traversals T1 and T2b. These traversals
marks as free any entry whose reference count
p erform a depth- rst traversal of the assembly tree
is still zero and whose corresp onding ob ject was
and execute an op eration on the comp osite parts
discarded from the cache. The second step is also
referenced by the leaves of this tree. T1 is read-
p erformed in the background while waiting for the
only; it reads the entire graph of a comp osite part.
reply to the commit request. No logging is required
T2b is identical to T1 except that T2b mo di es
in this case b ecause the ob jects that referred to
all the atomic parts. All p ointer up dates in T2b
these entries were discarded from the cache.
are due to p ointer swizzling; therefore, we added a
If marking an entry as free required removing
new traversal called R2b that has p ointer up dates
the entry from its hash bucket chain in the swizzle
due to b oth p ointer swizzling and assignments
table and inserting it in some free list, commit
to p ointer variables. Traversal R2b is similar to
time pro cessing could b e excessive. We avoid this
T2b except that it swaps the second and third
overhead by simply setting a bit in the entry. The
connections in the set of incoming connections of
free entry remains linked to its hash bucket chain
an atomic part.
and will be reused when a new ob ject whose
The co de was compiled with GNU's gcc with
name hashes to the same bucket is installed
optimization ag -O2. Each traversal was run in a
in the indirection table. This technique works
single transaction as sp eci ed in [3]. The traversals
well in practice b ecause our multiplicative hashing
were run on an isolated network.
function spreads the entries across buckets very
4.2 Exp eriments
uniformly.
Figure 2 shows the time sp ent managing reference
4 Performance Evaluation
counts and freeing indirection table entries for T1,
T2b and R2b. The bars lab eled Con guration
This section shows that the overheads intro duced
2 corresp ond to a con guration with a 512 KB
by our lazy reference counting scheme are low.
undo log and the lazy reference counting algorithm
4.1 Exp erimental Setup
con gured to start freeing indirection table entries
during a transaction when the storage used up by
The exp eriments ran a single client on a DEC
entries with null reference counts reaches 1 of the
3000/400 workstation, and a server on another,
client cache size. This is our default con guration
identical workstation. Both workstations had
which was used to obtain the results presented
128 MB of memory and ran OSF/1 version 3.2.
in [4 ].
They were connected by a 10 Mb/s Ethernet
and had DEC LANCE Ethernet interfaces. The
T1, T2b and R2b run in a single transaction,
database was stored on a Seagate ST-32171N disk,
access signi cantly more data than what ts in
with a p eak transfer rate of 15.2 MB/s, an average
the client cache, and T2b and R2b mo dify more
read seek time of 9.4 ms, and an average rotational
data than what ts in the undo log T2b mo di-
latency of 4.17 ms. The server cache was 36 MB
es approximately 4MB of data and R2b mo di-
and the client cache was 12 MB. We used 8KB
es approximately 3.5 MB of data. Therefore, the
pages.
bars lab eled Con guration 2 corresp ond to trans-
actions that do not fall in the common case for The exp eriments ran the single-user OO7 b ench-
which we designed our lazy reference counting al- mark [3 ]. The OO7 database contains a tree of as-
gorithm. We b elieve these transactions are uncom- sembly ob jects, with leaves p ointing to three com- 7
monly large [7 ]. To simulate the b ehavior of the will be the b ehavior in the common case. This
system with smaller transactions, we also ran T1, happ ens b ecause reference count up dates and
T2b and R2b with an undo log large enough to hold freeing entries can be delayed until the end of
a transaction which reduces the total amount copies of all mo di ed ob jects and without freeing
of work to be p erformed. The results also entries in the middle of transactions. The bars la-
show that p erforming reference count management b eled Con guration 1 corresp ond to the latter con-
op erations in the background reduces the overhead. guration.
Table 1 shows the reference counting overhead
The tops of the bars in Figure 2 were erased to
relative to the total traversal time. This table
represent the p ortion of the time sp ent managing
shows best case and worst case overhead for each
reference counts that is overlapp ed with fetch and
traversal. The b est case corresp onds to Con gu-
commit delays.
ration 1 with overlapping and the worst case cor-
The exp erimental metho dology we followed to
resp onds to Con guration 2 without overlapping.
obtain the overheads presented in Figure 2 con-
The imp ortant fact to retain is that the overhead
sisted in comparing the execution times of our sys-
is low; it is lower than 8 in the worst case, lower
tem and a version of our system from which all
than 4 in the b est case, and we exp ect most
reference counting co de was removed. Therefore,
traversals to fall in the b est case.
the numb ers we present include the overheads due
to increased instruction cache and TLB misses. In
fact, the foreground overhead of reference counting
T1 T2b R2b
in traversals T1 and T2b is signi cantly higher than
Worst Case 6 8 8
we exp ected and we b elieve this is due to con icts
Best Case 3 4 3
in our pro cessor's direct-mapp ed co de cache.
Table 1: Reference Counting Overhead Relativeto
Total Traversal Time Conclusion Configuration 1 (4MB log, 20% waste) 5
Configuration 2 (0.5MB log, 1% waste)
a companion pap er [4], we intro duced a novel
10 In
cache management technique, Hac, and showed
that it outp erforms other techniques across a wide
range of cache sizes and workloads. This rep ort's
main contribution is a detailed description of lazy
reference counting; a technique that solves a sp e-
Hac: how to discard indirection ta-
5 ci c problem in
ble entries in an indirect pointer swizzling scheme.
Time Overhead (seconds)
This rep ort also shows that lazy reference count-
ing has a low cost b ecause it reduces the amount
of reference count managementwork and p erforms
a signi cant p ortion of this work in the background
0
while waiting for the replies to fetch and commit
T1 T2b R2b
requests.
References
Figure 2: Reference Count ManagementOverhead
[1] A. Adya, R. Grub er, B. Liskov, and U. Mahesh-
the white bars represent overhead that was over-
wari. Ecient optimistic concurrency control us-
lapp ed with fetch and commit delays
ing lo osely synchronized clo cks. In Proceedings of
the SIGMOD International Conference on Man-
The results in Figure 2 show that the system
agement of Data, pages 23{34. ACM Press, May
1995. p erforms b etter in Con guration 1 whichwe exp ect 8
the Eighteenth International Conference on Very [2] M. Carey et al. The EXODUS extensible DBMS
Large Data Bases, pages 419{431, Vancouver, BC, pro ject: An overview. In Readings in Object-
Canada, 1992. Oriented Database Systems, pages 474{499. Mor-
gan Kaufmann, 1990.
[14] P. Wilson. Unipro cessor garbage collection tech-
niques. ACM Computing Surveys, 1996.
[3] M. J. Carey, D. J. DeWitt, and J. F. Naughton.
The OO7 b ench-
mark. Technical Rep ort; Revised Version dated
7/21/1994 1140, University of Wisconsin-Madison,
1994. At ftp://ftp.cs.wisc.edu/OO7.
[4] Miguel Castro, Atul Adya, Barbara Liskov, and
Andrew C. Myers. HAC: Hybrid Adaptive Caching
for Distributed Storage Systems. In 16th ACM
Symp. on Operating System Principles, Saint-
Malo, France, Octob er 1997.
[5] G. Collins. A metho d for overlapping and erasure
of lists. Communications of the ACM, 212,
Decemb er 1960.
[6] L. Deutsch and D Bobrow. An ecient, incremen-
tal, automatic garbage collector. Communications
of the ACM, 199, Octob er 1976.
[7] R. Grub er. Optimism vs. Locking: A Study of Con-
currency Control for Client-Server Object-Oriented
Databases. PhD thesis, M.I.T., Cambridge, MA,
1997.
[8] Barbara Liskov, Atul Adya, Miguel Castro, Mark
Day, Sanjay Ghemawat, Rob ert Grub er, Umesh
Maheshwari, Andrew C. Myers, and Liuba Shrira.
Safe and ecient sharing of p ersistent ob jects
in Thor. In Proc. SIGMOD Int'l Conf. on
Management of Data, pages 318{329, June 1996.
[9] Barbara Liskov, Dorothy Curtis, Mark Day, San-
jay Ghemawat, Rob ert Grub er, Paul Johnson,
and Andrew C. Myers. Theta Reference Man-
ual. Programming Metho dology Group Memo 88,
MIT Lab oratory for Computer Science, Cam-
bridge, MA, February 1994. Available at
http://www.pmg.lcs.mit.edu/pap ers/thetaref/.
[10] J. Eliot B. Moss. Working with p ersistent ob jects:
To swizzle or not to swizzle. IEEE Transactions
on Software Engineering, 183, August 1992.
[11] Kaehler T. and Krasner G. LOOM|Large Object-
Oriented Memory for Smal ltalk-80 Systems, pages
298{307. Morgan Kaufmann Publishers, Inc., San
Mateo, CA, 1990.
[12] S. J. White and D. J. Dewitt. Quickstore: A high
p erformance mapp ed ob ject store. In SIGMOD
'94, pages 187{198, 1994.
[13] Seth J. White and David J. Dewitt. A p erfor-
mance study of alternative ob ject faulting and
p ointer swizzling strategies. In Proceedings of 9