Lazy Reference Counting for Transactional Storage Systems
Total Page:16
File Type:pdf, Size:1020Kb
Lazy Reference Counting for Transactional Storage Systems Miguel Castro, Atul Adya, Barbara Liskov Lab oratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA 02139 fcastro,adya,[email protected] Abstract ing indirection table entries in an indirect pointer swizzling scheme. Hac isanovel technique for managing the client cache Pointer swizzling [11 ] is a well-established tech- in a distributed, p ersistent ob ject storage system. In a companion pap er, we showed that it outp erforms nique to sp eed up p ointer traversals in ob ject- other techniques across a wide range of cache sizes oriented databases. For ob jects in the client cache, and workloads. This rep ort describ es Hac's solution it replaces contained global ob ject names by vir- to a sp eci c problem: how to discard indirection table tual memory p ointers, therebyavoiding the need to entries in an indirect pointer swizzling scheme. Hac uses translate the ob ject name to a p ointer every time a lazy reference counting to solve this problem. Instead reference is used. Pointer swizzling schemes can b e of eagerly up dating reference counts when ob jects are classi ed as direct [13 ] or indirect [11 , 10 ]. In direct mo di ed and eagerly freeing unreferenced entries, which schemes, swizzled p ointers p oint directly to an ob- can b e exp ensive, we p erform these op erations lazily in the background while waiting for replies to fetch and ject, whereas in indirect schemes swizzled p ointers commit requests. Furthermore, weintro duce a number p oint to an indirection table entry that contains of additionaloptimizations to reduce the space and time a p ointer to the ob ject. Indirection allows Hac overheads of maintaining reference counts. The net to eciently evict cold ob jects from the cache and e ect is that the overhead of lazy reference counting compact hot ob jects to avoid storage fragmenta- is low. tion | when an ob ject is evicted or moved, it is Keywords: reference counting, p ointer swizzling, not necessary to nd and correct all p ointers to transactions, caching, p erformance the ob ject; instead, we can just x the ob ject's en- try in the indirection table. However, it intro duces 1 Intro duction the problem of reclaiming storage in the indirection Hac is a novel technique for managing the client table. cache in a client-server, p ersistent ob ject storage Reference counting [5 ] is a simple incremental system [4]. It combines page and ob ject caching to automatic storage reclamation technique. It main- reduce the miss rate in client caches dramatically tains a count of the numb er of references p ointing while keeping overheads low. Hac provides im- to an item and reclaims the item when its count proved p erformance over earlier approaches|by drops to zero. Existing reference counting schemes more than an order of magnitude on common work- su er from four problems [14 ] | space overheads, loads [4 ]. The contribution of this rep ort is to do c- the problem of cycles, the cost of reclaiming the ument lazy reference counting: the technique used unreferenced entries, and the cost of changing ref- by Hac to solve the sp eci c problem of discard- erence counts at every p ointer up date. Our lazy reference counting scheme overcomes these prob- This research was supp orted in part by the Advanced lems. First, it uses only one-byte reference counts Research Pro jects Agency of the Department of Defense, with larger counts stored in a separate table. Sec- monitored by the Oce of Naval Research under contract ond, the inability to collect cyclic garbage using ref- N00014-91-J-4136. M. Castro is supp orted by a PRAXIS erence counts is not a problem here since evicting XXI fellowship. ob jects from the cache breaks cycles. Third, freeing 1 found in [8]. We serialize transactions using opti- an unreferenced entry is achieved by simply setting mistic concurrency control; the mechanism is de- a bit in the entry reusing the entry is similarly scrib ed in [1 , 7]. Our concurrency control algo- inexp ensive. Finally, reference counts are not cor- rithm outp erforms the b est p essimistic metho ds on rected at every p ointer up date but only once at the almost all workloads [7 ]. end of a transaction to account for all of the trans- action's mo di cations. Furthermore, lazy reference Hac partitions the client cache into page frames counting p erforms reference count adjustments and and fetches entire pages from the server. To make frees unreferenced entries in the background while ro om for an incoming page, Hac selects some page the clientiswaiting for replies to fetch and commit frames for compaction, discards the cold ob jects requests. in these frames, and compacts the hot ob jects to free one of the selected frames. The approach is This rep ort describ es how lazy reference count- illustrated in Figure 1. This compaction pro cess is ing is implemented in Hac [4 ] and presents the re- p erformed in the background while waiting for the sults of exp eriments that evaluate its p erformance. fetch reply. The results show that the lazy reference counting scheme has low space and time overheads. Before compaction Even though we fo cus on the implementation of lazy reference counting in Hac, the algorithm and 1 5 eintro duce could b e ap- most of the optimizations w 3 6 7 e other related problems. For example, plied to solv 4 2 assume a client cache management scheme where Framet 14 Frame 2 Frame 53 direct swizzling is used and pages are evicted by Cold object read protecting the virtual page frame where they After compaction Hot object ere mapp ed. In this setting, lazy reference count- w 4 1 can be used to determine when there are no ing 5 Free 2 6 Frame more p ointers to the virtual page frame, allowing 3 7 it to b e reused to map a di erent page. rest of the rep ort is organized as follows. The Frame 14 Frame 2 Frame 53 Section 2 presents an overview of our system and ob ject naming scheme. Section 3 describ es lazy Figure 1: Compaction of pages by Hac reference couting. The results of our exp eriments are presented in Section 4. While a transaction runs, the client keeps track of ob jects it used in the read set, and ob jects it mo di ed in the write set. Additionally, when a 2 System Overview transaction mo di es an ob ject for the rst time, the Hac and lazy reference counting are part of client makes a copy of the ob ject in the undo log. Thor-1, a new implementation of the Thor ob ject- These copies are used to restore mo di ed ob jects to oriented database system; more information ab out their previous state when transactions ab ort. The Thor-1 can be found in [4]. This section b egins copies are imp ortant b ecause they sp eed up the re- with a brief overview of the client and server execution of ab orted transactions. This mechanism architectures. Next, it describ es ob ject naming and is essential for our optimistic concurrency control indirect swizzling. scheme to outp erform lo cking alternatives [7 ]. The space dedicated to holding the undo log has a 2.1 Clients and Servers limited size; when it lls up, no more copies are created. We exp ect a small bu er to be capable The system is made up of clients and servers. of holding copies of all mo di ed ob jects for most Servers provide p ersistent highly available stor- transactions [7 ]. age for ob jects. Clients cache copies of p ersistent When a transaction commits, the client sends the ob jects. Application transactions run entirely at new versions of all ob jects mo di ed or created by clients, on the cached copies. More information the committing transaction to a server, along with ab out the client interface to the database can be 2 the names of ob jects that were read and written swizzle table; it is used to translate ob ject names during the current transaction. The server vali- into p ointers to indirection table entries. The dates the transaction against those that have al- table is implemented as an over ow chaining hash ready committed or prepared we use a two-phase table. Eachentry in the indirection table contains proto col if the transaction used ob jects at multiple a p ointer to the elds of some ob ject or null if that servers. If validation succeeds, information ab out ob ject has b een discarded, a reference count, and the committing transaction, including all its mo di- a next p ointer used to link the entries that make up a hash table bucket. cations is forced to the transaction log; only after the information has b een recorded on stable stor- 3 Reference Count Management age do es the server inform the client of the commit. When a transaction that mo di ed ob ject o Indirection table entries are freed using the lazy commits, it causes all other clients that cache o reference counting scheme. The scheme takes ad- to contain out of date information. We cause vantage of the copies maintained by our concur- invalid information to be ushed from caches rency control algorithm in the undo log to up- by sending invalidations piggybacked on other date reference counts and free entries eciently at messages already b eing sent.