The Journal of Systems and Software 71 (2004) 189–198 www.elsevier.com/locate/jss

DIPS: an efficient pointer swizzling strategy for incremental uncaching environments

Jun-Ki Min *, Chin-Wan Chung

Division of Computer Science, Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong, Yusong-gu, Taejon 305-701, South Korea Received 11 July 2002; received in revised form 11 January 2003; accepted 9 February 2003

Abstract

Pointer swizzling improves the performance of OODBMSs by reducing the number of table lookups. However, the object re- placement incurs the unswizzling overhead. In this paper, we propose a new pointer swizzling strategy, the dynamic indirect pointer swizzling (DIPS). DIPS dynamically applies pointer swizzling techniques in order to reduce the overhead of unswizzling. DIPS uses the temporal locality information which is gathered by the object buffer manager. The information is used to select the object to whose pointers the pointer swizzling techniques are applied and to dynamically bind the pointer swizzling techniques using the virtual function mechanism. We show the efficiency of the proposed strategy through experiments over various object buffer sizes and workloads. 2003 Elsevier Inc. All rights reserved.

Keywords: Databases; Persistent objects; Smart pointers; Pointer swizzling

1. Introduction One of the performance improvement techniques for OODBMSs is pointer swizzling (Brahnmath et al., 1998; In the past decades, there has been much research on Kemper and Kossman, 1995; Moss, 1992; McAuliffe persistent object systems such as OODBMSs since the and Solomon, 1995). In OODBMSs, each persistent object-oriented data model supports many important object has a unique object identifier (OID) and the ref- features such as the class concept, the aggregation re- erence to an object is represented by the referenced lationship, and the inheritance relationship. Recently, object’s OID. In general, when an object is referenced, due to the emergence of internet and mobile device, the the system searches a table using an OID to find whether data such as XML, spatial-temporal data, and multi- the object is in memory or not. This table lookup is a media data have dramatically increased. These data are performance bottleneck in OODBMSs. The basic idea commonly represented by the object-oriented data of pointer swizzling is to reduce the number of table model (eXcelon, 2002; Haahr et al., 2001; Pinheiro et al., lookups by replacing an OID with the main memory 2000). Thus, the proliferation of these kinds of data has address of a persistent object. awaken the interest of the research community. However, since the applications of OODBMSs han- The applications of OODBMSs deal with complex dle a massive amount of data, the overhead of unswiz- structured data. Also, they manage a large volume of zling exists. When an object (called a victim) is replaced data and are computation intensive. Thus, the efficiency by the replacement algorithm (Effelsberg and Harder,€ of the performance is a key issue for the popularization 1984; Johnson and Shasha, 1994; O’Neil et al., 1993) or of OODBMSs. saved in the storage system by the concurrency control algorithm (Franklin et al., 1997; Voruganti et al., 1999), the swizzled pointers which point to the victim become * Corresponding author. Tel.: +82-42-869-5577; fax: +82-42-869- 3510/3577. dangling pointers. In addition, the swizzled pointers in E-mail addresses: [email protected] (J.-K. Min), chungcw@ the victim are invalid since the swizzled pointers are not islab.kaist.ac.kr (C.-W. Chung). OIDs but main memory addresses. Thus, the system

0164-1212/$ - see front matter 2003 Elsevier Inc. All rights reserved. doi:10.1016/S0164-1212(03)00047-5 190 J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 must replace the dangling pointers and the swizzled sizes and workloads. Experimental results confirm that pointers in the victim with proper OIDs. This is un- DIPS is superior to the other pointer swizzling tech- swizzling (Kemper and Kossman, 1995; McAuliffe and niques. Solomon, 1995). The rest of this paper is organized as follows: Section According to the swizzling time and the structure of 2 presents the basic architecture and the behavior of objects, the pointer swizzling techniques are very di- persistent object systems and describes various pointer verse. In spite of this variety, some persistent object swizzling techniques. In Section 3, we introduce our systems (Reinwald et al., 1996; White and DeWitt, 1995; DIPS strategy and the intuitive behavior. Section 4 Wilson and Kakkad, 1992) adopt a pointer swizzling contains experimental results, which show the efficiency technique without considering their applications’ be- of our proposed strategy. Finally, conclusions of our haviors. study and some areas for future research are given in However, there is no one pointer swizzling technique Section 5. superior to the others over all situations (Kemper and Kossman, 1995). Thus, there has been some work about adaptive pointer swizzling techniques (Brahnmath et al., 2. Foundation 1998; Kemper and Kossman, 1995). However, since these techniques fix the proper pointer swizzling tech- Above all, we present the procedural steps of the niques at compile time, the performance of the appli- manipulation of objects on typical OODBMSs (Cattell cation can be reduced when the behavior of the and Barry, 2000) for understanding. application changes in run time. Furthermore, to apply As shown in Fig. 1, user defined objects are specified these adaptive techniques, an application profile gener- by the object definition language (ODL) which is pro- ator and/or a specific application compiler is needed in cessed by the ODL preprocessor. The ODL preproces- order to select a proper pointer swizzling technique for a sor generates corresponding metadata and a header file particular application. This induces a difficulty in im- which is used as a part of the application source code. plementing the components of a DBMS and a difficulty Users make application programs using the object ma- of implementing application programs. nipulation language (OML) predefined by OODBMSs. OML supports the mechanism to invoke queries and 1.1. Our contribution procedures using programming languages such as C++, JAVA, and Smalltalk. The procedures are for the op- In this paper, we propose a new pointer swizzling erations (e.g., create, retrieve, update and delete) on strategy, dynamic indirect pointer swizzling (DIPS). persistent objects and transactions. Basically, our work can be applied to general persistent Since object-relational databases store and maintain object systems. However, since the representative of the objects based on the relational data model, some persistent object systems is OODBMS, we present our features of the object-oriented data model (e.g., multiple work based on OODBMSs. As mentioned above, to inheritance mechanism) may be restricted. However, blindly apply a pointer swizzling technique degrades the above procedural steps are similarly applied to object- performance. Thus, some adaptive techniques have been relational DBMSs (Reinwald et al., 1996) since the ob- proposed. However, these adaptive techniques fix the ject-relational DBMSs essentially build an OODBMS pointer swizzling techniques at compile time. In contrast view on the top of relational tables. to these adaptive techniques, DIPS applies the pointer As shown in Fig. 2, the relationships between per- swizzling technique according to the status of objects at sistent objects are represented by the smart pointers run time. The contributions of our approach are sum- marized as follows:

• Dynamic binding of the pointer swizzling technique at run time. • Efficient gathering of the status of objects (i.e., tem- poral locality information) for the pointer swizzling during run time. • Utilization of a general compiler (e.g., GNU C++) supporting the object-oriented paradigm. • Modification of the object buffer manager in order to consider the structural relationships among objects.

In addition, we implemented DIPS and conducted an extensive experimental study over various object buffer Fig. 1. Procedural steps of OODBMS. J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 191

Fig. 3. Structures of objects.

fault is high since the page fault occurs via the exception handling. Moss (1992) first classified the software check ap- proach and analyzed each technique in no uncaching (i.e., non-replacement) environments. Kemper and Kossman (1995) analyzed each technique based on Moss’s work in incremental uncaching environments. Fig. 2. An example of an OML source code. The software check approach is classified by (1) the time when the pointer swizzling occurs and (2) the structure of objects. (e.g., d_RefhTi in Fig. 2) (Edelson, 1992). Although the In the first classification, the software check approach class name and operations of the smart pointer are di- is divided into eager and lazy swizzling. Eager swizzling verse depending on the OML syntax, we use the ODMG swizzles all pointers before accessing them. Although this C++ OML syntax (Cattell and Barry, 2000) since the guarantees that all the pointers are swizzled, this induces ODMG specification is considered as the OODBMS unnecessary object fetches. Lazy swizzling swizzles a standard. The smart pointer keeps the corresponding pointer when the pointer is dereferenced. Lazy swizzling object’s identifier (OID) and the dereference is done must check whether the pointer is swizzled or not at each when the deference operators (e.g., d_RefhTi::!) of the dereferencing time. However, lazy swizzling does not smart pointer are invoked. Thus, when a persistent ob- induce the unnecessary pointer swizzling of unused ject is accessed via a smart pointer, the smart pointer pointers and, therefore, the unnecessary disk I/O. replaces OID with the main of the Additionally, following the structure of objects (see persistent object, i.e., pointer swizzling occurs. Fig. 3), the software check approach is divided into di- Now, we describe the various pointer swizzling rect and indirect swizzlings. In direct swizzling, the techniques. Basically, the pointer swizzling techniques swizzled pointer points to an object directly. In this case, are classified into two groups (White and DeWitt, when the victim is replaced by the object replacement 1992). The first is the software check approach: it uses algorithm, the swizzled pointers which point to the the table lookup to detect the access to non-memory victim become dangling pointers. To avoid the dangling resident objects. The other is the virtual memory pointers, each object keeps backward pointers that point mapping approach: it uses the virtual memory access to swizzled pointers using a , called RRL violation (exception handling) to detect the access to (Kemper and Kossman, 1995), or a hash table (McAu- non-memory resident objects (Wilson and Kakkad, liffe and Solomon, 1995) 1 structure. In indirect swiz- 1992). zling, the swizzled pointer points to an object descriptor Generally, the virtual memory mapping approach (OD). The object descriptor keeps the number (num_ref) precludes the object replacement and overcomes the lack of swizzled pointers which point to this object descrip- of main memory space using the swap space of the client tor. When the object is replaced, the object descriptor is system (Kemper and Kossman, 1995; White and De- not removed from memory if this number is not 0. Thus, Witt, 1992; Wilson and Kakkad, 1992). However, this the dangling pointer problem could be avoided easily in approach has the following disadvantages (Kemper and the indirect swizzling technique. Kossman, 1995): (1) The required data size of an ap- As mentioned earlier, the emergence of internet and plication cannot exceed the size of the virtual memory. mobile device induces the proliferation of persistent (2) The number of page faults increases due to the low object systems. To enhance the performance of systems, memory utilization since the unit of the virtual memory management is a page rather than an object and the virtual memory registration table. (3) The cost of a page 1 We call it reverse-reference-table (RRT). 192 J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 some OODBMSs and persistent object systems embody pointer swizzling as the default pointer swizzling tech- certain pointer swizzling techniques (eXcelon, 2002; nique. Haahr et al., 2001; Pinheiro et al., 2000; Reinwald et al., 1996; White and DeWitt, 1995). Some commercial and prototype systems and their corresponding pointer 3. Dynamic indirect pointer swizzling swizzling techniques are summarized by Kemper and Kossman (1995). In Section 2, we presented various pointer swizzling Kemper and Kossman (1995) claimed that there was techniques. The blind swizzling of pointers incurs the no one superior pointer swizzling technique over all unswizzling overhead in incremental uncaching envi- situations. They proposed the adaptable pointer swiz- ronments. Therefore, we develop a new pointer swiz- zling strategies. The adaptable pointer swizzling strate- zling strategy, called DIPS. gies require the profile generator and the specific compiler to choose the proper pointer swizzling tech- 3.1. Using the temporal locality information niques and fix those at compile time. Brahnmath et al. (1998) proposed another adaptive swizzling techniques The performance of pointer swizzling is influenced by for an array which contains a large number of OIDs. the temporal locality of the application and the system This work is for non-replacement environments and configuration. Thus, the basic idea of DIPS is that cold does not require the profile generator. However, this objects do not use the pointer swizzling technique. work still requires the specific compiler since it chooses a Among the components of the object manager, the proper direct swizzling technique with an analysis of object buffer manager (i.e., the object replacement al- source code at compile time. Thus, as mentioned by gorithm) deals with the temporal locality information to Kemper and Kossman (1995), it is crucial for the increase the hit ratio of the object in the buffer. adaptive pointer swizzling strategies to always determine LRU (Effelsberg and Harder,€ 1984), a well known at compile time which technique should be applied. replacement algorithm, uses a simple heuristic that an Like some batch mode applications, if an application object accessed at this time will be used again in the near does not change the behavior whenever it is invoked, the future. Although it is very efficient, it is too simple to be adaptive pointer swizzling approach is good. However, used for the pointer swizzling information. if an application changes the behavior depending on the To the best of our knowledge, 2Q 2 (Johnson and input parameters, the performance of the adaptive ap- Shasha, 1994) is as efficient as LRU. Also, since 2Q proach can degrade considerably because the approach quickly removes cold objects from the object buffer, the optimizes the performance based on the expected be- hit ratio is as high as LRU/2 (O’Neil et al., 1993). The havior. Furthermore, the adaptive pointer swizzling basic behavior of 2Q is shown in Fig. 4. 2Q handles the techniques require a specific application compiler and/or buffer using two separated queues: A1IN and AM. a specific application profile generator. In addition, if an A1IN acts as first-in-first-out queue and AM acts as application is an interactive program, the adaptive LRU queue. A1IN and AM contain both objects and pointer swizzling strategies may not be applied since the their descriptors. When an object which was not used in behavior of the application program changes during run the near past is loaded, the object buffer manager inserts time. Furthermore, for the interactive program, it is very this object into A1IN. When an object which was used difficult to obtain the profile. in the near past is loaded, the object buffer manager In contrast to the work of Kemper and Kossman inserts this object into AM. (1995), McAuliffe and Solomon (1995) showed a differ- In 2Q, the history of the object replacement is ent result. They indicated that indirect swizzling is su- maintained by A1OUT queue. A1OUT does not contain perior to direct swizzling over various workloads and the object itself but only the object descriptor. When an system configurations. object in A1IN is replaced, its descriptor is inserted into In general, since the applications of OODBMSs A1OUT. If the replaced object is a hot object, the access handle a massive amount of data, object replacements request for this object will be occurred while the de- occur during run time (Franklin et al., 1997; Voruganti scriptor of this object is in A1OUT. In other words, et al., 1999). In incremental uncaching environments, A1OUT acts as a filter for distinguishing hot objects eager direct pointer swizzling does not work properly from cold objects. due to the unnecessary swizzling of pointers (Kemper The sizes of A1IN, AM, and A1OUT are controlled and Kossman, 1995). Also, in lazy direct swizzling, the by parameters, Kin, Km and Kout which are the frac- swizzled pointers which point to the victim are unswiz- tions of the object buffer size such that Kin þ Km ¼ 1 zled using RRL or RRT in the victim. RRL or RRT in and Kout > 0. the objects referenced by the embedded pointers in the victim are rearranged. As a result, the cost of object 2 2Q was originally devised for the page buffer replacement replacements is high. Therefore, we choose indirect algorithm. However, 2Q can be applied for the object buffer as well. J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 193

some objects whose descriptors are in AM and all object descriptors in AMOUT. Let the object descriptor of the object A be in AM, the descriptor of the object B be in AMOUT and there be a Fig. 4. The behavior of 2Q. structural relationship between A and B.WhenB is ac- cessed via a swizzled pointer (i.e., a structural relation- ship) in A, the object B and its descriptor are moved 2Q, like other replacement algorithms, does not to the front of AM 3 since A is a hot object and the consider the structural relationships among objects. structural relationship between A and B had been used in Thus, we modified 2Q for DIPS in order to consider the the near past. Thus, our modified object buffer manager structural relationships among objects. The basic be- for DIPS efficiently uses the structural relationship as havior of DIPS is shown in Fig. 5. The dotted lines in the buffer management information based on the lazy Fig. 5 represent the differences from 2Q. indirect pointer swizzling technique. First of all, we adopted the object descriptor as the In our object buffer manager, most of all objects in queue element for all queues. This avoids an additional AM queue are hot objects. In DIPS, the internal indirection overhead for the queue management. How- pointers of objects in AM can be swizzled only. Thus, ever, since the object descriptor whose num_ref is not 0 when an object in A1IN is replaced, there is no overhead cannot be removed from A1OUT, this incurs the over- of unswizzling. head of A1OUT queue handling. Thus, we made an- other queue, called AMOUT which only contains object 3.2. Dynamic binding of pointer swizzling descriptors. All object descriptors in AMOUT come from A1OUT or AM. When an object descriptor is re- In the object-oriented programming environment, the moved from A1OUT or AM by the object buffer man- reference is represented as the smart pointer (Edelson, ager, num_ref of the descriptor is checked. If num_ref is 1992). The smart pointer overloads some operations greater than 0, the descriptor is inserted into AMOUT. (e.g., !, ) in order to use the normal pointer syntax of Furthermore, if num_ref of an object descriptor in the general compiler (Edelson, 1992). AMOUT is set to 0 as a result of unswizzling, the de- As mentioned earlier, one of our contributions is scriptor is removed from AMOUT. Thus, num_ref of the utilization of a general compiler that supports the any object descriptor in AMOUT is not 0. Since the object-oriented paradigm. In DIPS, the smart pointer objects whose descriptors are in AM are hot, the changes its behavior according to the temporal locality. pointers of only those objects can be swizzled. This Thus, in order to change the behavior of the smart means that there are structural relationships between pointer during run time, some operators of the smart pointer are implemented as virtual functions. As shown in Fig. 6, the smart pointer contains a hidden pointer that points to a virtual function table, called vtbl. vtbl contains addresses of virtual functions. The vtbl pointer (called vptr), which is a volatile pointer, points to vtbl. Generally, vptr in an object is initialized by the constructor of an object implicitly. As addressed in (Biliris et al., 1993), when a persistent object is loaded into main memory from disk, the hidden pointers such as vptr should be reinitialized since these hidden pointers are volatile pointers. Thus, to reinitial- ize the hidden pointers, ODL preprocessor generates the specific constructor which does not modify any attribute value of the object. The specific constructor calls the specific constructors of all attributes of the class associated with the object. Thus, for the dynamic binding of the pointer swizzling technique, the object manager applies a particular con- structor of the class when an object is fetched from the page buffer and its descriptor moves to AM. This con- structor calls the proper smart pointer constructor (e.g.,

Fig. 5. The modification of 2Q for DIPS. The behavior of: (a) DIPS, 3 This technique is also applied to lazy indirect and eager indirect in (b) Full-DIPS. the experiment. 194 J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198

Fig. 6. Virtual function table. the constructor of d_RefhTi_S) which initializes vptr of formance of Full-DIPS is lower than that of lazy indi- the smart pointer to point to vtbl containing the ad- rect swizzling and DIPS due to the reinit( )’s overhead dresses of various virtual functions each of which per- and the threshold check overhead. In Section 4.2, we forms pointer swizzling. describe the analysis of the performance of Full-DIPS It is very important that vptr of the smart pointer with experimental results in detail. changes according to the object’s temporal locality in- formation. In DIPS, there are two vtbls. The other vtbl contains addresses of the identical virtual functions ex- 4. Experiment cept that they do not perform pointer swizzling. This is the way to avoid additional software check whenever a 4.1. Experimental environment pointer is dereferenced. Also, the call to the specific constructor has a little overhead because the constructor To show the efficiency of our strategy, we compare it call for the hidden pointer initialization is processed with no swizzling, lazy indirect swizzling, and eager in- whenever an object is fetched by the object manager. direct swizzling. 4 But, we preclude the virtual memory mapping approach in our experiments since our work is 3.3. Variation of DIPS mainly focused on limited sized buffers which require the effective object replacement. The gain ratio is When the size of the object buffer is extremely large, ðRtime IOtimeNSÞðRtimeNS IOtimeNSÞ the object buffer manager does not distinguish cold Gain ratio ¼ Rtime IOtime objects and hot objects since there is no object replace- NS NS ment. It degrades the performance of DIPS. Thus, we As in the most related OODBMS literature, our work device another version of DIPS (called Full-DIPS). In is mainly focused on reducing the access time to in- Full-DIPS (see Fig. 5(b)), when an object in A1IN is memory persistent objects, not on reducing the disk I/O accessed and the fault ratio is less than a threshold, its time. Thus, to show the efficiency of our strategy, we descriptor moves to AM and the eager indirect pointer extract the I/O time (IOtimeNS) of the application with swizzling technique is applied to the internal pointers of no swizzling from the response time (Rtime) of the ap- it using the reinit( ) function, for which there are two plication with each pointer swizzling since IOtimeNS is alternatives as follows: the default I/O time of the application and some pointer swizzling techniques (e.g., eager indirect) induce the • The reinit( ) function simply calls a specific construc- additional disk I/O. tor of the object, since the specific constructor does Also, we perform experiments over various object not change the attribute value but can change all buffer sizes since the behavior of the object manager is the vtbl pointers in the object. dynamically changed by the ratio of the size of the object • The reinit( ) function gets the offsets of the smart buffer to the amount of data required by an application. pointers in the object from the schema manager, Thus, we change the size of the object buffer from 1% to and then only calls the specific constructors of the 100% of the required data size of each traverse. smart pointers. The experiments are performed on a Sun Ultra II 168 MHz platform with solaris 2.5.1 and 384 MBytes of We choose the reinit( ) function using the schema information for Full-DIPS since the reinit( ) function 4 Since the performance of pure eager indirect swizzling is low on using the schema information is a little bit more efficient small sized object buffer, we implemented eager indirect swizzling such than the reinit( ) function using the constructor which that pointer swizzling occurs at the call time of the smart pointer calls constructors of all attributes. However, the per- constructor. J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 195

module manual

atomic parts & connects complex assemblies

base assemblies

composite parts …

Fig. 8. Structure of OO7.

of OO7 is illustrated in Fig. 8. We stored objects of OO7 Fig. 7. The conceptual structure of the experimental system. data in disk following the first-come-first-serve policy. In our experiments, since we use the same data over vari- main memory. We use GNU C++ compiler 2.8.1 as the ous pointer swizzling techniques and preclude the disk application compiler. I/O time, our performance analysis is not affected by the The conceptual structure of the experimental system location of disk-resident objects (e.g., clustering effect). is shown in Fig. 7. In our experiments, we assume that The configuration of OO7 for our experiment is sum- each application keeps its own object buffer such as the marized in Table 1. The DB size is 15 MBytes. object-server architecture (DeWitt et al., 1990). The For showing the effect of the temporal locality and experimental system consists of four major components: the cache consistency algorithm, we implemented fol- the application API, the object manager, the schema lowing four traversal operations as the application manager and the storage manager. program. We use OMEGA 5 1.0 C++ OML class library, which is compatible with ODMG C++ OML (Cattell • T1: traverses the assembly hierarchy. As each base as- and Barry, 2000), as the application API. sembly is visited, visit each of its referenced unshared The object manager consists of the object table (OT), composite parts. As each composite part is visited, the object buffer manager (see Section 3.1) and the perform a depth-first-search (DFS) on its graph of transaction manager (TM). We implemented the OT as atomic parts (Carey et al., 1993). (The same as the a hash table. We run the object buffer manager with definition in OO7.) parameters set to Kin (25%), Km (75%), Kout (50%) • T2a: traverses T1 and updates one atomic part per which are the values of the parameters recommended in composite part (Carey et al., 1993). (The same as 2Q (Johnson and Shasha, 1994). If super objects 6 are the definition in OO7.) replaced, the heap pointer (i.e., return pointer) in a • T4: traverses the assembly hierarchy. As each base as- traverse is invalid. Thus, we adopt the object pinning sembly is visited, visit each of its referenced unshared mechanism in order to avoid the replacement of the composite parts. As each composite part is visited, super objects in traverse. The TM keeps the information visit all atomic parts of it in a breath-first-search of dirty objects and saves dirty objects at transaction (BFS) fashion but does not access any connect of commit time to ensure the inter-transaction cache con- atomic parts. sistency (Franklin et al., 1997). We use Shore 1.1.1 Storage Manager (Shore Project Table 1 Group, 1995) for a storage manager. Configuration of OO7 The evaluation of the proposed pointer swizzling Parameter Number strategy is based on a variety of experimental tests on OO7 benchmark data (Carey et al., 1993) which is the NumAtomicPerComp 50 NumConnPerAtomic 3 famous benchmark data for OODBMSs. The structure DocumentSize 2000 ManualSize 100k NumCompPerModule 500 NumAssmPerAssm 3 5 Object management systEm for geo-spatial applications (OME- NumLevel 7 GA) 1.0 is a spatial OODBMS which has been developed at KAIST. NumCompPerAssm 3 6 Super objects are in the reference sequence from the root object to NumModule 1 the lastly accessed object at the certain time of a traversal operation. 196 J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198

• T5a: traverses T4 and updates one atomic part per T1-Cold composite part. NS LIPS EIPS DIPS Full-DIPS(0.2) 40 A specification of the above operations is the hier- 30 archical navigation of complex assemblies and DFS/ 20 BFS traversal of atomic parts. Thus, the utilization of 10 extents (i.e., the set of OIDs of a certain class) is not 0 available. Although it depends on the OML specifica- -10 tion, if the API of extents is supported as persistent Gain Ratio(%) objects, the behavior of accessing objects using an extent -20 changes according to the temporal locality in DIPS. In -30 our experiments, the behavior is partially reflected on T4 -40 1 102030405060708090100 and T5a since these operations retrieve atomic parts of a Buffer Size(%) composite part iteratively. Each read-only traverse is run in ‘‘cold’’ and ‘‘warm’’ (a) ways. In a cold run, the traverse begins with the empty T1-Warm database buffer. The warm run consists of the first cold NS LIPS EIPS DIPS Full-DIPS(0.2) run and then running of the same traverse three more 40 times and reporting the average of the last three runs. 30 The required data sizes of T1, T2a, T4, and T5a are 8.6, 8.6, 3.9, and 3.9 MBytes, respectively. 20 10 4.2. Experimental result 0 -10 Gain Ratio(%) Table 2 summarizes the symbols and the names of the -20 techniques that we use in the experimental results. -30 When the size of the object buffer is extremely small -40 (1%) or extremely large (100%), it is observed that most 1 102030405060708090100 of all objects are in A1IN. However, the meanings of Buffer Size(%) two cases are different. When the size of the object buffer (b) is extremely small, most of all objects are cold objects Fig. 9. The experimental results of T1: (a) T1-cold, (b) T1-warm. because the fault ratio is very high. When the size of the object buffer is extremely large, the object buffer man- ager cannot distinguish hot objects from cold objects efficient than lazy indirect when the size of the object since there is no object replacement. Thus, the perfor- buffer is extremely large (100%). Generally, eager swiz- mance of DIPS in these cases is slightly lower than that zling is more efficient than lazy swizzling for an ex- of no swizzling. tremely large sized buffer because there is no software As shown in Figs. 9 and 10, DIPS is the most efficient check overhead in eager swizzling. However, in our ex- when the size of the object buffer is from 10% to 60% of periment, eager swizzling loads more objects than the the required data sizes of T1 and T4. Moreover, lazy required objects of a traverse to ensure all pointers indirect swizzling is less efficient than no swizzling when swizzled. Thus, there are the object replacement over- the size of the object buffer is 1–30% of the T1’s required head and the unswizzling overhead. In addition, Figs. 9 data size due to the unswizzling overhead. Eager indirect and 10 show the gain ratios of Full-DIPS with threshold swizzling shows the worst performance over 1–70% sized 0.2. object buffer due to the overhead of swizzling/unswiz- In the cold run of T1 (see Fig. 9(a)), the gain ratio of zling of unused pointers and, therefore, the unnecessary DIPS increases while the size of the object buffer is 1– disk I/O overhead. Furthermore, eager indirect is less 70% and then decreases. As mentioned earlier, the object buffer manager for DIPS distinguishes hot objects from Table 2 cold objects based on the object replacement history. Table of symbols When the size of the object buffer is in 1–70% of the Sizzling technique Symbols required data size of T1, the object buffer manager dis- No swizzling NS tinguishes hot objects from cold objects in the cold run Lazy indirect pointer swizzling LIPS of T1. However, the more the object buffer size in- Eager indirect pointer swizzling EIPS creases, the more the duration of hot objects in A1IN Dynamic indirect pointer swizzling DIPS increases. Thus, more smart pointers behave as the Full-dynamic indirect pointer swizzling Full-DIPS pointers of no swizzling in the cold run. J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 197

T4-Cold effect of pointer swizzling. Therefore, the lazy indirect NS LIPS EIPS DIPS Full-DIPS(0.2) technique is the most efficient in the cold run and the 20 warm run when the size of the object buffer is 70–100% 15 of the T1’s required data size. 10 Compared T4 with T1 (see Fig. 10), the gain ratio of 5 T4 is less than that of T1. In traverse T1, all atomic parts 0 of a composite part are, at least, visited twice in order to -5 Gain Ratio(%) check whether the atomic part is already visited or not. -10 In contrast to T1, the atomic part of a composite part is -15 visited only once in T4. Thus, the effect of pointer -20 1 102030405060708090100 swizzling in T4 is less than that in T1. However, the Buffer Size(%) pattern of T4’s gain ratio is similar to that of T1’s gain (a) ratio. Also, the object fault ratio of T4 is higher than that of T1. Thus, in the cold run of T4, Full-DIPS with T4-Warm threshold 0.2 acts as DIPS. But, when the object buffer NS LIPS EIPS DIPS Full-DIPS(0.2) size is 90–100% in the warm run of T4, the performance 20 of Full-DIPS is significantly reduced due to the reinit() 15 function overhead. 10 Figs. 11 and 12 show the gain ratio of each pointer 5 swizzling in traverses containing some update opera- 0 -5 tions. As mentioned early, dirty objects are saved at Gain Ratio(%) -10 commit time to ensure the inter-transaction cache con- -15 sistency (Franklin et al., 1997). Thus, the benefit using -20 pointer swizzling is injured by the unswizzling overhead 1 102030405060708090100 at commit time. However, as shown in Fig. 11, DIPS is Buffer Size(%) (b)

Fig. 10. The experimental results of T4: (a) T4-cold, (b) T4-warm. T2a

NS LIPS EIPS DIPS Full-DIPS(0.2) The gain ratio pattern of Full-DIPS is similar to that 10 of DIPS when the object buffer size is 1–70% in the cold 5 run of T1. Since there is the threshold check overhead at 0 each object access time in Full-DIPS, Full-DIPS is less -5 efficient than DIPS. When the object buffer size is 80– -10 100%, Full-DIPS shows the worst performance. There Gain Ratio(%) are two reasons. First, as mentioned earlier, more smart -15 pointers behave as no swizzling pointer until the object -20 fault ratio drops the threshold of Full-DIPS. Finally, as 1 102030405060708090100 Buffer Size(%) mentioned in Section 3.3, the reinit( ) function overhead occurs in the cold run. Fig. 11. The gain ratio of T2a. In the warm run of T1 (see Fig. 9(b)), the gain ratio of DIPS still increases when the size of the object buffer is greater than or equal to 70% because the object buffer T5a manager can distinguish hot objects from cold objects NS LIPS EIPS DIPS Full-DIPS(0.2) more exactly in the warm run. The gain ratio of DIPS 5 approaches 30% when the size of the object buffer is 90% 2.5 of the required data size.

Also, when the object buffer size is 70–80%, the gain 0 ratio of Full-DIPS decreases since the reinit( ) function overhead exists in the warm run of T1. When the object Gain Ratio(%) -2.5 buffer size is 90–100%, the reinit( ) function overhead does not exist in the warm run of T1. -5 As following the increment of the object buffer size, 1 102030405060708090100 the object fault ratio decreases. Thus, the unswizzling Buffer Size(%) overhead of the victim’s internal pointers injures less the Fig. 12. The gain ratio of T5a. 198 J.-K. Min, C.-W. Chung / The Journal of Systems and Software 71 (2004) 189–198 still most efficient than others while the size of the object Proceedings of the International Workshop on Persistent Object buffer is 20–60% of T2a’s required data size. Systems and on Persistence and Java (POS/PJW), pp. 268–278. Since T5a performs a BFS fashioned traverse, the Carey, M.J., DeWitt, D.J., Naughton, J.F., 1993. The OO7 bench- mark. In: Proceedings of ACM SIGMOD International Confer- pointer swizzling effect is less shown than T2a. More- ence on Management of Data, pp. 12–21. over, the unswizzling at transaction commit time injures Cattell, R.G.G., Barry, D.K., 2000. The Object Database Standard: the swizzling effect. However, DIPS is injured less than ODMG 3.0. Morgan Kaufmann Publishers. other pointer swizzlings. As a result, the performance of DeWitt, D.J., Futtersack, P., Maier, D., Velez, F., 1990. A study of DIPS is not lower than that of no swizzling. three alternative workstation-server architectures for object ori- ented database systems. In: Proceedings of International Confer- Consequently, DIPS is shown to provide good per- ence on VLDB, pp. 107–121. formance, particularly, when the size of the object buffer Edelson, D.R., 1992. Smart pointers: they’re smart, but they’re not is smaller than the required data size of an application. pointers. In: Proceedings of Usenix C++ Conference, pp. 1–20. Effelsberg, W., Harder,€ T., 1984. Principles of database buffer management. ACM TODS 9 (4), 560–595. 5. Conclusion eXcelon Corporation, 2002. eXtensible Information Server (XIS). Franklin, M.J., Carey, M.J., Livny, M., 1997. Transactional client- server cache consistency: alternatives and performance. TODS 22 In this paper, we present a new pointer swizzling (3), 315–363. strategy, called DIPS. Our strategy is based on two Haahr, M., Cunningham, R., Cahill, V., 2001. Towards a generic ideas: (1) gathering the temporal locality information architecture for mobile object-oriented applications. Technical from the object buffer manager and (2) dynamic binding Report, TCD-CS-2001-16, University of Dublin, Trinity College, of the pointer swizzling using virtual function mecha- Ireland. Johnson, T., Shasha, D., 1994. 2Q: a low overhead high performance nism. Furthermore, our proposed strategy can be easily buffer management replacement algorithm. In: Proceedings of implemented using a general compiler. International Conference on VLDB, pp. 439–450. The results of experiments indicate that the effect of Kemper, A., Kossman, D., 1995. Adaptable pointer swizzling strat- pointer swizzling is injured by the unswizzling overhead egies in object base. VLDB Journal 4 (3), 427–438. in incremental uncaching environments. But, DIPS Moss, J.E.B., 1992. Working with persistent objects––to swizzle or not to swizzle. IEEE TSE 18 (8), 657–673. shows good performance for navigational and materi- McAuliffe, M.L., Solomon, M.H., 1995. A trace-based simulation of alized accesses in incremental uncaching environments. pointer swizzling technique. In: Proceedings of IEEE International The relative improvement obtained by DIPS over no Conference on Data Engineering (ICDE), pp. 52–61. swizzling is from 6% to 31% except when the size of the O’Neil, E.J., O’Neil, P.E., Weikum, G., 1993. The LRU-K replace- object buffer is extremely small or extremely large. The ment algorithm for database disk buffering. In: Proceedings of ACM SIGMOD International Conference on Management of experiments with updates show that DIPS has a good Data, pp. 297–306. property for incremental uncaching since the effect of Pinheiro, E., Chen, D., Dwarkadas, S., Parthasarathy, S., Scott, M., pointer swizzling in DIPS is injured less by the unswiz- 2000. SDSM for heterogeneous machine architectures. In: Pro- zling overhead at transaction commit time. ceedings of the workshop on Software Distributed Shared Mem- In this study, we only consider the temporal locality ory, 2000. Reinwald, B., Lehman, T.J., Pirahesh, H., Gottemukkala, V., 1996. for pointer swizzling. Thus, for the future work, we Storing and using objects in a relational database. IBM Systems would like to extend our pointer swizzling strategy in a Journal 35 (2), 172–191. way that the temporal locality information and the The Shore Project Group, 1995. The Shore Storage Manger Program- spatial locality information are considered together. ming Interface, Computer Science Department, University of Using both types of information together may improve Wisconsin-Madison, Version 1.1.1. Voruganti, K., Ozsu, M.T., Unrau, R.C., 1999. An adaptive hybrid the performance of DIPS when the size of the object server architecture for client caching object DBMSs. In: Proceed- buffer is extremely small or extremely large. ings of International Conference on VLDB, pp. 440–451. White, S.J., DeWitt, D.J., 1992. A performance study of alternative object faulting and pointer swizzling strategies. In: Proceedings of References International Conference on VLDB, pp. 419–431. White, S.J., DeWitt, D.J., 1995. QuickStore: a high performance Biliris, A., Dar, S., Gehani, N.H., 1993. Making C++ persistent: the mapped object store. VLDB Journal 4 (4), 629–673. hidden pointers. Software-Practice and Experience 23 (12), 1285– Wilson, P.R., Kakkad, S.V., 1992. Pointer swizzling at page fault time: 1303. efficiently supporting huge address spaces on standard hardware. Brahnmath, K., Nystrom, N., Hosking, A. L., Cutts, Q.I., 1998. In: Proceedings of the International Workshop on Object Oriention Swizzle barrier optimizations for orthogonal persistence. In: Java, and Operating Systems, pp. 364–377.