Pointer Swizzling at Page Fault Time ECiently and Compatibly

Pointer Swizzling at Page Fault Time Eciently and Compatibly Supp orting Huge Address Spaces on Standard Hardware Paul R Wilson and Sheetal V Kakkad Department of Computer Sciences University of Texas Austin Texas wilsoncsutexasedu I could be bounded in a nutshel l and hardware can sp ecify directly Applications of large count myself king of innite space address spaces include distributed shared memories eg Li op erating systems with a single shared W Shakesp eare Hamlet I Iii address space eg CLLBH and p ersistent ob ject + stores eg ABC DSZ Distributed shared memories provide a single address space for appli Abstract cations that span multiple machines shared address space op erating systems provide a single addressing Pointer swizzling at page fault time is a novel mo del for all pro cesses on one or more machines and address translation mechanism that exploits conven p ersistent ob ject stores provide sharable recoverable tional address translation hardware It can supp ort heap storage to eliminate the use of les for most pur huge address spaces eciently without long hardware p oses addresses such large address spaces are attractive for p ersistent ob ject stores distributed shared memories A common feature of these systems is the empha and shared address space op erating systems This sis on simplifying programming by preserving p ointer swizzling scheme can b e used to provide data com semanticsthat is ob ject identity The programmers patibility across machines with dierent word sizes default view of data is of a heap of ob jects connected and even to provide binary co de compatibility across by p ointers Programs may traverse p ointers freely machines with dierent hardware address sizes and complex data structures may b e passed by refer Pointers are translated swizzled from a long for ence mat to a shorter hardwaresupp orted format at page It most current systems without large shared ad fault time No extra hardware is required and no con dress spaces it is often necessary to write routines tinual software overhead is incurred by presence checks to atten data structures into a lowlevel linear form or indirection of p ointers This pagewise technique ex and reconstruct them later in order to communicate ploits temp oral and spatial lo cality in much the same them from one machine to another or to make them way as a normal virtual memory this gives it many recoverable in the case of a crash or to save them desirable p erformance characteristics esp ecially given on disk so that they can b e op erated on later p er the trend toward larger main memories It is easy haps by a dierent program This tedious and error to implement using common compilers and op erating prone co ding is a large software development problem systems in conventional systems The explicitly programmed conversions typically lose most typ e information as well as p ointer semantics they bypass typ e systems Huge Address Spaces leaving data structure consistency entirely up to the programmer It is often desirable to supp ort a larger virtual mem Shared and p ersistent memory systems avoid much ory address space than the word size of the available of this diculty but they themselves p ose challenges for system implementors One problem in implement Pro c Intl Workshop on Ob ject Orientation ing such large memories is that it involves addressing a in Op erating Systems Paris France Sept huge numb er of ob jects p otentially more than can b e pp IEEE Press 1 sp ecied by the hardwares address bits Schemes for them supp orting large virtual addresses on normal hardware eg LOOM Kae Sta E WD and Mneme Mos typically incur signicant overhead Address Translation at PageFault Two approaches are commonly used to implement Time large address spaces in software One is to indirect p ointers through an object table and translate large Our approach is to fault pages into conventional vir identiers into ob ject table osets when ob jects are tual memory on demand translating long p ersistent brought into memory Untranslated identiers for memory addresses into normal hardwaresupp orted 2 which there isnt a corresp onding table entry may addresses at pagefault time Wil This strategy b e agged and translated into osets lazily exploits lo cality of reference in much the same way as a normal virtual memory we b elieve this pagewise The other main approach is pointer swizzling approach is increasingly attractive as main memories Rather than using an indirection through a table grow larger As the numb er of instructions executed longformat p ointers are converted to actual hardware b etween page faults go es up the cost of address trans supp orted addresses in an incremental fashion lation can b e amortized across more useful program In a conventional p ointer swizzling scheme a p er work In contrast software schemes involving pres sistent p ointer is converted into a virtual memory ad ence checks and indirections do not scale as nicely dress only when the running program tries to use it most of their address translation cost is tied directly this entails bringing the ob ject into the transient mem to the rate of program execution ory address space This may involve checking each Any incremental faulting scheme must somehow de p ointer at each use to see if it is an actual address tect references to ob jects in p ersistent memory so that White and Dewitt refer to this as swizzling on dis they can b e copied into virtual memory b efore b eing covery WD it is also p ossible to swizzle all of op erated on We cho ose to use existing virtual mem the p ointers in an ob ject at once the rst time it is ory hardwares pagewise access protection capability touched Mos In either case p ointer elds or ob for this This allows the checks to o ccur in parallel as jects must b e checked very frequently to see if theyve part of the normal functioning of the virtual memory b een swizzled yet so that unswizzled p ointers can b e system and avoids continual overhead in software swizzled b efore b eing traversed Our approach is analogous to that of App el El We would like to avoid these costs so that pro lis and Lis incremental garbage collection scheme grams that do not access p ersistent data do not pay AEL Their system copies live reachable ob jects the freight and so that programs that access p ersis incrementally from fromspace to tospace to sepa tent ob jects or pages many times do not incur addi 3 rate them from garbage ob jects ours relo cates pages tional checking costs at every access Ideally we would of ob jects from p ersistent memory transient memory like this mechanism to op erate eciently on standard so that they may b e directly addressed Still the basic hardware rather than requiring an exotic ob ject principles of op eration are the same oriented hardware memory hierarchy such as that To simplify the explanation of this technique we of the MUSHROOM pro ject WWH or capability will assume for the moment that ob jects in virtual based addressing schemes eg Lev Ros memory are the same size and shap e memory layout For simplicity much of this pap er is cast in terms as p ersistentmemory ob jects Later we will explain of p ersistent memories b ecause our current implemen how mismatches b etween representations eg one tation fo cus is on p ersistent ob ject stores The pap er 1 This terminology is appropriate in terms of our address is actually ab out address space implementation more translation scheme as welllong addresses conceptually endure generally however the ideas are equally applicable to over time but they may b e mapp ed in changing transient op erating systems and distributed shared memories ways to dierent virtual memory addresses 2 A similar approach develop ed indep endently app ears to We sometimes refer to conventional hardware b e used in Ob jectStore LLOW a commercial pro duct from supp orted virtual memory as transient memory dis Ob ject Design Inc Few details of their scheme are available tinguishing it from a larger persistent memory Con however 3 In a copying garbage collector unreachable garbage data ventional virtual memories are transient in that an ob jects are reclaimed implicitlyrather than nding the gar address space ceases to exist when the heavyweight bage the live reachable ob jects are moved copied into pro cess it b elongs to terminates Persistent memories another area of memory tospace and the obsolete area fromspace is then reclaimed in its entirety like le systems may outlive the pro cesses that create the p ersistent store any pages it holds entry p oint vs twoword p ointer elds can b e handled ers into must b e relo cated into virtual memory and Our scheme relo cates ob jects into transient memory accessprotected This allows the entry p ointers to b e somewhat so oner than a straightforward software translated into machine format so that the program only p ointer swizzling scheme This allows us to pre can b egin execution The second part of the gure serve one essential constraintthe running program is shows this state with page A app earing in virtual never allowed to encounter any p ointers into p ersistent 0 memory as page A memory Any pages that contain p ersistent p ointers must b e accessprotected If the program attempts to When the program attempts to access such a page access a page that may contain p ersistent p ointers a 0 in this case A the accessprotection handler trans trap handler is invoked it translates all of the p er lates al l of the p ointers in the page into the desired sistent p ointers in that page into transient

Pointer Swizzling at Page Fault Time ECiently and Compatibly

Efficient Support of Position Independence on Non-Volatile Memory

English Arabic Technical Computing Dictionary

Database : a Toolkit for Constructing Memory Mapped Databases Peter A

Exact Positioning of Data Approach to Memory Mapped Persistent Stores: Design, Analysis and Modelling

Leanstore: In-Memory Data Management Beyond Main Memory

A Survey of B-Tree Logging and Recovery Techniques

R-Code a Very Capable Virtual Computer

Lazy Reference Counting for Transactional Storage Systems

Providing Persistent Objects in Distributed Systems

Address Translation Strategies in the Texas Persistent Store

Object-Oriented Recovery for Non-Volatile Memory

Software Persistent Memory