Implementation of Distributed Orthogonal Persistence Using
Total Page:16
File Type:pdf, Size:1020Kb
Þs-9.15 Implementation of Distributed Orthogonal Persistence llsing Virtual Memory Francis Vaughan Thesis submitted for the degree of Doctor of Philosophy in The University of Adelaide (Faculty of Mathematical and Computer Sciences) December 1994 ,Au,***ol.eJ \ qq 5 Abstract Persistent object systems greatly simplify programming tasks, since they hide the traditional distinction between short-term and long-term storage from the applications programmer. As a result, the programmer can operate at a level of abstraction in which short-term and long-term data are treated uniformly. In the past most persistent systems have been constructed above conventional operating systems and have not supported any form of distributed programming paradigm. In this thesis we explore the implementation of orthogonally persistent systems that make direct use of fhe attributes of paged virtual memory found in the majority of conventional computing platforms. These attributes are exploited to support object movement for persistent storage to addressable memory, to aid in garbage collection, to provide the illusion of larger storage spaces than the underlying architecture allows, and to provide distribution of the persistent system. The thesis further explores the different models of distribution, notably a one world model in which a single persistent space exists, and a federated one in which many co- operating spaces exist. It explores communication mechanisms between federated spaces and the problems of maintaining consistency between separate persistent spaces in a manner which ensures both a reliable and resilient computational environment. In particular characterising the interdependencies using vector clocks and the manner in which vector time can be used to provide a complete mechanism for ensuring reliable and resilient computation. The thesis concludes with a description of a new.operating system design in which support for the mechanisms described earlier are intrinsic in the design. This operating system is able to provide orthogonal persistence in a distributed environment with no effort on the part of users of the operating system. Acknowledgments This thesis owes much to many people. I First and foremost thanks go to my parents for their love, support and forbearance over the last few years. Thanks a¡e due to my supervisor Al Dearle, to whom much is owed, especially for his friendship and encouragement in this endeavour. Ron Morrison and the persistence group at the University of St Andrews: Richard Conner, Quintin Cutts, Graham Kirby and Dave Munro for advice, encouragement and a never wavering welcome when I landed upon their doorstep. John Rosenberg and the Grasshopper group at Sydney University: Franz Henskens, Anders Lindström, Rex di Bona, Matty Farrow and Stephen Norris, for their part in the Grasshopper project. Also for the stimulating and rowdy discussions which made the Grasshopper project such a joy to be a part of. Alex Farkas and David Hulse for their respective contributions to Casper and Grasshopper. The original members of the Casper group, Tracy Lo Basso, Ruth Fazakerley and Bett Koch for putting up with me. Chris Barter for support and friendship over many years, without which this would not have been possible. The members the department at Adelaide University, for providing a convivial home from home for so many years. Contents Chapter 1. Introduction. 11 .Persistence I 1.2. .....Accidents of History 4 1.3. Models of persistence 5 1.3.1. Specially designated objects 5 t.3.2. Embedded persistence 7 1.3.3. Reachability 7 t.4. Distribution 8 1.4.t. ..Single Unstructured 8 1.4.2. .......... Structured single address space 9 1.4.3. ...Fully partitioned 9 t43I ...Federated 10 I.4.4. ......Difficulties 10 15 ... Implementation Mechanisms 11 151 ,............ Specialised Ha¡dware l1 1.5.2. ...Softwa¡e 11 1.5.3. ... Conventional Hardware t2 t2 t532 ...Native code support t3 t3 1.6.r ....Contributions. l4 Chapter 2. Implementation Tactics. 2.1............ ........Introduction t7 )) .....Issues t7 2.2.1. ......Data Movement 18 2.2.1.1. ............Snapshots, Boot and Resume, and Undump. l8 2.2.1.2. ............Data Movement in Persistent Systems t9 2.2.2. Address Generation t9 Pointer Representation 20 Garbage Collection 20 ............ Generation GC 2t ............ GC in Persistent Systems 22 ............ Basic architectures. 22 Object Tables 23 2.3 Hardwa¡e 24 2.3.r. Monads 25 2.3.2. Lisp Machines 26 2.3.2.1 Data Format 26 2.3.2.2 Address Generation and Data Movement 27 2.3.2.3 ..Garbage Collection 27 2.3.3. ..ORSLA 28 2331 ..Data Formats 28 2.3.3.2. ............Garbage Collection 29 2.3.3.3. ............Data Movement 30 2.3.4. ......Rekursiv 30 234t ....Data Movement 3l 2.3.4.2. ............Object Format 32 23 43 ........... Garbage Collection 33 2.4. Software 33 2.4.1 Smalltalk 34 2.4.r.1. Recognising Pointers 35 2.4.2. LOOM 36 2.4.2.7. Smalltalk and Mneme 38 2 422 Garbage Collection 4t 243 ..PS-algol and CPOMS 42 2.4.3.r ..Computational model 42 2432 ..Addresses in PS-algol and CPOMS 42 2433 ..Data Movement 43 2.4.3.4. ............Object Formats and Identifying Pointers 44 2435 .Garbage Collection 45 244 .Napier-88 and the PAM 45 .PAM 46 ....The POMS Store. 47 2.4.6. .......Object Format 47 2.4.6.r. ............Data Movement 48 2.4.6.2. .......Address Generation 49 2.4.6.3. ............Garbage Collection 49 2.5 Page Based 50 ...............Data Movement 51 ..File Mapping 52 q 2.5.2 ..Texas and ObjectStore 52 53 2521r ...........Data Movement in ObjectStore 53 252r2 ...........Ga¡bage Collection 54 2.5.2.2. ...Texas 54 2.5.2.2.1. Data Movement 54 2.5 .2.2.2. ..................Pointer formats 55 2.5.2.2.3. Address Creation. 55 2.6. ......Casper 55 261 .............Integrated Store Management 55 2.6.2. .......Data Movement 55 2.6.3. .Object Formats and Pointer Detection 56 .Address Generation 59 .........Local Heaps and Pointer Quarantine 59 ..Distributed Casper 59 2.6.5.2. .........Copy Out Implementation 60 60 2.6.5.4. -.Luy Copy-Out 6T 2655 ...Page/Card Based 62 266 ...Store level garbage collection 62 2661 .Opportunistic memory recovery 63 2.6.6.2. .Full Garbage Collection 63 2.6.6.2.L Mark Phase 64 2.6.6.2.2. ..................Translaúon Phase 64 26623 ...............Compaction 64 2663 ........Summary of Garbage Collection 65 2.7 ...............Performance 65 27r ..Cost of exceptions 66 27r ..Exceptions with CISC architectures. 66 2.7.1.2. ...Exceptions in Berkeley RISC. 67 27 r3 ...Exceptions on conventional ,...RISC architectures 67 2.7.1 .4 .Exceptions in Unix implementations. 68 2.7.t .5 .Exceptions in OSF and Mach. 68 2.7.1 6 .Measuring Exception Cost 68 27 t7 .Building Exceptions for Speed 69 2.7.2. .Page Granularity 70 2.7.2.t. sets: 7A 2.7.2.2. 7l 2.8 .................Comparisons. 72 2.8.1 Notes 75 Chapter 3. Pointer Swizzling. 3.1............ ........Introduction 77 3.2. ......Software address translation 78 3.3. ......Address translation at page fault time 80 3.4. ........... A hybrid approach 82 3.4.1 Exceptions 84 3.4.2 Data Load. 85 343 Eager Swizzling 86 3.4.4. Deswizzling 88 3.4.4.t. ..... Deswizzling in'Wilson's Scheme 90 3.4.4.2. .....Deswizzling in the Hybrid Scheme 90 3.4.5 Elaboration of detail 9t 3.4.5.t Finding object addresses 9t 3.4.5.2 Pointer comparisons 96 3.4.5.3 Large Objects 97 3.4.5.4 Management of the translation table 97 3.4.5.5 Creation of new objects 97 3.5 Comparison of the schemes 98 3.6 Conclusions 1 00 Chapter 4. Store Architecures. 4.1............ ........Introduction 101 101 r02 r02 4.2.3. Reliability. t02 4.3. Stability and Resilience r02 4.3.I. ......Soft Failure 103 4.3.2. ......Hard failure 103 4.3.3. ......Site Failure 105 4.4............ ........System reference model 105 4.4.I. ...... Self Consistency r07 442 Synchronous and Asynchronous Creation of Persistent State. r07 AA" Modelling. 108 4.5 Mechanisms 109 4.5.1. ......Snapshots 109 4.5.2. Incremental baseline generation 109 4.5.3. Logging. 110 Modelling Logging. 110 Undo and Redo in the one store. rt2 4.5.3.3 Recovering logs rt2 4.5.3.4. Replay Logging rt3 4.5.4. Nested Store Strategies t14 4.5.5. ......Output in replay systems tt4 ............. Concurent Access, Distributed Systems and 115 ;:; T : : :: :::::::: .:: ::::::::::::::::::Üå'l'åli,i?:?'*,- 1r6 4.6.2. ........ Replay of Output. 116 4.6.3. Exploiting Log Granularity tl7 4.7 Trade-offs 118 49. .............. Implementation Strategies. 118 4.8.1. ......Challis's Algorithm rt9 482 Shadowing r20 482 I After-Look r20 r2t t2t Maintaining locality r22 Object Shadowing t23 4.9 Conclusions 123 Chapter 5. Implemetation Strategies. 5.1. ........... ........Introduction 125 52 .Related Work. 125 52r .Shrines 725 5.2.2. .......Page Based NapierSS Stores t25 5.2.3. .......Browns Before-Look Store t25 5.2.3.1. ............Operation 126 5.2.4. .................Munro's After-Look Store r27 525 .................Thatte t29 Systems 131 131 r32 r33 53 CASPERBi-Phase 135 531 .....Basics 135 5.3.2. .....Page Map. 136 5.3.3. .......Store Structure r37 5.3.4. ..............Store Creation 139 535 .Secondary Files 140 536 ..............Store Start-up t40 537 ..Napier8 8 Implementation t4r 538 ..Runtime Actions 142 5.3.8.1 ...Page States 142 5382 ...Read Access r42 144 .LPMap mapping 145 .Meld r46 5.3.10 ............Recovery t47 5.3.1 I ............Implementation Specifics t48 5.3.11.1 .....Placement in memory 148 5.3.t1.2.. .............Page copy sequence 148 5.3.11.3.