Selective Transparency in Distributed Transaction Processing
Total Page:16
File Type:pdf, Size:1020Kb
Selective Transparency in Distributed Transaction Processing Daniel L. McCue Ph.D. Thesis The University of Newcastle upon Tyne Computing Laboratory April1992 ---- I I I f---- t- Abstract Abstract Object-oriented programming languages provide a powerful interface for programmers to access the mechanisms necessary for reliable distributed computing. Using inheritance and polymorphism provided by the object model, it is possible to develop a hierarchy of classes to capture the semantics and inter-relationships of various levels of functionality required for distributed transaction processing. Using multiple inheritance, application developers can selectively apply transaction properties to suit the requirements of the application objects. In addition to the specific problems of (distributed) transaction processing in an environment of persistent objects, there is a need for a unified framework, or architecture in which to place this system. To be truly effective, not only the transaction manager, but the entire transaction support environment must be described, designed and implemented in terms of objects. This thesis presents an architecture for reliable distributed processing in which the management of persistence, provision of transaction properties (e.g., concurrency control), and organisation of support services (e.g., RPC) are all gathered into a unified design based on the object model. Acknowledgments Acknowledgements "Induced, by a conviction of the great utility of such engines, to withdraw for some time my attention from a subject on which it has been engaged during several years, and which possesses charms of a higher order, I have now arrived at a point where success is no longer doubtful." [Charles Babbage 1822J I sincerely thank my supervisor, Professor Santosh Shrivastava, who has taught me so much about reliability and distributed systems. His unfailing generosity with time, ideas and collaborative work is sincerely appreciated. He has been an inspiration and a role model for the highest level of professionalism. Professor Pete Lee, a trusted friend and adviser, deserves my thanks for bringing me to Newcastle in the first place and for his diligent reading of early drafts of this thesis. His patience, stamina and encouragement have kept me going and encouraged me to complete the work once started. I would also like to thank the many members of the Arjuna project, particularly Mark Little, Graham Parrington and Stuart Wheater for their help and support. The members of the Computing Laboratory, especially Ron Kerr and Graham Megson, have also been a continuing source of inspiration and encouragement, keeping me intellectually challenged and making my stay here all the more pleasant. The work described in this thesis has been supported in part by grants from the UK Science and Engineering Council and ESPRIT project No. 2267 (Integrated Systems Archi tecture ). Finally, I must thank my wife, Charlene, who has always been there when I needed her. II Table of Contents Abstract . Acknowledgements. 11 Table of Contents . III List of Figures . v List of Tables. vu 1 Introduction 1 1.1 Developing Reliable Software 1 1.2 The Programming Interface 4 1.3 Mechanisms for Reliable Controlled Sharing . 5 1.4 Mechanisms for Managing Persistent Data . 6 1.5 Programming Language Options . 9 1.6 Building a Reliable Distributed Object System 11 1.7 Contributions of the Thesis 13 2 Transactions on Persistent Objects 14 2.1 Introduction. 14 2.2 Achieving Fault Tolerance with Transactions 15 2.3 Distribution. 21 2.4 Persistent Objects................................... 26 2.5 Transparencies and Properties 27 2.6 Object-Oriented Programming 30 2.7 Objects and Transactions in Distributed Systems. 33 2.8 Related Systems. 34 2.9 Summary. 45 3 Architectural Issues 46 3.1 Introduction. 46 3.2 System Architecture................................. 46 3.3 Support Service Interfaces 62 3.4 Architectural Summary. 67 4 A Class Hierarchy for Actions 68 4.1 Introduction . 68 4.2 Object Properties 69 4.3 Transactions . 82 4.4 Summary of Class Hierarchy Development. 86 4.5 Class Definitions . 87 4.6 Class Hierarchy Summary. 103 III 5 Developing Reliable Distributed Applications 104 5.1 Introduction. 104 5.2 Design and Implementation Issues 105 5.3 An Implementation. 118 5.4 Application Development............................ 119 6 Conclusions and Further Work 138 6.1 Some Remaining Issues 139 6.2 Some Observations of the use of C++ 141 6.3 Thesis summary . 142 References. 144 Index 165 IV List of Figures Figure 2-1: A simple class hierarchy. .. 31 Figure 2-2: An example of single inheritance 32 Figure 2-3: An example of multiple inheritance 32 Figure 3-1: Components of a Persistent Object Transaction System 49 Figure 3-2: An Instantiation of the Architecture. .. 51 Figure 3-3: Object State Transitions. .. 54 Figure 3-4: Objects, Clients and Servers 55 Figure 3-5: An Example Transaction 59 Figure 4-1: Abstract Classes for Object Identity and Naming. .. 72 Figure 4-2: Some possible derivations from object hierarchy 73 Figure 4-3: Transaction object hierarchy(s) including the concept of identity 74 Figure 4-4: An object hierarchy including variants of Persistent 77 Figure 4-5: Class hierarchy including variants of Recoverable. .. 80 Figure 4-6: Class hierarchy including locks 82 Figure 4-7: Class hierarchy including transactions 83 Figure 4-8: Complete class hierarchy . .. 84 Figure 4-9: Class CheckpointObject " 87 Figure 4-10: Class ObjectState . .. 89 Figure 4-11: Class UID . .. 91 Figure 4-12: Class NamedUID . .. 92 Figure 4-13: Class Identified. .. 92 Figure 4-14: Class Recoverable 93 Figure 4-15: Class StateRecoverable 94 Figure 4-16: Class OpRecoverable . .. 95 Figure 4-17: Class UndoRecoverable . .. 96 Figure 4-18: Class Persistent. .. 98 Figure 4-19: Class Lock 100 Figure 4-20: Lock modes and permissible transitions for nested transactionslOl Figure 4-21: Class Shared . .. 101 Figure 4-22: Class Transaction 102 Figure 5-1: Transaction State Transitions " 112 Figure 5-2: A C++ definition of the abstract class BankAccount 122 Figure 5-3: Specialisations of the BankAccount class . .. 122 Figure 5-4: Implementation of the class Bank Account . .. 124 v Figure 5-5: Overloading selected operations in a derived class . .. 125 Figure 5-6: A Client accessing a BankAccount from within a Transaction . 126 Figure 5-7: Multiple signal sources merging through a single queue. .. 127 Figure 5-8: Class definition for N-writer, l-reader, unrecoverable queue. 128 Figure 5-9: Class Definition for a Recoverable, Concurrency-controlled Routing Table 129 Figure 5-10: Defining a Recoverable Storage Allocator. .. 131 Figure 5-11: Implementing a Recoverable Storage Allocator 132 Figure 5-12: Classes of a Printing Subsystem 136 VI List of Tables Table 2-1: Related Systems Summary Chart 39 Table 2-2: Distributed Operating Systems Summary Chart 44 Table 5-1: Transaction transition actions (2-phase commit) 113 VII Introduction 1 Introduction "Contrary to the situation with hardware, where an increase in reliability usually has to be paid for by a higher price, in the case of software unreliability is the greatest cost factor. " Edsger W Dijkstra [Dijkstra 1982J 1.1 Developing Reliable Software As businesses, governments and individuals become increasingly dependent on computer systems, the reliability of computer software and systems becomes ever more important. Software and hardware design techniques that were once employed only for "life-critical" computer applications are now recognised to be necessary for a wide range of "business-critical" functions. Modern approaches to software design should show increasing use of techniques to improve reliability, and increasingly reliable software should result. Unfortunately, some of the otherwise beneficial trends in computer systems development, such as better communications and increased computing power, have had an adverse effect on the level of reliability which has been achieved. Networking technology has advanced to a point to allow even heterogeneous machine interconnection with relative ease. However, reliable management of distributed data remains a difficult, burdensome chore for programmers. Furthermore, the problems of data and code distribution, management of communications and system failures, and access to remote data have been added to the already difficult task of application programming. Larger computer networks and ever faster CPUs further tempt applications designers to attempt ever more complex projects. The effect of the increased size and complexity of computer 1 Introduction applications is to reduce their reliability. Thus, while business demand for reliability is increasing, technology trends are eroding the level of reliability that programmers can achieve with present-day tools. The difficulties of producing reliable software for distributed systems are manifest in many ways. Some of the problems stem directly from the addition of distribution to already complex applications. Other problems arise because distribution introduces more possibilities for failure, which causes overall reliability to suffer. Three fundamental application requirements which are difficult to meet reliably in distributed applications are data sharing, data (and code) distribution and data storage. Each requirement gives rise to problems which programmers must resolve in developing reliable distributed applications . • Data Sharing - As computer systems are applied to a wider variety of applications, business data is increasingly