Computing Machinery

Highly Available Systems for Database Applications WON KIM IBM Research Laboratory, San Jose, Californm 95193 As users entrust more and more of their applications to computer systems, the need for systems that are continuously operational (24 hours per day) has become even greater. This paper presents a survey and analysis of representat~%e architectures and techniques that have been developed for constructing highly available systems for database applications. It then proposes a design of a distributed software subsystem that can serve as a unified framework for constructing database application systems that meet various requirements for high availability. Categories and Subject Descriptors: A.1 [General Literature]: Introductory and Survey; C.1.2 [Processor Architectures]: Multiple Data Stream Architectures (Multiprocessors)--interconnection architectures; C.4 [Computer Systems Organization]: Performance of Systems--reliability, availability, and serviceability; H.2.4 [Database Management]: Systems--distributed systems, transaction processing General Terms: Reliability Additional Key Words and Phrases: Database concurrency control and recovery, relational database INTRODUCTION ager to retrieve and update records in the database. In this paper we examine major hardware A transaction is a collection of reads and and software aspects of highly available writes against a database that is treated as systems. Its scope is limited to those sys- a unit [Gray 1978]. If a transaction com- tems designed for database applications. pletes, its effect becomes permanently re- Database applications require multiple corded in the database; otherwise, no trace paths from the processor to the disks, which of its effect remains in the database. To gives rise to some difficult issues of system support the notion of a transaction, undo architecture and engineering. Further, they log of data before updates and redo log of involve the software issues of concurrency data after updates are used~to allow a trans- control, recovery from crashes, and transaction to be undone or redone after crashes. action management. The Write Ahead Log protocol often is used In a typical business data processing en- to ensure that the log is flushed to the disk vironment, a user message from a terminal before the updated database records are invokes an application program. The appli- written to the disk. cation program interacts with a transaction The most fundamental requirement in manager to initiate and terminate (commit constructing a highly available system for or abort) a transaction. Once a transaction database applications is that each major has been initiated, the application program hardware and software component must at repeatedly interacts with a database man- least be duplicated. At minimum, the sys- Author's current address: Microelectronics and Computer Technology Corporation, 9430 Research Boulevard, Austin, Texas 78759. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1984 ACM 0360-0300/84/0300-0071 $00.75 ComputingSurveys, Vo~'16, No. 1, March 1984 72 * Won Kim CONTENTS The data communication subsystem is the terminal handler that receives use r requests, invokes application programs, and delivers the resultsthat it receives from the INTRODUCTION database/transaction manager. 1. MACHINE AND STORAGE The database manager must supporl; the ORGANIZATION TAXONOMY two fundamental capabilities of concur- 2. INTUITIVE CRITERIA FOR HIGH AVAILABILITY rency control and recovery. That is, it must 3. LOOSELY COUPLED SYSTEMS guarantee that the database remains con- WITH A PARTITIONED DATABASE sistent despite interleaved reads and writes 3.1 Tandem NonStop System to the database by multiple concurrent 3.2 AT&T's Stored Program Controlled Network transactions. Techniques such as locking 3 3 Bank of America's Distributive and time stamping have been developed Computing Facility and extensively studied for this purpose 3.4 Stratus/32 Continuous Processing System [Bernstein and Goodman 1981; Kohler 3.5 Auragen System 4000 1981]. 3 6 System D Prototype 4. TIGHTLY COUPLED SYSTEMS The transaction/database manager must WITH A SHARED DATABASE ensure that the database consistency is not 5. LOOSELY COUPLED MULTIPROCESSOR compromised by system failures or trans- SYSTEM WITH A SHARED DATABASE action failurescaused by software errors or 5.1 GE MARK III 5.2 Computer Consoles' Power System deadlocks. In particular, it must be able to 6. REDUNDANT COMPUTATION SYSTEMS: recover from transaction failures and soft SYNTREX'S GEMINI FILE SERVER crashes, which corrupt only the contents of 7. A FRAMEWORK FOR THE MANAGEMENT the main memory, as well as from hard OF SYSTEM CONFIGURATION crashes, which destroy the contents of the 8. CONCLUDING REMARKS ACKNOWLEDGMENTS disks. For recovery from soft crashes, the REFERENCES undo and redo logs of transactions are used [Gray et al. 1981; Haerder and Reuter v 1983]. To recover from hard crashes, systems rely on periodic dumping of the database into archival storage. tern requires two processors. There may The operating system is needed not .only have to be two paths connecting the pro- to run the other software components, but cessors, and it is desirable to have at least also to detect most of the common, low- two paths from the processors to the data- level software/hardware errors that occur. base, that is, two I/O subsystems consisting Most computer systems rely on progTam of a channel (I/O processor), controller, checks (interrupts) and machine checks to and disk drives. The disk controllers must detect such errors.Program checks are used be multiported, so that they may be con- to detect exceptions and events that occur nected to more than one processor. during execution of the program [IBM On the software side, the system needs 1980]. Exceptions include the improper five essential ingredients: (1) a network specification or use of instructions and communication subsystem, (2) a data com- data, for example, arithmetic overflow or munication subsystem, (3) a database man- underflow, and addressing or protection vi- ager (or a file system), (4) a transaction olation. Machine checks are used to report manager, and (5) the operating system. machine malfunctions, such as memory The network communication subsystem parity errors, I/O errors,and missing inter- must support interprocess(or)communica- rupts. The hardware provides information tion withL. ~ cluster of locally distributed that as~i~s the operating system in deter- processors. If the highly available system is mining the location of the malfunction and a node on a geographically distributed sys- extent of the damage caused by it. tem, the communication subsystem must Most systems have been designed to sur- also support internode communication. vive the failure of a single software/hard- Computing Surveys, Vol. 16, No. 1, March 1984 Highly Available Systems for Database Applications . 73 ware component. If a system is to be truly is marketing a local network file server continuously operational, however, it must called GEMINI; the MARK III Cluster File guarantee availability during multiple con- System, developed by General Electric, pro- current failures of software/hardware com- vides time-sharing services for its tele- ponents, during on-line changes of such phone-switching network users; and the components, and during on-line physical Distributive Computing Facility developed reconfiguration of the database itself. To by Bank of America automates the teller support the full range of high-availability functions for accounts. requirements, the operating system must In addition, the late 1970s SRI SIFT be supplemented with a software subsystem project for aircraft control evoked the Au- that can manage all software and hardware gust Systems' products; Bolt Beranek and components of a system. Such a software Newman (BBN) developed the PLURI- subsystem will receive failure reports from BUS system for use as a highly reliable the system components and reconfigura- communications processor on ARPANET; tion requests from the system operator. It and during the 1960s AT&T developed No. will analyze the status of all the resources 1 ESS and No. 2 ESS (Electronic Switching it manages, and compute the optimal con- System) for telephone switching services. figuration of the system both in response System D, a distributed transaction- to multiple concurrent failures of compo- processing system, was created as a proto- nents and requests for load balancing. Fur- type at IBM Research in San Jose with ther, it will initiate and monitor system availability and modular growth as its ma- reconfiguration, effecting mid-course cor- jor objectives, and there are other research rection of a reconfiguration that does not projects currently under way at IBM Re- succeed. Finally, it will diagnose a class of search to investigate availability and per- failures that other components fail to rec- formance issues under various software/ ognize. hardware structures. The systems introduced in the following The remainder of this paper is organized paragraphs are often considered highly as follows. In Section 1, a taxonomy of available, and are used as

Computing Machinery

Transaction Systems Architects, Inc. 2006 Annual Report

C:\Andrzej\PDF\ABC Nagrywania P³yt CD\1 Strona.Cdr

CA XCOM Data Transport for UNIX and Linux Overview Guide

Using C-Kermit 2Nd Edition

Best Practices in Planning and Managing Your Metered Platform

ACADEMY of SCIENCES ______The Human Resources Director Has Approved the Following Exceptions to the Order of Layoff

Role of Operating System

Stratus® Ftserver® V Series Model 200/400 Systems

Mxload: Read Multics Backup Tapes on UNIX

Milap Shah 83-28, 255 Street Glen Oaks, NY – 11004 USA Email: [email protected]

Stratus® Ftserver® V Series Model 300 Systems

United States Patent (19) 11 Patent Number: 5,680,551 Martino, I 45 Date of Patent: Oct