A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems
Total Page:16
File Type:pdf, Size:1020Kb
Purdue University Purdue e-Pubs Department of Computer Science Technical Reports Department of Computer Science 1993 A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems Aidong Zhang Evaggelia Pitoura Bharat K. Bhargava Purdue University, [email protected] Report Number: 93-082 Zhang, Aidong; Pitoura, Evaggelia; and Bhargava, Bharat K., "A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems" (1993). Department of Computer Science Technical Reports. Paper 1095. https://docs.lib.purdue.edu/cstech/1095 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. A VIEW-BASED APPROACH TO RELAXING GLOBAL SERIALIZABILITY IN MULTIDATABASE SYSTEMS Aldong Zhang Evaggelia PitOUI'3 Bharat Bh:Jrgava CSD TR-9J.082 December 1993 (Revised March 1994) A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems Aidong Zhang, Evaggelia Pitoura, and Bharat K Bhargava Department of Computer Science Purdue University West Lafayette, IN 47907 USA Abstract In this paper, we propose a new approach to ensuring the correctness of non serializable executions. The approach is based on relating transaction views of the database to the integrity constraints of the system. Drawing upon this approach, we de velop a new correctness criterion for multidatabase concurrency control. This criterion, caUed view-based two-level serializability, relaxes global serializabitity in multidatabase systems while respecting the autonomy of local database systems. No additional re strictions other than serializallility are imposed on local database systems. 1 Introduction A Illultidataba.-<;e system (MDBS) is a higher-level confederation of a nWllber of pre-existing autonomolls and possibly heterogeneous database systems. The purpose of a multidatabase system is to allow uniform access to the information stored in these component systems. In this envirollment, transaction management is performed at two levels: at a local level by the pre-existing transaction managers (LTMs) of the local databases, and at a global level by the global transaction manager (GTM). There are two types of transactions: • local transactions, which acceSs data at one site only and which are submitted to the appropriate LTM, outside the control of the GTMj and • global transactions, which are submitted to the GTM, where they are parsed into a number of global subtransactions, each of which accesses data stored in a single local 1 database. These subtransactions are then submitted for execution to the appropriate LTM. Global subtransactiolls are viewed by an LTM as ordinary local transactions. Local transaction managers are responsible for the correct execution of all transactions executed at their local sites. The global transaction manager thus retains no control over global transactions after their submission to the LTMs and can make no assumptions about their exp.cution. The autonomous uature of the LTMs thus greatly complicates the problem of Lransaction management in a multidaLabasc system. COllcurrcllcy control in l\/IDBSs has received much attention frolll multidatabase re searchers. In particular, maillLaluing global serializability in the execution of both local and global transactions has been well studied [2, 8, 3, 4, 10]. The difficulties in this regard center upon the inability of the GTM to control the serialization order of global subtransac tions at local sites. As a result, even a serial execution of global transactions at the global level may not ensure that their serialization order at a local site will be consistent with their exp.cution order at the global level. All successful attempts for ensuring global serializabll-ity require the enforcement of conflicting operations among global subtransactions at each local site [4, 5, 11]. The GTM can thus control the serialization order of global subtransaetions by controlling the execution of these conflicting operations. However, enforcing coniliets lllay result in poor performance if most global transactions would not naturally conflict. Relaxing global serializability is thus a significant issue for multidatabase concurrency control. A IlOTl serializable criterion, called two-level serializability (21SR), was introduced in [6]. In general, 2LSR executions of local and global transactions do not guarantee database consistency. In this paper, we present an approach to ensuring that 2LSR executions maintain lllul tidatabase consistency. This approach draws upon the observation that the view of a global transaction, that is, the data it reads, can play an important role in ensuring that its exe cution will maintain database consistency. The underlying concepts of tlle view consistency and view closure of transactions are defined by relating the transaction views to the integrity constrainLs that are defined in the sysLem. The impact of such global transaction views on correctness criteria leads to the formulation of the concept of view-base.d two-level serializ ability. This new correctness criterioll, imposes certain restrictions on the view of global transactions that participate in two-level serializable executions. We shall demonstrate that the view-based two-level serializable execution of local and global transactions can maintain 2 lllultidatabase consistency in vanous practical multidatabase models. As this criterion is more general than serializability and the view consistency and view closure of transactiOllS call be efficiently enforc~d by the system, the proposed approach can be applied to the execution of all global and local transactions. This paper is organized as rollows. Section 2 introduces the fundamental concepts under lying our formulation or view-based two-level serializability, the details of which are presented in Section 3. Section 4 compares the present work with related research, while cOllcluding remarks are offered in Section 5. 2 Preliminaries In this section, we shall introduce the basic terminology to be used in this paper and explore the background of the problem. 2.1 Basic Terminology A multidatabase is the union of all data items stored at the participating local sites. We denote the set of all data items in a local site LSi by D; for i = 1, ..., m and the set of all data items in the lllultidaLabase by D. Thus, D = Ui~l D j • We assume tllat local databases are disjoilltj that is) Di n Dj = 0, i =j:. j. To distinguish between data items prim to multiclatabase integratioll, the set or data items at a local site LSi is partitioned into local data items, denoted LDi , and global data items, denoted GD., such that LDj n GD. = 0 and Di = LDj U GDi • The set of all global data items is denoted GD, GD = Ui~l GDj • Following the traditional approach, a database state is defined as a mapping of every data item to a value of its domain, and integrity constraints are formulas in predicate calculus that express relationships or data items that a database must satisfy. In an MDBS system, there are three types of integrity constraints: local integrity constmints are defined on local data items at a single local sitej local/global integ7'ity constmints are defined between local and global data itemsj and global 'integ1'ity constmints are defined on global data items. The consistency of database state is then defined as follows: Definition 1 (Database consistency) A local database state is consistent if it preserves 3 all integrity constraints that a1·e. defined at the local site. A multidatabase state is consistent if it prescrves all integrity constraints defined in the MDBS environment. In this paper, a transaction is a sequence of read and write operations resulting from the execution of a transaction program. Such a program is conventionally written in a high-level programming language, with assignments, loops, conditional statements, and other control structures. For the elements of a transactioll, we denote the read and write operations as 1'(X) and w(x) (possibly subscripted). We shall alternatively use r(x, v) (or w(x, v)) to denote an operation which reads (or writes) a value v from (or to) the data item x. Two operations conflict with eac:h other if they access the same data item and at least one of them is a wI'ite operation. The execution of a transaction transfers a database from one consistent state to another. In an MDBS environment, a local transaction is a transaction that accesses the data items at a single local site. A global transaction is a set of global subtransactions, within which each global subtransac.tion is a transaction accessing the data items at a single local site. In this paper, we assume that each global transaction has only one subtransaction at each local site. The execution of a global transaction transfers a l11ultidatabase from one consistent state to another. A schedule over a set of transactions is a partial order of all and only the operations of those transactions which orders all conllicting operations alld which respects the order of operations specified by the transactions. III a MDBS environment, a local schedule SDk is a schedule over both local transactions and global subtransactions which are executed at the local site 1S/:, and a gLobaL schedule S is a schedule over both local and global transactions which are executetl in an MDBS. A global subschedule SO is global schedule S restricted to the set 9 of global transactions in S. The correctness of a schedule is defined as follows: Definition 2 (Schedule correctness) A schedule lS correct if it preserves all integrity constmints that are defined in the database system and each transaction in S reads only a consistent database state. 4 2.2 Relaxing Global Serializability A global schedule 5' is considered La be globally serializable if S is serializable [1] on the execution of both local and global transactions. Clearly, if local and global transactions maintain all integrity constralnts defined in the MDBS environment, then a globally se rializable global schedule is coned.