<<

DISTRIBUTED SERIALIZABILITY

DDBMS Introduction

◦ A serial sable schedule is a schedule that follows a set of transactions to execute in some order such that the effects are equivalent to executing them in some serial order like a serial schedule. ◦ The execution of transactions in a serializable schedule is a sufficient condition for preventing conflicts. ◦ The serial execution of transactions always leaves the in a consistent state. Serializability describes the concurrent execution of several transactions. ◦ The objective of Serializability is to find the non-serial schedules that allow transactions to execute concurrently without interfering with one another and thereby producing a database state that could be produced by a serial execution. Introduction

◦ Serializability must be guaranteed to prevent inconsistency from transactions interfering with one another. ◦ The order of Read and Write operations are important in serializability. ◦ The Serializability rules are as follows: ◦ If two transactions T1 and T2 only Read a data item, they do not conflict and the order is not important. ◦ If two transactions T1 and T2 either Read or Write completely separate data items, they do not conflict and the execution order is not important. ◦ If one transaction T1 Writes a data item and another transaction T2 either Reads or Writes the same data item, the order of execution is important. Introduction

◦ Serializability can also be represented by constructing a precedence graph. ◦ A precedence relationship can be defined as, transaction T1 precedes transaction T2 and between T1 and T2 if there are two non-permutable actions A1and A2 and A1 is executed by T1 before A2 is executed by T2. ◦ Given the existence of non-permutable actions and the sequence of actions in a transaction it is possible to define a partial order of transactions by constructing a precedence graph. ◦ A precedence graph is a directed graph in which: ◦ The set of vertices is the set of transactions. ◦ An arc exists between transactions T1 and T2 if T1 precedes T2. ◦ A schedule is serializable if the precedence graph is cyclic. ◦ The serializability property of transactions is important in multi-user and distributed , where several transactions are likely to be executed concurrently. Distributed Serializability

◦ A transaction consists of a sequence of read and write operations together with some computation steps that represents a single logical unit of work in a database environment. ◦ All the operations of a transaction are units of atomicity; that is, either all the operations should be completed successfully or none of them are carried out at all. ◦ Ozsu has defined a formal notation for the transaction concept. According to this formalization, a transaction Ti is a partial ordering of its operations and the termination condition. ◦ The partial order P = { , } defines an ordering among the elements of (called the domain), where consists of the read and write operations and the termination condition (abort, commit) of the Σtransactionα Ti, and indicates the execution orderΣ of these operations withinΣ Ti. α Distributed Serializability

◦ The execution sequence or execution ordering of operations of a transaction is called the schedule. ◦ Let E denote an execution sequence of transactions T 1, T 2,. . ., Tn,. ◦ E is a serial execution or serial schedule if no transactions execute concurrently in E; that is, each transaction is executed to completion before the next one begins. ◦ Every serial execution or serial schedule is deemed to be correct, because the properties of transactions imply that a serial execution terminates properly and preserves database consistency. ◦ On the other hand, an execution is said to be a non-serial execution or non-serial schedule if the operations from a set of concurrent transactions execute simultaneously. Distributed Serializability

◦A non-serial schedule (or execution) is serializable if it is computationally equivalent to a serial schedule, that is, if it produces the same result and has the same effect on the database as some serial execution. ◦To prevent data inconsistency, it is essential to guarantee serializability of concurrent transactions. Distributed Serializability

◦ Serializability theory for centralized databases can be extended in a straightforward manner for distributed non-replicated databases. ◦ In a distributed system, each local site maintains a schedule or execution order of transactions or sub-transactions (part of a global transaction) running at that site, called local schedule. ◦ The global schedule or global execution order is the union of all local schedules in a non-replicated system. ◦ Hence, if each local site maintains serializability, then the global schedule also becomes serializable as the local serialization orders are identical. Distributed Serializability

of data items in a distributed system adds extra complexity in maintaining or distributed serializability. ◦ In this case also, it is possible that local schedules are serializable, but mutual consistency of the replicated data items may not be preserved owing to the conflicting operations on the same data item at multiple sites. ◦ Thus, a distributed schedule or global schedule (the union of all local schedules) is said to be distributed serializable if the execution order at each local site is serializable and the local serialization orders are identical. ◦ This requires that all sub-transactions appear in the same order in the equivalent serial schedule at all sites. Distributed Serializability

◦ Formally, distributed serializability can be defined as follows. ◦ Consider S is the union of all local serializable schedules S1, S2,. . ., Sn respectively in a distributed system. ◦ Now, the global schedule S is said to be distributed serializable, if for each pair of conflicting operations Oi and Oj from distinct transactions Ti and Tj respectively from different sites, Oi precedes Oj in the total ordering S, and if and only if Ti precedes Tj in all of the local schedules where they appear together. ◦ To attain serializability, a DDBMS must incorporate synchronization techniques that control the relative ordering of conflicting operations. Distributed Serializability

◦ Let us consider a simple transaction Ti that consists of the following operations: ◦ The partial ordering of the above transaction Ti can be formally represented as Transaction Ti: Read(a); P = { , }, Where = {R(a), R(b), W(a), C} and Read(b);

= {(R(a),Σ α W(a)),Σ (R(b), W(a)), (W(a), (C)), (R(a), (C)), (R(b), (C))}. a : = a + b; ◦α Here, the ordering relation specifies the relative write(a); ordering of all operations with respect to the commit; termination condition. Distributed Serializability

◦ The partial ordering of a transaction facilitates to derive the corresponding directed acyclic graph (DAG) for the transaction. ◦ The DAG of the above transaction Ti is illustrated in figure. Directed Acyclic Graph

◦ Consider the following three transactions:

T1: Read(x) T2: Write(x) T3: Read(x) Write(x) Write(y) Read(y) Commit Read(z) Read(z) Commit Commit