<<

The Principle of , or Guaranteeing in a Heterogeneous Environment of Multiple Autonomous Resource Managers Using Atomic Commitment

Yoav Raz Digital Equipment Corporation, 151 Taylor St. (TAYI), Littleton, Ma 01460

Abstract I Introduction

Commitment Ordering (CO) is a serializability con- management services are in- cept, that allows to be effec- tended to provide coordination for transactionsthat span tively achieved across multiple autonomous Re- multiple resource managers (RMs). source Managers (RMs). The RMs may use A RM is a software component that managesresources different (any) mechanisms. under transactions’ control. A resource is any medium Thus, CO provides a solution for the long standing with well defined states that are being modified and re- global serializability problem. RM autonomy means trieved while obeying transaction’s (“ah or nothing”) se- that no concurrency control information is shared mantics (atomicity). This means that effects of failed with other entities, except Atomic Commitment transactions are undone, which requires that resources’ (AC) protocol (e.g. Two Phase Commitment - 2PC) statesbe recoverable (i.e. if a resource is modified by a messages. CO is a necessary condition for guaran- transaction, the state it had when the transaction started teeing global serializability across autonomous can be restoredbefore the transaction ends). A resource RMs. CO generalizes the popular Strong-Strict Two is typically (but not necessarily) a data item. The scope Phase Locking concept (S-S2PL; “release locks of any specific resource(e.g. granularity units, versions, applied on behalf of a transaction only after the or replications) is defined as a part of a RM’s semantics. transaction has ended”). While S-S2PL is subject Examples of resource managers are systems to , CO exhibits -free executions (DBSs), queue managers,cache managers,some types when implemented as nonblocking (optimistic) of managemententities/objects (e.g. see [EMA], [OSI- concurrency control mechanisms. SMO]) etc. A RM may impose a certain property of the generated transaction histories (transaction event schedules) to guarantee correctnessand certain levels of fault toler- Permission IO copy withoti fee all or part of this material is granted ance. However, the global history, i.e. the combined his- provided that copies are MI ma& or dislributed for direct commercial tory of all the RMs involved, doesnot necessarilyinherit advantage, the VLDB copyright notice and the ti!le of the publication and its dale appear, and nolice ir given thal copying is by permission such a property even if it is provided by all the RMs. of the Very Large Database Endowment. To copy olherwise, or IO re- The serializability (SER) property is an example. pubbh, requires a fee and/or special permission from the endowment. Serializability is the most commonly accepted general criterion for the correctnessof concurrent transactions Proceedings of the 18th VLDB Conference (e.g. see [Bern 871, [Papa 863, and supported in most Vancouver, British Columbia, Canada 1992 RMs. When transactions involve more than one RM,

292 this property may be violated in general, unless special guarantees global serializability is Strong Strict Two measuresare taken, or certain conditions exist to guar- Phase Locking (S-S2PL; “release locks issued on behalf antee it. This issue is dealt with, for example, in [Brei of a transaction only after the transaction has ended”). 901, [Brei 911, [Elma 871, [Geor 911, [Glig 851, [Litw This fact has been known for several years, and has been 891 and rPu 881. [Weih 891 deals with the relationships the major correctnessfoundation for distributed transac- between local and global serializability in the frarne- tions. Various technical documents about distributed work of abstract data types. Achieving global transaction management (e.g. [OSI-CCR]) have men- serializability with reasonable performance, especially tioned it. The observation that local S-S2PL guarantees across RMs that implement different concurrency con- global serializability appears explicitly at least in vu trol mechanisms,has been considered a difficult prob- 881, [Brei 901 and [Brei 9112.The disadvantageof this lem (e.g. [Shet 901, [Silb 913). approachis that all the RMs involved have to implement S-S2PL based concurrency control, even if other types Global serializability can be guaranteed,in principle, by are preferablefor someRMs. several methods if the RMs involved share relevant concurrency control information. Timestamp Ordering In this paper we examine the relationships between his- (TO) is an example (e.g. [Bern 871, borne 901). If all tories of individual RMs and the global history that the RMs involved support TO-based concurrency con- comprises them, and generalize the above observation. trol and share the same timestamps,then the entire sys- We define a history property named Commitment Or- tem can exhibit a coherent behavior basedon TO, which dering (CO), and show that guaranteeing it is a neces- guaranteesglobal serializability. However, this technol- sary and sufficient condition for guaranteeing global ogy requires a certain RM synchronization as well as serializability under the conditions of RM autonomy. timestamp propagation, and is currently unavailable in CO can be implemented as standalone serializability heterogeneousenvironments. Another known method, mechanismsas well as being incorporated with other based on locking, allows RM autonomy. We define a concurrency control mechanisms.Since CO can be en- RM to be autonomous if it does not shareany resources forced solely by controlling the order of transactions’ and concurrency control information (e.g. timestamps) events, it can be combined with any other with another entity (external to the RM), and is being concurrency control mechanism without affecting the coordinated (at the nonapplication level’) solely via mechanism’s resource accessscheduling strategy. This Atomic Commitment (AC) protocols (to achieve global allows selecting and optimizing concurrency control for atomicity). Most systems that support distributed trans- each RM according to the nature of transactions in- action services provide AC protocols and related inter- volved. Enforcing CO does not require aborting more faces. These protocols guaranteeatomicity even in the transactionsthan those needed to be aborted for global presence of certain types of recoverable failures. It serializability violation prevention, which is determined meansthat either a distributed transaction is committed, exclusively by the resource accessorders, and is inde- i.e. its effects on all the resourcesinvolved becomeper- pendent of the commit orders. S-S2PL basedRMs pro- manent, or it is aborted (rolled back), i.e. its effects on vide CO already, since S-S2PLis a special caseof CO. all the resourcesare undone. The most commonly used In summary, serializability of transaction histories atomic commitment protocols are variants of the Two across (any) different RM types, which may use differ- Phase Commitment protocol (2PC - [Gray 781, [Lamp ent concurrency control mechanismsbut provide the CO 761). Examples are Digital Equipment Corporation’s property, is guaranteedwithout any global coordination Distributed Transaction Manager - DECdtm ([DEC- or services but AC. Thus, the CO solution is fully dis- dun]), Logical Unit Type 6.2 of International Business tributed. Machines Corporation ([LU6.2]), and the IS0 - OS1 standard for Distributed ([OSI- Section 2 is an overview and reformulation of DTP]). A well known local (i.e. local to each RM) serializability theory, which provides the foundation for concurrency control mechanism that together with AC analyzing CO. Section 3 defines CO and describes its

‘Typically, a RM is unaware of any resource state dependency with states of resources external to the RM, implied by applica- tions. This is also true in the cases where RMs are coordinataed by multi-database systems, which provide applications with integrated views of resources. 2 [Brei 911 uses the term rigorousness for S-S2PL. [Brei 911 also redefines CO (naming it strong recoverabifify) and uses it to show that applying S-S2PL locally guarantees global serializability. No algorithm for enforcing CO (beyond SS2PL) is given there.

293 properties. Section 4 examines CO schedulersand pre- The events of interest are the following2: sents generic CO algorithms. Section 5 deals with multi l The operation of reading a resource; ri[x] denotes RM histories, atomic commitment, and relationships be- that transaction Ti has retrieved (read) the (partial) tween local and global properties. Section 6 shows that stateof the resourcex. CO is exactly the property required to guaranteeglobal serializability across autonomous RMs. Section 7 pro- l The operation of writing a resource; wi[x] means vides a conclusion. This paper is an abridged version of that transaction Ti has modified (written) the state [Raz 903. of the resourcex. . Ending a transaction; ei means that Ti has ended (has been either committed or aborted) and will not 2 Histories and their properties - introduce any further operations. an overview A transactionobeys the following transaction rules (axi- oms):

This section summarizesand reformulates known con- l TRl cepts and results of concurrency control theory (seealso A transactionTi has exactly a single event ei . [Bern 87]), as well as introducing some new concepts, A value is assignedto ei : ei = c if the transaction as a foundation for the following sections.Some refor- is committed; ei = a if the transaction is aborted. mulation is required to express and prove results that Notation: ei may be denoted Ci or ai when follow. ei = C or ei= a, respectively. . TR2 2.1 Transactions and histories For any operation pi[x] (either ri[x] or wi[x]) pi[X] A transaction, informally, is an execution of a set of <.1 e.1 programsthat accessshared resources. It is required that Two operations on a resource x, pi[X], q.[x] are con- a transaction is atomic, i.e. either the transaction com- flicting if they are noncommutative, i.e. app1. ymg them in pletes successfully and its effects on the resourcesbe- different ordersresults in two different states3of x. come permanent’, or all its effects on the resourcesare A more restrictive4 approach assumesthem to be con- undone. In the first case,the transactionis committed. In flicting, if at least one of them is a write operation. the second, the transaction is aborted. Formally, we use an abstraction that captures only events and relation- A complete history H over a set T of transactions is a ships among them, which are necessaryfor reasoning partial order with a relation

The (binary, asymmetric, transitive, irreflexive) relation l HIS2 that comprisesthe partial order is denoted“

’ The tefin permanentis relative and depends on a resource’s volatility (e.g. sensitivity to process or media failure). 2 More event types such as locking and unlocking may be introduced when necessary. 3 We deal with states informally only. State distinction, and thus operation commutativity. may depend on the RM’s semantics. 4 Two write operations on the same resource may commute, e.g. incremenf and decremenl of a counfer.

294 semantics (as reflected by resource states) is not The conflict types are ww, wr, rw, when pl[x], q2[x] uniquely determined, since if ei = a the effect of are write-write, write-read, and read-write, respectively. w.[x] is undone; i.e. reading x after e. results in re- Remark: Note the asymmetryin the definition above. tri!eving the last state of x that was ditten by other (unaborted)transaction than Ti.) There is a conflict equivalence between two histories H and H’ (the two are conflict equivalent) if they are both Remarks: defined over the sameset of transactionsT, and consist l The subscript H in

2.3. I Serializability Transaction T2 is in a conjlict with transaction TI if P+xl < q*bl for respective conflicting operations q*[xl, pg.

’ A pre$x of a partial order P over a set S is a partial order P’ over a set S’s S, with the following properties: If be S’ and acpb thenakoae S’ If a,bc S’then acpb ifandonlyif acp.b 2 The related state transition function has a history and an event set as arguments. Its values on a given history and its prefixes, or on a given event set and its subsets are compatible. Such a fimaion is neither formalized nor explicitly used in this work. 3 Let P be a partial order over a set S. A projection (resfricfion) of P on a set S’C S is a partial order P’, a subset of P, that consists of all the elements in P, involving elements of S’ only.

295 2.3.2 Recoverability This section defines history properties that guarantee certain desired behavior patterns when aborts occur (see also [Bern 871). Recoverability is an essentialproperty of histories when aborted transactions are present (i.e., in all real situ- ations). Recoverability guaranteesthat committed trans- actions read only resource states written by committed transactions,and hence, no committed transactionsread corrupted states. Recoverability also ensures that a serializable history has the samesemantics (i.e., the his- Figure 2.1: Transaction statesand their transitions tory’s outcome as reflected by the resources’ states)as a conflict-equivalent serial history (when exists; e.g., for complete histories). This may not be true without The Serializability Graph of a history H, SG(H), is the recoverability, if abortedtransactions are present. following directed graph: Let T1 and T2 be two distinct transactions. We say that SG(H) = (T, C) where a transaction T2 reads (a resource x) from (in a read- l T is the set of all unaborted(i.e. committed from conflict, or wrf conflict with; wrf is a special case and undecided)transactions in H of a wr conflict) transaction Tl if T2 reads x before TI is aborted (if aborted), and TI is the last transaction to l c (a subsetof TxT) is a set of edgesthat write x before being read by T2 (i.e., wI[x] < r2[x] and representtransaction conflicts: there is no event t such that wl[x] c t < r2[x], where t Let T 1, T2 be any two transactionsin 3”. is either al or w3[x] of someT3). There is an edgefrom Tl to T2 ifT2 is in a conflict with Tl. It is required that for any two transactionsTI, T2 in H, whenever T2 reads any resource from Tl, aborting TI The Committed Transaction Serializability Graph of a implies aborting T2 (i.e., (T2 reads from T1) implies history H, CSG(H), is the subgraph of SG(H) with all (el = a implies e2 = a) ). To guaranteethis, T2 should the committed transactionsas nodes and all the respec- be decided only after Tl has been decided (this is a nec- tive edges. essarycondition’). Thus, a history H is defined to be re- The Undecided Transaction Serializability Graph of a coverable @EC; in REC) if for any two transactionsTl, history H, USG(H), is the subgraph of SG(H) with all T2 in H, whenever T2 reads any resource from Tl, TI the undecided transactions as nodes and all the respec- ends before T2 does (el < e2), and aborting Tl implies tive edges. aborting T2. Formally, (T2 reads from Tl) implies (el < e2 and (el = a implies e2 = a) ). Theorem 2.1 provides a criterion for checking The above formulation of recoverability allows it to be serializability: enforcedeffectively. Theorem 2.1 - The Serializability Theorem Aborts causedby transactionsreading states written by aborted transactions (cascading aborts) are prevented if A history H is serializable (in SER) if and only if any transaction in H reads only data written by already CSG(I-I) is cycle-free. committed transactions (i.e., (For a proof see,for example, [Bern 871). (T2 readsfrom Tl) implies el = c). Avoiding cascading aborrs (ACA; cascadelessness) is the property which is necessaryand sufficient to guarantee the above condi- tion*: H is ACA (in ACA), if for any two transactions

’ The claim is proven by assuming tie contrary, i.e., 5 < el, and having T2 committed, while T, is later aborted.

* Sufficiency is obvious. Necessity is proven by assuming r&x] c el and having T1 aborted.

296 Tl, T2 in H, (T2 reads x from TI) implies 2.4 On inherently blocking and (el = c and el < r2[x] ). noninherently blocking properties Let Tl, T2 be any two transactionsin H. H is strict (ST; is in ST, has the strictness property) if Somehistory properties can be enforced only by block- w1[xl c r$xl implies el < p$xl, ing mechanisms.A mechanism is blocking, if in some where p2[x3 iseither r2[x] or w2[x]. situations it delays some transaction’s event until a cer- tain event(s)occurs in someother transaction(s). Strictnesssimplifies the restoration of a resource’s state after aborting transactions that have written that re- A mechanism is operation-blocking, if in some situ- source. The recovery procedures of most existing data- ations it delays a transaction’s operation until a certain basesystems rely on strictness. event(s)occurs in someother transaction(s),or aborts all transactions with blocked operations (to avoid Theorem 2.2 follows immediately from the definitions operation-blocking, that otherwise would occur). above: We define a history property to be inherently-blocking, Theorem 2.2 ([Bern 871) if it can be enforced by operation-blocking mechanisms RECX ACA 3 ST only. Otherwise it is noninherently-blocking. where “1” denotesa strict containment. Both serializability and recoverability are noninherently- blocking, since they can always be guaranteedby abort- 2.3.3 Two Phase Locking ing a violating transaction any time before it ends, with- Two PhaseLocking (2PL) is a serializability mechanism out having any operations blocked. This observation is that implements two types of locks’: write locks and the basis for optimistic concurrency control ([Kung 81]), read locks. A write on a resource blocks both read where transactions run without blocking each other’s and write operations of that resource, while a read lock operations,and are aborted before ending, if they violate blocks write operations only. 2PL consists of partition- serializability or any other desired property. 2PL, ACA ing a transaction’s duration to two phases: in the first, and their specialcases, on the other hand, are inherently- locks are acquired; in the second, locks are released blocking. ([Eswa 761). Note that the mutual blocking of two or more transac- A history is defined to be a 2PL hisrory (it is in the class tions is the cause of deadlock situations. Thus, non- 2PL), if it can be generatedby the 2PL mechanism. blocking mechanismsguarantee deadlock-freeness. When combining strictness (ST) with 2PL we get Srricr Remark: In this work we deal with blocking and dead- Two Phase Locking (S2PL = STn2PL). To enforce locks informally only. S2PL, write locks issued on behalf of a transaction are not released until its end. Read locks, however, can be 2.5 On commit-decision releasedearlier, after the end of phaseone of 2PL. delegation The property Strong-S2PL (S-S2PL) requires that all In some situations the decision whether to commit or locks are not releasedbefore the transaction ends (either abort a ready transaction is delegatedfrom one system committed or aborted). (object, component) to another system (object, compo- Formally: A history H is S-S2PL (in S-S2PL) if for any nent) via a notification. This notification is denoted as a conflicting operationspI [x], q2[x] in H (of transactions YES vote on the transaction. Tl. T2 respectively) pI[x] < q2[x] implies el c q2[x]. Definition 2.2 Theorem 2.3 summarizes the relationships among the Let transaction T be in a ready state. System A dele- 2PL classes: gates the commit decision on a transaction T to system Theorem 2.3 B by voting YES on T, if systemA is prepared to either commit T or abort it, according to the decision taken by 2PL I> S2PL 1 s-S2PL

’ A lock is considered any mechanism that blocks resource-access operations.

297 system B. After voting YES system A cannot affect the Definition 2.3 decision anymore. A system (object) that delegates commit decisions obeys Remark SystemA can abort transactions.It cannot vote the ymmit-decision delegation, dependency condition YES on a transactionafter aborting it. (CD C) for transactions Tl and T2 , if it votes YES on T2 only after commiting or aborting Tl , i.e., the follow- Let yi denote the YES voting event by system A on ing relationship holds true: transaction Ti, and di the decision event that takesplace in system B. di takes the values a or c, and may be de- . CD3C noted ai or Ci respectively (when a distinction between el

‘Commiting Ti by system A after the decision is made may involve the completion of write operations that have been written before the voting to a temporary storage, and not to the resource itself. However, in such cases the resource is locked for any operation until the transaction ends, and thus CDDl can be assumed also for this case.

298 Definition 3.1 A history is in CO if for any conflicting operations pl[xl, q2[x] of any committed transactionsTL, T2 re- spectively , P#l < q2bl implies el

Theorem 3.1 SER ICO Proof: (i) Let a history H be in CO, and let .. . + T. + .. . + T. + .. . be a (directed) path in CSG(H)! By the CO definition (definition 3.1) and an induction by the order on the path above, we conclude that Ci < Cj . (ii) Now, supposethat H is not in SER. By theorem 2.1 (without loss of generality) there is a cy- cle T1 + T2 + .. . -+ T, + T1 in CSG(H), where n 2 2. Figure 3.1: Classcontainment relationships

First, let Ti and Tj in (i) be T1 and T2 above, respec- An arrow from a classA to a classB indicates that class tively (consider an appropriate prefix of the expression A strictly contains B; a lack of a directed path between representingthe cycle above). classesmeans that the classesare incomparable. This implies by (i) that c1 < c2. A property is inherently blocking if it can be enforced NOW,let Ti and Tj in (i) be T2 and T1 above, respec- only by blocking transaction’s operations until certain tively (consider an appropriate suffix of the expression eventsoccur in other transactions. representingthe cycle above). This implies that c2 c Cl. However, cl < c2 and c2 < cl contradict each other, since the relation “4’ is asymmetric. Hence CSG(H) is acyclic, and H is in SER by Theorem2.1. 4 Commitment Ordering Now examine the following serializable, non CO history schedulers to conclude that the containment is stricr rlbl w2[xl c2 cl A scheduler is a RM’s componentthat schedulescertain § transactions’ events. Commitment Ordering (CO) sched- ulers are schedulersthat generateCO histories. Generic mechanisms,that can be combined in various ways to The following diagram summarizesthe containment re- implement CO schedulersare presentedin the following lationships between history classes(some relationships sections. The algorithms describedbelow provide addi- are introduced here without proofs). tional (algorithmic) characterizationsfor the properties CO and COnREC.

299 4.1 Schedulers: components and A COCO maintains a serializability graph, the USG, of classification all undecided transactions.Every new transaction proc- essedby the RM is reflected as a new node in the USG; Schedulerstypically deal with three types of transaction every conflict between transactions in the USG is re- events: flected by a directed edge (an edge betweentwo transac- tions may representseveral conflicts). l Transaction initiation l Resourceaccess USG(H) = (UT,C) where l Transaction termination . UT is the set of all undecidedtransactions in a Concentrating on the latter two types’, we model a history H (complete)scheduler as consisting of two components: l c (a subsetof UTxUT) is the set of directed l ResourceAccess Scheduler (RAS) edgesbetween transactionsin UT. There A component that managesthe resourceaccess re- is an edgefrom Tl to T2 , if T2 is in a quests arriving on behalf of transactions,and de- conflict with Tl. cides when to executewhich resourceaccess opera- tion. l Transaction Termination Scheduler(ITS) The set of transactionsaborted as a result of committing A competent that monitors the set of transactions a transaction T (to prevent a future commitment order- and decides when and which transactionto commit ing violation) is defined as follows: or abort. In a multi RM environment this compo- ABORTco(T) = { T’ I The edge T’ + T is in C ) nent participates in atomic commitment procedures on behalf of its RM and controls (within the respec- These aborts cannot be compromized, as stated by tive RM) the execution of the decision reached via lemma4.1: atomic commitment for eachrelevant transaction. Lemma 4.1 A schedulercomponent is blocking if it executescertain Aborting all the members of ABORTco(T), after T is transaction’s event requests only after certain events committed, is necessaryfor guaranteeingCO (assuming have occurred in some other transaction(s). Otherwise, that every transaction is eventually decided). it is nonblocking. Proof: Nonblocking schedulers implement the so called opti- Supposethat T is committed. Let T’ be sometransaction mistic concurrency control approach([Kung 811).When in ABORTco(T). Thus T’ is undecidedwhen T is com- a scheduler is nonblocking, it provides deadlock-free mitted. If T’ is later committed, then c < c’, where c and executions. c’ are the commit events of T and T’ respsctively. How- ever, T is in a conflict with T’, and thus, CO is violated. 4.2 A “pure” CO TTS - The P Commitment Order Coordinator Lemma4.1 is the key for the CO algorithm. (COCO) Algorithm 4.1 - The CO Algorithm The following ‘ITS type, the Commitment Order Coor- Repeatthe following steps: dinator (COCO), checks for CO only and generatesCO histories. The generatedhistories are not necessarilyre- l Select any transaction T in the USG in the ready coverable. Recoverability, if required, can be applied by state (using any criteria, such as by priorities as- enhancing the COCO (seesection 4.2) or by an external signed to each transaction; a priority can be mechanism(e.g., seesection 4.4). changeddynamically as long as the transaction is in the USG), and commit it.

1 A transaction is usually initiated as soon as computing resources are. available, without considering the effect on the histoxy’s properties. Situations where an advantage can be.taken of controlling initiation are ignored here.

300 l Abort all the transactionsin the set ABORTCo(T), the new (undecided) transactions that have been gener- i.e. all the transactions (both ready and active) in atedafter completing step n (and are in USGn+l). the USG that have an edgegoing to T . Examine the following cases after completing iteration l Remove any decidedtransaction (T and the aborted n+l: transactions)from the graph (they do not belong in l Let T2, T3 (wlg) be two committed transactions in the USG by definition). H,. If T3 is in a conflict with T2 then c2 < c3 since Remark: During each iteration the USG should reflect H, is CO by the induction hypothesis. all operations’ conflicts until commit. l Obviously, c2 < cl for every (previously) commit- 9 ted transaction T2 in H, with which Tl is in a con- Example 4.1 flict.

The following figure demonstratesone iteration of the l Suppose that a committed transaction T2 is in a algorithm: conflict with Tl. This means that Tl is in ABORTCo(T2), and thus aborted when T2 was committed. A contradiction. The casesabove exhaust all possible pairs of conflicting committed transactionsin H,+l. Hence,H,, 1 is CO.

By theorems3.1 and 4.1 we conclude the following: Corollary 4.1 Histories generatedby a system that includes a COCO are serializable.

Figure 4.1: A USG If T5 is selectedto be committed, then T3 and T4 are Note that aborting the transactions in ABORTco(T) aborted;T3, T4 and T5 are then removed from the when committing T prevents any cycle involving T be- graph. ing generatedin the CSG in the future. This observation is a direct way to show that the algorithm above guaran- § tees serializability. If a transaction exists, that does not The following theorem states the algorithm’s correct- reside on any cycle in the USG, then a transaction T ex- ness: ists with no edgesfrom any other transaction. T can be Theorem 4.1 committed without aborting any other transaction since ABORTCOO is empty. If all the transactions in the Histories generatedby a scheduler involving a COCO USG are on cycles, at least one transaction has to be (algorithm 4.1) are in CO. aborted when committing another one. If the COCO is Proof: combined with a scheduler (see also section 4.4 below) that guarantees(local) serializability, cycles in the USG The proof is by induction on the number of iterations by are either prevented or eliminated by the scheduler the algorithm, starting from an empty history Ho , and aborting transactions. an empty graph USGO= USG(HO).H0 is CO. In a multi-RM environment, a COCO decides when to Assume that the history H, , generatedafter iteration n, vote YES on a transaction in an atomic commitment is CO. USG, (in its UT component) includes all the un- (AC) protocol, rather than deciding when to commit it. decidedtransactions in Hn. After a notification to commit has arrived, when com- Now perform an additional iteration, number n+l, and mitting a transaction,the actions taken by the COCO are commit transaction Tl (without loss of generality - wlg) the same as for a single RM environment. When using in USG,. H,+l includes all the transactionsin H, and AC, delaying the voting or blocking it usually reduces

301 the number of aborted transactions(see more in section Algorithm 4.2 5 below). Repeatthe following steps:

l Select any ready transaction T in the wr-USG, that 4.3 The CO Recoverability does not have any in-coming Cw,.. edge (i.e. such Coordinator (CORCO) that T is not in ABORTRRC(T’) for any transaction T’ in ABORTCO(T); this avoids the need to later A CORCO is a CO ‘ITS generating histories that are abort T itself ), and commit it. both CO and recoverable. This TTS is an enhancement of the COCO (section 4.2 above), and differs from it l Abort all the transactionsT’ (both ready and active) only in processing additional information to guarantee in ABORTco(T). recoverability, and thus, possibly aborting additional l Abort all the transactionsT” (both ready and active) transactions,to prevent recoverability violations. in ABORTREC(T’) for every T’ aborted in the pre- A CORCO maintains an enhancedserializability graph, vious step (cascadingaborts). wrf-USG: l Remove any decided transaction (T and all the wrfUSG(H) = (lJT,C u Cwr.) where abortedtransactions) from the graph. . UT is the set of all undecidedtransactions in Remarks: the history H. During each iteration the wrf-USG should reflect all op- erations’ conflicts till commit. l c is a set of edgesbetween transactions in UT: There is a C edgefrom TI to T2 , if Transactions on wrf cycles, that are not aborted by the T2 is in a conflict (conflicts) with TI algorithm, are abortedasynchronously. but has not read from TI. § is a set of edgesbetween transactions in Example 4.2 l Cwrf UT as well: There is a Cwr. edgefrom TI The following figure demonstratesone iteration of the to T2 , if T2 has read from TI (and algorithm: possibly is also in conflicts of other types with TI). Note that C and C,@are disjoint.

The set of transactionsaborted as a result of committing T (to prevent future CO violation) is defined as follows:

ABORTCo(T) = ( T’ I T’ + T is in C or CwTf1 A C Edge A c The above definition of ABORTCoQ has the samese- wti we mantics as the definition of ABORTco(T) for the Figure 4.3: A wrf-USG coca. If T5 is selectedto be committed then T3, T4, T.,, T8 are aborted;all committed and aborted transactionsare then removed from the graph. The set of aborted transactionsdue to recoverability, as a result of aborting transactionT’, is defined as follows: AJ-TCo(T5) = ( T3t T4 1 ABORTRRC(T3) = 4 (empty set) ABORTEC(T’) = ( T” I T’ + T” is in Cwrf or ~ORTREctT4> = ( T7, T8 1 T”’ +T” isin Cw § where T”’ is in AI3% RTRl&“) ) The algorithm’s correctnessis statedas follows: Note that the definition is recursive. This reflects the na- Theorem 4.2 ture of cascadingaborts. Histories generatedby a scheduler involving a CORCO A CORCO executesthe following algorithm: (algorithm 4.2) are both CO and recoverable.

302 Proof: The combined mechanismexecutes as follows (SeeFig- ure 4.3 above): The histories generatedare CO by theorem4.1, since a CORCO differs from a COCO only in possibly aborting First, a transaction interacts with a RAS (or a scheduler). additional transactionsduring each iteration (due to the Then if unaborted, when ready, the transaction is con- recoverability requirement). sideredby a CO ‘ITS as a candidateto be committed. A transaction may be aborted to prevent a CO TTS’s con- Since all the transactions that can violate recoverability dition violation. (transactions in ABORTEC(T’) for every aborted transaction T’ in ABORTco(T) ) are aborted during An important property of the CO TTSs is their total pas- each iteration (i.e. transactions that read resourceswrit- sivity with regard to resource access operations. The ten by an aborted transaction before the abort), the gen- RAS (or the complete scheduler) can implement any re- eratedhistories are recoverable. sourceaccess scheduling strategywithout being affected § by a combined CO ITS. The only requirement is that a CO TTS has updatedconflict information (e.g. a USG or By theorems3.1 and 4.2 we conclude the following: a w@USG). Corollary 4.2 When a scheduler delegates the commit decision, the Histories generatedby a system that includes a CORCO following statementsabout recoverability hold true : are both serializable and recoverable. Theorem 4.3 - The Recoverability Inheritance Theo- rem 4.4 Combining a CO TTS with a Let a system consist of a scheduler that guarantees RM’s scheduler recoverability (cascadelessness, strictness) and some The CO TTSs above can be combined with any RAS or component to which it delegatesthe commit decisions a complete scheduler. When a CO ‘ITS is combined on all unaborted transactions. Let the scheduler vote with a scheduler,the scheduler delegatesthe commit de- YES on a transaction when it can be committed (by the cision (see section 2.5 above) to the CO ‘ITS. If all the scheduler) without violating recoverability (cas- components are non-blocking, then also the combined cadelessness, strictness). Then the above system guar- mechanism is non-blocking, and hence ensures antees recoverability (cascadelessness, strictness re- deadlock-freeness: spectively) as well. Corollary 4.3 Proof: There exist schedulers incorporating a COCO or Since the decision notification can arrive any time after CORCO that generatedeadlock free executionsonly. voting, the scheduler has to guarantee the following condition in order to prevent a possible recoverability violation (seea definition of recoverability in section 2.3 By Corollaries 4.1 and 4.2 the combinedRAS (or sched- above): uler) does not necessarily need to produce serializable (T2 readsfrom TI) implies histories in order to guarantee serializability, since the (el c y2 and (el = a implies e2= a) ) CO TTSs above take care of this. Thus, if (T2 reaa3 from Tl) the schedulerdoes not vote YES on T2 before TI is decided (el c y2) and hence el < e2 follows by rules CDD2,3. Since e2 follows eI, if el = a then (by the above condition) the scheduler can and has to enforce (el = a implies 5 = a) by abort- ing T2. Thus, (T, readsfrom Tl) implies (el < e2 and (el = a implies e = a) ) and recoverability is guaranteed.Note that CD3 C for TI and T2 is maintained. Figure 4.3 The casesACA, ST are straightforward. The commit/abort decision processusing any CO TTS 0 combined with a RAS or scheduler

303 From theorem4.3 we conclude corollary 4.4: 5 Mu/tip/e Resource Manager Corollary 4.4 environment If guaranteeingboth CO and recoverability is required, a CORCO is necessaryto be combined with a scheduler only if the scheduler does not guaranteerecoverability 5.1 The underlying transaction by itself. Otherwise a COCO is sufficient. model Remark: Note that if the combined scheduler above is S-S2PL based,then the USG of the respective CO TTS does not have any edges.This meansthat no aborts by the CO ‘ITS are needed,as one can expect, and the en- tire CO TTS is unnecessary.This is an extreme case. Other schedulertypes may induce other properties of the respective USGS. No wrf conflicts are reflected in the USG when the schedulerguarantees cascadelessness; no ww and WI when it guaranteesstrictness. Theorem 4.4 - The Recoverability Enforcement Con- dition Let a systemconsist of a schedulerand somecomponent to which the scheduler delegatesthe commit decisions on all unabortedtransactions. Figure 5.1: A multiple RM environment. RMs are invoked and coordinatedthrough the DS Then l GuaranteeingCD3C for any T1 and T2 such that T2 is in a read-from (wrf, conflict with T1 is a neces- A multi RM environment consistsof several (more than sary condition for the system to guarantee one) autonomous RMs, and a Distributed Services (DS) recoverability. systemcomponent. l GuaranteeingCD3C for any Tl and T2 such that T2 The DS componentprovides: is in a WTor WV conflict with T1 is a necessarycon- l Application @rograms)execution environment dition for the system to guarantee stricmess(ST). l Application communication services Proof: l Application RM accessservices (i) Suppose that recoverability is guaranteed and that CD3C is not guaranteedfor all T1, T2 such that T2 has l TransactionManagement services read from T1. Thus there may exist Tl and T2 such that (transaction demarcation,RM transaction participa- T2 reads from T1 , where the scheduler has voted YES tion registration, synchronization, atomic commit- on both transactions before T1 is decided, violating ment, etc.) CD3C. Now supposethat T1 is aborted by the deciding component and then T2 is committed. This is a violation of recoverability, contrary to the assumption that the No distinction is made between a centralized RM (i.e. system guaranteesrecoverability. confined to a single node) or a distributed one (i.e. exe- cuting and accessingresources on severalnodes). (ii) Suppose that stricmess is guaranteed. Let wl[xl, p2[x] be conflicting operations of T1, T2 , respectively, Note that the RMs’ autonomy implies resourcepartition- where p2[x] is either a read or write operation. Due to ing among the RMs. strictness,el < p2[x] (see section 2.3.2). Since The following terminology and notation are used as p2[x] < y2 by CDDl (see definitin 2.2), also el< y2 well: follows, which is CD3C. . An environment is a DS and a set of RMs, where a Q transaction can span any subset.The RMs involved

304 with a transaction are the participants in the trans- It is assumed that an atomic commitment (AC) protocol action. is applied to guaranteeatomicity in the distributed envi- ronment. An AC protocol implements the following Each RM in an environment has an identifier (e.g. RM 2). general schemeeach time a transaction is decided:

l AC Events are qualified by both a transaction’s identi- fier and a RM’s identifier (e.g. w3 2[x] means a Each participating RM delegates the commit deci- sion by voting either YES or NO (also absenceof a write operation of resourcex by RM ‘2 on behalf of vote within a time limit may be consideredNO) af- transactionT3). ter its respective local subtransaction has reached the ready state, or votes NO if unable to reach the A multiple RM transaction is a generalization of a single ready state. The transaction is committed by all RM transaction: RMs if and only if all have voted YES. Otherwise it is abortedby all RMs. Definition 5.1 Remarks: A transaction Ti consists of one or more local subtrans- actions’. The YES vote is an obligation to end the local sub- A local subtransactionTij accessesall the resourcesun- transaction (commit or abort) as decided by the AC der the control of a participating RM j, that Ti needsto protocol. After voting YES, a RM cannot affect the access,and only these resources (i.e. all its events are decision. qualified with j). After voting NO, a local subtransaction may be A local subtransactionobeys the definition of a transac- abortedimmediately (thus a NO vote may be repre- tion and rules TRl, TR2 in section 2. sentedby the event e.Ij = a). A local subtransactionhas states as defined in section 2. Note that 2PC ([Gray 781, [Lamp 761) is a special caseof AC. A transaction Ti has ~II event di of deciding whether Ti is committed or aborted. di is usually distinct from any The AC protocol type determinesunder what failure event ei . of any local subtransactionTij and takes place and recovery conditions atomicity is guaranteed in the D!I componen?. (e.g. see [Bern 871). No specific AC protocols are dealt with here. Q A distinction between an individual RM’s history and AC enforces the following atomic commitment rules the global history is required as well: (axioms) in addition to the rules CDD (seesection 2.5): Definition 5.2 AC1 If di = c then Yi,j exists and yi j c di for all 1OCd A local history is generatedby a single RM, and defined subtransaction T; ; (i.e. a transa&on is decided to over the set of its local subtransactions. be committed or$ after receiving YES votes from A local history obeys the definition of a history in sec- all the local subtransactions). tion 2. AC2 Notation: Hi is the history generatedby RM i . If di = c then di < ei . for all local subtransaction su transactions are committed § T. . (i.e. all local d oilf y after the AC protocol has decided to commit A global history obeys the definition of a history in sec- the transaction). tion 2. For environments that implement AC we conclude the following:

’ Local subtransactions reflect transaction partitioning over RMs, and are independent of a possible explicit transaction’s partitioning into nested subtransactions by an application. * A local transaction is a transaction that consists of a single local subtransaction; a global transaction consists of two or more. Although a local transaction Ti can be decided locally by some RM j (and not necessarily in the DS), assume for convenience. that yi j, di exist and obey CDD.

305 Lemma 5.1- The Commit Fusion Lemma

Let Tl and T2 be any transactions decided via an AC RMl: HI protocol. rr,l~x’ w2,+4 c2,1 Cl , I If dl = c then RM2: H2 5,2Iy1 c2,2 rl,zly’ cl,z . event < dl if and only if event < yl . for some j, Figure 5.3: The local histories Hl and H2 for any event distinct from yI,j and d 1. Note that the history Hl violates CO, which results in a l dl < event if and only if elj <:event for some j, (global) serializability violation. for any event distinct from elj and dl. The respectiveglobal history H is the following: l For event1 and event2 of Tl and T2 respectively, where even+ i=1,2 is either yi qor di or ei ., RM j enforces’ event, < event, -1$ and only i ! it en- (i.e. ED% for T lj and T2,j;

Thus if di = c then di and all the events yi j and ei j, for every participating RM j, can be fused together into a single event without affecting order relations among other events. Proof: Follows by the rules CDD and AC.

Figure 5.4: The history H and its CSG. The commitment fusion lemma allows in many situations Since the transactions are committed, end-transaction a simplified transaction model to be used. This model events can be fused with respective decision events (by ignors the AC mechanismand assumesthat a (commit- the fusion lemma). ted) multi RM transaction(like a single RM transaction) 0 has a single commitment event. Note that this assump- tion is not valid for the abort events in a multi-RM trans- action. 5.2 Local and global properties Example 5.1 This section examines the relationship between proper- ties of histories generatedby the individual RMs and the The following two transactions both accessresources x global history generated in the environment. History and y. properties are redefined in a way that allows definitions x, y are under the control of RMs I,2 respectively. and results for single RM histories also to be used in the multi-RM case. Definition 5.3 - Global Properties T2,1. w2.rril %,I Let X be a history property, well defined for a single T&2 ‘w32V~ $2 RM’s histories. Let X’, the associated property of X, be a (global) history property defined by a modified defini- tion of X, where every event of ending a committed Figure 5.2: Tl and T2 and their local subtransactions transaction (ei) in the definition of X is replaced with The RMs generate the following histories (yij events the respectivecommit-decision event (di). are omitted): A global history H has property X (is in X) if property X’ is well defined and H has property X’ (i.e. H obeys

’ “Enforces” means that actions taken by RM j generate events that reflect or imply such an order.

306 the definition of X’). (Without guaranteeingCD3C, CO may be violated, and 8 vise versa.) Remarks: We now define propertiesof global histories that reflect propertiesof their respectivelocal histories: l Let X be a history property. In what follows, X also denotes the class of global histories with property Definition 5.4 - Local-X X. Let H be global history generatedin someenvironment, l Note that if a property X does not impose con- and H. its respective local history generatedby RM j in straints on the order of events ei, then the defini- the enlvironment. tions of X a-id X’ are identical (e.g. serializability). H’j, the augmented history of a local history H. , is a l Note that when the fusion lemma (5.1) is imple- history defined over the local subtransactionso f' RM J,. mented,the propertiesX and X’ are identical. where each local subtransactionTij is augmentedwith the event di, Let X be a history property, well defined for single RM Example 5.2 histories, and X’ the associatedproperty of X (see def. CO’, the associatedproperty of CO, is defmed as fol- 5.3). H is in Local-X (is locally X) if H’j of every RM j lows: in the environment is in X’, A history is in CO’ if for any conflicting operations Pl,j[Xl, qzj[x] of any committed transactionsTl, T2 re- Usually (e.g. for all the history properties that we have spectively , Pl,j[‘l < S2j[‘l implies dl < d2. explicitly defined) Local-X 2 X , i.e. if a global history Thus if a (global) history is in CO’ , it is also in CO by is in X, it is in Local-X. In particular, the following rela- definition 5.3. tionships exist as well: 4 Theorem 5.2 The following is a representativeexample for a result of LcKal-x = x (i.e. a (global) history is in X if and definition 5.3 : only if it is in Local-X), Corollary 5.1 where X is any of the following properties: SER 1 CO (i.e. theorem3.1 is valid also for global . REX, ACA, ST, CO, S-S2PL history classes). Proof outline: The theoremfollows by the definitions of global proper- The other class containment releationships described in ties (5.3), Local-X (5.4), the commitment fusion lemma (5.1), the RMs’ resourcepartitioning, and the definitions section 3 follow similarly for global history classes. of REC, ACA, ST, CO and S-S2PL. By lemma 5.1 and the definition of CO we also con- A proof is demonstratedbelow for the case X = CO clude the following: (without using the fusion lemma; it is trivial with the Theorem 5.1 - The Global CO Enforcement Condi- lemma): tion Let H be a global history and Hj its respective local his- tory of RM j, and let H’. be the augmentedhistory of H.. l Let H be a global history. Then CD3C for any com- mitted Tlj and T2 . (i.e. elj < y2,j), such that T2j With out loss of genera&, let Tl and T2 be committed is in a conflict wi3.i T transactionsin H. cient condition for H tol& 1 Eifcessory and ‘@- Let H be in CO. Supposethat for someRM j , T2 j is in a conflict with TIj. This means that T2 is in a conflict and thus with Tl , which implies dl < d2 (CO and definition 5.3). Thus for every RM j , H’j is in CO’ (see definition 5.4) l Guaranteeing CD3C for any Tlj and T2j , such that T2,j is in a conflict with Tl ., and ylj exists al- and H is in Local-CO. Now let H be in Local-CO. If T2 is in a conflict with TI ready, 1s a necessary and sufs’ zcient condition for it meansthat for someRM j , T is in a conflict with guaranteeingCO. 3 25

307 TIj. Since H’. is in CO’ &ocal-CO; definition 5.4) this YES vote on T is possible, the COCO may either choose implies that d-l1< d2. Thus H is in CO (definition 5.3). to do so immediately upon being requested (the non- 0 blocking without delays approach), or to delay the vot- ing for a given, predetermined amount of time (non- Theorem 5.3 blocking with delays). During the delay the set Local-X 2 X (i.e. being in Local-X does not imply ABORTCO(T) may become smaller or empty, since its that a history is in X ), membersmay be decided and removed from the USG, where X is any of the following properties: and since ABORTCO(T) cannot increase after T has reachedthe ready state. l SER, 2PL, S2PL Instead of immediately voting, or delaying the voting for Proof: a given amount of time (which may still result in aborts) Local-X 2 X is true for SER (using theorem 2.1) and the COCO can block the voting on T until all transac- 2PL (follows by definition). tions in ABORTCo(T) are decided.However, if another It is also true for strictness: RM in the environment also blocks, this may result in a global deadlock (e.g., if T’ is in ABORTco(T) for one If H is a global history in ST then (Wi .[x] < pkj[x] im- RM, and T is in ABORTCo(T’) for another RM). plies di < pkj[x]) for any relevant Ril. J by definition 5.3. Thus by definition 5.4 H is in Local-ST. Aborting transactions by timeout is a common mecha- Thus the contaimentis also true for S2PL = STn2PL. nism for resolving such deadlocks.Controlling the time- out by the AC protocol, rather than aborting independ- Now let H be the history in example5.1 above. ently by the RMs, is preferable for preventing The history H is in Local-SER, Local-2PL and Local- unnecessaryaborts. S2PL since both H’I and H’2 are in SER’, 2PL’ and Note that aborting transactionsby the COCO is neces- S2PL’. sary only if a local cycle in its USG is not eliminated by some external entity (e.g. a scheduler that generatesa However H is not in SER, 2PL or S2PL: cycle-free local USG or one that uses aborts to resolve l CSG(H) has a cycle, so by theorem 2.1 H is not in local cycles), or if a global cycle (acrosstwo or more lo- SER. cal USGS) is generated.Since the cycles are generated exclusively by the way the RASs operate and are inde- l If it is in 2PL or S2PL it is also in SER, and we pendent of the commit order, the COCO does not have have a contradiction. to abort more transactionsthat need to be aborted (also 0 when using any other concurrency control) for serializability violation prevention. 5.3 On generating global CO histories This section describeshow the Commitment Order Co- ordinator (COCO) defined in section 4.2 takes part in 6 Guaranteeing global AC to guaranteeglobal CO histories. (The CORCO de- serializability by Local scribed in section4.3 is handled similarly.) Commitment Ordering In a multi-RM environment that implements AC, a COCO typically receives a request via an AC protocol This section shows necessaryand sufficient conditions to commit sometransaction T in the USG. If the COCO for guaranteeingglobal serializability when the RMs are can commit the transaction, it votes YES on it via AC, autonomous, i.e. the only exchangesbetween them for which is an obligation to either commit or abort accord- the purpose of transaction management are those of ing to the decision reached by the AC protocol. Later, atomic commitment protocols (with no piggy-backing of after being notified of the decision, if T is committed, all any additional concurrency control information; e.g. transactionsin ABORTCO(T) need to be aborted (by al- transaction timestamps). gorithm 4.1). Thus the COCO (say, of RM i) has to de- lay its YES vote on T, if it has voted YES on any trans- action in ABORTco(T) (CD3C by theorem 5.1). When

308 6.1 Local-CO is sufficient for - Repeat(asynchronously when possible) the fol- global serializability lowing procedure: Commit (via AC) any ready transaction, that The following is a consequenceof theorem 3.1 (corol- cannot cause a serializability violation, and lary 5.1) and theorem 5.2 : abort all the PR transactions. Theorem 6.1 The resulting global histories are proven to be in CO. SER 3 Local-CO (i.e. if a history is in Local-CO Theorem 6.2 then it is globally serializable). . If all the RMs in the environment are autonomous and provide local serializability, then CS is a neces- sary strategy for each RM, in order to guarantee Remark: Local-CO is maintained, if all the RMs in the global serializability. environment use any types (possibly different, e.g. CORCO and S-S2PLbased) of CO mechanisms. . If CS is implementedby all the RMs, the global his- tories generatedare in Local-CO., 6.2 Conditions when Local-CO is Proof: necessary to guarantee global The Serializability Theorem (theorem 2.1) implies that serializability the serializability graph provides all the necessaryinfor- mation about serializability. We assumethat every RM, Theorem6.1 statesthat local CO is a sufficient condition say RM i, “knows” its local serializability graph SGi (it for global serializability. We now use (informally) includes all the committed and undecided transactions) Knowledge Theory based arguments (see for example and its subgraphs CSGi (includes committed transac- [Halp 871, [Hadz 871) to prove that Local-CO is also tions only) and USGi (includes all undecided aansac- necessary for guaranteeing global serializability in a tions). We also assume(based on AC) that each RM has multi-RM environment, when the RMs support local committed a transaction, if and only if it has voted Yes, serializability and use atomic commitment exchanges and “knows” that all other RMs participating in a trans- only for coordination. (necessary for guaranteeing action have voted Yes, and will eventually commit it. meansthat otherwise a violation may occur, see defini- tion 2.1.) The goal for each RM is to guarantee a cycle-free (global) CSG (committed transaction serializability The necessity in CO is proven by requiring that each graph), by avoiding any action that may create a global RM avoid committing any transaction that can poten- cycle (local cycles in CSGi are eliminated by RM i, tially cause a serializability violation when committed. since local serializability is assumedin the theorem). If it is clear that a transactionremains in such a situation forever (based on knowledge available to the RM lo- cally), the transaction is aborted.We namesuch a trans- First, CS is trivially necessaryfor the following reasons: action a permanent risk (PR). The PR property is rela- since a PR transaction remains PR for ever (by defini- tive to a RM. The above requirement implies that each tion), it cannot be committed and thus must be aborted RM in the environment has to implement the following to free computing resources. On the other hand, any commitment strategy (CS): ready transaction that cannot causea serializability vio-

l cs lation can be committed. - Starting from a history with no decidedtransac- We now need to identify PR transactions, while imple- tions, commit any ready transaction via an AC menting CS. We show that this implies that each RM protocol. Every other transaction that is a PR is executesalgorithm 4.1. aborted’. Each RM implementsCS as follows:

’ A hidden axiom is assumed, that computing resources are not held unnecessarily. Otherwise, PR transactions can be marked and kept undecided forever. Aborting such transactions and reexecutiug them also supports a general concept of fuirness that requests a transaction’s successful com- pletion within a reasonable time interval.

309 l Base stage: mentation of the strategy), as long as CD3C is Assume that CSGi doesnot include any transaction. maintained for T’ and T, where there exists an edge Commit any ready transactionT (via AC). T’ + T in USGi. Without enforcing CD3C, a Supposethat prior to committing T there is an edge committed transaction can be identified later as a T’ + T in USGi. It is possible that there exists an PR in RM i, and can cause a serializability viola- edge T + T’ in some USGj of someRM j, j#i, but tion. RM i, though, cannot verify this (due to the auton- In the CS implementation above, all the PR transactions omy requirement). This means that committing T’ are identified and aborted at each stage.Examining this later may causea cycle in CSG. Since committing T implementation we conclude that it resuhs in exactly cannot be reversed(see transaction state transitions performing algorithm 4.1 in eachRM (at the stagewhen in section 2), no future event can change this situ- T is committed in RM j, the set of all PR transactionsis ation. Hence T’ is a PR (in RM i), and RM i must exactly the set ABORTco(T) ). Hence, by theorem4.1 abort it (by voting No via AC) upon committing T. eve RM involved guaranteesCO, and by enforcing . Inductive stage: CD jrC, also CO’. This meansthat the generated(global) Supposethat CSGi includes at least one transaction. history is in Local-CO. The only possible deviation from We show that no ready transaction can cause a the implementation above is by aborting additional serializability violation if committed, and hence can transactions at each stage. Such a deviation still main- be committed (provided that a consensusto commit tains the generatedhistory in Local-CO. is reachedby all the participating RMs via AC): 0 Commit any ready transactionT. Theorems6.1 and 6.2 imply the following: (i) Examine any undecided transactions T’ (in USGi). Corollary 6.1 Supposethat prior to committing T there is an edge Guaranteeing Local-CO is a necessary and s@cient T’ + T in USGi. Using again the argumentsgiven condition for guaranteeing (global) serializability in an for the basestage, T’ is a PR, and RM i must abort environment of autonomous RMs. it. If there is no edge from T’ to T, there is no path possibly left from T’ to T, after aborting the PR transactions above. Thus no additional T’ is a PR and no decision on T’ is taken at this stage. 7 Conclusion (ii) Examine now any previously committed trans- action T” (in CSGi). This work generalizes a previously known result, that It is impossible to have a path T +... + T” in CSGi Strong Strict Two Phase Locking (S-S2PL) together or in CSGj for any RM j, j#i , since, if this path ex- with Two Phase Commit (2PC) guarantee global isted at the stagewhen T” was committed, it would serializability in a multi resource manager (RM) envi- have been disconnected during that stage, when ronment. The new concept defined here, Commitment aborting all the PR transactions (with edges to T”; Ordering (CO), provides additional ways to achieve using (i) above), and since no incoming edgesto T” global serializability, through different concurrency con- could have been generatedafter T” has been com- trol mechanisms,that may provide deadlock-free execu- mitted. Hence,only a path T” +... + T can exist in tions. This allows the levels of concurrency to be con- CSGi or in CSG. for any RM j, j#i. This meansthat trolled by local trade-offs between blocking no cycle in CSd through T and T” can be created, implementationsof CO (e.g. S-S2PL),which are subject and no T” needsto be aborted (which is impossible to deadlocks, and deadlock-free CO implementations, since T” is committed, and would fail the strategy). which are subject to cascading aborts. To guarantee The argumentsabove ensure that no ready transac- global serializability, no services, but those of atomic tion can causea serializability violation when com- commitment, are necessaryfor the coordination of trans- mitted at the beginning of an inductive stage,as was actions across RMs, if each RM supports CO. Another assumed,and hence (any ready transaction)T could result shown is that guaranteeing CO is necessary for have been committed. Note that committing a trans- guaranteeing global serializability, when the RMs in- action can start before the commit process is com- volved are autonomous (i.e. when only atomic commit- pleted for a previous one (i.e. a concurrent imple- ment is used for RM coordination).

310 The relationships between various properties of the his- References tories generatedby the individual RMs and respective properties of the respectiveglobal history are examined, [Bern 871 P. Bernstein, V. Hadzilacos,N. Goodman, and assuming that atomic commitment is used, it is Concurrency Control and Recovery in Data-base Sys- shown which properties, CO in particular, are preserved tems, Addison-Wesley, 1987. globally when applied locally by the RMs. [Brei 901, Y. Breibart, A. Silberschatz, G. Thomp- Generic CO enforcing mechanisms are described as son, “Reliable Transaction Management in a Multida- well, and their behavior in a multi-RM environment is tabase System”, in Proc. of the ACM SIGMOD Int. examined. Since CO can be enforcedlocally in each RM Conf. on Management of Data, Atlantic City, New Jer- (most existing commercial databasesystems are S-S2PL sey, June 1990 based, and already provide CO), no change in existing atomic commitment protocols and interfaces is required [Brei 911, Y. Breibart, Dimitrios Georgakopoulos, to utilize the CO solution. Marek Rusinkiewicz, A. Silberschatz, “On Rigorous Transaction Scheduling”, IEEE Trans. Soft. Eng., Vol The study presentedin this work suggeststhat CO is a 17, No 9, September1991. practical, fully distributed solution for the global serializability problem in a distributed, high- [DECdtm] J. Johnson,W. Laing, R. Landau, “Trans- performance transaction-processingenvironment (see action ManagementSupport in the VMS Operating Sys- also [Raz 91a] for implementation-oriented aspects of tem Kernel”, Digital Technical Journal, Vol 3, no. 1, CO). Winter 1991. Philip A. Bernstein, William T. Ember- Autonomy implies that a RM has no knowledge of ton, Vijay Trehan, “DECdta - Digital’s Distributed whether a transaction is local, i.e. confined to the RM, Transaction ProcessingArchitecture”, Digital Technical or global, i.e. spanning more than one RM. If a RM is Journal, Vol3, no. 1, Winter 1991. coordinated with other RMs via AC protocols only, and in addition can identify its local transactions(e.g. by no- [EM Al Enterprise Management Architecture - tifications from applications (either implicitly or explic- General Description, Digital Equipment Corporation, itly), or through AC protocols), it is said to have ex- EK-DEMAR-GD-001. tended knowledge autonomy (EKA). Since local [Elma 901 A. Elmagannid, W. Du, “A Paradigm for transactions do not need to be coordinated acrossRMs Concurrency Control in HeterogeneousDistributed Da- via AC protocols, they do not needto obey the CO con- tabase Systems”, Proc. of the Sixth Int. Con& on Data dition for the purpose of global serializability. Under Engineering, Los Angeles, California, February 1990. EKA a more general property, Extended Commitment Ordering (ECO), is necessary to guarantee global [Eswa 761 Eswaran, K.P., Gray, J.N., Lorie, R.A., serializability (see[Raz 91b]). EC0 reducesto CO when Traiger, I.L., “The Notions of Consistencyand Predicate all the transactionsare assumedto be global. Locks in a DatabaseSystem”, Comm. ACM 19(11), pp. 624633,1976. Acknowledgments [Gear 911 Dimitrios Georgakopoulos, Marek Rus- inkiewicz, Amit Sheth, “On serializability of Multi data- base Transactions Through Forced Local Conflicts”, in Many thanks are due to several people for useful dis- Proc. of the Seventh Int. Co& on Data Engineering, cussions on the subject of this work or commentson an Kobe, Japan,April 1991. early version of this paper: Rob Abbott, Phil Bernstein, Edward Braginsky, Jeff East, Bill Emberton, Maurice [Glig 851 V. Gligor, R. Popescu-Zeletin, Herlihy, Mei Hsu, Walt Kohler, Dave Lomet, Bob Tay- “Concurrency Control Issuesin Distributed Heterogene- lor, Vijay Trehan and Mark Tuttle. ous DatabaseManagement Systems”, in F. A. Schreiber, Special thanks are due to Vijay Trehan for continued en- W. Litwin editors, Distributed Data Sharing Systems, couragementand support. pp. 43-56, North Holland, 1985. [Gray 781 Gray, J. N., “Notes on DatabaseOperat- ing Systems”,Operating Systems: An Advanced Course,

311 Lecture Notes in Science 60, pp. 393-481, Ipu 881 Calton Pu, “Transactionsacross Heteroge- Springer-Verlag, 1978. neous : the Superdatabase Architecture”, Technical Report No. CUCS-243-86 (revised June [Hadz 871 VassosHadzilacos, “A Knowledge Theo- 1988), Department of Computer Science, Columbia retic Analysis of Atomic Commitment Protocols”, Proc. University, New York, NY. of the Sixth ACM Symposium on Principles of Database Systems, pp. 129-134,March 23-251987. [Raz 901 Yoav Raz, “The Principle of Commitment Ordering, or GuaranteeingSerializability in a Heteroge- [ Halp 871 Joseph Y. Halpem, “Using Reasoning neous Environment of Multiple Autonomous Resource about Knowledge to Analyze Distributed Systems”, Re- Managers”, DEC-TR 841, Digital Equipment Corpora- search Report RJ 5522 (56421) 3/3/87, Computer Sci- tion, November 1990,revised April 1992. ence, IBM Almaden ResearchCenter, San Jose,Cahfor- nia, 1987. [Raz 91al Yoav Raz, “The Commitment Order Co- ordinator (COCO) of a ResourceManager, or Architec- [Kung 811 Kung, H. T., Robinson, J. T., “On Opti- ture for Distributed Commitment Ordering Based mistic Methods for Concurrency Control”, ACM Tras. Concurrency Control”, DEC-TR 843, Digital Equipment on Database Systems 6(2), pp. 213-226,June 1981. Corporation, December1991, revised April 1992. [Lamp 761 Lampson, B., Sturgis, H., “Crash Recov- [Raz 91bl Yoav Raz, “Extended Commitment Or- ery in a Distributed Data Storage System”, Technical dering, or GuaranteeingGlobal Serializability by Apply- Report, Xerox, Palo Alto Research Center, Palo Alto, ing Commitment Order Selectively to Global Transac- California, 1976. tions”, DEC-TR 842, Digital Equipment Corporation, [Litw 891 Litwin, W., H. TilTi, “Flexible November 1991, revised April 1992. Concurrency Control Using Value Date”, in Integration [Shet 901 Amit Sheth, James Larson, “Federated of Information Systems: Bridging Heterogeneous Data- DatabaseSystems”, ACM Computing Surveys, Vol. 22, bases, ed. A. Gupta, IEEE Press,1989. No 3, pp. 183-236,September 1990. [Lome 903 David Lomet, “Consistent Timestamping [Silb 911 Avi Silberschatz, , for Transactionsin Distributed Systems”,Technical Re- Jeff Ullman, “DatabaseSystems: Achievements and Op- port CRL 90/3, Digital Equipment Corporation, Cam- portunities”, Communications of the ACM, Vol. 34, No. bridge ResearchLab, September1990. 10, October 1991. [LU6.2] System Network Architecture - Format [Weih 891 William E. Weihl, “Local Atomicity and Protocol ReferenceManual: Architecture Logic for Properties: Modular Concurrency Control for Abstract LU Type 6.2, SC30-3269-3,International BusinessMa- Data Types”, ACM Transactions on Programming Lan- chines Corporation, 1985. guages and Systems, Vol. 11, No. 2, pp. 249-282, April [OSI-CCR] ISO/IEC IS 9804, 9805, JTCl/SC21, In- 1989. formation Processing Systems - Open Systems Intercon- nection - Commitment, Concurrency and Recovery serv- ice element, October 1989. ISO/IEC JTCl/SC21 N4611 Addendum, CCR Tutorial - Annex C of IS0 9804, August 1990. [OSI-SMO] ISO/IEC DP 10040,JTCl/SC21, Informa- tion Processing Systems - Open Systems Interconnection - System Management Overview, September1989. [OSI-DTP] ISO/IEC DIS 10026 (1,2,3), JTCl/SC21, Information Processing Systems - Open Systems Inter- connection - Distributed Transaction Processing, @to- ber 1989. [Papa863 Papadimitriou, C. H., The Theory of Concurrency Control, Computer SciencePress, 1986.

312