3.1 Distributed transactions: Local / global

Global transactions: 3. Serialization in Distributed DB Access data at multiple servers, steps at one or more location 3.1 Distributed Transactions Relevant in both homogeneous and heterogeneous 3.2 Global and local TA in federations heterogenous distributed DB 3.3 Local and global histories Local transactions: 3.4 Direct and indirect conflicts Run exclusively on a single server 3.5 Ticket based CC relevant in heterogeneous federations

For now: global TAs

based on Weikum / Vossen: Transactional Information Systems, chap. 18 © HS-2010 HS / TA-DCC-2- 2

Homogeneous / heterogeneous TA Heterogeneous DB federation

A distributed DB server infrastructure is homogenous if the A distributed DB infrastructure is heterogeneous if nodes - Nodes have autonomy - Use the same type of system (say DB2) - May be of different type - Distribution is transparent to the transaction program - TA programs may start subtransactions on - No local autonomy of the nodes other nodes - TA have a global coordinator

Typical configuration: • parallel DBS Typical situation: • Multiple Data Managers in 3-tier architecture Distributed applications on autonomous systems. for performance / reasons Example: sells a product and sends a transport order to DHL

© HS-2010 HS / TA-DCC-2- 3 © HS-2010 HS / TA-DCC-2- 4

Global history

TA = { T1 , ... , Tm } executed in a federation of n sites Example Steps of Ti executed at different sites with 2 sites, one holds x, the other y: local histories H1,…, Hn t Server 1: r1(x) w2(x) c1 c2 Server 2: w1(y) c1 r2(y) c2

H is a global history for TA with respect to global history: H1 , ... , Hn if the local projections of H equal the local history at each site s = r1(x) w1(y) w2(x) c1 r2(y) c2 Πi(H) = Hi for 1 <= i <= n

Note: each site has to commit for each TA © HS-2010 HS / TA-DCC-2- 5 © HS-2010 HS / TA-DCC-2- 6

1 Global correctness Global Global conflict serializability

A global history S is globally conflict serializable if there Example: exists a serial history over the global (sub-) transactions that is conflict equivalent to S. Server 1: r1(x) w2(x) c1 c2 Server 2: r2(y) c2 w1(y) c1 Local serializability at all sites not sufficient! Local histories serializable, however global:

r2(y) w1(y) r1(x) w2(x) c1 c2

© HS-2010 HS / TA-DCC-2- 7 © HS-2010 HS / TA-DCC-2- 8

Global conflict serializability Global conflict serializability Given: global history with local histories H1 , ... , Hn involving a set TA of transactions Ti and each Hi is conflict serializable. Crucial point for isolation guarantee in distributed how? transactions: Theorem: S is globally conflict serializable Total ordering among the TA has to be established ⇔ there exists a total order “<” on TA that is consistent by some means like a manager. with each local serialization order of the transactions. Not a big deal in homogeneous distributed DB without autonomy… Proof: obvious... but in heterogeneous, autonomous systems.

© HS-2010 HS / TA-DCC-2- 9 © HS-2010 HS / TA-DCC-2- 10

3.2 Heterogeneity Heterogeneous federation Heterogeneous Federations Global Independent servers with local autonomy transactions . . . - If all servers happen to be independent servers: multidatabase system (MDBS) Global - No global knowledge, TA manager Ö global protocols are not an option! - MDBS: loosely coupled distributed Local TA managers ... Important: and data - local transactions can interfere with global ones servers - can solutions with scheduling guarantees be built from localized components?

© HS-2010 HS / TA-DCC-2- 11 © HS-2010Local transactions HS / TA-DCC-2- 12 Weikum / Vossen: Transactional Inf. Systems

2 3.3 Local and global histories Local and global History Local history at site S Let Ti= {ti} be set of local and global transactions, s1,…sn All operations of local TAs at S and those operations local histories (at servers S1,...,Sn) in a federated system. of global TAs executed at S (subtransactions at S) A global history h is a sequence of - exactly the operations of all the local and S1 = {a,b}, S2 = {c,d,e} /* data global transactions and t1 = r(a) w(b) local at S1 - local projections of h are the local ones (si) t2 = w(d) r(e) local at S2 t3 = w(a) r(d) global t4 = w(b) r(c) w(e) global Simplification: resources disjoint, Example: red: local TAs no . s1: r1(a) w3(a) c3 w1(b) c1 w4(b) c4 Local histories: s2: r4(c) w2(d) r3(d) c3 r2(e) c2 w4(e) c4 s1: r1(a) w3(a) c3 w1(b) c1 w4(b) c4 r1(a) r4(c) w2(d) r3(d) w3(a) c3 r2(d) c2 c3 w1(b) c1 w4(b) w4(e) c4 c4 s2: r4(c) w2(d) r3(d) c3 r2(e) c2 w4(e) c4 © HS-2010 HS / TA-DCC-2- 13 © HS-2010 HS / TA-DCC-2- 14

Serializability of global history Serializablity of global history

Global history serializable? Counterexample with serializable local histories and conservative scheduling of t1,t2 without a serializable global r1(a) r4(c) w2(d) r3(d) w3(a) c3 r2(d) c2 c3 w1(b) c1 w4(b) w4(e) c4 c4 one

S1 = {a}, S2= {b,c} s1: r1(a) w3(a) c3 w1(b) c1 w4(b) c4 s2: r4(c) w2(d) r3(d) c3 r2(e) c2 w4(e) c4 Global TA: t1 = r(a), w(b), t2 = w(a) r(c) Local TA t3 = r(b) w(c) Note: without s1: t1 < t3, t1 < t4 Ö t1 t3 t4 s2 : t2 < t4, t2 < t3 Ö t2 t3 t4 local TA serializable Local histories: Global history is serializable: t1 < t2 < t3 < t4 Ö t1 t2 t3 t4 S1: r1(a) w2(a) S2: r3(b) w1(b) r2(c) w3(c) … by chance! Local serializability does local histories serializable: t1, t2 and t2 t3 t1 NOT imply global serializability global history is not: t1 t2 t3 ?? # t2 t3 t1 ?? #

© HS-2010 HS / TA-DCC-2- 15 © HS-2010 HS / TA-DCC-2- 16

Global histories Global histories

Consequence: Even read-only transactions may be in conflict (!)

Serializability of global TA without locals not sufficient to s1= {a ,b} , S2 = {c,d } guarantee global serializability in heterogeneous federations S1: r1(a) r3(a) r3(b) w3(a) w3(b) r2(b) Reason: indirect conflict between t1 and t2 at S2 caused by local TA t3: S2: r2(c) r4(c) r4(d) w4(c) w4(d) r1(d)

r3(b) w1(b) r2(c) w3(c) Global transactions t1 and t2 (both read-only!) are serialized differently at either site. Solution in principle: Global TA manager has to t1 guarantee same serialization sequence of global TAs t3 t2 t2 t4 t1 also in case of indirect conflicts t4 © HS-2010 HS / TA-DCC-2- 17 © HS-2010 HS / TA-DCC-2- 18

3