<<

2 Announcement

™ Project milestone #2 due today (April 14) Transaction Processing: ™ Homework #4 due in 9 days (April 23) ™ Recitation session this Friday (April 18) ƒ Homework #4 Q&A ™ Project demo period in two weeks (April 28-May 3) CPS 216 ƒ Sign-up sheet will be available next Monday Advanced Systems ™ Final exam in 17 days (May 1, 2-5pm)

3 4 Review Concurrency control

™ ACID ™ Goal: ensure the “I” (isolation) in ACID ƒ Atomicity: TX’s are either completely done or not done at all ƒ Consistency: TX’s should leave the database in a consistent state T1: T2: ƒ Isolation: TX’s must behave as if they are executed in isolation read(A); read(A); ƒ Durability: Effects of committed TX’s are resilient against failures write(A); write(A); read(B); read(C); ™ SQL transactions write(B); write(C); -- Begins implicitly ; commit; SELECT …; UPDATE …; ROLLBACK | COMMIT; A B C

5 6 Good versus bad schedules Serial schedule

Good! Bad! Good? ™ Execute transactions in order, with no interleaving of operations T1 T2 T1 T2 T1 T2 ƒ T1.r(A), T1.w(A), T1.r(B), T1.w(B), T2.r(A), T2.w(A), r(A) r(A) r(A) T2.r(C), T2.w(C) w(A) Read 400 r(A) w(A) ƒ T2.r(A), T2.w(A), T2.r(C), T2.w(C), T1.r(A), T1.w(A), r(B) w(A) Read 400 r(A) Write T1.r(B), T1.w(B) w(B) 400 – 100 w(A) w(A) Write )Isolation achieved by definition! r(A) r(B) 400 – 50 r(B) w(A) r(C) r(C) ™ Problem: no concurrency at all r(C) w(B) w(B) ™ Question: how to reorder operations to allow more w(C) w(C) w(C) concurrency

1 7 8 Conflicting operations Precedence graph

™ Two operations on the same data item conflict if at ™ A node for each transaction least one of the operations is a write ™ A directed edge from Ti to Tj if an operation of Ti ƒ r(X) and w(X) conflict precedes and conflicts with an operation of Tj in the ƒ w(X) and r(X) conflict schedule

ƒ w(X) and w(X) conflict T T T T 1 2 T 1 2 T ƒ r(X) and r(X) do not 1 1 r(A) r(A) w(A) r(A) ƒ r/w(X) and r/w(Y) do not T T r(A) 2 w(A) 2 ™ Order of conflicting operations matters w(A) w(A) r(B) Good: r(B) Bad: ƒ E.g., if T .r(A) precedes T .w(A), then conceptually, T 1 2 1 r(C) no cycle r(C) cycle should precede T2 w(B) w(B) w(C) w(C)

9 10 Conflict-serializable schedule Locking

™ A schedule is conflict-serializable iff its precedence ™ Rules graph has no cycles ƒ If a transaction wants to read an object, it must first ™ A conflict-serializable schedule is equivalent to some request a shared lock (S mode) on that object serial schedule (and therefore is “good”) ƒ If a transaction wants to modify an object, it must first request an exclusive lock (X mode) on that object ƒ In that serial schedule, transactions are executed in the topological order of the precedence graph ƒ Allow one exclusive lock, or multiple shared locks ƒ You can get to that serial schedule by repeatedly Mode of the lock requested swapping adjacent, non-conflicting operations from SX different transactions Mode of lock(s) SYesNo Grant the lock? currently held XNoNo by other transactions Compatibility matrix

11 12 Basic locking is not enough Two-phase locking (2PL)

Add 1 to both A and B T1 T2 Multiply both A and B by 2 (preserve A=B) (preserves A=B) ™ All lock requests precede all unlock requests lock-X(A) Read 100 r(A) ƒ Phase 1: obtain locks, phase 2: release locks Write 100+1 w(A) T1 T2 T1 T2 unlock(A) 2PL guarantees a lock-X(A) lock-X(A) T r(A) conflict-serializable r(A) r(A) Read 101 1 Possible schedule w(A) schedule w(A) under locking w(A) Write 101*2 lock-X(B) r(A) unlock(A) unlock(A) w(A) T2 lock-X(A) But still not lock-X(B) r(A) r(B) conflict-serializable! r(B) Read 100 w(A) w(B) w(B) Write 100*2 lock-X(B) r(B) unlock(B) r(B) w(B) lock-X(B) w(B) Read 200 r(B) A ≠ B! r(B) Cannot obtain the lock on B until T unlocks Write 200+1 w(B) w(B) 1 unlock(B) unlock(B)

2 13 14 Problem of 2PL Strict 2PL

T1 T2 ™ T2 has read uncommitted ™ Only release locks at commit/abort time r(A) data written by T ƒ A writer will block all other readers until the writer w(A) 1 r(A) commits or aborts ™ If T1 aborts, then T2 must w(A) r(B) abort as well w(B) r(B) ™ Cascading aborts possible if ) Used in most commercial DBMS (except Oracle) w(B) other transactions have read … Abort! data written by T2

™ Even worse, what if T2 commits before T1? ƒ Schedule is not recoverable if the system crashes right

after T2 commits

15 16 Deadlocks Dealing with deadlocks

™ Impose an order for locking objects ƒ Must know in advance which objects a transaction will access ™ Timeout T T 1 2 Deadlock: cycle in the wait-for graph ƒ If a transaction has been blocked for too long, just abort lock-X(A) r(A) ™ Prevention T1 w(A) lock-X(B) ƒ Idea: abort more often, so blocking is less likely r(B) T1 is waiting for T2 T2 is waiting for T1 w(B) ƒ Suppose T is waiting for T’ T lock-X(B) lock-X(A) 2 • Wait/die scheme: Abort T if it has a lower priority; otherwise T waits Deadlock! r(B)r(A) • Wound/wait scheme: Abort T’ if it has a lower priority; otherwise T waits w(B)w(A) ™ Detection using wait-for graph ƒ Idea: deadlock is rare, so only deal it when it becomes an issue ƒ When do we detect deadlocks? ƒ Which transactions do we abort in case of deadlock?

17 18 Implementation of locking Multiple-granularity locks

™ Do not rely on transactions themselves to ™ Hard to decide what granularity to lock Database lock/unlock explicitly ƒ Trade-off between overhead and concurrency Tables

™ DBMS inserts lock/unlock requests automatically ™ Granularities form a hierarchy Pages Transactions ™ Allow transactions to lock at different Streams of operations Rows granularity, using intention locks Insert lock/unlock requests Operations with ƒ S, X: lock the entire subtree in S, X mode, respectively lock/unlock requests ƒ IS: intend to lock some descendent in S mode Lock Scheduler ƒ IX: intend to lock some descendent in X mode Lock info for each object, Serialized schedule with no including locks currently held lock/unlock operations ƒ SIX (= S + IX): lock the entire subtree in S mode; and the request queue intend to lock descendent in X mode

3 19 20 Multiple-granularity locking protocol Examples Mode of the lock requested S X IS IX SIX ™ T1 reads all of R Mode of lock(s) SYes Yes ƒ T gets an S lock on R X 1 currently held Grant the lock? IS Yes Yes Yes Yes ™ T scans R and updates a few rows by other transactions IX Yes Yes 2 SIX Yes ƒ T2 gets an SIX lock on R, and then occasionally get an X Compatibility matrix lock for some rows ™ Lock: before locking an item, T must acquire intention locks on all ancestors of the item ™ T3 uses an index to read only part of R ƒ To get S or IS, must hold IS or IX on parent ƒ T3 gets an IS lock on R, and then repeatedly gets an S • What if T holds S or SIX on parent? lock on rows it needs to access ƒ To get X or IX or SIX, must hold IX or SIX on parent ™ Unlock: release locks bottom-up ™ 2PL must also be observed

21 22 Phantom problem revisited Solutions

™ Reads are repeatable, but may see phantoms ™ Index locking ™ Example: different average ƒ Use the index on Student(age)

ƒ -- T1: -- T2: ƒ T2 locks the index block(s) with entries for age = 10 SELECT AVG(GPA) • If there are no entries for age = 10, T2 must lock the index FROM Student WHERE age = 10; block where such entries would be, if they existed! INSERT INTO Student VALUES(789, ‘Nelson’, 10, 1.0); ™ Predicate locking COMMIT; ƒ “Lock” the predicate (age = 10) SELECT AVG(GPA) ƒ FROM Student WHERE age = 10; Reason with predicates to detect conflicts COMMIT; ƒ Expensive to implement ) How do you lock something that does not exist yet?

23 24 Concurrency control without locking Optimistic concurrency control

™ Locking is pessimistic ™ Optimistic (validation-based) ƒ Use blocking to avoid conflicts ƒ Overhead of locking even if contention is low

™ Timestamp-based ™ Optimistic concurrency control ƒ Assume that most transactions do not conflict ƒ Let them execute as much as possible ™ Multi-version (Oracle, PostgreSQL) ƒ If it turns out that they conflict, abort and restart

4 25 26 Sketch of protocol Pessimistic versus optimistic

™ Read phase: transaction executes, reads from the ™ Overhead of locking versus overhead of validation database, and writes to a private space and copying private space ™ Validate phase: DBMS checks for conflicts with ™ Blocking versus aborts and restarts other transactions; if conflict is possible, abort and ™ “Concurrency control performance modeling: alternatives restart and implications,” by Agrawal et al. TODS 1987 (in red ƒ Requires maintaining a list of objects read and written by book) each transaction ƒ Locking has better throughput for environments with medium-to-high contention ™ Write phase: copy changes in the private space to the database ƒ Optimistic concurrency control is better when resource utilization is low enough

27 28 Timestamp-based Multi-version concurrency control

™ Assign a timestamp to each transaction ™ Maintain versions for each database object ƒ Timestamp order is commit order ƒ Each write creates a new version ™ Associate each database object with a read ƒ Each read is directed to an appropriate version timestamp and a write timestamp ƒ Conflicts are detected in a similar manner as timestamp concurrency control ™ When transaction reads/writes an object, check the object’s timestamp for conflict with a younger ™ In addition to the problems inherited from transaction; if so, abort and restart timestamp concurrency control ƒ Pro: Reads are never blocked ™ Problems ƒ ƒ Even reads require writes (of object timestamps) Con: Multiple versions need to be maintained ƒ Ensuring recoverability is hard (plenty of dirty reads) ) Oracle and PostgreSQL use variants of this scheme

29 Summary

™ Covered ƒ Conflict- ƒ 2PL, strict 2PL ƒ Deadlocks ƒ Multiple-granularity locking ƒ Predicate locking and tree locking ƒ Overview of other concurrency-control methods ™ Not covered ƒ View-serializability ƒ Concurrency control for search trees (not the same as multiple-granularity locking and tree locking)

5