<<

Structuring Applications

• Many applications involve long transactions Models of Transactions that make many accesses • To deal with such complex applications many systems Chapter 19 provide mechanisms for imposing some structure on transactions

2

Flat Transaction Flat Transaction • Consists of: begin transaction – Computation on local variables • Abort causes the begin transaction

• not seen by DBMS; hence will be 

ignored in most future discussion execution of a program ¡£ ¤¦©¨   – Access to DBMS using call or ¢¡£ ¥¤§¦©¨  that restores the statement level interface variables updated by the ¡£ ¤¦©¨ • This is transaction schedule; …..

applies to these operations ¢¡£ ¤§¦©¨ ….. transaction to the state • No internal structure they had when the if condition then abort • Accesses a single DBMS transaction first accessed commit commit • Adequate for simple applications them.

3 4

Some Limitations of Flat Transactions • Only total rollback (abort) is possible – Partial rollback not possible Providing Structure Within a • All work lost in case of crash Single Transaction • Limited to accessing a single DBMS • Entire transaction takes place at a single point in time

5 6

1 begin transaction S1;

Savepoints Call to DBMS sp1 := create_savepoint(); S2;

sp2 := create_savepoint(); • Problem: Transaction detects condition that S3; requires rollback of recent database changes if (condition) {rollback (sp1); S5}; S4; that it has made commit

1: Transaction reverses changes • Rollback to spi causes database updates subsequent to creation itself of spi to be undone – S2 and S3 updated the database (else there is no point • Solution 2: Transaction uses the rollback rolling back over them) facility within DBMS to undo the changes • Program counter and local variables are not rolled back • creation does not make prior database changes 7 durable (abort rolls all changes back) 8

Distributed Systems: Integration Example of Savepoints of Legacy Applications • Problem: Many enterprises support multiple • Suppose we are making airplane legacy systems doing separate tasks reservations for a long trip – Increasing automation requires that these systems be integrated – London-NY NY-Chicago Chicago-Des Moines • We might put savepoints after the code that made withdraw part the London-NY and NY-Chicago reservations return part Inventory DBMS 1 Site B • If we cannot get a reservation from Chicago to stock level Application Des Moines, we would rollback to the savepoint order part after the London-NY reservation and then perhaps Billing Site C payment DBMS 2 try to get a reservation through St Louis Application

9 10

Distributed Transactions Distributed Transactions • Incorporate transactions at multiple servers into a single (distributed) transaction • Goal: should be ACID – Not all distributed applications are legacy systems; some are built from scratch as distributed systems – Each subtransaction is locally ACID (e.g., local constraints maintained, locally serializable) tx_begin; Inventory DBMS 1 Site B – In addition the transaction should be globally ACID order_part; Application • A: Either all subtransactions commit or all abort withdraw_part; • C: Global integrity constraints are maintained payment; Billing DBMS 2 Site C tx_commit; Application • I: Concurrently executing distributed transactions are globally serializable Site A • D: Each subtransaction is durable

11 12

2 Banking Example Banking Example (con’t)

• Global atomicity - funds transfer • Global consistency - – Either both subtransactions commit or neither does – Sum of all account balances at bank branches = total assets recorded at main office tx_begin; withdraw(acct1); deposit(acct2); tx_commit;

13 14

Exported Interfaces Banking Example (con’t) Local system might export an interface for executing • Global isolation - local at each site individual SQL statements. does not guarantee – post_interest subtransaction is serialized after audit subtransaction

subtransaction in DBMS at branch 1 and before audit tx_begin; DBMS 1 site B

¤§¦©¨ ¦ ¥¤¢¡¥ 

in DBMS at branch 2 (local isolation), but ¡£

¢£ ¤ ¦ ¦¥§¡¥ 

– there is no global order ¡£ ¤§¦©¨

¡£ ¤§¦©¨ ¦  ¤¢¡   post_interest audit DBMS 2 site C time tx_commit; ↓ sum balances at branch 1; site A subtransaction post interest at branch 1; post interest at branch 2; Alternatively, the local system might export an interface sum balances at branch 2; for executing subtransactions. 15 16

Multidatabase Transaction Hierarchy • A distributed transaction invokes subtransactions. • Set of accessed by a distributed • General model: one distributed transaction might transaction is referred to as a multidatabase (or invoke another as a subtransaction, yielding a federated database) hierarchical structure – Each database retains its autonomy and might support local (non-distributed) transactions Distributed transactions • Multidatabase might have global integrity constraints – e.g., Sum of balances of individual bank accounts at all branch offices = total assets stored at main office

17 18

3 Models of Distributed Transactions Distributed Transactions • Can siblings execute concurrently? • Transaction designer has little control over • Can parent execute concurrently the structure. fixed by with children? distribution of data and/or exported • Who initiates commit? interfaces (legacy environment)

Hierarchical Model: No concurrency among subtransactions, root initiates commit • Essentially a bottom-up design Peer Model: Concurrency among siblings and between parent and children, any subtransaction can initiate commit

19 20

Characteristics of Nested Transactions Nested Transactions • (1) Parent can create children to perform subtasks; children might • Problem: Lack of mechanisms that allow: execute sequentially or concurrently; – a top-down, functional decomposition of a parent waits until all children transaction into subtransactions complete (no communication – individual subtransactions to abort without between parent and children). aborting the entire transaction • (2) Each subtransaction (together with its descendants) is • Although a nested transaction looks similar isolated with respect to each sibling (and its descendants). to a distributed transaction, it is not Hence, siblings are serializable, but order is not determined conceived of as a tool for accessing a and nested transaction is non-deterministic. multidatabase • (3) Concurrent nested transactions are serializable.

21 22

Characteristics of Nested Transactions Nested Transaction - Example

Booking a flight from • (4) A subtransaction is atomic. It can London to Des Moines abort or commit independently of other L -- DM C subtransactions. Commit is conditional concurrent C = commit on commit of parent (since child task is a A = abort subtask of parent task). Abort causes C L -- NY C NY -- DM abort of all subtransaction’s children. sequential

• (5) Nested transaction commits when root commits. At NY -- Chic -- DM A concurrent C NY -- StL -- DM that point updates of committed subtransactions are made stop in Chicago concurrent stop in St. Louis durable. C/A A C C

23 NY -- Chic Chic -- DM NY -- StL StL -- DM24

4 Nested Transactions parent of all Characteristics of Nested nested transactions Transactions • (6) Individual subtransactions are not necessarily concurrent consistent, but nested transaction as a whole is consistent

isolation isolation isolation

25 26

Structuring to Increase Example - Switch Sections Transaction Performance transaction moves student from section • Problem: In the models previously discussed, a s1 to section s2, transaction generally locks items it accesses and holds Move(s1, s2) uses TestInc, Dec locks until commit time to guarantee serializabiltiy Section abstr. acquire on x release lock on x L2 TestInc(s2) Dec(s1) ↓ ↓ T1: r(x:12) .. compute .. w(x:13) commit Tuple abstr. enrollments T2: request read(x) r(x:13) ..compute.. w(x:14) .. L1 Sel(t2) Upd(t2) s t o r e d i n t u p l e s Upd(t1) t1 and t2 ↑ ↑ Page abstr. (wait) acquire lock on x L0 Rd(p2) Rd(p2) Wr(p2) Rd(p1) Wr(p1) – This eliminates bad interleavings, but limits concurrency and tuples stored hence performance in pages 27 time p1 and p2 28

Chained Transactions

• Problem 1 (trivial): Invoking begin_transaction at the start of each Structuring into Multiple transaction involves communication Transactions overhead • With chaining, a new transaction is started automatically for an application program when the program commits or aborts the previous one – This is the approach taken in SQL

29 30

5 Chained Transactions Chained Transactions transaction • Problem 2: If the system crashes during the starts execution of a long-running transaction, begin transaction begin transaction implicitly considerable work can be lost S1 S1 S1 commit commit commit S2 S2 begin transaction • Chaining allows a transaction S1; S3 begin transaction S2 S1 commit; commit to be decomposed into sub- S3 S3 S2 => S2; commit commit transactions with intermediate S3 commit; commit points commit S3; S2 not included in a Equivalent since Chaining • Database updates are made commit; transaction since it S2 does not access equivalent has no db operations the database durable at intermediate points => less work is lost in a crash

31 32

Example Chaining Considerations - S1; -- update recs 1 - 1000 Atomicity commit; S2; -- update recs 1001 - 2000 commit; • Transaction as a whole is not atomic. If S3; -- update recs 2001 - 3000 crash occurs commit; – DBMS cannot roll the entire transaction back • Initial subtransactions have committed, • Chaining compared with savepoints: – Their updates are durable – The updates might have been accessed by other – Savepoint: explicit rollback to arbitrary savepoint; all transactions (locks have been released) updates lost in a crash – Hence, the application must roll itself forward – Chaining: abort rolls back to last commit; only the updates of the most recent transaction lost in a crash

33 34

Chaining Considerations - Atomicity • Roll forward requires that on recovery the application Chaining Considerations can determine how much work has been committed – Each subtransaction must tell successor where it left off • Transaction as a whole is not isolated. • Communication between successive subtransactions – Database state between successive subtransactions might cannot use local variables (they are lost in a crash) change since locks are released (but performance improves) – Use the database to communicate between subtransactions subtransaction 1 subtransaction 2 T1: r(x:15)…w(x:24)… commit r(x:30)… r(rec_index:0); T2: …w(x:30)…commit S1; -- update records 1 - 1000 w(rec_index:1000); -- save index of last record updated • Subtransactions might not be consistent commit; r(rec_index:1000); -- get index of last record updated – Inconsistent intermediate states visible to concurrent S2; -- update records 1001 – 2000 transactions during execution or after a crash w(rec_index:2000); commit; 35 36

6 Alternative Semantics for A Problem With Obtaining Chaining S1; Atomicity With Chaining chain; • Suppose we use the first semantics for chaining S2; – Subtransactions give up locks when they commit chain; • Suppose that after a subtransaction of a transaction S3; T makes its changes to some item and commits commit; – Another transaction changes the same item and • Chain commits the transaction (makes it durable) and commits starts a new transaction, but does not release locks – T would then like to abort – Individual transactions do not have to be consistent – Based on our usual definition of chained transactions, atomicity cannot be achieved because of the committed – Recovery is complicated (as before); rollforward required subtransactions – No performance gain 37 38

Partial Atomicity An Example

• Suppose we want to achieve some measure

of atomicity by undoing the effects of all T1: Update(x)1,1 commit1,1 … abort1

the committed subtransactions when the T2: Update(x) commit overall transaction wants to abort • We might think we can undo the updates

made by T by just restoring the values each If, when T1 aborts, we just restore the value of x to item had when T started (physical logging) the value it had before T1 updated it, T2’s update would be lost – This will not work

39 40

Sagas: An Extension To Chained Compensation Transactions That Achieves Partial Atomicity • One approach to this problem is compensation • For each subtransaction, ST in a chained • Instead of restoring a value physically, we restore i,j it logically by executing a compensating transaction, Ti a compensating transaction, CTi is transaction designed – In the student registration system, a Deregistration • Thus if a transaction T1 consisting of 5 chained subtransaction compensates for a successful subtransactions aborts after the first 3 Registration subtransaction subtransactions have committed, then – Thus Registration increments the Enrollment attribute and Deregistration decrements that same attribute ST1,1ST1,2ST1,3CT1,3CT1,2CT1,1 • Compensation works even if some other concurrent will perform the desired compensation Registration subtransaction has also incremented Enrollment

41 42

7 Declarative Transaction Sagas and Atomicity Demarcation • With this type of compensation, when a • We have already talked about two ways in transaction aborts, the value of every item it which procedures can execute within a changed is eventually restored to the value transaction it had before that transaction started – As a part of the transaction • However, complete atomicity is not • guaranteed – As a child in a nested transaction – Some other concurrent transaction might have read the changed value before it was restored to its original value

43 44

Declarative Transaction Declarative Transaction Demarcation (con’t) Demarcation (con’t) • Two other possible ways • One way to implement such alternatives is – The calling transaction is suspended, and a new transaction is started. When it completes the first through declarative transaction transaction continues demarcation • Example: The called procedure is at a site that charges for its services and wants to be paid even if the calling transaction – Declare in some data structure, outside of any aborts transaction, the desired transactional behavior – The calling transaction is suspended, and the called procedure executes outside of any transaction. When it – When the procedure is called, the system completes the first transaction continues intercepts the call and provides the desired • Example: The called procedure accesses a non-transactional behavior file system

45 46

Implementation of Declarative Transaction Attributes Transaction Demarcation • Declarative transaction demarcation is • Possible attributes (in J2EE) are implemented within J2EE and .NET – Required – RequiresNew – We discuss J2EE (.NET is similar) – Mandatory • The desired transactional behavior of each – NotSupported procedure is declared as an attributed in a – Supports separate file called the deployment – Never • The behavior for each attribute depends on descriptor whether or not the procedure is called from within a procedure

47 – All possibilities are on the next slide 48

8 Status of Calling Method Attribute of Called Not in a In a Method Transaction Transaction Description of Each Attribute Required Starts a New Executes Within Transaction the Transaction RequiresNew Starts a New Starts a New • Required: Transaction Transaction – The procedure must execute within a Mandatory Exception Executes Within transaction Thrown the Transaction • If called from outside a transaction, a new NotSupported Transaction Not Transaction transaction is started Started Suspended • If called from within a transaction, it executes Supports Transaction Not Executes Within within that transaction Started the Transaction Never Transaction Not Exception Started Thrown

All Possibilities 49 50

Description (con’t) Description (con’t)

• RequiresNew: • Mandatory: – Must execute within a new transaction – Must execute within an existing transaction • If called from outside a transaction, a new • If called from outside a transaction, an exception is transaction is started thrown • If called from within a transaction, that transaction is suspended and a new transaction is started. • If called from within a transaction, it executes When that transaction completes, the first within that transaction transaction resumes – Note that this semantics is different from nested transactions. In this case the commit of the new transaction is not conditional.

51 52

Description (con’t) Description (con’t)

• NotSupported: • Supports: – Does not support transaction – Can execute within or not within a transaction, • If called from outside a transaction, a transaction is but cannot start a new transaction not started • If called from outside a transaction, a transaction is • If called from inside a transaction, that transaction is not started suspended until the procedure completes after which • If called from inside a transaction, it executes within the transaction resumes that transaction

53 54

9 Description (con’t) Example

• Never: • The Deposit and Withdraw transactions in – Can never execute within a transaction a banking application would have attribute • If called from outside a transaction, a new Required. transaction is not started – If called to perform a deposit, a new transaction • If called from within a transaction, an exception is would be started thrown – If called from within a Transfer transaction to transfer money between accounts, they would execute within that transaction

55 56

Advantages Multilevel Transactions

• Designer of individual procedures does not have • A multilevel transaction is a nested set of to know the transactional context in which the subtransactions. procedure will be used • The same procedure can be used in different – The commitment of a subtransaction is transaction contexts unconditional, causing it to release its locks, but – Different attributes are specified for each different – Multilevel transactions are atomic and their context concurrent execution is serializable • We discuss J2EE in more detail and how declarative transaction demarcation is implemented in J2EE in the Architecture chapter.

57 58

Example - Switch Sections Multilevel Transactions transaction (sequential), moves student from one section to another, • Data is viewed as a sequence of increasing, Move(s1, s2) uses TestInc, Dec application oriented, levels of abstraction Section abstr. • Each level supports a set of abstract objects L2 TestInc(s2) Dec(s1) and abstract operations (methods) for Tuple abstr. accessing those objects L1 Sel(t2) Upd(t2) Upd(t1) • Each abstract operation is implemented as a Page abstr. transaction using the abstractions at the next L0 Rd(p2) Rd(p2) Wr(p2) Rd(p1) Wr(p1) lower level

59 time 60

10 Multilevel Transactions Multilevel Transactions

• Parent initiates a single subtransaction at a • When a subtransaction (at any level) completes, it time and waits for its completion. Hence a commits unconditionally and releases locks that it has acquired on items at the next lower level. multilevel transaction is sequential. – TestInc(s2) locks t2; unlocks t2 when it commits • All leaf subtransactions in the tree are at the • The change it has made to the locked item becomes same level visible to subtransactions of other transactions • Only leaf transactions access the database. – The incremented value of t2 is visible to a subsequent execution of TestInc or Dec by concurrent transactions • Compare with distributed and nested • This creates problems maintaining isolation and models atomicity.

61 62

Maintaining Isolation Maintaining Atomicity

Move1: TestInc(s2) Dec(s1) abort p2 is unlocked when Sel commits Move2: TestInc(s3) Dec(s1) commit ↓ TestInc1: Sel(t2) Upd(t2) • When T1 aborts, the value of s1 that existed prior to its TestInc2: Sel(t2) Upd(t2) access cannot simply be restored (physical restoration) ↑ • Logical restoration must be done using compensating Sel2 can lock p2 transactions • Problem: Interleaved execution of two – Inc compensates for Dec; Dec compensates for a TestInc’s results in error (we will return to this successful TestInc; no compensation needed for later) unsuccessful TestInc 63 64

Correctness of Multilevel Compensating Transactions Transactions • Multilevel model uses compensating • As we shall see later, transaction – Multilevel transactions are atomic

logical restoration • In contrast with Sagas, which also use (using compensation) compensation, but do not guarantee atomicity caused by abort – Concurrent execution of multilevel transactions is serializable T1: TestInc(s2) Dec(s1) Inc(s1) Dec(s2) T2: TestInc(s3) Dec(s1) commit

65 66

11 Recoverable Queues Transactional Features

• Problem: Distributed model assumes that the T1: begin transaction; T2: begin transaction; compute; dequeue(item); subtransactions of a transaction follow one item := service recoverable perform requested another immediately (or are concurrent). description queue service; enqueue(item); commit; • In some applications the requirement is that a commit; subtransaction be eventually executed, but not necessarily immediately. • Item is enqueued if T1 commits (deleted if it aborts); item • A recoverable queue is a transactional data is deleted if T2 commits (restored if it aborts) structure in which information about • An item enqueued by T1 cannot be dequeued by T2 until T1 commits transactions to be executed later can be durably • Queue is durable stored.

67 68

Pipeline Queue for Billing Concurrent Implemention of the Application Same Application

shipping billing shipping shipping queue queue queue transaction

order entry shipping billing order entry transaction transaction transaction transaction

billing billing queue transaction

69 70

Recoverable Queue Recoverable Queue • Separate implementation takes advantage • Queue could be implemented within database, of semantics to improve performance but performance suffers – enqueue and dequeue are atomic and isolated, – A transaction should not hold long duration locks but some queue locks are released on a heavily used data structure immediately acquire lock on queue in db release lock on queue acquire lock on queue and entry I1

↓ ↓ release lock on queue release lock on I1

T1: enq(I1) …………..compute………………commit T2: request enq(I2) (wait) enq(I2) T1: enq(I1) …………..compute………………commit T3: request enq(I3) (wait) T2: enq(I2) …………….compute………….. T4: request enq(I4) (wait) T3: enq(I3) ………………..compute ………… acquire lock on queue acquire lock 71 T4: enq(I4)…………….compute ….. 72 on queue and I2

12 Recoverable Queue Scheduling begin transaction select DBMS update • As a result, any scheduling policy for enqueue accessing the queue might be enforced select recoverable dequeue queue – but a FIFO queue might not behave in a FIFO commit manner • Queue and DBMS are two separate systems T1: enq(I1) …commit restore I1 – Transaction must be committed at both but T2: enq(I2) …commit • isolation is implemented at the DBMS and applies to the T3: deq(I1) … abort schedule of requests made to the DBMS only T4: deq(I2) …commit

73 74

Performing Real-World Actions Performing Real-World Actions

• Problem: A real-world action performed from within a • Solution: (part 1) T enqueues entry. If T aborts, item transaction, T, cannot be rolled back if crash occurs before is dequeued; if T commits action executed later commit.

T: begin_transaction; T queue T device compute; D crash update database; T: begin_transaction; TD: begin_transaction; activate device; compute; dequeue entry; commit; update database; activate device; enqueue entry; commit; • On recovery after a crash, how can we tell if the action has commit; occurred? – ATM example: We do not want to dispense cash twice. • Server executes TD in a loop – but problem still exists within TD 75 76

Performing Real-World Actions Performing Real-World Actions T queue TD counter

device • On recovery:

• Solution: (part 2) Restore queue and database (value read from – Device maintains read-only counter (hardware) that counter) to last commit; is automatically incremented with each action if (device value > recorded value) • Action and increment are assumed to occur atomically then discard head entry;

– Server performs: TD: begin_transaction; Restart server; dequeue; activate device; record counter in db;

commit; 77 78

13 Forwarding Agent Example of Real World Action • Implementing deferred service. • Suppose the hardware counter and the database counter were both at 100 before the transaction started enqueue dequeue invoke – When the hardware performs its action, it increments its counter to client agent server 101 reply – TD would then increment the database counter to 101 • If the system crashed after the hardware performed its Request action the database increment (if it had occurred) would be queue rolled back to 100 • Thus when the system recovered – If the hardware counter was 101 and the database counter was 100, Response we would know that the action had been performed. dequeue queue enqueue – If both counters were the same (100), we would know that the action had not taken place. • In general there are multiple clients (producers) and 79 multiple servers (consumers) 80

Workflow Task Workflows • Self-contained job performed by an agent • Problem: None of the previous models are – Inventory transaction (agent = database server) sufficiently flexible to describe complex, – Packing task (agent = human) long-running enterprise processes involving • Has an associated role that defines type of job computational and non-computational tasks – An agent can perform specified roles in distributed, heterogeneous systems over • Accepts input from other tasks, produces output extended periods of time • Has physical status: committed, aborted, ... – Committed task has logical status: success, failure

81 82

Workflow Workflow • Task execution precedence specified separately from task itself • Conditional alternatives can be specified: – using control flow language: if (condition) execute T1 else execute T2 initiate T2, T3 when T1 committed • Conditions: – or using graphical tool: – Logical/physical status of a task T2 – Time of day T1 – Value of a variable output by a task

T3 AND condition • Alternative paths can be specified in case of concurrency task failure 83 84

14 Execution Precedence in a Workflow Catalog Ordering System • Specifies flow of data between tasks O bill R AND OR by air T2 take update remove package shipping complete T1 order by land T3

85 86

Flow of Data in a Catalog Workflow Agent Ordering System • Capable of performing tasks bill • Has a set of associated roles describing tasks it can by do air • Has a worklist listing tasks that have been take update remove package shipping complete order assigned to it • Possible implementation: by land – Worklist stored in a recoverable queue – Agent is an infinitely looping process that processes one queue element on each iteration

87 88

Workflow and ACID Properties Workflow and ACID Properties • Each task is either • Individual tasks might be ACID, but – Retriable: Can ultimately be made to commit if workflow as a whole is not retried a sufficient number of times (e.g., deposit) – Some task might not be essential: its failure is – Compensatable: Compensating task exists (e.g., ignored even though workflow completes withdraw) – Concurrent workflows might see each other’s – Pivot: Neither retriable nor compensatable (e.g., intermediate state buy a non-refundable ticket) – Might not choose to compensate for a task even though workflow fails

89 90

15 Workflow Management System Workflow and ACID Properties • Provides mechanism for specifying workflow (control flow language, GUI) • The atomicity of a workflow is guaranteed • Provides mechanism for controlling execution of if each execution path is characterized by concurrent workflows: {compensatable}*, [pivot], {retriable}* – Roles and agents • This does not guarantee isolation since – Worklists and load balancing intermediate states are visible – Filters (data reformatting) and controls flow of data – Task activation – Maintain workflow state durably (data, task status) – Use of recoverable queues – Failure recovery of WFMS itself (resume workflows)

91 92

Importance of Workflows

• Allows management of an enterprise to guarantee that certain activities are carried out in accordance with established business rules, even though those activities involve a collection of agents, perhaps in different locations and perhaps with minimal training

93

16