Database System Implementation Project

Database System Implementation Project

Database System Implementation Project Spring 2006-2007 Lecture 5 Transaction Properties • A transaction system must satisfy the ACID properties • Atomicity – Either all the operations within the transaction are reflected properly in the database, or none are • Consistency – When a transaction completes, the database must be in a consistent state; i.e. all constraints must hold •Isolation – When multiple transactions execute concurrently, they must appear to execute one after the other, in isolation of each other • Durability – After a transaction commits, all changes should persist, even when a system failure occurs Bank Account Example • Transfer $400 from account A-201 to A-305 – Clearly requires multiple steps • If transaction isn’t atomic: – Perhaps only one account shows the change! • If transaction isn’t consistent: – Sum of account balances may not stay constant • If transaction isn’t isolated: – Multiple operations involving either account could result in inaccurate balances • If transaction isn’t durable: – If DB crashes, could end up with inaccurate balances! Transaction Properties • A database system must provide transactions with ACID properties • Several components must work together to provide ACID properties: – A transaction manager ensures atomicity of transactions – A concurrency-control system ensures proper isolation of concurrent transactions – A recovery manager ensures durability of transactions – Database application programmers must ensure consistency of their transactions Transaction States • Transactions go through a series of states •Active – Transaction starts in this state, and stays active as it progresses • Partially committed – Last operation in transaction has completed successfully, but transaction may still be aborted – e.g. a hardware failure may require the transaction to be aborted during recovery • Committed – Transaction is completed; all changes are visible in the database Transaction States (2) • Failed – Transaction can no longer complete successfully – Transaction cannot be committed, only aborted • Aborted – Transaction has been completely rolled back; DB is in original state • Transaction state diagram: partially committed committed active failed aborted Shadow Copies • Can provide atomicity and durability using a shadow copy scheme • Requires only one transaction at a time • Approach: – When transaction needs to write to database, the entire database is copied – All modifications in the transaction go against the shadow copy – When transaction is committed, the new copy replaces the old version in one atomic operation Shadow Copies (2) db pointer old copy new copy of database of database • A db-pointer refers to the last committed state • When changes are made, they go against a complete copy of the database • When transaction is committed, db-pointer is changed to new copy (delete old version) • If transaction is aborted, just delete new version Shadow Copies (3) • Very inefficient strategy for atomicity, durability – Can only support one transaction at a time! • Most text editors use this model – During transaction: • foo.txt is the current copy • #foo.txt# is the shadow copy – At transaction-commit (“save document”): • foo.txt renamed to foo.txt~ • #foo.txt# renamed to foo.txt – Ideally, changing the db-pointer is an atomic operation provided by the OS • Necessary for guaranteeing survival of system failures Transaction Operations • Transactions are modeled as a series of read and write operations • Example: – Transfer $400 from account A-201 to A-305 – Transaction T1 schedule: read(X1) X1 := X1 –400 write(X1) read(X2) X2 := X2 + 400 write(X2) Disk Operations • Table data is stored in one or more files – Data files are read at a page granularity – Can model these operations: • input(B) transfers block B from disk to memory • output(B) writes block B from memory to disk • Databases include a buffer manager – Disk pages are kept in a shared buffer – Dramatically reduces number of disk reads and writes – read() and write() use buffer pages in memory – Buffer manager must interact closely with recovery manager Log-Based Recovery • Most databases use a transaction log to provide durable transactions • Table data is distributed across multiple files – Providing atomic operations involving multiple files is very difficult • Operations are logged to a single file – Virtually all OSes provide atomic operations for interacting with a single file • Can use the transaction log in recovery processing to ensure transaction durability • These schemes are for single-version storage – Only one copy/version of each record is stored Log-Based Recovery (2) • Several different kinds of log records: <Ti start> • Transaction Ti was started <Ti, Xj, V1, V2> • Transaction Ti wrote to data item Xj • Old value was V1, new value is V2 <Ti commit> • Transaction Ti was committed <Ti abort> • Transaction Ti was aborted Log-Based Recovery (3) • Update log records: <Ti, Xj, V1, V2> – Records every write operation a transaction performs –V1 is the old value, V2 is the new value • If a txn needs to be redone, can rewrite V2 to Xj –V2 is “redo data” • If a txn needs to be undone, can rewrite V1 to Xj –V1 is “undo data” • A data value Xj may have multiple updates in transaction Ti – Transaction log will have multiple update records for that value Deferred Modification • Deferred-modification technique – All updates are recorded to transaction log first – Table writes are deferred until txn partially commits • At commit time for transaction Ti: –<Ti commit> record is written to log – Transaction log is flushed to disk – Records for Ti are used to perform deferred writes • Undo data is unnecessary with this scheme – A table’s data is never written before the txn commits • Can be very inefficient for large transactions – Generate many records, then replay them at commit! Log-Based Recovery • When DB system crashes, can scan log to restore DB to a consistent state – “Replay the log” against the database • If log has both <Ti start> and <Ti commit> logs for transaction Ti, redo that transaction – Replay all update operations for Ti – Model as a redo(Ti) function • If no commit record for Ti, don’t redo its updates • redo() function must be idempotent – Applying redo() multiple times must be equivalent to applying it only once – DB may also crash during recovery processing Immediate Modification • Immediate-modification technique – Table writes are allowed before a txn commits • Called uncommitted modifications – If a transaction must be rolled back, old values of data items are required! • Transaction log is maintained as before – Update records must be written to log before corresponding table writes may occur! • Technique is called write-ahead logging (WAL) – Transaction log is called a write-ahead log – All updates are logged in WAL before written to tables Immediate Modification (2) • A new recovery procedure is needed: undo(Ti) – undo() must also be idempotent – Update records are applied in reverse order – For each record, V1 (old value) is written back to Xj •If a txnTi is aborted during normal operation – Use txn-log to undo all operations in Ti: undo(Ti) • During recovery, scan entire log: –If Ti has <Ti start> and <Ti commit> logs: redo(Ti) –If Ti has only <Ti start>, or <Ti start> and <Ti abort> logs: undo(Ti) – Order of application is important! (more later) Immediate Modification (3) • Usually more efficient than deferred modification – Most transactions will commit successfully – Undoing a transaction will be infrequent •Must carefully manage disk writes and flushes! – Transaction logging and buffer management are tightly coupled – Can’t flush a buffer page to disk until corresponding txn-log writes for that page have been flushed Checkpoints • Transaction logs can grow very large – At recovery time, entire log must be replayed – Log may need to be scanned twice or more times • Can write checkpoints to the transaction log – Indicates that all transaction-log records before the checkpoint are reflected in stable storage – Only need to replay log from most recent checkpoint • Checkpoint procedure: – Flush all transaction log data to disk – Flush all modified table data to disk – Write a checkpoint record to the disk Checkpoints (2) • No other writes may occur during checkpoint – Transactions may be active at time of checkpoint – Write operations are suspended until checkpoint completes • Can delete transaction logs before a checkpoint – Those log records are reflected in all table data • DBs often keep two most recent checkpoints – If most recent checkpoint was corrupted, DB can go back to second most recent checkpoint – If second most recent checkpoint is also corrupted, recovery fails Concurrent Transactions • When transactions are serialized, at most one transaction is interrupted – Need to undo at most one transaction – May need to redo several transactions • When transactions proceed concurrently, several transactions may be interrupted – Checkpoint record specifies transactions that are in flight at time of checkpoint • Order of redo() and undo() application is critical – Transactions often write to the same data values Concurrent Transactions (2) • Must make sure that applying undo(Ti) doesn’t accidentally overwrite redo(Tj) • Must also apply undo operations backwards – A txn can write the same data item multiple times • Recovery procedure: – Generate a list of transactions to redo, and a list of transactions to undo – Perform all undo operations first, scanning backward through log – Then

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    25 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us