ARIES: Logging and Recovery Review

Review: The ACID properties • A tomicity: All actions in the Xact happen, or none happen. ARIES: Logging and • C onsistency: If each Xact is consistent, and the DB starts Recovery consistent, it ends up consistent. • I solation: Execution of one Xact is isolated from that of other Slides derived from Joe Hellerstein; Xacts. Updated by A. Fekete • D urability: If a Xact commits, its effects persist. If you are going to be in the logging business, one of the things that you have to do is to • The Recovery Manager guarantees Atomicity & Durability. learn about heavy equipment. - Robert VanNatta, Logging History of Columbia County Motivation Intended Functionality • Atomicity: – Transactions may abort (“Rollback”). • At any time, each data item contains the value produced by the most recent update done by a • Durability: transaction that committed – What if DBMS stops running? (Causes?) Desired Behavior after system restarts: crash! T1 – T1, T2 & T3 should be T2 durable. T3 – T4 & T5 should be T4 aborted (effects not seen). T5 Assumptions Challenge: REDO Action Buffer Disk • Essential concurrency control is in effect. Initially 0 – For read/write items: Write locks taken and held T1 writes 1 1 0 till commit T1 commits 1 0 • Eg Strict 2PL, but read locks not important for recovery. CRASH 0 – For more general types: operations of concurrent transactions commute • Need to restore value 1 to item • Updates are happening “in place”. – Last value written by a committed transaction – i.e. data is overwritten on (deleted from) its location. • Unlike multiversion approaches • Buffer in volatile memory; data persists on disk Challenge: UNDO Handling the Buffer Pool Action Buffer Disk • Can you think of a simple scheme Initially 0 to guarantee Atomicity & No Steal Steal T1 writes 1 1 0 Durability? Page flushed 1 Force CRASH 1 Trivial • Force write to disk at commit? • Need to restore value 0 to item – Poor response time. – Last value from a committed transaction – But provides durability. No Force Desired • No Steal of buffer-pool frames from uncommited Xacts (“pin”)? – Poor throughput. – But easily ensure atomicity More on Steal and Force Basic Idea: Logging •STEAL (why enforcing Atomicity is hard) – To steal frame F: Current page in F (say P) is • Record REDO and UNDO information, for every written to disk; some Xact holds lock on P. update, in a log. • What if the Xact with the lock on P aborts? – Sequential writes to log (put it on a separate disk). • Must remember the old value of P at steal time (to – Minimal info (diff) written to log, so multiple updates support UNDOing the write to page P). fit in a single log page. • NO FORCE (why enforcing Durability is hard) • Log: An ordered list of REDO/UNDO actions – What if system crashes before a modified page is – Log record contains: written to disk? <XID, pageID, offset, length, old data, new data> – Write as little as possible, in a convenient place, at – and additional control info (which we’ll see soon) commit time,to support REDOing modifications. – For abstract types, have operation(args) instead of old value new value. Write-Ahead Logging (WAL) ARIES • The Write-Ahead Logging Protocol: • Exactly how is logging (and recovery!) done? 1. Must force the log record for an update before – Many approaches (traditional ones used in the corresponding data page gets to disk. relational systems of 1980s); 2. Must write all log records for a Xact before – ARIES algorithms developed by IBM used many of commit. the same ideas, and some novelties that were • #1 (undo rule) allows system to have quite radical at the time Atomicity. • Research report in 1989; conference paper on an • #2 (redo rule) allows system to have extension in 1989; comprehensive journal publication in Durability. 1992 • 10 Year VLDB Award 1999 DB RAM Key ideas of ARIES WAL & the Log LSNs pageLSNs flushedLSN • Each log record has a unique Log Sequence Number (LSN). • Log every change (even undos during txn Log records abort) – LSNs always increasing. flushed to disk • In restart, first repeat history without • Each data page contains a backtracking pageLSN. – Even redo the actions of loser transactions – The LSN of the most recent log • Then undo actions of losers record for an update to that page. • System keeps track of pageLSN • LSNs in pages used to coordinate state “Log tail” flushedLSN. in RAM between log, buffer, disk – The max LSN flushed so far. Novel features of ARIES in italics WAL constraints Log Records Possible log record types: • Before a page is written, LogRecord fields: • Update –pageLSN flushedLSN prevLSN • Commit • Commit record included in log; all related XID •Abort update log records precede it in log type •End (signifies end of commit pageID or abort) update length • Compensation Log Records records offset (CLRs) only before-image – for UNDO actions after-image – (and some other tricks!) Other Log-Related State Normal Execution of an Xact • Transaction Table: • Series of reads & writes, followed by commit or – One entry per active Xact. abort. –Contains XID, status (running/commited/aborted), – We will assume that page write is atomic on disk. and lastLSN. • In practice, additional details to deal with non-atomic writes. • Dirty Page Table: • Strict 2PL (at least for writes). – One entry per dirty page in buffer pool. • STEAL, NO-FORCE buffer management, with Write- –Contains recLSN -- the LSN of the log record which Ahead Logging. first caused the page to be dirty. Checkpointing The Big Picture: What’s Stored Where • Periodically, the DBMS creates a checkpoint, in order to minimize the time taken to recover in the event of a system LOG crash. Write to log: DB RAM – begin_checkpoint record: Indicates when chkpt began. LogRecords – end_checkpoint record: Contains current Xact table and dirty page prevLSN Xact Table table. This is a `fuzzy checkpoint’: XID Data pages lastLSN • Other Xacts continue to run; so these tables only known to reflect type each status some mix of state after the time of the begin_checkpoint record. pageID with a • No attempt to force dirty pages to disk; effectiveness of checkpoint length pageLSN Dirty Page Table limited by oldest unwritten change to a dirty page. (So it’s a good idea recLSN to periodically flush dirty pages to disk!) offset before-image master record – Store LSN of chkpt record in a safe place (master record). after-image flushedLSN Simple Transaction Abort Abort, cont. • For now, consider an explicit abort of a Xact. – No crash involved. • To perform UNDO, must have a lock on data! • We want to “play back” the log in reverse – No problem! order, UNDOing updates. • Before restoring old value of a page, write a CLR: – You continue logging while you UNDO!! –Get lastLSN of Xact from Xact table. – CLR has one extra field: undonextLSN – Can follow chain of log records backward via the • Points to the next LSN to undo (i.e. the prevLSN of the record we’re prevLSN field. currently undoing). – Note: before starting UNDO, could write an Abort – CLR contains REDO info log record. –CLRs never Undone • Undo needn’t be idempotent (>1 UNDO won’t happen) • Why bother? • But they might be Redone when repeating history (=1 UNDO guaranteed) • At end of all UNDOs, write an “end” log record. Transaction Commit Crash Recovery: Big Picture Oldest log • Write commit record to log. rec. of Xact Start from a checkpoint (found active at crash • All log records up to Xact’s lastLSN are flushed. via master record). – Guarantees that flushedLSN lastLSN. Smallest Three phases. Need to: – Note that log flushes are sequential, synchronous recLSN in dirty page – Figure out which Xacts writes to disk. table after committed since checkpoint, – Many log records per log page. Analysis which failed (Analysis). • Make transaction visible – REDO all actions. – Commit() returns, locks dropped, etc. Last chkpt (repeat history) • Write end record to log. – UNDO effects of failed Xacts. CRASH A RU Recovery: The Analysis Phase Recovery: The REDO Phase • We repeat History to reconstruct state at crash: – Reapply all updates (even of aborted Xacts!), redo • Reconstruct state at checkpoint. CLRs. –via end_checkpoint record. • Scan forward from log rec containing smallest • Scan log forward from begin_checkpoint. recLSN in D.P.T. For each CLR or update log rec LSN, –Endrecord: Remove Xact from Xact table. REDO the action unless page is already more – Other records: Add Xact to Xact table, set uptodate than this record: lastLSN=LSN, change Xact status on commit. – REDO when Affected page is in D.P.T., and has – Update record: If P not in Dirty Page Table, pageLSN (in DB) < LSN. [if page has recLSN > LSN no • Add P to D.P.T., set its recLSN=LSN. need to read page in from disk to check pageLSN] • To REDO an action: This phase could be skipped; – Reapply logged action. information can be regained in REDO pass following –Set pageLSN to LSN. No additional logging! Invariant Recovery: The UNDO Phase • State of page P is the outcome of all changes • Key idea: Similar to simple transaction abort, of relevant log records whose LSN is <= for each loser transaction (that was in flight or P.pageLSN aborted at time of crash) • During redo phase, every page P has – Process each loser transaction’s log records P.pageLSN >= redoLSN backwards; undoing each record in turn and generating CLRs • Thus at end of redo pass, the database has a • But: loser may include partial (or complete) state that reflects exactly everything on the rollback actions (stable) log • Avoid to undo what was already undone – undoNextLSN field in each CLR equals prevLSN field from the original action UndoNextLSN Recovery: The UNDO Phase ToUndo={ l | l a lastLSN of a “loser” Xact} Repeat: – Choose largest LSN among ToUndo.

ARIES: Logging and Recovery Review

SQL Server Protection Whitepaper

How to Conduct Transaction Log Analysis for Web Searching And

Destiny® System Backups

Infosphere MDM Collaboration Server: Installation Guide

Database Management Systems Introduction Transaction ACID

Data Definition Language (Ddl)

Transaction Processing System

System Administration Guide: Volume 1

Transactions – Recovery ACID Properties and Recovery

Microsoft SQL Server DBA's Guide to Actifio Copy Data Management

Zz-Cryptoprevent-Do

Finding Sensitive Data in and Around Microsoft SQL Server