
Multi-Version Concurrency Control Lecture 18: Case Studies 1 / 82 Multi-Version Concurrency Control Recap Recap 2 / 82 Multi-Version Concurrency Control Recap Multi-Version Concurrency Control • The DBMS maintains multiple physical versions of a single logical object in the database: I When a txn writes to an object, the DBMS creates a new version of that object. I When a txn reads an object, it reads the newest version that existed when the txn started. 3 / 82 Multi-Version Concurrency Control Recap Multi-Version Concurrency Control • Writers don’t block readers. Readers don’t block writers. • Read-only txns can read a consistent snapshot without acquiring locks or txn ids. I Use timestamps to determine visibility. • Easily support time-travel queries. 4 / 82 Multi-Version Concurrency Control Recap Today’s Agenda • MVCC Protocols • Microsoft Hekaton (SQL Server) • TUM HyPer • SAP HANA • CMU Cicada 5 / 82 Multi-Version Concurrency Control MVCC Protocols MVCC Protocols 6 / 82 Multi-Version Concurrency Control MVCC Protocols Snapshot Isolation (SI) • When a txn starts, it sees a consistent snapshot of the database that existed when that the txn started. I No torn writes from active txns. I If two txns update the same object, then first writer wins. • SI is susceptible to the Write Skew Anomaly. 7 / 82 Multi-Version Concurrency Control MVCC Protocols Write Skew Anomaly 8 / 82 Multi-Version Concurrency Control MVCC Protocols Write Skew Anomaly 9 / 82 Multi-Version Concurrency Control MVCC Protocols Isolation Level Hierarchy 10 / 82 Multi-Version Concurrency Control MVCC Protocols Concurrency Control Protocol • Approach 1: Timestamp Ordering I Assign txns timestamps that determine serial order. I Considered to be original MVCC protocol. • Approach 2: Optimistic Concurrency Control I Three-phase protocol from last class. I Use private workspace for new versions. • Approach 3: Two-Phase Locking I Txns acquire appropriate lock on physical version before they can read/write a logical tuple. 11 / 82 Multi-Version Concurrency Control MVCC Protocols Tuple Format 12 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 13 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 14 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 15 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 16 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 17 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 18 / 82 Multi-Version Concurrency Control MVCC Protocols Timestamp Ordering (MVTO) 19 / 82 Multi-Version Concurrency Control MVCC Protocols Two-Phase Locking (MV2PL) 20 / 82 Multi-Version Concurrency Control MVCC Protocols Two-Phase Locking (MV2PL) 21 / 82 Multi-Version Concurrency Control MVCC Protocols Two-Phase Locking (MV2PL) 22 / 82 Multi-Version Concurrency Control MVCC Protocols Two-Phase Locking (MV2PL) 23 / 82 Multi-Version Concurrency Control MVCC Protocols Two-Phase Locking (MV2PL) 24 / 82 Multi-Version Concurrency Control MVCC Protocols Observation 25 / 82 Multi-Version Concurrency Control MVCC Protocols Observation 26 / 82 Multi-Version Concurrency Control MVCC Protocols Observation 27 / 82 Multi-Version Concurrency Control MVCC Protocols Observation 28 / 82 Multi-Version Concurrency Control MVCC Protocols PostgreSQL: Txn Id Wraparound • Set a flag in each tuple header that says that itis frozen in the past. Any new txn id will always be newer than a frozen version. • Runs the vacuum before the system gets close to this upper limit. • Otherwise it must stop accepting new commands when the system gets close to the max txn id. 29 / 82 Multi-Version Concurrency Control Microsoft Hekaton Microsoft Hekaton 30 / 82 Multi-Version Concurrency Control Microsoft Hekaton Microsoft Hekaton • Incubator project started in 2008 to create new OLTP engine for MSFT SQL Server (MSSQL). I Reference • Had to integrate with MSSQL ecosystem. • Had to support all possible OLTP workloads with predictable performance. I Single-threaded partitioning (e.g., H-Store/VoltDB) works well for some applications but terrible for others. 31 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton MVCC • Each txn is assigned a timestamp when they begin (BeginTS) and when they commit (CommitTS). • Each tuple contains two timestamps that represents their visibility and current state: I BEGIN-TS: The BeginTS of the active txn or the CommitTS of the committed txn that created it. I END-TS: The BeginTS of the active txn that created the next version or infinity or the CommitTS of the committed txn that created it. 32 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 33 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 34 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 35 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 36 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 37 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 38 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 39 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 40 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 41 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Operations 42 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Transaction State Map • Global map of all txns’ states in the system: I ACTIVE: The txn is executing read/write operations. I VALIDATING: The txn has invoked commit and the DBMS is checking whether it is valid. I COMMITTED: The txn is finished but may have not updated its versions’ TS. I TERMINATED: The txn has updated the TS for all of the versions that it created. 43 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Transaction Lifecycle 44 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Transaction Meta-Data • Read Set I Pointers to physical versions returned to access method. • Write Set I Pointers to versions updated (old and new), versions deleted (old), and version inserted (new). • Scan Set I Stores enough information needed to perform each scan operation again to check result. • Commit Dependencies I List of txns that are waiting for this txn to finish. 45 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Transaction Validation • Read Stability I Check that each version read is still visible as of the end of the txn. • Phantom Avoidance I Repeat each scan to check whether new versions have become visible since the txn began. • Extent of validation depends on isolation level: I SERIALIZABLE: Read Stability + Phantom Avoidance I REPEATABLE READS: Read Stability I SNAPSHOT ISOLATION: None I READ COMMITTED: None 46 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Optimistic vs. Pessimistic • Optimistic Txns: I Check whether a version read is still visible at the end of the txn. I Repeat all index scans to check for phantoms. • Pessimistic Txns: I Use shared & exclusive locks on records and buckets. I No validation is needed. I Separate background thread to detect deadlocks. 47 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Optimistic vs. Pessimistic 48 / 82 Multi-Version Concurrency Control Microsoft Hekaton Hekaton: Lessons • Use only lock-free data structures I No latches, spin locks, or critical sections I Indexes, txn map, memory alloc, garbage collector I Example: Bw-Trees • Only one single serialization point in the DBMS to get the txn’s begin and commit timestamp I Atomic Addition (CAS) 49 / 82 Multi-Version Concurrency Control Microsoft Hekaton Observation • Read/scan set validations are expensive if the txns access a lot of data. • Appending new versions hurts the performance of OLAP scans due to pointer chasing & branching. • Record-level conflict checks may be too coarse-grained and incur false positives. 50 / 82 Multi-Version Concurrency Control Hyper Hyper 51 / 82 Multi-Version Concurrency Control Hyper Hyper MVCC • Column-store with delta record versioning. • Reference I In-Place updates for non-indexed attributes I Delete/Insert updates for indexed attributes. I Newest-to-Oldest Version Chains I No Predicate Locks / No Scan Checks • Avoids write-write conflicts by aborting txns that try to update an uncommitted object. 52 / 82 Multi-Version Concurrency Control Hyper Hyper: Storage Architecture 53 / 82 Multi-Version Concurrency Control Hyper Hyper: Storage Architecture 54 / 82 Multi-Version Concurrency Control Hyper Hyper: Storage Architecture 55 / 82 Multi-Version Concurrency Control Hyper Hyper: Storage Architecture 56 / 82 Multi-Version Concurrency Control Hyper Hyper: Validation • First-Writer Wins I If version vector is not null, then it always points to the last committed version. I Do not need to check whether write-sets overlap. • Check the redo buffers of txns that committed after the validating txn started. I Compare the committed txn’s write set for phantoms using Precision Locking. I Only need to store the txn’s read predicates and not its entire read set. 57 / 82 Multi-Version Concurrency Control Hyper Hyper: Precision Locking 58 / 82 Multi-Version Concurrency Control Hyper Hyper: Precision Locking 59 / 82 Multi-Version Concurrency Control Hyper Hyper: Precision Locking 60 / 82 Multi-Version Concurrency Control Hyper Hyper: Precision
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages82 Page
-
File Size-