CMU 15-445/645 Database Systems (Fall 2018) :: Systems Potpourri

CMU 15-445/645 Database Systems (Fall 2018) :: Systems Potpourri

Final Review + Systems Potpourri Lecture #26 Database Systems Andy Pavlo 15-445/15-645 Computer Science Fall 2018 AP Carnegie Mellon Univ. 2 ADMINISTRIVIA Project #4: Monday Dec 10th @ 11:59pm Extra Credit: Wednesday Dec 12th @11:59pm Final Exam: Sunday Dec 16th @ 8:30am CMU 15-445/645 (Fall 2018) 3 FINAL EXAM Who: You What: http://cmudb.io/f18-final When: Sunday Dec 16th @ 8:30am Where: GHC 4401 Why: https://youtu.be/6yOH_FjeSAQ CMU 15-445/645 (Fall 2018) 4 FINAL EXAM What to bring: → CMU ID → Calculator → Two pages of handwritten notes (double-sided) Optional: → Spare change of clothes What not to bring: → Your roommate CMU 15-445/645 (Fall 2018) 5 COURSE EVALS Your feedback is strongly needed: → https://cmu.smartevals.com Things that we want feedback on: → Homework Assignments → Projects → Reading Materials → Lectures CMU 15-445/645 (Fall 2018) 6 OFFICE HOURS Andy: → Friday Dec. 14th @ 3:00pm-4:00pm CMU 15-445/645 (Fall 2018) 7 STUFF BEFORE MID-TERM SQL Buffer Pool Management Hash Tables B+Trees Storage Models CMU 15-445/645 (Fall 2018) 8 PARALLEL EXECUTION Inter-Query Parallelism Intra-Query Parallelism Inter-Operator Parallelism Intra-Operator Parallelism CMU 15-445/645 (Fall 2018) 9 EMBEDDED LOGIC User-defined Functions Stored Procedures Focus on advantages vs. disadvantages CMU 15-445/645 (Fall 2018) 10 TRANSACTIONS ACID Conflict Serializability: → How to check? → How to ensure? View Serializability Recoverable Schedules Isolation Levels / Anomalies CMU 15-445/645 (Fall 2018) 11 TRANSACTIONS Two-Phase Locking → Strict vs. Non-Strict → Deadlock Detection & Prevention Multiple Granularity Locking → Intention Locks CMU 15-445/645 (Fall 2018) 12 TRANSACTIONS Timestamp Ordering Concurrency Control → Thomas Write Rule Optimistic Concurrency Control → Read Phase → Validation Phase → Write Phase Multi-Version Concurrency Control → Version Storage / Ordering → Garbage Collection CMU 15-445/645 (Fall 2018) 13 CRASH RECOVERY Buffer Pool Policies: → STEAL vs. NO-STEAL → FORCE vs. NO-FORCE Write-Ahead Logging Logging Schemes Checkpoints ARIES Recovery → Log Sequence Numbers → CLRs CMU 15-445/645 (Fall 2018) 14 DISTRIBUTED DATABASES System Architectures Replication Partitioning Schemes Two-Phase Commit CMU 15-445/645 (Fall 2018) 15 2015 2016 2017 2018 MongoDB 32 MongoDB 33 Google Spanner/F1 15 CockroachDB 26 Google Spanner/F1 22 Google Spanner/F1 22 MongoDB 14 Google Spanner/F1 25 LinkedIn Espresso 16 Apache Cassandra 19 CockroachDB 10 MongoDB 24 Apache Cassandra 16 Facebook Scuba 17 Apache Hbase 9 Amazon Aurora 18 Facebook Scuba 16 Redis 16 Peloton 8 Redis 18 Apache Hbase 14 Apache Hbase 15 Facebook Scuba 6 Apache Cassandra 17 VoltDB 10 CockroachDB 12 Cloudera Impala 6 ElasticSearch 12 Redis 10 LinkedIn Espresso 11 Apache Hive 6 Apache Hive 11 Vertica 5 Cloudera Impala 8 Apache Cassandra 5 Facebook Scuba 10 Cloudera Impala 5 Peloton 7 LinkedIn Espresso 5 MySQL 10 CMU 15-445/645 (Fall 2018) 16 CMU 15-445/645 (Fall 2018) 17 COCKROACHDB Started in 2015 by ex-Google employees. Open-source (Apache Licensed) Decentralized shared-nothing architecture. Log-structured on-disk storage (RocksDB) Concurrency Control: → MVCC + OCC → Serializable isolation only CMU 15-445/645 (Fall 2018) 18 DISTRIBUTED ARCHITECTURE Multi-layer architecture on top of a SQL Layer replicated key-value store. Transactional → All tables and indexes are store in a giant Key-Value sorted map in the k/v store. Uses RocksDB as the storage manager Router at each node. Replication Raft protocol (variant of Paxos) for replication and consensus. Storage CMU 15-445/645 (Fall 2018) 19 CONCURRENCY CONTROL DBMS uses hybrid clocks (physical + logical) to order transactions globally. → Synchronized wall clock with local counter. Txns stage writes as "intents" and then checks for conflicts on commit. All meta-data about txns state resides in the key- value store. CMU 15-445/645 (Fall 2018) 20 COCKROACHDB OVERVIEW Global Database Keyspace (Logical) System Table1 Index1 Table2 Application Key→Location Key→Data Key→Data Key→Data Node 1 Node 2 Node 3 CMU 15-445/645 (Fall 2018) 20 COCKROACHDB OVERVIEW Global Database Keyspace (Logical) System Table1 Index1 Table2 Application Key→Location Key→Data Key→Data Key→Data Key/Value API Node 1 Node 2 Node 3 Leader CMU 15-445/645 (Fall 2018) 20 COCKROACHDB OVERVIEW Global Database Keyspace (Logical) System Table1 Index1 Table2 Application Key→Location Key→Data Key→Data Key→Data Key/Value API Raft Node 1 Node 2 Node 3 Leader Raft CMU 15-445/645 (Fall 2018) 21 CMU 15-445/645 (Fall 2018) 21 CMU 15-445/645 (Fall 2018) 21 CMU 15-445/645 (Fall 2018) 2006 2011 23 GOOGLE SPANNER Google’s geo-replicated DBMS (>2011) Schematized, semi-relational data model. Decentralized shared-disk architecture. Log-structured on-disk storage. Concurrency Control: → Strict 2PL + MVCC + Multi-Paxos + 2PC → Externally consistent global write-transactions with synchronous replication. → Lock-free read-only transactions. CMU 15-445/645 (Fall 2018) 24 PHYSICAL DENORMALIZATION CREATE TABLE users { uid INT NOT NULL, email VARCHAR, PRIMARY KEY (uid) }; CREATE TABLE albums { uid INT NOT NULL, aid INT NOT NULL, name VARCHAR, PRIMARY KEY (uid, aid) } INTERLEAVE IN PARENT users ON DELETE CASCADE; CMU 15-445/645 (Fall 2018) 24 PHYSICAL DENORMALIZATION CREATE TABLE users { uid INT NOT NULL, Physical Storage email VARCHAR, users(1001) PRIMARY KEY (uid) ⤷albums(1001, 9990) }; CREATE TABLE albums { ⤷albums(1001, 9991) uid INT NOT NULL, aid INT NOT NULL, users(1002) name VARCHAR, ⤷albums(1002, 6631) PRIMARY KEY (uid, aid) } INTERLEAVE IN PARENT users ⤷albums(1002, 6634) ON DELETE CASCADE; CMU 15-445/645 (Fall 2018) 25 CONCURRENCY CONTROL MVCC + Strict 2PL with Wound-Wait Deadlock Prevention Ensures ordering through globally unique timestamps generated from atomic clocks and GPS devices. Database is broken up into tablets: → Use Paxos to elect leader in tablet group. → Use 2PC for txns that span tablets. CMU 15-445/645 (Fall 2018) 26 SPANNER TABLETS Writes + Reads Tablet A Tablet A Tablet A Group Paxos Data Center 1 Data Center 2 Data Center 3 Leader CMU 15-445/645 (Fall 2018) 26 SPANNER TABLETS Writes + Reads Tablet A Tablet A Tablet A Paxos Paxos Group Paxos Data Center 1 Data Center 2 Data Center 3 Leader CMU 15-445/645 (Fall 2018) 26 SPANNER TABLETS Snapshot Reads Writes + Reads Snapshot Reads Tablet A Tablet A Tablet A Paxos Paxos Group Paxos Data Center 1 Data Center 2 Data Center 3 Leader CMU 15-445/645 (Fall 2018) 26 SPANNER TABLETS Tablet B-Z 2PC Paxos Groups Snapshot Reads Writes + Reads Snapshot Reads Tablet A Tablet A Tablet A Paxos Paxos Group Paxos Data Center 1 Data Center 2 Data Center 3 Leader CMU 15-445/645 (Fall 2018) 27 TRANSACTION ORDERING Spanner orders transactions based on physical "wall-clock" time. → This is necessary to guarantee linearizability. → If T1 finishes before T2, then T2 should see the result of T1. Each Paxos group decides in what order transactions should be committed according to the timestamps. → If T1 commits at time1 and T2 starts at time2 > time1, then T1's timestamp should be less than T2's. CMU 15-445/645 (Fall 2018) 28 SPANNER TRUETIME The DBMS maintains a global wall-clock time across all data centers with bounded uncertainty. Timestamps are intervals, not single values TT.now() TIME earliest latest 2*ε CMU 15-445/645 (Fall 2018) 30 SPANNER TRUETIME Commit + Acquire Locks Release Locks TIME Commit Timestamp s Wait until s > TT.now().latest TT.now().earliest > s Commit Wait average ε average ε CMU 15-445/645 (Fall 2018) 32 GOOGLE F1 (2013) OCC engine built on top of Spanner. → In the read phase, F1 returns the last modified timestamp with each row. No locks. → The timestamp for a row is stored in a hidden lock column. The client library returns these timestamps to the F1 server. → If the timestamps differ from the current timestamps at the time of commit the transaction is aborted. CMU 15-445/645 (Fall 2018) 33 GOOGLE CLOUD SPANNER (2017) Spanner Database-as-a-Service. AFAIK, it is based on Spanner SQL not F1. CMU 15-445/645 (Fall 2018) 34 CMU 15-445/645 (Fall 2018) 35 MONGODB Distributed document DBMS started in 2007. → Document → Tuple → Collection → Table/Relation Open-source (Server Side Public License) Centralized shared-nothing architecture. Concurrency Control: → OCC with multi-granular locking CMU 15-445/645 (Fall 2018) 36 PHYSICAL DENORMALIZATION A customer has orders and each order has order items. Customers R1(custId,name,…) ⨝ Orders R2(orderId,custId,…) ⨝ Order Items R3(itemId,orderId,…) CMU 15-445/645 (Fall 2018) 36 PHYSICAL DENORMALIZATION A customer has orders and each order has order items. Customers Customer Orders OrdersOrder Orders Order Item ⋮ Order Item Order Items CMU 15-445/645 (Fall 2018) 36 PHYSICAL DENORMALIZATION A customer has orders and each order { has order items. "custId": 1234, "custName": "Andy", "orders": [ Customers { "orderId": 9999, "orderItems": [ { "itemId": "XXXX", "price": 19.99 }, Orders { "itemId": "YYYY", "price": 29.99 }, ] } Order Items ] } CMU 15-445/645 (Fall 2018) 37 QUERY EXECUTION JSON-only query API No cost-based query planner / optimizer. → Heuristic-based + "random walk" optimization. JavaScript UDFs (not encouraged). Supports server-side joins (only left-outer?). Multi-document transactions (new in 2018). CMU 15-445/645 (Fall 2018) 38 DISTRIBUTED ARCHITECTURE Heterogeneous distributed components. → Shared nothing architecture → Centralized query router. Master-slave replication. Auto-sharding: → Define 'partitioning' attributes for each collection (hash or range). → When a shard

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    51 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us