class 20 updates 2.0 prof. Stratos Idreos

HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ UPDATE table_name SET column1=value1,column2=value2,... WHERE some_column=some_value

INSERT INTO table_name VALUES (value1,value2,value3,...)

updates

CS165, Fall 2015 Stratos Idreos 2 /34 the world has changed a little bit by now…

updates

CS165, Fall 2015 Stratos Idreos 3 /34 monitor CPU utilization

monitor memory hierarchy utilization

monitor clicks (frequency, locations, specific links, sequences)

CS165, Fall 2015 Stratos Idreos 4 /34 new entry (a,b,c,d,…) on x N columns, K trees, statistics, … table x A B C D A

D … B

CS165, Fall 2015 Stratos Idreos 5 /34 updates reads

A B C D A B C D

periodic

and/or on-the-fly merge

write optimized-store read optimized-store

A case for fractured mirrors Ravishankar Ramamurthy, David J. DeWitt, Qi Su Very Large Journal (VLDBJ), 2003

CS165, Fall 2015 Stratos Idreos 6 /34 L1 L2

CS165, Fall 2015 Stratos Idreos 7 /34 buffer L1 L2

any problems

CS165, Fall 2015 Stratos Idreos 8 /34 L1 L2

CS165, Fall 2015 Stratos Idreos 9 /34 c0 c1 c3 … L1 L2

c4

c5

c6 …

CS165, Fall 2015 Stratos Idreos 10/34 in-place updates: the cardinal sin

A(t1) A(t2) A(t3)

The Transaction Concept: Virtues and Limitations Jim Gray, Tandem TR 81.3, 1981 …

CS165, Fall 2015 Stratos Idreos 11/34 what can go wrong? failures & >1 queries

CS165, Fall 2015 Stratos Idreos 12/34 update all rows A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CS165, Fall 2015 Stratos Idreos 13/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update

+project actions

CS165, Fall 2015 Stratos Idreos 13/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update

select+project actions

A B C D

list of rowIDs (positions) (sort)

CS165, Fall 2015 Stratos Idreos 13/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update

select+project actions

A B C D

list of rowIDs (positions) (sort)

we know what to update but nothing happened yet

CS165, Fall 2015 Stratos Idreos 13/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU

level 1 read page in L1 update level 2 persist to L2 A B C D if problem (power/abort) before we write all pages we are left with an inconsistent state

WAL: keep persistent notes as we go so we can resume or undo

CS165, Fall 2015 Stratos Idreos 14/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU read page in L1 update persist to L2 level 1

level 2 A B C D

CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU read page in L1 update persist to L2 level 1

level 2 A B C D

CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU read page in L1 update persist to L2 level 1

level 2 A B C D

CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU apply all updates read page in L1 update for this page persist to L2 level 1

level 2 A B C D

CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU apply all updates read page in L1 update for this page persist to L2 level 1

level 2 A B C D

N updates in persistent storage

CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU apply all updates read page in L1 update for this page persist to L2 level 1

level 2 A B C D

N updates in persistent storage

(update in place/out of place) problems when failure

CS165, Fall 2015 Stratos Idreos 15/34 recovery classic example joe owes mike 100$ both joe and mike have a Bank of Bla account

possible actions

joe -100 mike + 100

mike + 100 joe - 100

what if there is a failure here?

CS165, Fall 2015 Stratos Idreos 16/34 recovery classic example joe owes mike 100$ both joe and mike have a Bank of Bla account

actually possible actions 1)read mike; 2) mike+100; 3) write new mike;

joe -100 mike + 100

mike + 100 joe - 100

what if there is a failure here?

CS165, Fall 2015 Stratos Idreos 16/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

CPU read page in L1 update persist to L2 level 1

level 2 A B C D

what do we need to remember (log)

CS165, Fall 2015 Stratos Idreos 17/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update

select+project actions

A B C D

list of rowIDs (positions) (sort)

restart: get set of positions again, and either resume last written page or undo all previously written pages

CS165, Fall 2015 Stratos Idreos 18/34 buffer updates buffer updates buffer updates

log log log memory memory memory

disk SSD non volatile memory

disk SSD

disk

CS165, Fall 2015 Stratos Idreos 19/34 flash disks read/write asymmetry

need to erase a block to write it

finite number of erase cycles

MaSM: efficient online updates in data warehouses Manos Athanassoulis, Shimin Chen, Anastasia Ailamaki, Phillip Gibbons, Radu Stoica ACM SIGMOD Conference, 2011

CS165, Fall 2015 Stratos Idreos 20/34 C Mohan, IBM Research ACM SIGMOD Edgar F. Codd Innovations award 1993

(rumor has it he has his own Facebook server) (also check his noSQL lecture)

ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging C. Mohan, Donald J. Haderle, Bruce G. Lindsay, Hamid Pirahesh, Peter M. Schwarz ACM Transactions on Systems, 1992

CS165, Fall 2015 Stratos Idreos 21/34 recovery classic example joe owes mike 100$ both joe andΧ mike have a Bank of Bla account

what about logging and recovery during e-shopping e.g., shopping cart, wish list, check out

Quantifying eventual consistency with PBS Peter Bailis, Shivaram Venkataraman, Michael J. Franklin, Joseph M. Hellerstein, Ion Stoica Communications of the ACM, 2014 CS165, Fall 2015 Stratos Idreos 22/34 db noSQL why noSQL simple complex clean legacy just enough tuning … expensive …

Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing Ashish Gupta et al. International Conference on as apps become Very Large Databases (VLDB), 2014 more complex as apps need to be F1: A Distributed SQL Database That Scales Jeff Shute et al. more scalable International Conference on Very Large Databases (VLDB), 2013 newSQL

(more in 265) CS165, Fall 2015 Stratos Idreos 23/34 all or nothing remember what we changed so we can undo or resume (depends on applications semantics)

CS165, Fall 2015 Stratos Idreos 24/34 parse optimize transaction any database query that plan( can/should be seen as a single task read X write Z … )

CS165, Fall 2015 Stratos Idreos 25/34 Atomicity Consistency Isolation Durability

Jim Gray, IBM, Tandem, DEC, Microsoft ACM Turing award ACM SIGMOD Edgar F. Codd Innovations award

CS165, Fall 2015 Stratos Idreos 26/34 all or nothing

Atomicity correct Consistency Isolation >1 queries Durability persistent

CS165, Fall 2015 Stratos Idreos 27/34 A B C D

queries arrive concurrently Goal: q1) select A>v1, B=B+b, C=C-c process queries in parallel q2) select B>v2, avg(C) leave db in a consistent state q3) select Cv4, A=C return correct results q4) write more queries

CS165, Fall 2015 Stratos Idreos 28/34 query 1 query 2

parse parse optimize optimize

plan( plan( read X read Z write Z write Z … … ) )

CS165, Fall 2015 Stratos Idreos 29/34 locks wait until lock is free

… … get_rlock(X) R W read X read X R yes no release_rlock(X) … W no no …

… … write X get_wlock(X) … write X release_wlock(X) … commit

CS165, Fall 2015 Stratos Idreos 30/34 2 phase locking 1. get all locks

2. do all tasks

3. commit

4. release all locks

(and variations)

HT

be positive! - optimistic concurrency control

CS165, Fall 2015 Stratos Idreos 31/34 lock granularity database table1 table2

C1 C2

… …

A survey of B-tree logging and recovery techniques Goetz Graefe ACM Trans. Database Systems, 2012 CS165, Fall 2015 Stratos Idreos 32/34 textbook: chapters 16, 17, 18

CS165, Fall 2015 Stratos Idreos 33/34 class 20

updates 2.0 DATA SYSTEMS prof. Stratos Idreos