class 20 updates 2.0 prof. Stratos Idreos
HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ UPDATE table_name SET column1=value1,column2=value2,... WHERE some_column=some_value
INSERT INTO table_name VALUES (value1,value2,value3,...)
updates
CS165, Fall 2015 Stratos Idreos 2 /34 the world has changed a little bit by now…
updates
CS165, Fall 2015 Stratos Idreos 3 /34 monitor CPU utilization
monitor memory hierarchy utilization
monitor clicks (frequency, locations, specific links, sequences)
CS165, Fall 2015 Stratos Idreos 4 /34 insert new entry (a,b,c,d,…) on table x update N columns, K trees, statistics, … table x A B C D A
D … B
CS165, Fall 2015 Stratos Idreos 5 /34 updates reads
A B C D A B C D
periodic merge
and/or on-the-fly merge
write optimized-store read optimized-store
A case for fractured mirrors Ravishankar Ramamurthy, David J. DeWitt, Qi Su Very Large Databases Journal (VLDBJ), 2003
CS165, Fall 2015 Stratos Idreos 6 /34 L1 L2
CS165, Fall 2015 Stratos Idreos 7 /34 buffer L1 L2
any problems
CS165, Fall 2015 Stratos Idreos 8 /34 L1 L2
CS165, Fall 2015 Stratos Idreos 9 /34 c0 c1 c3 … L1 L2
c4
c5
c6 …
CS165, Fall 2015 Stratos Idreos 10/34 in-place updates: the cardinal sin
A(t1) A(t2) A(t3)
The Transaction Concept: Virtues and Limitations Jim Gray, Tandem TR 81.3, 1981 …
CS165, Fall 2015 Stratos Idreos 11/34 what can go wrong? failures & >1 queries
CS165, Fall 2015 Stratos Idreos 12/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CS165, Fall 2015 Stratos Idreos 13/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update
select+project actions
CS165, Fall 2015 Stratos Idreos 13/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update
select+project actions
A B C D
list of rowIDs (positions) (sort)
CS165, Fall 2015 Stratos Idreos 13/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update
select+project actions
A B C D
list of rowIDs (positions) (sort)
we know what to update but nothing happened yet
CS165, Fall 2015 Stratos Idreos 13/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU
level 1 read page in L1 update level 2 persist to L2 A B C D if problem (power/abort) before we write all pages we are left with an inconsistent state
WAL: keep persistent notes as we go so we can resume or undo
CS165, Fall 2015 Stratos Idreos 14/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU read page in L1 update persist to L2 level 1
level 2 A B C D
CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU read page in L1 update persist to L2 level 1
level 2 A B C D
CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU read page in L1 update persist to L2 level 1
level 2 A B C D
CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU apply all updates read page in L1 update for this page persist to L2 level 1
level 2 A B C D
CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU apply all updates read page in L1 update for this page persist to L2 level 1
level 2 A B C D
N updates in persistent storage
CS165, Fall 2015 Stratos Idreos 15/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU apply all updates read page in L1 update for this page persist to L2 level 1
level 2 A B C D
N updates in persistent storage
(update in place/out of place) problems when failure
CS165, Fall 2015 Stratos Idreos 15/34 recovery classic example joe owes mike 100$ both joe and mike have a Bank of Bla account
possible actions
joe -100 mike + 100
mike + 100 joe - 100
what if there is a failure here?
CS165, Fall 2015 Stratos Idreos 16/34 recovery classic example joe owes mike 100$ both joe and mike have a Bank of Bla account
actually possible actions 1)read mike; 2) mike+100; 3) write new mike;
joe -100 mike + 100
mike + 100 joe - 100
what if there is a failure here?
CS165, Fall 2015 Stratos Idreos 16/34 update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)
CPU read page in L1 update persist to L2 level 1
level 2 A B C D
what do we need to remember (log)
CS165, Fall 2015 Stratos Idreos 17/34 update all rows A B C D where A=v1 & B=v2 search (scan/index) to (a=a/2,b=b/4,c=c-3,d=d+2) to find row to update
select+project actions
A B C D
list of rowIDs (positions) (sort)
restart: get set of positions again, and either resume from last written page or undo all previously written pages
CS165, Fall 2015 Stratos Idreos 18/34 buffer updates buffer updates buffer updates
log log log memory memory memory
disk SSD non volatile memory
disk SSD
disk
CS165, Fall 2015 Stratos Idreos 19/34 flash disks read/write asymmetry
need to erase a block to write it
finite number of erase cycles
MaSM: efficient online updates in data warehouses Manos Athanassoulis, Shimin Chen, Anastasia Ailamaki, Phillip Gibbons, Radu Stoica ACM SIGMOD Conference, 2011
CS165, Fall 2015 Stratos Idreos 20/34 C Mohan, IBM Research ACM SIGMOD Edgar F. Codd Innovations award 1993
(rumor has it he has his own Facebook server) (also check his noSQL lecture)
ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging C. Mohan, Donald J. Haderle, Bruce G. Lindsay, Hamid Pirahesh, Peter M. Schwarz ACM Transactions on Database Systems, 1992
CS165, Fall 2015 Stratos Idreos 21/34 recovery classic example joe owes mike 100$ both joe andΧ mike have a Bank of Bla account
what about logging and recovery during e-shopping e.g., shopping cart, wish list, check out
Quantifying eventual consistency with PBS Peter Bailis, Shivaram Venkataraman, Michael J. Franklin, Joseph M. Hellerstein, Ion Stoica Communications of the ACM, 2014 CS165, Fall 2015 Stratos Idreos 22/34 db noSQL why noSQL simple complex clean legacy just enough tuning … expensive …
Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing Ashish Gupta et al. International Conference on as apps become Very Large Databases (VLDB), 2014 more complex as apps need to be F1: A Distributed SQL Database That Scales Jeff Shute et al. more scalable International Conference on Very Large Databases (VLDB), 2013 newSQL
(more in 265) CS165, Fall 2015 Stratos Idreos 23/34 all or nothing remember what we changed so we can undo or resume (depends on applications semantics)
CS165, Fall 2015 Stratos Idreos 24/34 parse optimize transaction any database query that plan( can/should be seen as a single task read X write Z … )
CS165, Fall 2015 Stratos Idreos 25/34 Atomicity Consistency Isolation Durability
Jim Gray, IBM, Tandem, DEC, Microsoft ACM Turing award ACM SIGMOD Edgar F. Codd Innovations award
CS165, Fall 2015 Stratos Idreos 26/34 all or nothing
Atomicity correct Consistency Isolation >1 queries Durability persistent
CS165, Fall 2015 Stratos Idreos 27/34 A B C D
queries arrive concurrently Goal: q1) select A>v1, B=B+b, C=C-c process queries in parallel q2) select B>v2, avg(C) leave db in a consistent state q3) select C
CS165, Fall 2015 Stratos Idreos 28/34 query 1 query 2
parse parse optimize optimize
plan( plan( read X read Z write Z write Z … … ) )
CS165, Fall 2015 Stratos Idreos 29/34 locks wait until lock is free
… … get_rlock(X) R W read X read X R yes no release_rlock(X) … W no no …
… … write X get_wlock(X) … write X release_wlock(X) … commit
CS165, Fall 2015 Stratos Idreos 30/34 2 phase locking 1. get all locks
2. do all tasks
3. commit
4. release all locks
(and variations)
HT
be positive! - optimistic concurrency control
CS165, Fall 2015 Stratos Idreos 31/34 lock granularity database table1 table2
C1 C2
… …
A survey of B-tree logging and recovery techniques Goetz Graefe ACM Trans. Database Systems, 2012 CS165, Fall 2015 Stratos Idreos 32/34 textbook: chapters 16, 17, 18
CS165, Fall 2015 Stratos Idreos 33/34 class 20
updates 2.0 DATA SYSTEMS prof. Stratos Idreos