From Logical Plans to Physical Plans

CSE 344 - Winter 2015 1 Query Evaluation Steps SQL query

Parse & Check Query Check syntax, Translate query access control, string into internal names, etc. representation Decide how best to answer query: query Logical plan à optimization physical plan

Query Execution Query Evaluation Return Results CSE 344 - Winter 2015 2 Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Example

SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’

Give a expression for this query

CSE 344 - Winter 2015 3 Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Relational Algebra

SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’

π sname(σ scity=‘Seattle’∧ sstate=‘WA’∧ pno=2 (Supplier sid = sid Supply))

CSE 344 - Winter 2015 4 Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Relational Algebra

SELECT sname π sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = 2 σ and x.scity = ‘Seattle’ scity=‘Seattle’ ∧ sstate=‘WA’ ∧ pno=2 and x.sstate = ‘WA’

sid = sid

Relational algebra expression is also called the “logical query plan” Supplier Supply

CSE 344 - Winter 2015 5 Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 1

(On the fly) π sname A physical query plan is a logical query plan annotated with (On the fly) physical implementation details σ scity=‘Seattle’ ∧sstate=‘WA’ ∧ pno=2 SELECT sname FROM Supplier x, Supply y (Block-nested loop) WHERE x.sid = y.sid sid = sid and y.pno = 2 and x.scity = ‘Seattle’ and x.sstate = ‘WA’ Supplier Supply (File scan) (File scan) CSE 344 - Winter 2015 6 Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 2 Different but equivalent logical query plan; different physical plan (On the fly) π sname (d) SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid (Sort-merge ) (c) and y.pno = 2 sid = sid and x.scity = ‘Seattle’ (Scan and x.sstate = ‘WA’ write to T1) (Scan (a) σ write to T2) scity=‘Seattle’ ∧sstate=‘WA’ (b) σ pno=2

Supplier Supply (File scan) (File scan)

CSE 344 - Winter 2015 7 Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 3

(On the fly) (d) π sname Another logical plan that (On the fly) produces the same result and (c) σ scity=‘Seattle’ ∧sstate=‘WA’ is implemented with a different physical plan

(b) sid = sid (Index nested loop) SELECT sname (Use index) FROM Supplier x, Supply y WHERE x.sid = y.sid (a) σ pno=2 and y.pno = 2 and x.scity = ‘Seattle’ Supply Supplier and x.sstate = ‘WA’ (Index lookup on pno ) (Index lookup on sid) 8 Assume: clustered Doesn’t matter if clustered or not Physical Data Independence

• Means that applications are insulated from changes in physical storage details – E.g., can add/remove indexes without changing apps – Can do other physical tunings for performance

• SQL and relational algebra facilitate physical data independence because both languages are “set-at-a-time”: Relations as input and output

CSE 344 - Winter 2015 9