<<

Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Overview Carnegie Mellon Univ. School of Computer Science • history 15-415 - Applications • concepts • Formal query languages – Lecture #5: Relational Algebra – rel. tuple calculus – rel. domain calculus

Faloutsos CMU SCS 15-415 #2

CMU SCS CMU SCS

History Concepts - reminder

• before: records, pointers, sets etc • Database: a set of relations (= tables) • introduced by E.F. Codd in 1970 • rows: tuples • revolutionary! • columns: attributes (or keys) • first systems: 1977-8 (System R; Ingres) • superkey, candidate key, primary key • Turing award in 1981

Faloutsos CMU SCS 15-415 #3 Faloutsos CMU SCS 15-415 #4

CMU SCS CMU SCS

Example Example: cont’d k-th attribute Database: Database: (Dk domain)

STUDENT SSN c-id grade Ssn Name Address 123 15-413 A rel. schema (attr+domains) 123 smith main str 234 15-413 B 234 jones forbes ave tuple

Faloutsos CMU SCS 15-415 #5 Faloutsos CMU SCS 15-415 #6

1 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Example: cont’d Example: cont’d

• Di: the domain of the i-th attribute (eg., char(10)

STUDENT Ssn Name Address rel. schema (attr+domains) rel. schema (attr+domains) 123 smith main str 234 jones forbes ave instance instance

Faloutsos CMU SCS 15-415 #7 Faloutsos CMU SCS 15-415 #8

CMU SCS CMU SCS

Overview Formal query languages

• history • How do we collect information? • concepts • Eg., find ssn’s of people in 415 • Formal query languages • (recall: everything is a set!) – relational algebra • One solution: Rel. algebra, ie., set operators – rel. tuple calculus • Q1: Which ones?? – rel. domain calculus • Q2: what is a minimal set of operators?

Faloutsos CMU SCS 15-415 #9 Faloutsos CMU SCS 15-415 #10

CMU SCS CMU SCS

Relational operators Example:

• . • Q: find all students (part or full time) • . • A: PT-STUDENT union FT-STUDENT • . • set union U FT-STUDENT PT-STUDENT • set difference ‘-’ Ssn Name Ssn Name Address 129 peters main str 123 smith main str 239 lee 5th ave 234 jones forbes ave

Faloutsos CMU SCS 15-415 #11 Faloutsos CMU SCS 15-415 #12

2 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Observations: Observations:

• two tables are ‘union compatible’ if they • A: redundant:

have the same attributes (‘domains’) • STUDENT intersection STAFF =

• Q: how about intersection U

STUDENT STAFF

Faloutsos CMU SCS 15-415 #13 Faloutsos CMU SCS 15-415 #14

CMU SCS CMU SCS

Observations: Observations:

• A: redundant: • A: redundant: • STUDENT intersection STAFF = • STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF)

STUDENT STAFF STUDENT STAFF

Faloutsos CMU SCS 15-415 #15 Faloutsos CMU SCS 15-415 #16

CMU SCS CMU SCS

Observations: Relational operators

• A: redundant: • . • STUDENT intersection STAFF = • . STUDENT - (STUDENT - STAFF) • . • set union U Double negation: • set difference ‘-’ We’ll see it again, later…

Faloutsos CMU SCS 15-415 #17 Faloutsos CMU SCS 15-415 #18

3 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Other operators? Other operators?

• eg, find all students on ‘Main street’ • Notice: selection (and rest of operators) • A: ‘selection’ expect tables, and produce tables (-> can be cascaded!!)  (STUDENT) address'mainstr' • For selection, in general: STUDENT  (RELATION) Ssn Name Address condition 123 smith main str 234 jones forbes ave

Faloutsos CMU SCS 15-415 #19 Faloutsos CMU SCS 15-415 #20

CMU SCS CMU SCS

Selection - examples Relational operators

• Find all ‘Smiths’ on ‘Forbes Ave’ • selection  condition (R) • .

 name'Smith'  address'Forbes ave' (STUDENT) • . • set union R U S ‘condition’ can be any boolean combination of ‘=‘, • set difference R - S ‘>’, ‘>=‘, ...

Faloutsos CMU SCS 15-415 #21 Faloutsos CMU SCS 15-415 #22

CMU SCS CMU SCS

Relational operators Relational operators

• selection picks rows - how about columns? Cascading: ‘find ssn of students on ‘forbes ave’ • A: ‘projection’ - eg.:  ssn (STUDENT)

 ssn ( address' forbesave' (STUDENT)) finds all the ‘ssn’ - removing duplicates STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave

Faloutsos CMU SCS 15-415 #23 Faloutsos CMU SCS 15-415 #24

4 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Relational operators Relational operators

• selection Are we done yet?  condition (R) • projection Q: Give a query we can not answer yet!  attlist (R) • . • set union R U S • set difference R - S

Faloutsos CMU SCS 15-415 #25 Faloutsos CMU SCS 15-415 #26

CMU SCS CMU SCS

Relational operators Relational operators

A: any query across two or more tables, A: any query across two or more tables, eg., ‘find names of students in 15-415’ eg., ‘find names of students in 15-415’ Q: what extra operator do we need?? Q: what extra operator do we need?? A: surprisingly, cartesian product is enough! TAKES STUDENT SSN c-id grade Ssn Name Address 123 15-413 A 123 smith main str 234 15-413 B 234 jones forbes ave

Faloutsos CMU SCS 15-415 #27 Faloutsos CMU SCS 15-415 #28

CMU SCS CMU SCS

Cartesian product so what?

• eg., dog-breeding: MALE x FEMALE • Eg., how do we find names of students taking • gives all possible couples 415?

SSN c-id grade MALE FEMALE M.name F.name 123 15-415 A name x name = spike lassie 234 15-413 B spike lassie spike shiba spot shiba spot lassie spot shiba

Faloutsos CMU SCS 15-415 #29 Faloutsos CMU SCS 15-415 #30

5 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Cartesian product Cartesian product

• A: .. cid15415( STUDENT.ssnTAKES.ssn(STUDENT x TAKES)) ......  STUDENT.ssnTAKES.ssn (STUDENT x TAKES)

Ssn Name Address ssn cid grade 123 smith main str 123 15-415 A 234 jones forbes ave 123 15-415 A 123 smith main str 234 15-413 B 234 jones forbes ave 234 15-413 B

Faloutsos CMU SCS 15-415 #31 Faloutsos CMU SCS 15-415 #32

CMU SCS CMU SCS FUNDAMENTAL Relational operators  name(  ( (STUDENT x TAKES)) • selection cid 15415 STUDENT.ssnTAKES.ssn  condition (R) ) • projection  attlist (R) • cartesian product MALE x FEMALE • set union R U S • set difference R - S

Faloutsos CMU SCS 15-415 #33 Faloutsos CMU SCS 15-415 #34

CMU SCS CMU SCS

Relational ops Joins

• Surprisingly, they are enough, to help us • Equijoin: R  R.aS.b S  R.aS.b (R  S) answer almost any query we want!! • derived/convenience operators: – set intersection – join (theta join, equi-join, natural join)

– ‘rename’ operator  R' (R) – division R  S

Faloutsos CMU SCS 15-415 #35 Faloutsos CMU SCS 15-415 #36

6 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Cartesian product Joins R  S  (R  S) • A: ......  STUDENT.ssnTAKES.ssn (STUDENT x TAKES) • Equijoin: R.aS.b R.aS.b • theta-joins: R  S Ssn Name Address ssn cid grade generalization of equi-join - any condition  123 smith main str 123 15-415 A 234 jones forbes ave 123 15-415 A 123 smith main str 234 15-413 B 234 jones forbes ave 234 15-413 B

Faloutsos CMU SCS 15-415 #37 Faloutsos CMU SCS 15-415 #38

CMU SCS CMU SCS

Joins Joins

• very popular: natural join: R S • nat. join has 5 attributes STUDENT TAKES • like equi-join, but it drops duplicate columns: STUDENT (ssn, name, address) TAKES (ssn, cid, grade)

equi-join: 6 STUDENT  STUDENT.ssnTAKES.ssn TAKES

Faloutsos CMU SCS 15-415 #39 Faloutsos CMU SCS 15-415 #40

CMU SCS CMU SCS

Natural Joins - nit-picking Overview - rel. algebra

• if no attributes in common between R, S: • fundamental operators nat. join -> cartesian product • derived operators – joins etc – rename – division • examples

Faloutsos CMU SCS 15-415 #41 Faloutsos CMU SCS 15-415 #42

7 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Rename op. Rename op.

• Q: why?  AFTER (BEFORE) • PC (parent-id, child-id) PC  PC • A: shorthand; self-joins; … • for example, find the grand-parents of PC ‘Tom’, given PC (parent-id, child-id) p-id c-id Mary Tom Peter Mary John Tom

Faloutsos CMU SCS 15-415 #43 Faloutsos CMU SCS 15-415 #44

CMU SCS CMU SCS

Rename op. Rename op.

• first, WRONG attempt: • we clearly need two different names for the same table - hence, the ‘rename’ op. • (why? how many columns?) • Second WRONG attempt:  (PC)  PC PC1 PC1.cidPC . pid PC  PC .cidPC . pid PC

Faloutsos CMU SCS 15-415 #45 Faloutsos CMU SCS 15-415 #46

CMU SCS CMU SCS

Overview - rel. algebra Division

• fundamental operators • Rarely used, but powerful. • derived operators • Example: find suspicious suppliers, ie., – joins etc suppliers that supplied all the parts in – rename A_BOMB – division • examples

Faloutsos CMU SCS 15-415 #47 Faloutsos CMU SCS 15-415 #48

8 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Division Division

SHIPMENT • Observations: ~reverse of cartesian product s# p# ABOMB BAD_S • It can be derived from the 5 fundamental s1 p1 p# s# s2 p1  operators (!!)  p1 s1 s1 p2 p2 • How? s3 p1 s5 p3

Faloutsos CMU SCS 15-415 #49 Faloutsos CMU SCS 15-415 #50

CMU SCS CMU SCS

Division Division

• Answer: • Answer:

r  s  (r)  [( (r)s)  r] (RS) (RS) (RS)

• Observation: find ‘good’ suppliers, and • Observation: find ‘good’ suppliers, and subtract! (double negation) subtract! (double negation)

Faloutsos CMU SCS 15-415 #51 Faloutsos CMU SCS 15-415 #52

CMU SCS CMU SCS

Division Division

• Answer: • Answer:

All suppliers all possible

All bad parts suspicious shipments

Faloutsos CMU SCS 15-415 #53 Faloutsos CMU SCS 15-415 #54

9 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Division Division SHIPMENT s# p# ABOMB BAD_S s1 p1 p# s# s2 p1   p1 s1 • Answer: s1 p2 p2 • Answer: s3 p1 s5 p3 r  s  (r)  [( (r)s)  r] (RS) (RS) (RS)

all possible all suppliers who missed suspicious shipments at least one suspicious shipment, that didn’t happen i.e.: ‘good’ suppliers Faloutsos CMU SCS 15-415 #55 Faloutsos CMU SCS 15-415 #56

CMU SCS CMU SCS

Overview - rel. algebra Sample schema find names of students that take 15-415 • fundamental operators STUDENT CLASS • derived operators Ssn Name Address c-id c-name units 123 smith main str 15-413 s.e. 2 – joins etc 234 jones forbes ave 15-412 o.s. 2 – rename – division TAKES SSN c-id grade • examples 123 15-413 A 234 15-413 B

Faloutsos CMU SCS 15-415 #57 Faloutsos CMU SCS 15-415 #58

CMU SCS CMU SCS

Examples Examples

• find names of students that take 15-415 • find names of students that take 15-415

 name[ cid15415(STUDENT TAKES)]

Faloutsos CMU SCS 15-415 #59 Faloutsos CMU SCS 15-415 #60

10 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Sample schema Examples find course names of ‘smith’

STUDENT CLASS • find course names of ‘smith’ Ssn Name Address c-id c-name units 123 smith main str 15-413 s.e. 2 234 jones forbes ave 15-412 o.s. 2  cname[  name'smith' (

TAKES STUDENT TAKES  CLASS SSN c-id grade 123 15-413 A )] 234 15-413 B

Faloutsos CMU SCS 15-415 #61 Faloutsos CMU SCS 15-415 #62

CMU SCS CMU SCS

Examples Examples

• find ssn of ‘overworked’ students, ie., that • find ssn of ‘overworked’ students, ie., that take 412, 413, 415 take 412, 413, 415: almost correct answer:

 cname412(TAKES) 

 cname413(TAKES) 

 cname415(TAKES)

Faloutsos CMU SCS 15-415 #63 Faloutsos CMU SCS 15-415 #64

CMU SCS CMU SCS

Examples Examples

• find ssn of ‘overworked’ students, ie., that • find ssn of students that work at least as take 412, 413, 415 - Correct answer: hard as ssn=123, ie., they take all the courses of ssn=123, and maybe more  [ (TAKES)] ssn cname412  [ (TAKES)] ssn c-namename=413412

 ssn[ c-namename=415412 (TAKES)]

Faloutsos CMU SCS 15-415 #65 Faloutsos CMU SCS 15-415 #66

11 Faloutsos CMU SCS 15-415

CMU SCS CMU SCS

Sample schema Examples

STUDENT CLASS • find ssn of students that work at least as Ssn Name Address c-id c-name units hard as ssn=123 (ie., they take all the 123 smith main str 15-413 s.e. 2 courses of ssn=123, and maybe more 234 jones forbes ave 15-412 o.s. 2

TAKES [ ssn,cid (TAKES)] cid [ ssn123(TAKES)] SSN c-id grade 123 15-413 A 234 15-413 B

Faloutsos CMU SCS 15-415 #67 Faloutsos CMU SCS 15-415 #68

CMU SCS

Conclusions

: only tables (‘relations’) • relational algebra: powerful, minimal: 5 operators can handle almost any query!

Faloutsos CMU SCS 15-415 #69

12