Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Overview Carnegie Mellon Univ. School of Computer Science • history 15-415 - Database Applications • concepts • Formal query languages – relational algebra Lecture #5: Relational Algebra – rel. tuple calculus – rel. domain calculus
Faloutsos CMU SCS 15-415 #2
CMU SCS CMU SCS
History Concepts - reminder
• before: records, pointers, sets etc • Database: a set of relations (= tables) • introduced by E.F. Codd in 1970 • rows: tuples • revolutionary! • columns: attributes (or keys) • first systems: 1977-8 (System R; Ingres) • superkey, candidate key, primary key • Turing award in 1981
Faloutsos CMU SCS 15-415 #3 Faloutsos CMU SCS 15-415 #4
CMU SCS CMU SCS
Example Example: cont’d k-th attribute Database: Database: (Dk domain)
STUDENT SSN c-id grade Ssn Name Address 123 15-413 A rel. schema (attr+domains) 123 smith main str 234 15-413 B 234 jones forbes ave tuple
Faloutsos CMU SCS 15-415 #5 Faloutsos CMU SCS 15-415 #6
1 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Example: cont’d Example: cont’d
• Di: the domain of the i-th attribute (eg., char(10)
STUDENT Ssn Name Address rel. schema (attr+domains) rel. schema (attr+domains) 123 smith main str 234 jones forbes ave instance instance
Faloutsos CMU SCS 15-415 #7 Faloutsos CMU SCS 15-415 #8
CMU SCS CMU SCS
Overview Formal query languages
• history • How do we collect information? • concepts • Eg., find ssn’s of people in 415 • Formal query languages • (recall: everything is a set!) – relational algebra • One solution: Rel. algebra, ie., set operators – rel. tuple calculus • Q1: Which ones?? – rel. domain calculus • Q2: what is a minimal set of operators?
Faloutsos CMU SCS 15-415 #9 Faloutsos CMU SCS 15-415 #10
CMU SCS CMU SCS
Relational operators Example:
• . • Q: find all students (part or full time) • . • A: PT-STUDENT union FT-STUDENT • . • set union U FT-STUDENT PT-STUDENT • set difference ‘-’ Ssn Name Ssn Name Address 129 peters main str 123 smith main str 239 lee 5th ave 234 jones forbes ave
Faloutsos CMU SCS 15-415 #11 Faloutsos CMU SCS 15-415 #12
2 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Observations: Observations:
• two tables are ‘union compatible’ if they • A: redundant:
have the same attributes (‘domains’) • STUDENT intersection STAFF =
• Q: how about intersection U
STUDENT STAFF
Faloutsos CMU SCS 15-415 #13 Faloutsos CMU SCS 15-415 #14
CMU SCS CMU SCS
Observations: Observations:
• A: redundant: • A: redundant: • STUDENT intersection STAFF = • STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF)
STUDENT STAFF STUDENT STAFF
Faloutsos CMU SCS 15-415 #15 Faloutsos CMU SCS 15-415 #16
CMU SCS CMU SCS
Observations: Relational operators
• A: redundant: • . • STUDENT intersection STAFF = • . STUDENT - (STUDENT - STAFF) • . • set union U Double negation: • set difference ‘-’ We’ll see it again, later…
Faloutsos CMU SCS 15-415 #17 Faloutsos CMU SCS 15-415 #18
3 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Other operators? Other operators?
• eg, find all students on ‘Main street’ • Notice: selection (and rest of operators) • A: ‘selection’ expect tables, and produce tables (-> can be cascaded!!) (STUDENT) address'mainstr' • For selection, in general: STUDENT (RELATION) Ssn Name Address condition 123 smith main str 234 jones forbes ave
Faloutsos CMU SCS 15-415 #19 Faloutsos CMU SCS 15-415 #20
CMU SCS CMU SCS
Selection - examples Relational operators
• Find all ‘Smiths’ on ‘Forbes Ave’ • selection condition (R) • .
name'Smith' address'Forbes ave' (STUDENT) • . • set union R U S ‘condition’ can be any boolean combination of ‘=‘, • set difference R - S ‘>’, ‘>=‘, ...
Faloutsos CMU SCS 15-415 #21 Faloutsos CMU SCS 15-415 #22
CMU SCS CMU SCS
Relational operators Relational operators
• selection picks rows - how about columns? Cascading: ‘find ssn of students on ‘forbes ave’ • A: ‘projection’ - eg.: ssn (STUDENT)
ssn ( address' forbesave' (STUDENT)) finds all the ‘ssn’ - removing duplicates STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave
Faloutsos CMU SCS 15-415 #23 Faloutsos CMU SCS 15-415 #24
4 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Relational operators Relational operators
• selection Are we done yet? condition (R) • projection Q: Give a query we can not answer yet! attlist (R) • . • set union R U S • set difference R - S
Faloutsos CMU SCS 15-415 #25 Faloutsos CMU SCS 15-415 #26
CMU SCS CMU SCS
Relational operators Relational operators
A: any query across two or more tables, A: any query across two or more tables, eg., ‘find names of students in 15-415’ eg., ‘find names of students in 15-415’ Q: what extra operator do we need?? Q: what extra operator do we need?? A: surprisingly, cartesian product is enough! TAKES STUDENT SSN c-id grade Ssn Name Address 123 15-413 A 123 smith main str 234 15-413 B 234 jones forbes ave
Faloutsos CMU SCS 15-415 #27 Faloutsos CMU SCS 15-415 #28
CMU SCS CMU SCS
Cartesian product so what?
• eg., dog-breeding: MALE x FEMALE • Eg., how do we find names of students taking • gives all possible couples 415?
SSN c-id grade MALE FEMALE M.name F.name 123 15-415 A name x name = spike lassie 234 15-413 B spike lassie spike shiba spot shiba spot lassie spot shiba
Faloutsos CMU SCS 15-415 #29 Faloutsos CMU SCS 15-415 #30
5 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Cartesian product Cartesian product
• A: .. cid15415( STUDENT.ssnTAKES.ssn(STUDENT x TAKES)) ...... STUDENT.ssnTAKES.ssn (STUDENT x TAKES)
Ssn Name Address ssn cid grade 123 smith main str 123 15-415 A 234 jones forbes ave 123 15-415 A 123 smith main str 234 15-413 B 234 jones forbes ave 234 15-413 B
Faloutsos CMU SCS 15-415 #31 Faloutsos CMU SCS 15-415 #32
CMU SCS CMU SCS FUNDAMENTAL Relational operators name( ( (STUDENT x TAKES)) • selection cid 15415 STUDENT.ssnTAKES.ssn condition (R) ) • projection attlist (R) • cartesian product MALE x FEMALE • set union R U S • set difference R - S
Faloutsos CMU SCS 15-415 #33 Faloutsos CMU SCS 15-415 #34
CMU SCS CMU SCS
Relational ops Joins
• Surprisingly, they are enough, to help us • Equijoin: R R.aS.b S R.aS.b (R S) answer almost any query we want!! • derived/convenience operators: – set intersection – join (theta join, equi-join, natural join)
– ‘rename’ operator R' (R) – division R S
Faloutsos CMU SCS 15-415 #35 Faloutsos CMU SCS 15-415 #36
6 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Cartesian product Joins R S (R S) • A: ...... STUDENT.ssnTAKES.ssn (STUDENT x TAKES) • Equijoin: R.aS.b R.aS.b • theta-joins: R S Ssn Name Address ssn cid grade generalization of equi-join - any condition 123 smith main str 123 15-415 A 234 jones forbes ave 123 15-415 A 123 smith main str 234 15-413 B 234 jones forbes ave 234 15-413 B
Faloutsos CMU SCS 15-415 #37 Faloutsos CMU SCS 15-415 #38
CMU SCS CMU SCS
Joins Joins
• very popular: natural join: R S • nat. join has 5 attributes STUDENT TAKES • like equi-join, but it drops duplicate columns: STUDENT (ssn, name, address) TAKES (ssn, cid, grade)
equi-join: 6 STUDENT STUDENT.ssnTAKES.ssn TAKES
Faloutsos CMU SCS 15-415 #39 Faloutsos CMU SCS 15-415 #40
CMU SCS CMU SCS
Natural Joins - nit-picking Overview - rel. algebra
• if no attributes in common between R, S: • fundamental operators nat. join -> cartesian product • derived operators – joins etc – rename – division • examples
Faloutsos CMU SCS 15-415 #41 Faloutsos CMU SCS 15-415 #42
7 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Rename op. Rename op.
• Q: why? AFTER (BEFORE) • PC (parent-id, child-id) PC PC • A: shorthand; self-joins; … • for example, find the grand-parents of PC ‘Tom’, given PC (parent-id, child-id) p-id c-id Mary Tom Peter Mary John Tom
Faloutsos CMU SCS 15-415 #43 Faloutsos CMU SCS 15-415 #44
CMU SCS CMU SCS
Rename op. Rename op.
• first, WRONG attempt: • we clearly need two different names for the same table - hence, the ‘rename’ op. • (why? how many columns?) • Second WRONG attempt: (PC) PC PC1 PC1.cidPC . pid PC PC .cidPC . pid PC
Faloutsos CMU SCS 15-415 #45 Faloutsos CMU SCS 15-415 #46
CMU SCS CMU SCS
Overview - rel. algebra Division
• fundamental operators • Rarely used, but powerful. • derived operators • Example: find suspicious suppliers, ie., – joins etc suppliers that supplied all the parts in – rename A_BOMB – division • examples
Faloutsos CMU SCS 15-415 #47 Faloutsos CMU SCS 15-415 #48
8 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Division Division
SHIPMENT • Observations: ~reverse of cartesian product s# p# ABOMB BAD_S • It can be derived from the 5 fundamental s1 p1 p# s# s2 p1 operators (!!) p1 s1 s1 p2 p2 • How? s3 p1 s5 p3
Faloutsos CMU SCS 15-415 #49 Faloutsos CMU SCS 15-415 #50
CMU SCS CMU SCS
Division Division
• Answer: • Answer:
r s (r) [( (r)s) r] (RS) (RS) (RS)
• Observation: find ‘good’ suppliers, and • Observation: find ‘good’ suppliers, and subtract! (double negation) subtract! (double negation)
Faloutsos CMU SCS 15-415 #51 Faloutsos CMU SCS 15-415 #52
CMU SCS CMU SCS
Division Division
• Answer: • Answer:
All suppliers all possible
All bad parts suspicious shipments
Faloutsos CMU SCS 15-415 #53 Faloutsos CMU SCS 15-415 #54
9 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Division Division SHIPMENT s# p# ABOMB BAD_S s1 p1 p# s# s2 p1 p1 s1 • Answer: s1 p2 p2 • Answer: s3 p1 s5 p3 r s (r) [( (r)s) r] (RS) (RS) (RS)
all possible all suppliers who missed suspicious shipments at least one suspicious shipment, that didn’t happen i.e.: ‘good’ suppliers Faloutsos CMU SCS 15-415 #55 Faloutsos CMU SCS 15-415 #56
CMU SCS CMU SCS
Overview - rel. algebra Sample schema find names of students that take 15-415 • fundamental operators STUDENT CLASS • derived operators Ssn Name Address c-id c-name units 123 smith main str 15-413 s.e. 2 – joins etc 234 jones forbes ave 15-412 o.s. 2 – rename – division TAKES SSN c-id grade • examples 123 15-413 A 234 15-413 B
Faloutsos CMU SCS 15-415 #57 Faloutsos CMU SCS 15-415 #58
CMU SCS CMU SCS
Examples Examples
• find names of students that take 15-415 • find names of students that take 15-415
name[ cid15415(STUDENT TAKES)]
Faloutsos CMU SCS 15-415 #59 Faloutsos CMU SCS 15-415 #60
10 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Sample schema Examples find course names of ‘smith’
STUDENT CLASS • find course names of ‘smith’ Ssn Name Address c-id c-name units 123 smith main str 15-413 s.e. 2 234 jones forbes ave 15-412 o.s. 2 cname[ name'smith' (
TAKES STUDENT TAKES CLASS SSN c-id grade 123 15-413 A )] 234 15-413 B
Faloutsos CMU SCS 15-415 #61 Faloutsos CMU SCS 15-415 #62
CMU SCS CMU SCS
Examples Examples
• find ssn of ‘overworked’ students, ie., that • find ssn of ‘overworked’ students, ie., that take 412, 413, 415 take 412, 413, 415: almost correct answer:
cname412(TAKES)
cname413(TAKES)
cname415(TAKES)
Faloutsos CMU SCS 15-415 #63 Faloutsos CMU SCS 15-415 #64
CMU SCS CMU SCS
Examples Examples
• find ssn of ‘overworked’ students, ie., that • find ssn of students that work at least as take 412, 413, 415 - Correct answer: hard as ssn=123, ie., they take all the courses of ssn=123, and maybe more [ (TAKES)] ssn cname412 [ (TAKES)] ssn c-namename=413412
ssn[ c-namename=415412 (TAKES)]
Faloutsos CMU SCS 15-415 #65 Faloutsos CMU SCS 15-415 #66
11 Faloutsos CMU SCS 15-415
CMU SCS CMU SCS
Sample schema Examples
STUDENT CLASS • find ssn of students that work at least as Ssn Name Address c-id c-name units hard as ssn=123 (ie., they take all the 123 smith main str 15-413 s.e. 2 courses of ssn=123, and maybe more 234 jones forbes ave 15-412 o.s. 2
TAKES [ ssn,cid (TAKES)] cid [ ssn123(TAKES)] SSN c-id grade 123 15-413 A 234 15-413 B
Faloutsos CMU SCS 15-415 #67 Faloutsos CMU SCS 15-415 #68
CMU SCS
Conclusions
• Relational model: only tables (‘relations’) • relational algebra: powerful, minimal: 5 operators can handle almost any query!
Faloutsos CMU SCS 15-415 #69
12