Relational Algebra Lecture #5: Relational Algebra – Rel
Total Page:16
File Type:pdf, Size:1020Kb
Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Overview Carnegie Mellon Univ. School of Computer Science • history 15-415 - Database Applications • concepts • Formal query languages – relational algebra Lecture #5: Relational Algebra – rel. tuple calculus – rel. domain calculus Faloutsos CMU SCS 15-415 #2 CMU SCS CMU SCS History Concepts - reminder • before: records, pointers, sets etc • Database: a set of relations (= tables) • introduced by E.F. Codd in 1970 • rows: tuples • revolutionary! • columns: attributes (or keys) • first systems: 1977-8 (System R; Ingres) • superkey, candidate key, primary key • Turing award in 1981 Faloutsos CMU SCS 15-415 #3 Faloutsos CMU SCS 15-415 #4 CMU SCS CMU SCS Example Example: cont’d k-th attribute Database: Database: (Dk domain) STUDENT SSN c-id grade Ssn Name Address 123 15-413 A rel. schema (attr+domains) 123 smith main str 234 15-413 B 234 jones forbes ave tuple Faloutsos CMU SCS 15-415 #5 Faloutsos CMU SCS 15-415 #6 1 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Example: cont’d Example: cont’d • Di: the domain of the i-th attribute (eg., char(10) rel. schema (attr+domains) rel. schema (attr+domains) instance instance Faloutsos CMU SCS 15-415 #7 Faloutsos CMU SCS 15-415 #8 CMU SCS CMU SCS Overview Formal query languages • history • How do we collect information? • concepts • Eg., find ssn’s of people in 415 • Formal query languages • (recall: everything is a set!) – relational algebra • One solution: Rel. algebra, ie., set operators – rel. tuple calculus • Q1: Which ones?? – rel. domain calculus • Q2: what is a minimal set of operators? Faloutsos CMU SCS 15-415 #9 Faloutsos CMU SCS 15-415 #10 CMU SCS CMU SCS Relational operators Example: • . • Q: find all students (part or full time) • . • A: PT-STUDENT union FT-STUDENT STUDENT• . Ssn• set unionName Address U 123 smith main str FT-STUDENT PT-STUDENT • set234 differencejones forbes ‘-’ ave Ssn Name Ssn Name Address 129 peters main str 123 smith main str 239 lee 5th ave 234 jones forbes ave Faloutsos CMU SCS 15-415 #11 Faloutsos CMU SCS 15-415 #12 2 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Observations: Observations: • two tables are ‘union compatible’ if they • A: redundant: have the same attributes (‘domains’) • STUDENT intersection STAFF = U • Q: how about intersection STUDENT STAFF Faloutsos CMU SCS 15-415 #13 Faloutsos CMU SCS 15-415 #14 CMU SCS CMU SCS Observations: Observations: • A: redundant: • A: redundant: • STUDENT intersection STAFF = • STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF) STUDENT STAFF STUDENT STAFF Faloutsos CMU SCS 15-415 #15 Faloutsos CMU SCS 15-415 #16 CMU SCS CMU SCS Observations: Relational operators • A: redundant: • . • STUDENT intersection STAFF = • . STUDENT - (STUDENT - STAFF) • . • set union U Double negation: • set difference ‘-’ We’ll see it again, later… Faloutsos CMU SCS 15-415 #17 Faloutsos CMU SCS 15-415 #18 3 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Other operators? Other operators? • eg, find all students on ‘Main street’ • Notice: selection (and rest of operators) • A: ‘selection’ expect tables, and produce tables (-> can be cascaded!!) (STUDENT) address'mainstr' • For selection, in general: STUDENT (RELATION) Ssn Name Address condition 123 smith main str 234 jones forbes ave Faloutsos CMU SCS 15-415 #19 Faloutsos CMU SCS 15-415 #20 CMU SCS CMU SCS Selection - examples Relational operators • Find all ‘Smiths’ on ‘Forbes Ave’ • selection condition (R) • . name'Smith' address'Forbes ave' (STUDENT) • . • set union R U S ‘condition’ can be any boolean combination of ‘=‘, • set difference R - S ‘>’, ‘>=‘, ... Faloutsos CMU SCS 15-415 #21 Faloutsos CMU SCS 15-415 #22 CMU SCS CMU SCS Relational operators Relational operators • selection picks rows - how about columns? Cascading: ‘find ssn of students on ‘forbes ave’ • A: ‘projection’ - eg.: ssn (STUDENT) ssn ( address' forbesave' (STUDENT)) finds all the ‘ssn’ - removing duplicates STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave Faloutsos CMU SCS 15-415 #23 Faloutsos CMU SCS 15-415 #24 4 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Relational operators Relational operators • selection Are we done yet? • projection Q: Give a query we can not answer yet! attlist (R) • . • set union R U S STUDENT • set differenceSsn Name Address R - S 123 smith main str 234 jones forbes ave Faloutsos CMU SCS 15-415 #25 Faloutsos CMU SCS 15-415 #26 CMU SCS CMU SCS Relational operators Relational operators A: any query across two or more tables, A: any query across two or more tables, condition (R) eg., ‘find names of students in 15-415’ eg., ‘find names of students in 15-415’ Q: what extra operator do we need?? Q: what extra operator do we need?? A: surprisingly, cartesian product is enough! TAKES SSN c-id grade 123 15-413 A 234 15-413 B Faloutsos CMU SCS 15-415 #27 Faloutsos CMU SCS 15-415 #28 CMU SCS CMU SCS Cartesian product so what? • eg., dog-breeding: MALE x FEMALE • Eg., how do we find names of students taking • gives all possible couples 415? SSN c-id grade MALE FEMALE M.name F.name 123 15-415 A name x name = spike lassie 234 15-413 B spike lassie spike shiba spot shiba spot lassie spot shiba Faloutsos CMU SCS 15-415 #29 Faloutsos CMU SCS 15-415 #30 5 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Cartesian product Cartesian product • A: .. cid15415( STUDENT.ssnTAKES.ssn(STUDENT x TAKES)) ......... STUDENT.ssnTAKES.ssn (STUDENT x TAKES) attlist (R) Ssn Name Address ssn cid grade 123 smith main str 123 15-415 A 234 jones forbes ave 123 15-415 A 123 smith main str 234 15-413 B 234 jones forbes ave 234 15-413 B Faloutsos CMU SCS 15-415 #31 Faloutsos CMU SCS 15-415 #32 CMU SCS CMU SCS FUNDAMENTAL Relational operators name( ( (STUDENT x TAKES)) • selection cid 15415 STUDENT.ssnTAKES.ssn condition (R) ) • projection • cartesian product MALE x FEMALE • set union R U S • set difference R - S Faloutsos CMU SCS 15-415 #33 Faloutsos CMU SCS 15-415 #34 CMU SCS CMU SCS Relational ops Joins • Surprisingly, they are enough, to help us • Equijoin: R R.aS.b S R.aS.b (R S) answer almost any query we want!! • derived/convenience operators: – set intersection – join (theta join, equi-join, natural join) – ‘rename’ operator R' (R) – division R S Faloutsos CMU SCS 15-415 #35 Faloutsos CMU SCS 15-415 #36 6 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Cartesian product Joins • A: • Equijoin: ......... STUDENT.ssnTAKES.ssn (STUDENT x TAKES) • theta-joins: R S Ssn Name Address ssn cid grade generalization of equi-join - any condition 123 smith main str 123 15-415 A 234 jones forbes ave 123 15-415 A 123 smith main str 234 15-413 B 234 jones forbes ave 234 15-413 B Faloutsos CMU SCS 15-415 #37 Faloutsos CMU SCS 15-415 #38 CMU SCS CMU SCS Joins Joins • very popular: natural join: R S • nat. join has 5 attributes STUDENT TAKES • like equi-join, but it drops duplicate columns: STUDENT (ssn, name, address) TAKES (ssn, cid, grade) equi-join: 6 STUDENT STUDENT.ssnTAKES.ssn TAKES Faloutsos CMU SCS 15-415 #39 Faloutsos CMU SCS 15-415 #40 CMU SCS CMU SCS Natural Joins - nit-picking Overview - rel. algebra • if no attributes in common between R, S: • fundamentalR operators R.aS. b S R.aS.b (R S) nat. join -> cartesian product • derived operators – joins etc – rename – division • examples Faloutsos CMU SCS 15-415 #41 Faloutsos CMU SCS 15-415 #42 7 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Rename op. Rename op. • Q: why? AFTER (BEFORE) • PC (parent-id, child-id) PC PC • A: shorthand; self-joins; … • for example, find the grand-parents of PC ‘Tom’, given PC (parent-id, child-id) p-id c-id Mary Tom Peter Mary John Tom Faloutsos CMU SCS 15-415 #43 Faloutsos CMU SCS 15-415 #44 CMU SCS CMU SCS Rename op. Rename op. • first, WRONG attempt: • we clearly need two different names for the same table - hence, the ‘rename’ op. • (why? how many columns?) • Second WRONG attempt: PC1 (PC) PC1.cidPC . pid PC PC PC .cidPC . pid PC Faloutsos CMU SCS 15-415 #45 Faloutsos CMU SCS 15-415 #46 CMU SCS CMU SCS Overview - rel. algebra Division • fundamental operators • Rarely used, but powerful. • derived operators • Example: find suspicious suppliers, ie., – joins etc suppliers that supplied all the parts in – rename A_BOMB – division • examples Faloutsos CMU SCS 15-415 #47 Faloutsos CMU SCS 15-415 #48 8 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Division Division SHIPMENT • Observations: ~reverse of cartesian product s# p# ABOMB BAD_S • It can be derived from the 5 fundamental s1 p1 p# s# s2 p1 operators (!!) p1 s1 s1 p2 p2 • How? s3 p1 s5 p3 Faloutsos CMU SCS 15-415 #49 Faloutsos CMU SCS 15-415 #50 CMU SCS CMU SCS Division Division • Answer: • Answer: r s (r) [( (r)s) r] (RS) (RS) (RS) • Observation: find ‘good’ suppliers, and • Observation: find ‘good’ suppliers, and subtract! (double negation) subtract! (double negation) Faloutsos CMU SCS 15-415 #51 Faloutsos CMU SCS 15-415 #52 CMU SCS CMU SCS Division Division • Answer: • Answer: All suppliers all possible All bad parts suspicious shipments Faloutsos CMU SCS 15-415 #53 Faloutsos CMU SCS 15-415 #54 9 Faloutsos CMU SCS 15-415 CMU SCS CMU SCS Division Division SHIPMENT• Answer: • Answer: s# p# ABOMB BAD_S s1 p1 p# s# s2 p1 p1 s1 s1 p2 p2 s3 p1 STUDENT s5 p3 Ssn Name Addressall possible 123 smith main str all suppliers who missed suspicious shipments 234 jones forbes ave at least one suspicious shipment, that didn’t happen i.e.: ‘good’ suppliers Faloutsos CMU SCS 15-415 #55 Faloutsos CMU SCS 15-415 #56 CMU SCS CMU SCS Overview - rel.