Relational Algebra Relational Algebra Union, Intersection, Difference
Total Page:16
File Type:pdf, Size:1020Kb
Query Languages for Week 3 – Relational Algebra Relational Databases Operations on databases: Queries — read data from the database; Querying and Updating a Database Updates — change the content of the database. The Relational Algebra In this lecture unit we discuss the relational algebra, Union, Intersection, Difference a procedural language that defines database Renaming, Selection and Projection operations in terms of algebraic expressions. [The Relational Calculus is a declarative language Join, Cartesian Product for database operations based on Predicate Logic; we will not discuss it here.] CSC343 Introduction to Databases — University of Toronto Relational Algebra —1 CSC343 Introduction to Databases — University of Toronto Relational Algebra —2 Relational Algebra Union, Intersection, Difference A collection of algebraic operators that Are defined on relations; Relations are sets, so we can apply set-theoretic Produce relations as results, operators and therefore can be combined to form However, we want the results to be relations complex algebraic expressions. (that is, homogeneous sets of tuples) Operators: It is therefore meaningful to only apply union, Union, intersection, difference; intersection, difference to pairs of relations Renaming; defined over the same attributes. Selection and Projection; Join (natural join, Cartesian product, theta join). CSC343 Introduction to Databases — University of Toronto Relational Algebra —3 CSC343 Introduction to Databases — University of Toronto Relational Algebra —4 1 Union Intersection Graduates Graduates Number Surname Age Number Surname Age 7274 Robinson 37 7274 Robinson 37 7432 O'Malley 39 Graduates ∪ Managers 7432 O'Malley 39 9824 Darkes 38 Number Surname Age 9824 Darkes 38 Graduates ∩ Managers 7274 Robinson 37 Number Surname Age 7432 O'Malley 39 7432 O'Malley 39 Managers 9824 Darkes 38 Managers 9824 Darkes 38 Number Surname Age 9297 O'Malley 56 Number Surname Age 9297 O'Malley 56 9297 O'Malley 56 7432 O'Malley 39 7432 O'Malley 39 9824 Darkes 38 9824 Darkes 38 CSC343 Introduction to Databases — University of Toronto Relational Algebra —5 CSC343 Introduction to Databases — University of Toronto Relational Algebra —6 Difference A Meaningful but Impossible Union Graduates Paternity Maternity Number Surname Age Father Child Mother Child 7274 Robinson 37 Adam Cain Eve Cain 7432 O'Malley 39 Adam Abel Eve Seth 9824 Darkes 38 Graduates - Managers Abraham Isaac Sarah Isaac Number Surname Age Abraham Ishmael Hagar Ishmael Managers 7274 Robinson 37 Number Surname Age 9297 O'Malley 56 Paternity ∪ Maternity ??? 7432 O'Malley 39 9824 Darkes 38 The problem: Father and Mother are different names, but both represent a parent. The solution: rename attributes! CSC343 Introduction to Databases — University of Toronto Relational Algebra —7 CSC343 Introduction to Databases — University of Toronto Relational Algebra —8 2 Renaming Example of Renaming This is a unary operator which changes attribute ρ (Paternity) names for a relation without changing any Paternity Father−> Parent Father Child Parent Child values. Adam Cain Adam Cain Adam Abel Adam Abel Renaming removes the limitations associated Abraham Isaac Abraham Isaac with set operators. Abraham Ishmael Abraham Ishmael Notation: ρOldName→NewName(r) The textbook allows positions rather than attribute For example, ρFather→Parent(Paternity) names, e.g., → If there are two or more attributes involved in a 1 Parent renaming operation, then ordering is meaningful: Textbook also allows renaming of the relation itself,e.g.,Paternity,1→ Parenthood,Parent e.g., ρBranch,Salary → Location,Pay(Employees) CSC343 Introduction to Databases — University of Toronto Relational Algebra —9 CSC343 Introduction to Databases — University of Toronto Relational Algebra —10 Renaming and Union, Renaming and Union with Several Attributes Paternity Maternity Father Child Mother Child Employees Adam Cain Eve Cain Eve Seth Surname Branch Salary Adam Abel Staff Abraham Isaac Sarah Isaac Patterson Rome 45 Trumble London 53 Abraham Ishmael Hagar Ishmael Surname Factory Wages Patterson Rome 45 Trumble London 53 ρFather->Parent(Paternity) ∪ρMother->Parent(Maternity) Parent Child Adam Cain Adam Abel ρBranch,Salary→Location,Pay(Employees) ∪ρFactory, Wages→Location,Pay(Staff) Abraham Isaac Surname Location Pay Abraham Ishmael Patterson Rome 45 Eve Cain Trumble London 53 Eve Seth Cooke Chicago 33 Sarah Isaac Bush Monza 32 Hagar Ishmael CSC343 Introduction to Databases — University of Toronto Relational Algebra —11 CSC343 Introduction to Databases — University of Toronto Relational Algebra —12 3 Selection and Projection Selection These are unary operators, in a sense orthogonal: This is a unary operation which returns a relation selection for "horizontal" decompositions; with the same schema as the operand; projection for "vertical" decompositions. but, with a subset of the tuples of the operand, A B C Selection A B C i.e., only those that satisfy a condition. Notation: σF(r) Semantics: σF(r) = { t | t ∈r s.t. t satisfies F, I.e., F(t)} A B C A B Projection CSC343 Introduction to Databases — University of Toronto Relational Algebra —13 CSC343 Introduction to Databases — University of Toronto Relational Algebra —14 Selection Example Selection, Another Example We can use Employees the or Surname FirstName Age Salary Citizens symbol (v), Smith Mary 25 2000 Surname FirstName PlaceOfBirth Residence Not symbol Black Lucy 40 3000 Smith Mary Rome Milan Verdi Nico 36 4500 Black Lucy Rome Rome (¬) & and Smith Mark 40 3900 Verdi Nico Florence Florence symbol (ʌ) Smith Mark Naples Florence as we would for σ (Employees) Age<30 v Salary>4000 σ PlaceOfBirth=Residence (Citizens) specifying Surname FirstName Age Salary Surname FirstName PlaceOfBirth Residence any logical Smith Mary 25 2000 Black Lucy Rome Rome condition Verdi Nico 36 4500 Verdi Nico Florence Florence CSC343 Introduction to Databases — University of Toronto Relational Algebra —15 CSC343 Introduction to Databases — University of Toronto Relational Algebra —16 4 Projection Example of Projection Projection returns a relation which includes a subset Employees of the attributes of the operand. Surname FirstName Department Head Smith Mary Sales De Rossi Notation: Given a relation r(X) and a subset Y of X: Black Lucy Sales De Rossi Verdi Mary Personnel Fox πY(r) Smith Mark Personnel Fox Semantics: πY(r) = { t[Y] | t ∈ r } πSurname, FirstName(Employees) Surname FirstName Smith Mary Black Lucy Verdi Mary Smith Mark CSC343 Introduction to Databases — University of Toronto Relational Algebra —17 CSC343 Introduction to Databases — University of Toronto Relational Algebra —18 Another Example Cardinality of Projection Operations Note that the result of a projection contains at most as many tuples as the operand relation. Employees Surname FirstName Department Head However, it may contain fewer, if several tuples Smith Mary Sales De Rossi Black Lucy Sales De Rossi collapse, i.e., they are identical in all their values. Verdi Mary Personnel Fox Smith Mark Personnel Fox Theorem: πY(r) contains as many tuples as r if and only if Y is a superkey for r. This property holds even if Y is "by chance” a πDepartment, Head (Employees) Department Head superkey, i.e., it is not defined as a superkey in the Sales De Rossi schema, but it is a superkey for the current Personnel Fox database, see the example. CSC343 Introduction to Databases — University of Toronto Relational Algebra —19 CSC343 Introduction to Databases — University of Toronto Relational Algebra —20 5 Tuples that Collapse Tuples that do not Collapse, "by Chance" Students Students RegNum Surname FirstName BirthDate DegreeProg RegNum Surname FirstName BirthDate DegreeProg 284328 Smith Luigi 29/04/59 Computing 296328 Smith John 29/04/59 Computing 296328 Smith John 29/04/59 Computing 587614 Smith Lucy 01/05/61 Engineering 587614 Smith Lucy 01/05/61 Engineering 934856 Black Lucy 01/05/61 Fine Art 934856 Black Lucy 01/05/61 Fine Art 965536 Black Lucy 05/03/58 Engineering 965536 Black Lucy 05/03/58 Fine Art π (Students) πSurname, DegreeProg (Students) Surname, DegreeProg Surname DegreeProg Surname DegreeProg Smith Computing Smith Computing Smith Engineering Smith Engineering Black Fine Art Black Fine Art Black Engineering CSC343 Introduction to Databases — University of Toronto Relational Algebra —21 CSC343 Introduction to Databases — University of Toronto Relational Algebra —22 Join A Natural Join The most used operator in the relational algebra. Allows us to establish connections among data in r1 r2 different relations, taking advantage of the "value- Employee Department Department Head Smith sales production Mori based" nature of the relational model. Black production sales Brown White production Two main versions of the join: "natural" join: takes attribute names into account; r1 ⋈ r2 "theta" join. Employee Department Head Smith sales Brown Both join operations are denoted by the symbol ⋈. Black production Mori White production Mori CSC343 Introduction to Databases — University of Toronto Relational Algebra —23 CSC343 Introduction to Databases — University of Toronto Relational Algebra —24 6 Definition of Natural Join Natural Join: Comments r1 (X1), r2 (X2) The tuples in the resulting relation are obtained by r1 ⋈ r2 (natural join of r1 and r2) is a relation on combining tuples in the operands with equal X1X2 (the union of the two sets): values on the common attributes { t on X1X2 | t [X1] ∈ r1 and t [X2] ∈ r2 } The common attributes often form a key of one of the operands (remember: references