<<

Relational Introduction to Design 2012, Lecture 5

Rasmus Ejlers Møgelberg

Overview • Use of logic in constructing queries • Relational algebra • Translating queries to relational algebra • Equations expressed in relational algebra

Rasmus Ejlers Møgelberg 2 Use of logic in constructing queries • Consider the following problem

Find all students who have taken all courses offered by the biology department

• Expressed more formally

Find all students s such that for all courses c, if c is offered by ‘Biology’ then s has taken c

• Translate to SQL:

* from students where [???]

Rasmus Ejlers Møgelberg 3

Use of logic in constructing queries

Find all students s such that for all courses c, if c is offered by ‘Biology’ then s has taken c • The problem is not suitable for SQL, because it uses ‘for all’ and ‘if ... then ...’ • So reformulate

Find all students s such that there is no course c such that c is offered by ‘Biology’ and s has not taken c

• (using classical logic)

Rasmus Ejlers Møgelberg 4 Use of logic in constructing queries

Find all students s such that there is no course c such that c is offered by ‘Biology’ and s has not taken c • This can be formulated in SQL:

select * from student where not exists (select * from course where dept_name = ‘Biology’ and course_id not in (select course_id from takes where takes.id = student.id))

Finds all courses offered by Biology not Finds all courses taken by student take by student

Rasmus Ejlers Møgelberg 5

More logic • Similar analysis is needed for the challenging exercises this week • You will see more logic in the course Foundations of Computing

Rasmus Ejlers Møgelberg 6 Relational algebra

Rasmus Ejlers Møgelberg

Relational algebra • A language for expressing basic operations in the • Two purposes - Express meaning of queries - Express execution plans in DBMSs • SQL is declarative (what) • Relational algebra is procedual (how)

Rasmus Ejlers Møgelberg 8 Relational algebra in DBMSs

Illustration from book

Rasmus Ejlers Møgelberg 9

Projection

• In SQL select name, salary from instructor; • In relational algebra Πname, salary(instructor)

Rasmus Ejlers Møgelberg 10 Selection

select * from instructor where salary > 90000;

σsalary>90000(instructor)

Rasmus Ejlers Møgelberg 11

Combining selection and

select name, dept_name from instructor where salary > 90000;

Πname, dept name(σsalary>90000(instructor))

Rasmus Ejlers Møgelberg 12 Translating SQL into relational algebra • Expression select name, dept_name from instructor where salary > 90000; • Is translated to Πname, dept name(σsalary>90000(instructor)) • Relational algebra expression says - First do selection - Then do projection • Relational algebra procedural

Rasmus Ejlers Møgelberg 13

Syntax trees • The syntax * + 5 3 4 • represents the expression (3+4)*5 • Trees grow downwards in computer science! • Evaluation from bottom up • Useful graphical way of representing evaluation order (no need for parentheses)

Rasmus Ejlers Møgelberg 14 Syntax trees for relational algebra

Πname, dept name

σsalary>90000

instructor • represents Πname, dept name(σsalary>90000(instructor))

Rasmus Ejlers Møgelberg 15

Cartesian products mysql> select * from instructor, department; +------+------+------+------+------+------+------+ | ID | name | dept_name | salary | dept_name | building | budget | +------+------+------+------+------+------+------+ | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Biology | Watson | 90000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Comp. Sci. | Taylor | 100000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Elec. Eng. | Taylor | 85000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Finance | Painter | 120000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | History | Painter | 50000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Music | Packard | 80000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Physics | Watson | 70000.00 | | 12121 | Wu | Finance | 90000.00 | Biology | Watson | 90000.00 | | 12121 | Wu | Finance | 90000.00 | Comp. Sci. | Taylor | 100000.00 | | 12121 | Wu | Finance | 90000.00 | Elec. Eng. | Taylor | 85000.00 | | 12121 | Wu | Finance | 90000.00 | Finance | Painter | 120000.00 | | 12121 | Wu | Finance | 90000.00 | History | Painter | 50000.00 | | 12121 | Wu | Finance | 90000.00 | Music | Packard | 80000.00 | | 12121 | Wu | Finance | 90000.00 | Physics | Watson | 70000.00 | | 15151 | Mozart | Music | 40000.00 | Biology | Watson | 90000.00 | | 15151 | Mozart | Music | 40000.00 | Comp. Sci. | Taylor | 100000.00 | | 15151 | Mozart | Music | 40000.00 | Elec. Eng. | Taylor | 85000.00 | | 15151 | Mozart | Music | 40000.00 | Finance | Painter | 120000.00 | | 15151 | Mozart | Music | 40000.00 | History | Painter | 50000.00 | ... +------+------+------+------+------+------+------+ 84 rows in (0.01 sec)

Rasmus Ejlers Møgelberg 16 Products

select * from instructor, department; • In relational algebra instructor department × • Syntax tree

× instructor department

Rasmus Ejlers Møgelberg 17

Relational model: natural

mysql> select * from instructor natural join department; +------+------+------+------+------+------+ | dept_name | ID | name | salary | building | budget | +------+------+------+------+------+------+ | Comp. Sci. | 10101 | Srinivasan | 65000.00 | Taylor | 100000.00 | | Finance | 12121 | Wu | 90000.00 | Painter | 120000.00 | | Music | 15151 | Mozart | 40000.00 | Packard | 80000.00 | | Physics | 22222 | Einstein | 95000.00 | Watson | 70000.00 | | History | 32343 | El Said | 60000.00 | Painter | 50000.00 | | Physics | 33456 | Gold | 87000.00 | Watson | 70000.00 | | Comp. Sci. | 45565 | Katz | 75000.00 | Taylor | 100000.00 | | History | 58583 | Califieri | 62000.00 | Painter | 50000.00 | | Finance | 76543 | Singh | 80000.00 | Painter | 120000.00 | | Biology | 76766 | Crick | 72000.00 | Watson | 90000.00 | | Comp. Sci. | 83821 | Brandt | 92000.00 | Taylor | 100000.00 | | Elec. Eng. | 98345 | Kim | 80000.00 | Taylor | 85000.00 | +------+------+------+------+------+------+ 12 rows in set (0.01 sec)

• First , then select, then project

Rasmus Ejlers Møgelberg 18 Join in relational algebra • Join can be defined using other constructors Πdept name,ID,. . . ,budget

σinstructor.dept name=department.dept name

×

department instructor

Rasmus Ejlers Møgelberg 19

Computation of joins • In practice joins are not always computed this way • Consider e.g.

• Can often find relevant entry on right hand side fast without having to construct cartesian product

Rasmus Ejlers Møgelberg 20 Expressing execution plans

• DBMSs use a variant of relational algebra for this • Still, basic relational algebra good way of understanding meaning of queries

Rasmus Ejlers Møgelberg 21

General joins • Define

R ￿￿ Θ S = σΘ(R S) × • For example select * from student join advisor on s_ID = ID • Is translated to relational algebra as

student ￿￿ (ID=s ID) advisor

Rasmus Ejlers Møgelberg 22 Set operations • Usual set operations in relational algebra R S ∪ R S ∩ R S \ • These only allowed between relations with same set of attributes! • Warning: - The book treats relational algebra - Might have been better to use relational algebra

Rasmus Ejlers Møgelberg 23

Using left outer join

mysql> select * from student natural left outer join takes; +------+------+------+------+------+------+------+------+------+ | ID | name | dept_name | tot_cred | course_id | sec_id | semester | year | grade | +------+------+------+------+------+------+------+------+------+ | 00128 | Zhang | Comp. Sci. | 102 | CS-101 | 1 | Fall | 2009 | A | | 00128 | Zhang | Comp. Sci. | 102 | CS-347 | 1 | Fall | 2009 | A- | | 12345 | Shankar | Comp. Sci. | 32 | CS-101 | 1 | Fall | 2009 | C | | 12345 | Shankar | Comp. Sci. | 32 | CS-190 | 2 | Spring | 2009 | A | | 12345 | Shankar | Comp. Sci. | 32 | CS-315 | 1 | Spring | 2010 | A | | 12345 | Shankar | Comp. Sci. | 32 | CS-347 | 1 | Fall | 2009 | A | | 19991 | Brandt | History | 80 | HIS-351 | 1 | Spring | 2010 | B | | 23121 | Chavez | Finance | 110 | FIN-201 | 1 | Spring | 2010 | C+ | | 44553 | Peltier | Physics | 56 | PHY-101 | 1 | Fall | 2009 | B- | | 45678 | Levy | Physics | 46 | CS-101 | 1 | Fall | 2009 | F | | 45678 | Levy | Physics | 46 | CS-101 | 1 | Spring | 2010 | B+ | | 45678 | Levy | Physics | 46 | CS-319 | 1 | Spring | 2010 | B | | 54321 | Williams | Comp. Sci. | 54 | CS-101 | 1 | Fall | 2009 | A- | | 54321 | Williams | Comp. Sci. | 54 | CS-190 | 2 | Spring | 2009 | B+ | | 55739 | Sanchez | Music | 38 | MU-199 | 1 | Spring | 2010 | A- | | 70557 | Snow | Physics | 0 | NULL | NULL | NULL | NULL | NULL | | 76543 | Brown | Comp. Sci. | 58 | CS-101 | 1 | Fall | 2009 | A | | 76543 | Brown | Comp. Sci. | 58 | CS-319 | 2 | Spring | 2010 | A | | 76653 | Aoi | Elec. Eng. | 60 | EE-181 | 1 | Spring | 2009 | C | | 98765 | Bourikas | Elec. Eng. | 98 | CS-101 | 1 | Fall | 2009 | C- | | 98765 | Bourikas | Elec. Eng. | 98 | CS-315 | 1 | Spring | 2010 | B | | 98988 | Tanaka | Biology | 120 | BIO-101 | 1 | Summer | 2009 | A | | 98988 | Tanaka | Biology | 120 | BIO-301 | 1 | Summer | 2010 | NULL | +------+------+------+------+------+------+------+------+------+ 23 rows in set (0.00 sec)

Rasmus Ejlers Møgelberg 24 Outer join • Left outer join defined as ∪

student ￿￿ takes × (,...,null) − { } students who have not taken a student Πstudent course students who have taken a course student ￿￿ takes

Rasmus Ejlers Møgelberg 25

Generalised projections • Projections can be combined with basic operations on numbers, dates, booleans or strings, e.g.

Πflight num,capacity reservations(...) −

Rasmus Ejlers Møgelberg 26 Renaming • It is often necessary to rename a • The expression

ρR(a,...,b)(S) • renames relation S to R and the attributes of S to a, ..., b

Rasmus Ejlers Møgelberg 27

Aggregation • Special symbol for aggregation • SQL mysql> select avg(salary), dept_name from instructor -> group by dept_name;

• Relational algebra

dept nameGaverage(salary)

Rasmus Ejlers Møgelberg 28 Aggregation with having

• First group, then select groups mysql> select avg(salary), dept_name from instructor -> group by dept_name -> having count(ID)>1;

Rasmus Ejlers Møgelberg 29

Having • Recall that having is just selection on the group level • Translate mysql> select avg(salary), dept_name from instructor -> group by dept_name -> having avg(salary) > 80000; • as σavg salary>80000

ρR(avg salary,dept name)

dept nameGaverage(salary)

instructor

Rasmus Ejlers Møgelberg 30 Subqueries • Example

mysql> select name from instructor, -> (select max(salary) as max_salary from instructor) as S -> where instructor.salary = S.max_salary;

• Insert tree from subquery in tree from outer query • (details on blackboard) • Nested queries in where clause are more involved

mysql> select name from instructor -> where salary >= all (select salary from instructor);

Rasmus Ejlers Møgelberg 31

Equations

Rasmus Ejlers Møgelberg Equations • Many different relational algebra expressions compute the same thing • When evaluating queries, DBMS will - generate many different relational algebra expressions computing the query - choose the one it thinks is most efficient • Here we see some basic equalities of expressions

Rasmus Ejlers Møgelberg 33

Relational algebra in DBMSs

Illustration from book

Rasmus Ejlers Møgelberg 34 Equalities (examples) • Selection is commutative σΘ1 (σΘ2 (R)) = σΘ2 (σΘ1 (R))

= σΘ1andΘ2 (R) • Join is commutative R1 ￿￿ R2 = R2 ￿￿ R1 • (only difference is order of attributes) • Join is associative

R1 ￿￿ (R2 ￿￿ R3)=(R1 ￿￿ R2) ￿￿ R3

Rasmus Ejlers Møgelberg 35

More equalities

• Suppose Θ 1 only talks about attributes of R 1 and similarly Θ 2 only talks about attributes of R2 • Then

σΘ andΘ 1 2 ￿￿

￿￿ = σΘ1 σΘ2

R1 R2 R1 R2 • Right hand side is often much less expensive to compute (DBMS makes such optimizations automatically)

Rasmus Ejlers Møgelberg 36 Summary • After this lecture you should be able to - Translate simple queries to relational algebra - Draw the syntax tree of relational algebra expressions • Future goal: - Judge which relational algebra expression represents the most efficient evaluation plan for a query

Rasmus Ejlers Møgelberg 37