<<

License

c 2002-2010 T. Uyar, ¸S. O˘g¨ ud¨ uc¨ u¨ Database Systems Relational Algebra You are free: I to Share — to copy, distribute and transmit the work I to Remix — to adapt the work Under the following conditions: ¨ I Attribution — You must attribute the work in the manner specified by the author or licensor (but not in H. Turgut Uyar ¸Sule O˘gud¨ uc¨ u¨ any way that suggests that they endorse you or your use of the work). I Noncommercial — You may not use this work for commercial purposes. I Share Alike — If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one. 2002-2010

Legal code (the full license): http://creativecommons.org/licenses/by-nc-sa/3.0/

1 / 115 2 / 115

Topics Closure

Relational Algebra Introduction Selection Definition Joining closure: Set Operations the input and output of all operations are relations

I the output of one operation can be the input of another Queries operation Introduction I operations can be nested Joining Subqueries Set Operations

3 / 115 4 / 115

Leap Data Definition

Creating Relations

I operator first, operands later: r e l a t i o n ( r e l a t i o n n a m e ) ( operator (operand1) (operand2) ... ( a t t r i b u t e name ,type ,length) I data types: [,...]

I string, integer, boolean ) I no real numbers I dotted notation to distinguish attributes with identical names Deleting Relations d e l r e l ( r e l a t i o n n a m e )

5 / 115 6 / 115 Relation Creation Example Data Manipulation

Adding Tuples Example (creating the movie relation) add ( r e l a t i o n name) (value [, ...]) r e l a t i o n ( movie ) (

( id , integer , 4 ) , I values must be given in the order as they were defined ( t i t l e , str ing , 8 0 ) , I no quotes around values ( yr , integer , 4 ) , ( score , integer , 3 ) , ( votes , integer , 5 ) , Deleting Tuples (directorid , integer , 4 ) delete ( r e l a t i o n name) (condition) )

I all values in quotes when writing conditions

7 / 115 8 / 115

Data Manipulation Examples Example Relations

Example (MOVIE Relation)

Example (adding a movie) Example (deleting movies) ID TITLE YR SCORE VOTES DIRECTORID 6 Usual Suspects 1995 87 35027 639 70 1999 83 13809 1485 add ( movie ) ( delete ( movie ) 107 Batman & Robin 1997 35 10577 105 110 Sleepy Hollow 1999 75 10514 148 6 , ( score < ’30 ’) 112 Three Kings 1999 77 10319 0 151 1997 74 8388 2020 Usual Suspects , 213 Blade 1998 67 6885 0 228 Ed Wood 1994 78 6587 148 1995 , 251 End of Days 1999 55 6095 103 87 , 281 1988 77 5651 292 373 Fear and Loathing in Las Vegas 1998 65 4658 59 35027 , 432 Stigmata 1999 61 4141 0 433 eXistenZ 1999 69 4130 97 639 573 Dead Man 1995 74 3333 175 1468 Europa 1991 76 1042 615 ) 1512 Suspiria 1977 71 1004 2259 1539 Cry-Baby 1990 59 972 364

9 / 115 10 / 115

Example Relations Example Relations

Example (PERSON Relation) Example (CASTING Relation)

ID NAME

9 Arnold Schwarzenegger 308 Gabriel Byrne MOVIEID ACTORID ORD 151 793 3 373 302 2 26 350 6 308 2 213 745 6 432 308 2 59 Terry Gilliam 364 John Waters 6 302 3 213 3578 8 432 406 1 97 David Cronenberg 406 Patricia Arquette 70 282 2 228 26 1 433 350 1 103 Peter Hyams 503 John Malkovich 70 503 14 228 406 4 433 793 2 105 Joel Schumacher 615 Lars von Trier 107 9 1 251 9 1 573 26 1 138 639 Bryan Singer 107 138 2 251 308 2 573 308 12 148 745 Udo Kier 107 243 4 251 745 10 573 1641 6 175 Jim Jarmusch 793 110 26 1 281 243 7 1468 745 3 187 1485 110 187 2 281 503 2 1512 745 9 243 1641 Iggy Pop 112 138 1 373 26 1 1539 26 1 282 2020 Andrew Niccol 112 1485 4 373 187 6 1539 1641 5 292 2259 Dario Argento 151 243 2 373 282 8 1539 3578 7 302 3578 Traci Lords

11 / 115 12 / 115 Selection Selection Examples - 1

Definition Example (movies with more than 10000 votes) selection: creating a new relation from all the tuples that satisfy a condition s1 = s e l e c t (movie) (votes > ’10000 ’) Statement ID TITLE YR SCORE VOTES DIRECTORID s e l e c t ( r e l a t i o n name) (condition) 6 Usual Suspects 1995 87 35027 639 70 Being John Malkovich 1999 83 13809 1485 107 Batman & Robin 1997 35 10577 105 110 Sleepy Hollow 1999 75 10514 148 112 Three Kings 1999 77 10319 0 I also known as: restrict

I output relation header = input relation header

13 / 115 14 / 115

Selection Examples - 2 Projection

Example (movies older than 1992, with scores higher than 75) Definition projection: s2 = s e l e c t ( movie ) creating a new relation using the given attributes ( ( yr < ’1992 ’) and ( score > ’75 ’)) Statement

ID TITLE YR SCORE VOTES DIRECTORID project ( r e l a t i o n name) (attribute [, ...]) 281 Dangerous Liaisons 1988 77 5651 292 1468 Europa 1991 76 1042 615

I output relation header = attribute list

15 / 115 16 / 115

Projection Examples - 1 Projection Examples - 2

Example (titles of all movies) Example (titles and years of all movies)

p1 = project (movie) (title) p2 = project (movie) (title ,yr)

TITLE TITLE YR Usual Suspects Usual Suspects 1995 Being John Malkovich Dangerous Liaisons Being John Malkovich 1999 Dangerous Liaisons 1988 Batman & Robin Fear and Loathing in Las Vegas Batman & Robin 1997 Fear and Loathing in Las Vegas 1998 Sleepy Hollow Stigmata Sleepy Hollow 1999 Stigmata 1999 Three Kings eXistenZ Three Kings 1999 eXistenZ 1999 Gattaca Dead Man Gattaca 1997 Dead Man 1995 Blade Europa Blade 1998 Europa 1991 Ed Wood Suspiria Ed Wood 1994 Suspiria 1977 End of Days Cry-Baby End of Days 1999 Cry-Baby 1990

17 / 115 18 / 115 Projection Examples - 3 Projection Examples - 4

Example (years of all movies) Example (titles of all movies with votes more than 5000 and p3 = project (movie) (yr) scores higher than 70)

YR 1. all movies with votes more than 5000 and scores higher than 1995 1999 70 1997 1998 2. titles of all movies with votes more than 5000 and scores 1994 1988 higher than 70 1991 1977 1990

19 / 115 20 / 115

Projection Examples - 4 Projection Examples - 4

Example (all movies with votes more than 5000 and scores Example (titles of all movies with votes more than 5000 and higher than 70) scores higher than 70)

p4a = s e l e c t ( movie ) p4 = project (p4a) (title) ( ( votes > ’5000 ’) and ( score > ’70 ’))

TITLE ID TITLE YR SCORE VOTES DIRECTORID Usual Suspects 6 Usual Suspects 1995 87 35027 639 Being John Malkovich 70 Being John Malkovich 1999 83 13809 1485 Sleepy Hollow 110 Sleepy Hollow 1999 75 10514 148 Three Kings 112 Three Kings 1999 77 10319 0 Gattaca 151 Gattaca 1997 74 8388 2020 Ed Wood 228 Ed Wood 1994 78 6587 148 Dangerous Liaisons 281 Dangerous Liaisons 1988 77 5651 292

21 / 115 22 / 115

Projection Examples - 4 Product

Definition Example (titles of all movies with votes more than 5000 and product: relation 1 × relation 2 scores higher than 70) Carthesian product of two relations p4 = project ( s e l e c t ( movie ) Statement ( ( votes > ’5000 ’) and ( score > ’70 ’))) product ( r e l a t i o n 1) (relation 2 ) ( t i t l e )

I output relation header = relation1 header + relation2 header

23 / 115 24 / 115 Product Example Joining

Example (s2 × p4) Definition x1 = product ( s2 ) ( p4 ) joining: creating a new relation by matching tuples of two relations over ID TITLE YR SCORE VOTES DIRECTORID p4.TITLE common values of one or more attributes 281 Dangerous Liaisons 1988 77 5651 292 Usual Suspects 281 Dangerous Liaisons 1988 77 5651 292 Being John Malkovich 281 Dangerous Liaisons 1988 77 5651 292 Sleepy Hollow Statement 281 Dangerous Liaisons 1988 77 5651 292 Three Kings 281 Dangerous Liaisons 1988 77 5651 292 Gattaca 281 Dangerous Liaisons 1988 77 5651 292 Ed Wood j o i n ( r e l a t i o n 1) (relation 2) (condition) 281 Dangerous Liaisons 1988 77 5651 292 Dangerous Liaisons 1468 Europa 1991 76 1042 615 Usual Suspects 1468 Europa 1991 76 1042 615 Being John Malkovich 1468 Europa 1991 76 1042 615 Sleepy Hollow I select (product ( relation 1 ) ( relation 2 )) ( condition ) 1468 Europa 1991 76 1042 615 Three Kings 1468 Europa 1991 76 1042 615 Gattaca 1468 Europa 1991 76 1042 615 Ed Wood I output relation header = relation1 header + relation2 header 1468 Europa 1991 76 1042 615 Dangerous Liaisons

25 / 115 26 / 115

Join Examples - 1 Join Examples - 1

Example (all movies and their directors)

j 1 a = j o i n (movie) (person) (movie. directorid=person.id) Example (titles of all movies and the names of their directors)

1. all movies and their directors ID TITLE ... DIRECTORID PERSON.ID NAME 6 Usual Suspects ... 639 639 Bryan Singer 70 Being John Malkovich ... 1485 1485 Spike Jonze 2. titles of all movies and the names of their directors 107 Batman & Robin ... 105 105 Joel Schumacher 110 Sleepy Hollow ... 148 148 Tim Burton 151 Gattaca ... 2020 2020 Andrew Niccol ...... 433 eXistenZ ... 97 97 David Cronenberg 573 Dead Man ... 175 175 Jim Jarmusch 1468 Europa ... 615 615 Lars von Trier 1512 Suspiria ... 2259 2259 Dario Argento 1539 Cry-Baby ... 364 364 John Waters

27 / 115 28 / 115

Join Examples - 1 Join Examples - 2

Example (titles of all movies and the names of their directors)

j 1 = project (j1a) (title ,name) Example (titles of all movies and the names of their actors along with their ordinal numbers)

TITLE NAME Usual Suspects Bryan Singer 1. all movies and the identities of their actors Being John Malkovich Spike Jonze Batman & Robin Joel Schumacher 2. all movies and their actors Sleepy Hollow Tim Burton Gattaca Andrew Niccol 3. titles of all movies and the names of their actors along with ...... eXistenZ David Cronenberg their ordinal numbers Dead Man Jim Jarmusch Europa Lars von Trier Suspiria Dario Argento Cry-Baby John Waters

29 / 115 30 / 115 Join Examples - 2 Join Examples - 2

Example (all movies and the identities of their actors) Example (all movies and their actors)

j 2 a = j o i n (movie) (casting) j2 b = j o i n (j2a) (person) (movie.id=casting .movieid) (j2a.actorid=person.id)

ID TITLE ... MOVIEID ACTORID ORD ID TITLE ... ACTORID ORD PERSON.ID NAME 6 Usual Suspects ... 6 308 2 6 Usual Suspects ... 308 2 308 Gabriel Byrne 6 Usual Suspects ... 6 302 3 6 Usual Suspects ... 302 3 302 Benicio Del Toro 70 Being John Malkovich ... 70 282 2 70 Being John Malkovich ... 282 2 282 Cameron Diaz 70 Being John Malkovich ... 70 503 14 70 Being John Malkovich ... 503 14 503 John Malkovich ...... 1512 Suspiria ... 1512 745 9 1512 Suspiria ... 745 9 745 Udo Kier 1539 Cry-Baby ... 1539 26 1 1539 Cry-Baby ... 26 1 26 Johnny Depp 1539 Cry-Baby ... 1539 1641 5 1539 Cry-Baby ... 1641 5 1641 Iggy Pop 1539 Cry-Baby ... 1539 3578 7 1539 Cry-Baby ... 3578 7 3578 Traci Lords

31 / 115 32 / 115

Joining Examples - 2 Join Examples - 3

Example (titles of all movies and the names of their actors along with their ordinal numbers) Example (names of actors who played with Johnny Depp) j 2 = project (j2b) (title ,name,ord) 1. Johnny Depp’s identity

TITLE NAME ORD 2. identities of movies Johnny Depp was in Usual Suspects Gabriel Byrne 2 Usual Suspects Benicio Del Toro 3 3. identities of actors who were in the movies Johnny Depp was Being John Malkovich Cameron Diaz 2 Being John Malkovich John Malkovich 14 in ...... Suspiria Udo Kier 9 4. names of actors who played with Johnny Depp Cry-Baby Johnny Depp 1 Cry-Baby Iggy Pop 5 Cry-Baby Traci Lords 7

33 / 115 34 / 115

Join Examples - 3 Join Examples - 3

Example (identities of movies Johnny Depp was in) Example (Johnny Depp’s identity) j3 b = project ( j o i n (j3a) (casting) j 3 a = project ( s e l e c t ( person ) (j3a.id=casting.actorid)) (name=’Johnny Depp’)) ( movieid ) ( i d )

MOVIEID ID 110 228 26 373 573 1539

35 / 115 36 / 115 Join Examples - 3 Join Examples - 3

Example (identities of actors who were in the movies Johnny Example (names of actors who played with Johnny Depp) Depp was in) j 3 = project ( j o i n (j3c) (person) j 3 c = project ( j o i n (j3b) (casting) (j3c.actorid=person.id)) (j3b.movieid=casting .movieid)) ( name ) ( a c t o r i d )

NAME ACTORID Johnny Depp 26 Christina Ricci 187 Patricia Arquette 406 Cameron Diaz 282 Benicio Del Toro 302 Gabriel Byrne 308 Iggy Pop 1641 Traci Lords 3578

37 / 115 38 / 115

Natural Join Natural Join

I in Leap, a condition must be predefined for a natural join Definition natural join: add (relship) (frelation ,prelation , joining two relations over the attributes which have the same name fkey1 ,fkey2 ,fkey3 , pkey1 ,pkey2 ,pkey3) Statement n a t j o i n ( r e l a t i o n 1) (relation 2 ) Example (movie.directorid → person.id) add (relship) (movie,person , I output relation header = relation1 header ∪ relation2 header directorid ,−,−, id ,−,−)

39 / 115 40 / 115

Natural Join Example Division

Example (all movies and their directors) Definition n1 = n a t j o i n (movie) (person) division: relation 1 / relation 2 creating a new relation from the tuples of the first relation that ID TITLE ... DIRECTORID NAME PERSON.ID contain all the tuples in the second relation 6 Usual Suspects ... 639 Bryan Singer 70 Being John Malkovich ... 1485 Spike Jonze 107 Batman & Robin ... 105 Joel Schumacher Statement 110 Sleepy Hollow ... 148 Tim Burton 151 Gattaca ... 2020 Andrew Niccol divide (relation 1) (relation 2 ) 228 Ed Wood ... 148 Tim Burton 251 End of Days ... 103 Peter Hyams 281 Dangerous Liaisons ... 292 Stephen Frears 373 Fear and Loathing in Las Vegas ... 59 Terry Gilliam 433 eXistenZ ... 97 David Cronenberg I relation2 header ⊆ relation1 header 573 Dead Man ... 175 Jim Jarmusch 1468 Europa ... 615 Lars von Trier I output relation header = relation1 header - relation2 header 1512 Suspiria ... 2259 Dario Argento 1539 Cry-Baby ... 364 John Waters

41 / 115 42 / 115 Division Example Division Example

Example (titles of movies in which Johnny Depp and Christina Example (identities of Johnny Depp and Christina Ricci) Ricci both played) v1a = project ( s e l e c t ( person ) 1. identities of Johnny Depp and Christina Ricci ((name=’Johnny Depp’) 2. identities of all movies and actors or (name=’Christina Ricci ’))) 3. identities of movies in which Johnny Depp and Christina Ricci ( i d ) both played ID 4. titles of movies in which Johnny Depp and Christina Ricci 26 both played 187

43 / 115 44 / 115

Division Example Division Example

Example (identities of all movies and actors) Example (identities of movies in which Johnny Depp and v1b = project (casting) (movieid ,actorid) Christina Ricci both played)

MOVIEID ACTORID v1c = project (divide (v1b) (v1a)) (movieid) 6 308 ...... 6 302 373 26 70 282 373 187 70 503 ...... MOVIEID ...... 573 26 110 110 26 ...... 373 110 187 1512 745 ...... 1539 26 228 26 1539 1641 ...... 1539 3578

45 / 115 46 / 115

Division Example Division Example

Example (titles of movies in which Johnny Depp and Christina Ricci both played) Example (product - division relationship)

v1 = project ( j o i n (v1c) (movie) product ( v1c ) ( v1a ) (v1c.movieid=movie. id)) ( t i t l e ) MOVIEID ACTORID 110 26 110 187 373 26 TITLE 373 187 Sleepy Hollow Fear and Loathing in Las Vegas

47 / 115 48 / 115 Intersection Intersect Example

Definition intersection: relation 1 ∩ relation 2 Example (names of all directors who also acted) creating a new relation from the tuples found in both relations 1. identities of all directors Statement 2. identities of all actors i n t e r s e c t ( r e l a t i o n 1) (relation 2 ) 3. identities of all directors who also acted 4. names of all directors who also acted

I output relation header = relation1 header = relation2 header

49 / 115 50 / 115

Intersection Example Intersect Example

Example (identities of all directors)

i 1 a = project (movie) (directorid) Example (identities of all actors) i 1 b = project (casting) (actorid) DIRECTORID 639 1485 ACTORID 105 308 187 148 302 1485 0 282 793 2020 503 745 103 9 3578 292 138 406 59 243 350 97 26 1641 175 615 2259 364

51 / 115 52 / 115

Intersect Example Intersect Example

Example (names of all directors who also acted) Example (identities of all directors who also acted) i 1 = project ( j o i n (i1c) (person) i 1 c = i n t e r s e c t ( i 1 a ) ( i 1 b ) (actorid=person.id)) ( name ) ACTORID 1485 NAME Spike Jonze

53 / 115 54 / 115 Union Union Example

Definition Example (names of directors and actors of all movies newer union: relation 1 ∪ relation 2 than 1997) creating a new relation from the tuples which are found in at least 1. all movies newer than 1997 one of the relations 2. all movies newer than 1997 and their actors Statement 3. identities of directors of all movies newer than 1997 union ( r e l a t i o n 1) (relation 2 ) 4. identities of actors of all movies newer than 1997 5. identities of directors and actors of all movies newer than 1997

I output relation header = relation1 header = relation2 header 6. names of directors and actors of all movies newer than 1997

55 / 115 56 / 115

Union Example Union Example

Example (all movies newer than 1997 and their actors) Example (all movies newer than 1997) u1b = j o i n (u1a) (casting) u1a = s e l e c t ( movie ) ( yr > ’1997 ’) (u1a.id=casting .movieid)

ID TITLE YR SCORE VOTES DIRECTORID ID TITLE ... DIRECTORID MOVIEID ACTORID ORD 70 Being John Malkovich 1999 83 13809 1485 70 Being John Malkovich ... 1485 70 282 2 110 Sleepy Hollow 1999 75 10514 148 70 Being John Malkovich ... 1485 70 503 14 112 Three Kings 1999 77 10319 0 110 Sleepy Hollow ... 148 110 26 1 213 Blade 1998 67 6885 0 110 Sleepy Hollow ... 148 110 187 2 251 End of Days 1999 55 6095 103 112 Three Kings ... 0 112 138 1 373 Fear and Loathing in Las Vegas 1998 65 4658 59 ...... 432 Stigmata 1999 61 4141 0 373 Fear and Loathing in Las Vegas ... 59 373 302 2 433 eXistenZ 1999 69 4130 97 432 Stigmata ... 0 432 308 2 432 Stigmata ... 0 432 406 1 433 eXistenZ ... 97 433 350 1 433 eXistenZ ... 97 433 793 2

57 / 115 58 / 115

Union Example Union Example

Example (identities of directors of all movies newer than 1997) Example (identities of actors of all movies newer than 1997)

u1c = project (u1b) (directorid) u1d = project (u1b) (actorid)

DIRECTORID ACTORID 1485 282 3578 148 503 9 0 26 308 103 187 302 59 138 406 97 1485 350 745 793

59 / 115 60 / 115 Union Example Union Example

Example (identities of directors and actors of all movies newer Example (names of directors and actors of all movies newer than 1997) than 1997)

u1e = union ( u1c ) ( u1d ) u1 = project ( j o i n (u1e) (person) (u1e.actorid=person.id)) (name)

ACTORID 1485 NAME 148 138 Spike Jonze George Clooney 0 745 Tim Burton Udo Kier 103 3578 Peter Hyams Traci Lords 59 9 Terry Gilliam Arnold Schwarzenegger 97 308 David Cronenberg Gabriel Byrne 282 302 Cameron Diaz Benicio Del Toro 503 406 John Malkovich Patricia Arquette 26 350 Johnny Depp Jennifer Jason Leigh 187 793 Christina Ricci Jude Law

61 / 115 62 / 115

Difference Difference Example

Definition difference: relation 1 - relation 2 Example (names of actors who have not played with Johnny creating a new relation from the tuples which are found in the first Depp) relation but not in the second 1. identities of actors who played with Johnny Depp Statement 2. identities of all actors d i f f e r e n c e ( r e l a t i o n 1) (relation 2 ) 3. identities of actors who have not played with Johnny Depp 4. names of actors who have not played with Johnny Depp

I output relation header = relation1 header = relation2 header

63 / 115 64 / 115

Difference Example Difference Example

Example (names of actors who have not played with Johnny Example (identities of actors who have not played with Johnny Depp) Depp) d1 = project ( j o i n (d1a) (person) d1a = d i f f e r e n c e ( i 1 b ) ( j 3 c ) (d1a. actorid=person.id)) ( name ) ACTORID 503 9 NAME 138 John Malkovich 243 Arnold Schwarzenegger 1485 George Clooney 793 Uma Thurman 745 Spike Jonze 350 Jude Law Udo Kier Jennifer Jason Leigh

65 / 115 66 / 115 References Simple Queries

Statement Required text: Date SELECT [ ALL | DISTINCT ] column name [ , . . . ] FROM table name I Chapter 7: Relational Algebra 7.1. Introduction 7.2. Closure Revisited I identical rows are allowed

7.4. The Original Algebra: Semantics I ALL: preserve identical rows (default) I DISTINCT: multiple rows only once

I ∗: all columns

67 / 115 68 / 115

Query Examples Sorting Results

Example (all data of all movies) Statement SELECT ∗ FROM MOVIE SELECT [ ALL | DISTINCT ] column name [ , . . . ] FROM table name Example (titles and years of all movies) [ ORDER BY { column name [ ASC | DESC ] } [,...]] SELECT TITLE , YR FROM MOVIE

I sort order:

Example (years when movies were filmed) I ASC: ascending (default) DESC: descending SELECT DISTINCT YR FROM MOVIE I

69 / 115 70 / 115

Query Examples Expressions

Example (years when movies were filmed, in ascending order) Statement SELECT [ ALL | DISTINCT ] SELECT DISTINCT YR FROM MOVIE { expression [ AS column name ] } [,...] ORDER BY YR FROM table name [ ORDER BY { column name [ ASC | DESC ] } Example (years when movies were filmed, in descending order) [,...]]

SELECT DISTINCT YR FROM MOVIE ORDER BY YR DESC I the new column can be named I the name or order of the column can be used for sorting

71 / 115 72 / 115 Query Examples Query Examples

Example (titles and total scores of all movies, in descending order of total scores)

Example (titles and total scores of all movies) SELECT TITLE , SCORE ∗ VOTES AS POINTS FROM MOVIE SELECT TITLE , SCORE ∗ VOTES ORDER BY POINTS DESC FROM MOVIE SELECT TITLE , SCORE ∗ VOTES FROM MOVIE ORDER BY 2 DESC

73 / 115 74 / 115

Selecting Rows Condition Expressions

Statement I whether a column is empty or not: SELECT [ ALL | DISTINCT ] column name IS { NULL | NOT NULL } { expression [ AS column name ] } [,...] set membership: FROM table name I [ WHERE c o n d i t i o n ] column name IN ( v a l u e s e t ) [ ORDER BY { column name [ ASC | DESC ] } [,...]] I string comparison column name LIKE p a t t e r n

I comparison operators: I connectives: in the pattern, % can substitute any symbol group = < > <= >= <> NOT AND OR I

75 / 115 76 / 115

Query Examples Query Examples

Example (year of ”Citizen Kane”) Example (titles of movies with unknown year)

SELECT YR FROM MOVIE SELECT TITLE FROM MOVIE WHERE (TITLE = ’Citizen Kane ’ ) WHERE (YR IS NULL)

Example (titles of movies with scores less than 3 and votes Example (titles of movies in years 1967, 1954 and 1988) more than 10) SELECT TITLE , YR FROM MOVIE SELECT TITLE FROM MOVIE WHERE (YR IN (1967, 1954, 1988)) WHERE ( (SCORE < 3) AND (VOTES > 10))

77 / 115 78 / 115 Query Examples Grouping

Statement SELECT [ ALL | DISTINCT ] { expression [ AS column name ] } [,...] Example (titles and scores of ”Police Academy”movies) FROM table name [ WHERE c o n d i t i o n ] SELECT TITLE , SCORE FROM MOVIE [ GROUP BY column name [, ...] ] WHERE (TITLE LIKE ’ P o l i c e Academy%’ ) [ HAVING c o n d i t i o n ] [ ORDER BY { column name [ ASC | DESC ] } [,...]]

I selected rows can be grouped

I groups can be filtered

79 / 115 80 / 115

Processing Order Group Values

I the rows that satisfy the WHERE condition are selected I one value for each group I selected rows are grouped using the columns specified in the GROUP BY clause I the value of the grouping column I the result of an aggregate function I if no group, the entire result is one group I aggregate functions: I the groups that satisfy the HAVING condition are selected COUNT SUM AVG MAX MIN

I the expressions given in the column list are calculated I column name as parameter I the result is sorted on the column list given in the I null values are ignored ORDER BY clause

81 / 115 82 / 115

Query Examples Query Examples

Example (for every year, the number of movies with score Example (score of the favorite movie of every year, in greater than 8.5 in that year) ascending order of years)

SELECT YR, COUNT(∗ ) FROM MOVIE SELECT YR, MAX(SCORE) FROM MOVIE WHERE (SCORE > 8 . 5 ) GROUP BY YR GROUP BY YR ORDER BY YR

83 / 115 84 / 115 Query Examples Query Examples

Example (averages of movie scores which were filmed in the years where there are at least 25 movies for which more than 40 people have voted, in ascending order of years) Example (total number of votes) SELECT YR, AVG(SCORE) SELECT SUM(VOTES) FROM MOVIE FROM MOVIE WHERE (VOTES > 40) GROUP BY YR HAVING (COUNT(ID) >= 25) ORDER BY YR

85 / 115 86 / 115

Joining Query Examples

I joining can be done using WHERE conditions Example (name of the director of ”Star Wars”) I list the tables to join in the table list I use the dotted notation for columns with identical names SELECT NAME I processing order: FROM MOVIE, PERSON

I the Carthesian product of tables is obtained WHERE ((DIRECTORID = PERSON. ID) I the rows that satisfy the WHERE condition are selected AND (TITLE = ’Star Wars ’ ) ) I ...

87 / 115 88 / 115

Query Examples Query Examples

Example (names of all the cast of ”Alien”) Example (titles of all movies Harrison Ford was in)

SELECT NAME SELECT TITLE FROM MOVIE, PERSON, CASTING FROM MOVIE, PERSON, CASTING WHERE ((TITLE = ’Alien’) WHERE ((NAME = ’Harrison Ford ’ ) AND (MOVIEID = MOVIE.ID) AND (MOVIEID = MOVIE.ID) AND (ACTORID = PERSON. ID ) ) AND (ACTORID = PERSON. ID ) )

89 / 115 90 / 115 Query Examples Query Examples

Example (titles of all movies where Harrison Ford was not the Example (titles and names of lead actors of all movies in 1962) lead) SELECT TITLE , NAME SELECT TITLE FROM MOVIE, PERSON, CASTING FROM MOVIE, PERSON, CASTING WHERE ( (YR = 1962) WHERE ((NAME = ’Harrison Ford ’ ) AND (MOVIEID = MOVIE.ID) AND (MOVIEID = MOVIE.ID) AND (ACTORID = PERSON. ID ) AND (ACTORID = PERSON. ID ) AND (ORD = 1 ) ) AND (ORD > 1 ) )

91 / 115 92 / 115

Table Expressions Joining Using Conditions

I the join operation can be expressed as a table expression:

I by specifying a condition I over columns with the same name natural join I Expression Syntax I outer join I product t a b l e 1 JOIN t a b l e 2 ON j o i n c o n d i t i o n Statement SELECT ... FROM t a b l e expression [ AS table name ] WHERE s e l e c t i o n c o n d i t i o n ...

93 / 115 94 / 115

Query Examples Joining Over Columns with the Same Name

Example (name of the director of ”Star Wars”) Expression Statement

SELECT NAME t a b l e 1 JOIN t a b l e 2 FROM MOVIE JOIN PERSON USING ( column name [, ...]) ON (DIRECTORID = PERSON. ID) WHERE (TITLE = ’Star Wars ’ ) I repeated columns are taken once

95 / 115 96 / 115 Natural Join Outer Join

I in inner join, if a row of a table does not match with any row of the other table, it is NOT included in the result

Expression Syntax I in outer join, it IS included where the row from the other table consists of empty values t a b l e 1 NATURAL JOIN t a b l e 2 Statement t a b l e 1 [ LEFT | RIGHT | FULL ] [ OUTER ] JOIN t a b l e 2

97 / 115 98 / 115

Outer Join Examples Outer Join Examples

Example (left outer join) Example (right outer join)

Table: T1 Table: T2 Table: T1 Table: T2

NUM NAME NUM VALUE NUM NAME NUM VALUE 1 a 1 xxx 1 a 1 xxx 2 b 3 yyy 2 b 3 yyy 3 c 5 zzz 3 c 5 zzz

Table: SELECT * FROM T1 LEFT JOIN T2 Table: SELECT * FROM T1 RIGHT JOIN T2

NUM NAME NUM VALUE NUM NAME NUM VALUE 1 a 1 xxx 1 a 1 xxx 2 b 3 c 3 yyy 3 c 3 yyy 5 zzz

99 / 115 100 / 115

Outer Join Examples Query Examples

Example (full outer join)

Table: T1 Table: T2 Example (titles of all movies with no known actor) NUM NAME NUM VALUE 1 a 1 xxx SELECT TITLE 2 b 3 yyy 3 c 5 zzz FROM MOVIE LEFT JOIN CASTING ON (MOVIEID = MOVIE.ID) WHERE (ACTORID IS NULL) Table: SELECT * FROM T1 FULL JOIN T2

NUM NAME NUM VALUE 1 a 1 xxx 2 b 3 c 3 yyy 5 zzz

101 / 115 102 / 115 Products Self Join

Expression Syntax I if the columns to join are in the same table

t a b l e 1 CROSS JOIN t a b l e 2 I give a new name to the table in the expression

103 / 115 104 / 115

Query Examples Subqueries

Statement Example (titles of all movies with the same number of votes) WHERE expression operator SELECT M1.TITLE, M2.TITLE [ ALL | ANY ] (subquery) FROM MOVIE AS M1, MOVIE AS M2

WHERE (M1.VOTES = M2.VOTES) I result of a subquery used in a condition expression

AND (M1.ID < M2.ID) I careful: row and column counts of subquery I ALL: for all values from subquery I ANY: for at least one value from subquery

105 / 115 106 / 115

Query Examples Query Examples

Example (titles and scores of all movies more popular than Example (titles of movies with more votes than the sum of all ”Star Wars”, in descending order of scores) ”Police Academy”movies) SELECT TITLE, SCORE FROM MOVIE SELECT TITLE FROM MOVIE WHERE ( SCORE > WHERE ( VOTES > ( SELECT SCORE FROM MOVIE ( SELECT SUM(VOTES) FROM MOVIE WHERE (TITLE = ’Star Wars’) ) WHERE (TITLE LIKE ’Police Academy%’) ) ) ORDER BY SCORE DESC )

107 / 115 108 / 115 Query Examples Set Operations

Example (titles of movies with scores smaller than all ”Police Academy”movies) I an operation on two subquery results basic set operations in the relational model: SELECT TITLE FROM MOVIE I intersection: INTERSECT WHERE ( SCORE < ALL I I union: UNION ( SELECT SCORE FROM MOVIE I difference: EXCEPT WHERE (TITLE LIKE ’Police Academy%’) ) identical rows are not allowed in the result ) I

109 / 115 110 / 115

Query Examples Query Examples

Example (number of people who both directed and acted) Example (number of directors who have not acted) SELECT COUNT(∗) FROM ( SELECT COUNT(∗) FROM ( ( SELECT DISTINCT DIRECTORID FROM MOVIE ) ( SELECT DISTINCT DIRECTORID FROM MOVIE ) INTERSECT EXCEPT ( SELECT DISTINCT ACTORID FROM CASTING ) ( SELECT DISTINCT ACTORID FROM CASTING ) ) AS DIRECTORACTOR ) AS DIRECTORONLY

111 / 115 112 / 115

Query Examples Extra Examples

Example (number of people who worked in movies before 1930) I how many movies acted in and in which years SELECT COUNT(∗) FROM ( titles and number of cast members of the movies filmed in ( SELECT DISTINCT DIRECTORID FROM MOVIE I 1978, in ascending order of cast member numbers WHERE (YR < 1930) ) UNION I names of actors who played with Johnny Depp ( SELECT DISTINCT ACTORID FROM CASTING I titles and names of lead actors of the movies Uma Thurman WHERE (MOVIEID IN was in

( SELECT ID FROM MOVIE I names of actors with at least 10 lead roles WHERE (YR < 1930) )) ) ) AS OLDMOVIES

113 / 115 114 / 115 References

Required text: Date

I Chapter 8: Relational Calculus

I 8.6. SQL Facilities

I Appendix B: SQL Expressions

I Chapter 19: Missing Information

SQLzoo

I A Gentle Introduction to SQL: http://sqlzoo.net/

115 / 115