<<

Languages of DBMS

 DDL

 define the schema and storage stored in a Data Relational Query Languages Dictionary  Data Manipulation Language DML

 Manipulative populate schema, update

 Retrieval querying content of a database  DCL

 permissions, access control etc...

Data Manipulation Language  Relational operators transform either a single or a pair of relations into a result that is a relation that can be used as an operand on  Theory behind operations is formally defined later operations and equivalent to a first-order logic (FOL)  For every operator operand and result, ∀ ∃ ≡  ( , ) Relational relations are free of duplicates Algebra  Operators are oriented or set oriented  is a retrieval  Structured Query Language (SQL) based on set operators and relational  an ANSI standard for relational , operators based on relational algebra/calculus  SQL2 1992  SQL3 1998

Operations in the Query Operators  Theory behind operations is formally defined and equivalent to a first order logic (FOL)  Relational Algebra  Relational operators transform either a simple  tuple (unary) Selection, Projection relation or a pair of relations into a result that is a relation  set (binary) Union, Intersection, Difference  tuple (binary) Join, Division  The result can be used as an operand on  Additional Operators later activities  Outer Join, Outer Union  For every operand and result, relations are free of duplicates  Operators are tuple oriented or set oriented

1 Relational Algebra A Retrieval DML Must Express Select Project

 Attributes required in a result  target list Union Intersection Difference  Criteria for selecting for that result R R R S S S  qualifier  The relations that take part in the query Join Division  set generators *

 Independent of the instances in the database  Expressions are in terms of the database Cartesian product schema X

SQL Retrieval Statement π Project Operator SELECT[all|distinct] selects a subset of the attributes of a relation {*|{.*|expr[alias]|.*} [,{table.*|expr[alias]}]...} attribute list are drawn from the specified relation; FROM table [alias][,table[alias]] ... if the key attribute is in the list then card(result) = card(relation) [WHERE condition] [CONNECT BY condition [START WITH condition]] π Result = (attribute list)(relation name) [GROUP BY expr [,expr] ...] [HAVING condition] resulting relation [{UNION|UNION ALL|INTERSECT|MINUS} has only the the degree(result) = number of attributes in the attribute list SELECT ...] attributes in the list, in same order [ORDER BY {expr|position} as they appear in no duplicates in the result [ASC|DESC][,expr|position}[ASC|DESC]. the list [FOR UPDATE OF [,column] ... [NOWAIT]]

π Project Operator π Project Operator SELECT

STUDENT STUDENT studno name hons tutor year studno name hons tutor year s1 jones ca bush 2 select * s1 jones ca bush 2 s2 brown cis kahn 2 from student; s2 brown cis kahn 2 s3 smith cs goble 2 s3 smith cs goble 2 s4 bloggs ca goble 1 s4 bloggs ca goble 1 s5 jones cs zobel 1 s5 jones cs zobel 1 s6 peters ca kahn 3 s6 peters ca kahn 3 tutor tutor bush kahn select tutor bush π goble from student; kahn tutor(STUDENT) goble goble zobel zobel kahn

2 σ Select Operator σ Select Operator selects a subset of the tuples in a relation that STUDENT satisfy a selection condition studno name hons tutor year s1 jones ca bush 2 s2 brown cis kahn 2 a boolean expression specified on the attributes s3 smith cs goble 2 of a specified relation s4 bloggs ca goble 1 s5 jones cs zobel 1 σ s6 peters ca kahn 3 Result = (selection condition)(relation name) a relation that • stands for the usual comparison operators ‘<‘, studno name hons tutor year has the same ‘<>‘, ‘<=‘, ‘>‘, ‘>=‘, etc s4 bloggs ca goble 1 attributes as the • clauses can be arbitrarily connected with boolean operators AND, NOT, OR source relation; σ name=‘bloggs’(STUDENT) degree(result) = degree(relation); card(result) <= card(relation)

retrieve tutor who tutors Bloggs SQL retrieval expressions

STUDENT studno name hons tutor year  s1 jones ca bush 2 select studentno, name from student s2 brown cis kahn 2 s3 smith cs goble 2 where hons != ‘ca’ and s4 bloggs ca goble 1 s5 jones cs zobel 1 (tutor = ‘goble’ or tutor = ‘kahn’); s6 peters ca kahn 3  select * from enrol where labmark > 50; π (σ (STUDENT)) tutor name=‘bloggs’  select * from enrol select tutor from student where labmark between 30 and 50; where name = ‘bloggs’;

Cartesian Product Operator  select * from enrol where labmark in (0, 100); Definition:  select * from enrol  The cartesian product of two relations where labmark is ; R1(A1,A2,...,An) with cardinality i and  select * from student R2(B1,B2,...,Bm) with cardinality j is a relation R3 with degree k=n+m, cardinality i*j and where name is like ‘b%’; attributes (A1,A2,...,An,B1,B2,...,Bm)  select studno, courseno,  The result, denoted by R1XR2, is a relation exammark+labmark total from enrol that includes all the possible combinations of tuples from R1 and R2 where is not NULL; labmark  Used in conjunction with other operations

3 Cartesian Product Example X Cartesian Product STUDENT studno name hons tutor year lecturer roomno STUDENT COURSE studno name hons tutor year s1 jones ca bush 2 kahn IT206 studno name courseno subject equip s1 jones ca bush 2 bush 2.26 s1 jones ca bush 2 s1 jones ca bush 2 goble 2.82 s1 jones cs250 prog sun s2 brown cis kahn 2 s1 jones ca bush 2 zobel 2.34 s1 jones ca bush 2 watson IT212 s2 brown cs150 prog sun s3 smith cs goble 2 s1 jones ca bush 2 woods IT204 s6 peters cs390 specs sun s4 bloggs ca goble 1 s1 jones ca bush 2 capon A14 s1 jones ca bush 2 lindsey 2.10 s5 jones cs zobel 1 s1 jones ca bush 2 barringer 2.125 s2 brown cis kahn 2 kahn IT206 s6 peters ca kahn 3 s2 brown cis kahn 2 bush 2.26 STUDENT X COURSE s2 brown cis kahn 2 goble 2.82 studno name courseno subject equip s2 brown cis kahn 2 zobel 2.34 STAFF s2 brown cis kahn 2 watson IT212 s1 jones cs250 prog sun s2 brown cis kahn 2 woods IT204 s1 jones cs150 prog sun lecturer roomno s2 brown cis kahn 2 capon A14 s2 brown cis kahn 2 lindsey 2.10 s1 jones cs390 specs sun kahn IT206 s2 brown cis kahn 2 barringer 2.125 s2 brown cs250 prog sun bush 2.26 s3 smith cs goble 2 kahn IT206 s3 smith cs goble 2 bush 2.26 s2 brown cs150 prog sun goble 2.82 s3 smith cs goble 2 goble 2.82 s2 brown cs390 specs sun zobel 2.34 s3 smith cs goble 2 zobel 2.34 s3 smith cs goble 2 watson IT212 s6 peters cs250 prog sun watson IT212 s3 smith cs goble 2 woods IT204 s6 peters cs150 prog sun s3 smith cs goble 2 capon A14 woods IT204 s3 smith cs goble 2 lindsey 2.10 s6 peters cs390 specs sun capon A14 s3 smith cs goble 2 barringer 2.125 s4 bloggs ca goble 1 kahn IT206 lindsey 2.10 s4 bloggs ca goble 1 bush 2.26 barringer 2.125

θ θ Join Operator Join Operator STAFF lecturer roomno Definition: STUDENT kahn IT206 studno name hons tutor year bush 2.26 The of two relations R1(A1,A2,...,An) and s1 jones ca bush 2 goble 2.82 s2 brown cis kahn 2 R2(B1,B2,...,Bm) is a relation R3 with degree k=n+m zobel 2.34 and attributes (A ,A ,...,A , B ,B ,...,B ) that satisfy s3 smith cs goble 2 watson IT212 1 2 n 1 2 m s4 bloggs ca goble 1 woods IT204 the join condition s5 jones cs zobel 1 capon A14 s6 peters ca kahn 3 lindsey 2.10 Result = R1 (θ join condition) R2 barringer 2.125

stud name hons tutor year lecturer roomno •stands for the usual comparison no The result is a concatenated operators ‘<‘, ‘<>‘, ‘<=‘, ‘>‘, ‘>=‘, etc s1 jones ca bush 2 bush 2.26 set but only for those tuples s2 brown cis kahn 2 kahn IT206 where the condition is true. Θ s3 smith cs goble 2 goble 2.82 •comparing terms in the clauses can s4 bloggs ca goble 1 goble 2.82 It does not require union be arbitrarily connected with boolean compatibility of R1 and R2 s5 jones cs zobel 1 zobel 2.34 operators AND, NOT, OR s6 peters ca kahn 3 kahn IT206

ENROL More joins stud course lab exam no no mark mark STUDENT s1 cs250 65 52 Natural Join Operator studno name hons tutor year s1 cs260 80 75 s1 jones ca bush 2 s1 cs270 47 34  θ s2 brown cis kahn 2 s2 cs250 67 55 Of all the types of -join, the equi-join is the s3 smith cs goble 2 s2 cs270 65 71 only one that yields a result in which the s4 bloggs ca goble 1 s3 cs270 49 50 compared columns are redundant to each s5 jones cs zobel 1 s4 cs280 50 51 s6 peters ca kahn 3 s5 cs250 0 3 other—possibly different names but same s6 cs250 2 7 values  The natural join is an equi-join but one of the stud name hons tutor year stud course lab exam no no no mark mark redundant columns (simple or composite) is s1 jones ca bush 2 s1 cs250 65 52 s1 jones ca bush 2 s1 cs260 80 75 omitted from the result s1 jones ca bush 2 s1 cs270 47 34 s2 brown cis kahn 2 s2 cs250 67 55  Relational join is the principle algebraic s2 brown cis kahn 2 s2 cs270 65 71 s3 smith cs goble 2 s3 cs270 49 50 counterpart of queries that involve the s4 bloggs ca goble 1 s4 cs280 50 51 s5 jones cs zobel 1 s5 cs250 0 3 existential quantifier ∃ s6 peters ca kahn 3 s6 cs250 2 7

4 Self Join: Joins on the same relation Exercise π (lecturer, (staff (appraiser = lecturer) staff) Get student’s name, all their courses, subject roomno,appraiser, approom) of course, labmark for course, lecturer of course and lecturer’s roomno for ‘ca’ students STAFF lecturer roomno appraiser lecturer roomno appraiser approom kahn IT206 watson kahn IT206 watson IT212 University Schema bush 2.26 capon bush 2.26 capon A14 goble 2.82 capon goble 2.82 capon A14  STUDENT(studno,name,hons,tutor,year) zobel 2.34 watson zobel 2.34 watson IT212 watson IT212 barringer watson IT212 barringer 2.125  ENROL(studno,courseno,labmark,exammark) woods IT204 barringer woods IT204 barringer 2.125  capon A14 watson capon A14 watson IT212 COURSE(courseno,subject,equip) lindsey 2.10 woods lindsey 2.10 woods IT204  barringer 2.125 null STAFF(lecturer,roomno,appraiser)  TEACH(courseno,lecturer) select e.lecturer, e.roomno, m.lecturer appraiser, m.roomno approom from staff e, staff m  YEAR(yearno,yeartutor) where e.appraiser = m.lecturer

Set Theoretic Operators ∪ Union Operator Definition:  Union, Intersection and Difference The union of two relations  Operands need to be union compatible for R1(A ,A ,...,A ) and R2(B ,B ,...,B ) the result to be a valid relation 1 2 n 1 2 m is a relation R3(C1,C2,...,Cn) such that ≤ ≤ dom(Ci)= dom(Ai) =dom (Bi) for 1 i n Definition: Two relations  The result R1 ∪ R2 is a relation that includes R1(A1,A2,...,An) and R2(B1,B2,...,Bm) all tuples that are either in R1 or R2 or in both are union compatible iff: without duplicate tuples n = m and, dom(A )= dom (B ) for 1 ≤ i ≤ n  The resulting relation might have the same i i attribute names as the first or the second relation

Retrieve all staff that lecture or tutor ∩ Intersection Operator STUDENT TEACH studno name hons tutor year courseno lecturer s1 jones ca bush 2 cs250 lindsey Definition: s2 brown cis kahn 2 cs250 capon s3 smith cs goble 2 cs260 kahn The intersection of two relations s4 bloggs ca goble 1 cs260 bush R1(A1,A2,...,An) and R2(B1,B2,...,Bm) is a s5 jones cs zobel 1 cs270 zobel s6 peters ca kahn 3 cs270 woods relation R3(C1,C2,...,Cn) such that cs280 capon ∩ ≤ ≤ dom(Ci)= dom(Ai) dom (Bi) for 1 i n lecturer lindsey ∩ Lecturers π TEACH capon  The result R1 R2 is a relation that includes (lecturer) kahn only those tuples in R1 that also appear in R2 π bush Tutors (tutor)STUDENT zobel  The resulting relation might have the same Lecturers ∪ Tutors woods capon attribute names as the first or the second goble relation

5 Retrieve all staff that lecture and tutor − Difference Operator

STUDENT TEACH studno name hons tutor year courseno lecturer s1 jones ca bush 2 cs250 lindsey Definition: s2 brown cis kahn 2 cs250 capon The difference of two relations s3 smith cs goble 2 cs260 kahn R1(A ,A ,...,A ) and R2(B ,B ,...,B ) is a s4 bloggs ca goble 1 cs260 bush 1 2 n 1 2 m s5 jones cs zobel 1 cs270 zobel relation R3(C1,C2,...,Cn) such that s6 peters ca kahn 3 cs270 woods − ≤ ≤ cs280 capon dom(Ci)= dom(Ai) dom (Bi) for 1 i n

 The result R1 − R2 is a relation that includes Lecturers π TEACH lecturer (lecturer) kahn all tuples that are in R1 and not in R2 Tutors π STUDENT bush (tutor) zobel  The resulting relation might have the same ∩ Lecturers Tutors attribute names as the first or the second relation

Retrieve all staff that lecture but don’t Outer Join Operation tutor  In an equi-join, tuples without a ‘match’ are STUDENT TEACH studno name hons tutor year courseno lecturer eliminated s1 jones ca bush 2 cs250 lindsey  Outer join keeps all tuples in R1 or R2 or both s2 brown cis kahn 2 cs250 capon s3 smith cs goble 2 cs260 kahn in the result, padding with nulls s4 bloggs ca goble 1 cs260 bush  Left outer join R1 R2 s5 jones cs zobel 1 cs270 zobel s6 peters ca kahn 3 cs270 woods  keeps every tuple in R1 cs280 capon  select * from R1, R2 where R1.a = R2.a (+)  Right outer join R1 R2 π Lecturers (lecturer)TEACH lecturer  keeps every tuple in R2 lindsey π  select * from R1, R2 where R1.a (+) = R2.a Tutors (tutor)STUDENT capon woods  Double outer join R1 R2 Lecturers -Tutors  keeps every tuple in R1 and R2  select * from R1, R2 where R1.a (+) = R2.a (+)

Outer Join Operator Outer Join Operator

STUDENT STAFF STUDENT STAFF studno name hons tutor year lecturer roomno lecturer roomno studno name hons tutor year s1 jones ca bush 2 kahn IT206 s1 jones ca bush 2 kahn IT206 s2 brown cis kahn 2 bush 2.26 s2 brown cis kahn 2 bush 2.26 s3 smith cs goble 2 goble 2.82 s3 smith cs goble 2 goble 2.82 s4 bloggs ca null 1 zobel 2.34 s4 bloggs ca null 1 zobel 2.34 s5 jones cs zobel 1 watson IT212 s5 jones cs zobel 1 watson IT212 s6 peters ca null 3 woods IT204 s6 peters ca null 3 woods IT204 capon A14 capon A14 select * from student, staff lindsey 2.10 lindsey 2.10 barringer 2.125 select * from student, staff barringer 2.125 where tutor = lecturer (+) where tutor = lecturer studno name hons tutor year lecturer roomno studno name hons tutor year lecturer roomno s1 jones ca bush 2 bush 2.26 s1 jones ca bush 2 bush 2.26 s2 brown cis kahn 2 kahn IT206 s2 brown cis kahn 2 kahn IT206 s3 smith cs goble 2 goble 2.82 s3 smith cs goble 2 goble 2.82 s5 jones cs zobel 1 zobel 2.34 s5 jones cs zobel 1 zobel 2.34 s4 bloggs ca null 1 null null s6 peters ca null 3 null null

6 Outer Self Join π Outer Union (lecturer, (staff (appraiser = lecturer) staff) roomno,appraiser, approom)  Takes the union of tuples from two relations

STAFF lecturer roomno appraiser approom that are not union compatible lecturer roomno appraiser kahn IT206 watson IT212  kahn IT206 watson bush 2.26 capon A14 The two relations, R1 and R2, are partially bush 2.26 capon goble 2.82 capon A14 compatible—only some of their attributes are goble 2.82 capon zobel 2.34 watson IT212 zobel 2.34 watson watson IT212 barringer 2.125 union compatible watson IT212 barringer woods IT204 barringer 2.125 woods IT204 barringer capon A14 watson IT212  The attributes that are not union compatible capon A14 watson lindsey 2.10 woods IT204 from either relation are kept in the result and lindsey 2.10 woods barringer 2.125 null null barringer 2.125 null tuples without values for these attributes are padded with nulls select e.lecturer, e.roomno, m.lecturer appraiser, m.roomno approom from staff e, staff m where e.appraiser = m.lecturer (+)

Ordering results Completeness of Relational Algebra select *  Five fundamental operations from enrol, student  σπ X ∪ — where labmark is not null and  Additional operators are defined as = combination of two or more of the basic student.studno enrol.studno operations, order by  e.g. R1 ∩ R2 = R1 ∪ R2 — ((R1 — R2) ∪ (R2—R1) hons, courseno, name σ R1 R1 = (R1 X R2)

default is ascending

÷ Division Operation ÷ Division Operation Example

Definition: Retrieve the studnos of students who are enrolled on all the courses that Capon lectures on The division of two relations R1(A1,A2,...,An)  with cardinality i and R2(B1,B2,...,Bm) with Small_ENROL ÷ Capon_TEACH cardinality j is a relation R3 with degree k=n-m,

cardinality i*j and attributes Small_ENROL (A ,A ,...,A ,B ,B ,...,B ) studno courseno 1 2 n 1 2 m s1 cs250 s1 cs260 that satisfy the division condition Capon_TEACH s1 cs280 result courseno  The principle algebraic counterpart of queries s2 cs250 s1 cs250 that involve the universal quantifier ∀ s2 cs270 s4 s3 cs270 cs280  Relational languages do not express relational s4 cs280 s4 cs250 division s6 cs250

7 Aggregation Functions Aggregation Functions in SQL  Aggregation functions on collections of data values: average, minimum, maximum, sum, count ENROL select studno, count(*), stud course lab exam avg(labmark)  Group tuples by value of an attribute and apply no no mark mark from enrol aggregate function independently to each group of tuples s1 cs250 65 52 s1 cs260 80 75 group by studno s1 cs270 47 34 s2 cs250 67 55 (relation name) ENROL ƒ s2 cs270 65 71 stud course lab exam s3 cs270 49 50 s4 cs280 50 51 no no mark mark studno ƒ COUNT courseno (ENROL) studno count avg s1 cs250 65 52 s5 cs250 0 3 s1 3 64 s1 cs260 80 75 s6 cs250 2 7 s2 2 66 studno count s1 cs270 47 34 s3 1 49 s1 3 s2 cs250 67 55 s4 1 50 s2 2 s2 cs270 65 71 s5 1 0 s3 1 s3 cs270 49 50 s6 1 2 s4 1 s4 cs280 50 51 s5 1 s5 cs250 0 3 s6 1 s6 cs250 2 7

Aggregation Functions in SQL Nested Subqueries

ENROL  where stud course lab exam select studno, count(*), Complete select queries within a no no mark mark clause of another outer query s1 cs250 65 52 avg(labmark) s1 cs260 80 75 from enrol  Creates an intermediate result s1 cs270 47 34 s2 cs250 67 55 group by studno  No limit to the number of levels of nesting s2 cs270 65 71 having count(*) >= 2 List all students with the same tutor as bloggs s3 cs270 49 50 s4 cs280 50 51 s5 cs250 0 3 select studno, name, tutor s6 cs250 2 7 from student studno name tutor studno count avg s3 smith goble s1 3 64 where tutor =(select tutor s4 bloggs goble s2 2 66 from student where name = ‘bloggs’)

Union compatibility in nested Nested Subqueries subqueries select distinct name select distinct studno from student from enrol where studno in where (courseno, exammark)in (select studno (select courseno, exammark from enrol, teach, year where from student s, enrol e year.yeartutor = teach.lecturer and where s.name = ‘bloggs’ and teach.courseno = enrol.courseno) e.studno = s.studentno);

8 Nested subqueries set comparison Subqueries operators May be used in these situations:  Outer query qualifier includes a value v compared with a bag of values V generated  to define the set of rows to be inserted in the from a subquery target table of an insert, create table or copy command  Comparison v with V evaluates TRUE  to define one or more values to be assigned v in V if v is one of the elements V to existing rows in an update statement v = any V if v equal to some value in V v > any V if v > some value in V (same for <)  to provide values for comparison in where, v > all V if v greater than all the values in V having and start with clauses in (same for <) select, update and delete commands

Correlated subqueries Exists and correlated sub queries  Exists is usually used to check whether the result of a  A condition in the where clause of a nested correlated nested query is empty query references some attribute of a relation  Exists (Q) returns TRUE if there is at least one tuple in declared in the outer (parent) query the results query Q and FALSE otherwise  The nested query is evaluated once for each select name tuple in the outer (parent) query from student where exists (select * select name from enrol, teach from student where student.studno=enrol.studno and where 3 > (select count (*) enrol.courseno = teach.courseno and from enrol teach.lecturer = ‘Capon’) where student.studno=enrol.studno) Retrieve the names of students who have registered for at least one course taught by Capon

Exists and correlated sub queries Conclusions  Not Exists (Q) returns FALSE if there is at least one tuple in the results query Q and  The only logical structure is that of a relation TRUE otherwise  Constraints are formally defined on the select name concept of domain and key from student  Operations deal with entire relations rather where not exists (select * than single record at a time from enrol  Operations are formally defined and can be where student.studno=enrol.studno) combined in a declarative language to Retrieve the names of students who have no implement user queries on the database enrolments on courses

9 Conclusions on SQL  Retrieval: many ways to achieve the same result—though different performance costs  Comprehensive and powerful facilities  Non-procedural  Can’t have recursive queries  Limitations are overcome by use of a high- level procedural language that permits embedded SQL statements

10