Outline

SQL: Structured Query Language  Introduction slide167 Data Manipulation Language  ExampleDB slide170  Data Manipulation

 Retrieval slide 174

 Updates slide 203  QueryProcessing slide208

166 165

Query Languages Introduction SQL presentation Introduction

 Query Languages  Functionalities :   Data Manipulation Language  data definition and data manipulation in the relational format  Formal Languages Manipulation  data control   Manipulation language  (based on predicate logic)  non procedural  user-oriented Query languages  borrowed to relational algebra and to the tuple relational calculus  Structured Query Language (SQL)  Power of the manipulation language  QUEry Language (QUEL) Relational Algebra  (QBE) +  Link with programming languages Functions-Aggregates + Sorting  use of SQL statements inside a higher-level language program is known as Embedded SQL (Pascal, C, ...)  A SQL query (without functions and sorting)  Set of Relational Algebra operations

167 168

DBMS SQL - 1 SQL Presentation (2) Introduction Example DB : Company DB DB Example

 History  SEQUEL language of the SYSTEM/R relational DBMS prototype (74-76) Employee(ESSN, LastName, FirstName, BirthDate, Address,  IBM research lab at San José E Sex, Salary, BonusAmount, SupervSSN , DepNumber  Normalization at ISO )  The SQL1 norm (1986, 1989) Department(DepNumber, DepName, MgrSSN ,  The SQL2 norm (1992) D MgrStartDate )  The SQL3 norm DL  Query language of (quite) all relational DBMSs Dept_locat(DepNumber , DLocation )  ORACLE (Oracle Corporation - 1977) Project( ProjNumber, ProjName, PLocation, DepNumber ) P  INGRES (Ingres Technology - 1980) W  DB2 (IBM - 1984) Work_on(ProjNumber , ESSN , Nbh )  INFORMIX (Informix Inc - 1981)  SYBASE (Sybase Inc - 1984)  MySQL (1995)

169 170

ER Model of Company DB Example 1,1 supervised_by Supervise Employee Populated Database (1) Employee ENSS LastName FirstName BirthDate Address Sex Salary BonusAmount SupervSSN dDpNumber ESSN supervises 123456789 Smith John 1967-01-09 731 Fondren, Houston, TX M 300000 10000 333445555 5 LastName 0,n 333445555 Wong Franklin 1955-12-08 638 Voss, Houston, TX M 400000 5000 888665555 5 FirstName Works_for 999887777 Zelaya Alicia 1968-07-19 3321 Castel, Spring, TX F 250000 2000 987654321 4 0,1 BirthDate 0,n Departement 967654321 Wallace Jennifer 1941-06-20 291 Berry, Bellaire, TX F 430000 9000 888665555 4 Address 0,1 DepNumber 666884444 Narayan Ramesh 1962-09-15 975 Fire Oak, Humble, TX M 380000 15000 333445555 5 453453453 English Joyce 1972-07-31 5631 Rice, Houston, TX F 250000 2500 333445555 5 Sex 0,1 DepName Salary Manages 987987987 Jabbar Ahmad 1969-03-29 980 Dallas, Houston, TX M 250000 12000 987654321 4 MgrStartDate 888665555 Borg James 1937-11-10 450 Stone, Houston, TX M 550000 8000 1 BonusAmount 0,n 0,n Work_on ProjNumber ESSN Nbh 0,n 1 123456789 32.5 2 123456789 7.5 3 666884444 40.0 Work_on Locate 1 453453453 20.0 nbh 2 453453453 20.0 2 333445555 10.0 Project ProjNumber ProjName PLocation DepNumber 3 333445555 10.0 1 productx Bellaire 5 10 333445555 10.0 0,n 2 producty Sugarland 5 (1,1) 20 333445555 10.0 3 productz Houston 5 30 999887777 30.0 Project Dept-Locat 10 computerization Stanford 4 10 999887777 10.0 20 reorganization Houston 1 ProjNumber DLocation 10 987987987 35.0 0,1 Control 30 newbenefits Stanford 4 ProjName 30 987987987 5.0 PLocation 30 987654321 20.0 20 987654321 15.0 20 888665555 null

171 172

DBMS SQL - 2 Populated Database (2) Data retrieval Data Retrieval

Dept_Locat DepNumber DLocation  General retrieval syntax slide 175 1 Houston 4 Stanford  Selection and projection slide 176 5 Bellaire 5 Sugarland  Join operator slide 181 5 Houston  Set-Theoretical Operations slide 184

Department DepNumber DepName MgrSSN MgrStarDate  Aggregate functions slide 185 5 R&D 333445555 1988-05-22 4 Administration 987654321 1995-01-01 1 Headquarters 888665555 1981-06-19  grouping slide 189  Predicates and division slide 193  Summary slide 198  Complete example slide 200

173 174

General Syntax of Data Retrieval Data Retrieval Selection and Projection Data Retrieval

Syntax How to fill the clauses ?  " Retrieve all employees who are involved on a project for more than 20 hours "  which result the user wants to see, SELECT schema result? SELECT ESSN, ProjNumber, nbh FROM Work_on WHERE Nbh > 20; ESSN ProjNumber Nbh FROM  Where are the attributes that I need? 11234 1 32.5 16668 3 40 29998 27 30

 Are there conditions on the values 19879 26 35 [WHERE used in my rule? Do I have several ]

175 176

DBMS SQL - 3 Projection(1) Data Retrieval Projection (2)

 " Find all the informations registered on the  "Determine all the project locations " Project records " SELECT DISTINCT PLocation SELECT * FROM Project FROM Project Wild-card character To discard instead duplicate values of listing all columns of tuples

Result PLocation Result ProjNumber ProjName PLocation DepNumber 1 productx Bellaire 5 Bellaire 2 producty Sugarland 5 Sugarland 3 productz Houston 5 10 computerization Stanford 4 Houston 20 reorganization Houston 1 Stanford 30 newbenefits Stanford 4

177 178

Selection Data Retrieval Selection and sorting Data Retrieval

"Find all the employees who work for a department whose number is between 1 and 4"  "Find all the employees whose lastname begins with SELECT * 'b' or 'B' " FROM Employee SELECT * WHERE DepNumber >=1 AND DepNumber <=4 FROM Employee WHERE LastName LIKE ‘b%’ OR LastName LIKE ‘B%’ SELECT *  " Retrieve the list of women ordered by their salary" FROM Employee SELECT * WHERE DepNumber BETWEEN 1 AND 4 FROM Employee WHERE Sex ='F' SELECT * ORDER BY Salary FROM Employee WHERE DepNumber IN (1, 2, 3, 4)

179 180

DBMS SQL - 4 The join operation Data Retrieval Procedural Join Data Retrieval

SELECT ESSN " Retrieve the ESSN values for all the employees FROM Employee working in the 'R&D' department at Houston " WHERE DepNumber IN ( SELECT DepNumber SELECT E.ESSN FROM Department FROM Employee E, Department D, Dep_Locat DL WHERE DepName = 'R&D' WHERE E.DepNumber = D.DepNumber Cartesian and DepNumber IN ( AND D.DepNumber=DL.DepNumber Product SELECT DepNumber AND D.DLocation ='Houston' FROM Dep_Locat AND E.DepName = ‘R&D’ join conditions WHERE DLocation = ‘Houston')) attribute name non ambiguous

181 182

Auto-join Data Retrieval Join (SQL2 syntax)

 Join a with itself  SQL2 introduces a new syntax closest to  synonyms relational algebra (joins are expressed within "Employees who earn more than their supervisor " the FROM clause) SELECT E.EmpName  Supported by several DBMSs (>= Oracle 9, FROM Employee E, Employee SUP MySQL, SQLServer, …) WHERE E.supervSSN = SUP.ESSN AND E.Salary > SUP.Salary

183 184

DBMS SQL - 5 SQL2 join examples Set-Theoretical Operations Data Retrieval Union (SQL1 norm )  automatic elimination of  Cross Product doubles "List of expenses of the staff SELECT *  Difference (SQL2 norm !) FROM Employee CROSS JOIN project costs (salary and bonus)" SELECT Salary "Employees who are not  Jointure FROM Employee supervisors" SELECT E.ESSN UNION FROM Employee E JOIN Department D ON (E.DepNumber = SELECT BonusAmount SELECT ENSS D.DepNumber) JOIN Dep_Locat DL ON FROM Employee FROM Employee (D.DepNumber=DL.DepNumber) MINUS Intersection (SQL2 norm !) SELECT supervSSN WHERE D.DLocation ='Houston' "Employees who supervise other FROM Employee AND E.DepName = ‘R&D’ employees and work on at least  « natural » join one project" SELECT SupervSSN  Explicit equality on attributes with same name can be replaced by FROM Employee implicite one using NATURAL JOIN, ou JOIN … USING (attrs) INTERSECT SELECT ENSS FROM Work_on

185 186

Aggregate functions Data Retrieval Examples with functions (1) Data Retrieval

 5 predefined functions : COUNT, SUM, MIN, MAX,  " Find the maximum salary, the minimum salary, and AVG the average salary among all employees ."  Principle :  applied to all values in a of a relation SELECT MAX(Salary), MIN(Salary), AVG(Salary) FROM Employee  Produces a unique value  Rule without grouping (later) :  only in the SELECT, never in the WHERE  "Find the total salary amount and the total bonus  do not mix in the SELECT functions and attributes! received by the employees" SELECT SUM(Salary), SUM(BonusAmount) FROM Employee

187 188

DBMS SQL - 6 Examples with functions (2) Data Retrieval Examples with functions (3) Data Retrieval

 "Retrieve the number of employees in the 'R&D'  "Employees whose salary is greater than the department " average salary" SELECT COUNT (ESSN) FROM Employee E, Department D SELECT ENSS WHERE E.DepNumber=D.DepNumber FROM Employee AND DepName='R&D’ WHERE salary > (  " Retrieve the total number of employees in the SELECT AVG(salary) company" FROM Employee) SELECT COUNT (*) FROM Employee

189 190

Grouping Data Retrieval Grouping examples Data Retrieval   Principle " Find the average salary for each department " SELECT DepNumber, AVG(Salary)  Horizontal subgroups of tuples based on the value of FROM Employee an attribute or a group of attributes, specified in the GROUP BY DepNumber GROUP BY clause  " … ordered by descending department number "  The relation is (logically) fragmented in groups of tuples, where all the tuples of each group have the SELECT DepNumber, AVG(Salary) same value for the grouping attribute(s) FROM Employee GROUP BY DepNumber  The function is applied to each subgroup ORDER BY 1 DESC independently  " … only if the department has more than 4  Selection on the groups: Having clause employees"  to retrieve the values of these functions for only those SELECT DepNumber, AVG(Salary) groups that satisfy a given condition FROM Employee GROUP BY DepNumber  the Having clause is used for specifying a selection condition on groups (rather than on individual tuples) HAVING COUNT(*)>=4 ORDER BY 1 DESC 191 192

DBMS SQL - 7 computation Data Retrieval Example of wrong query Data Retrieval Employee Salary DepNumber 300000 5 initial 400000 5 250000 4 SELECT DepNumber, ESSN, AVG(Salary)   Sort the relation on the grouping 430000 4 attributes 380000 5 FROM Employee 250000 5   Create a sub-relation for each group 250000 4 GROUP BY DepNumber having the same value of the attribute on 550000 1 all the grouping attributes, here  Expected result « depNumber » Employee Salary DepNumber 300000 5 Result DepNumber ESSN avg(Salary) 400000 5  Apply the SELECT clause on each  5 {123456789, 332500 380000 5 333445555, group (in the example the department and 250000 5 666884444, number and the average salary on the 250000 4 453453453} group)  430000 4 4 {987987987, 310000 999887777, 250000 4 967654321}   Unify the results 550000 1  Problem 1 {88866555} 550000

 Apply the HAVING clause Employee avg(Salary) DepNumber  ESSN is multivalued / DepNumber 332500 5 310000 4   one value per cell (1st normal form) 550000 1

 Employee avg(Salary) DepNumber and 332500 5  193 194

Predicates :ALL, ANY, EXISTS Data Retrieval Predicates(2) Data Retrieval

 ANY :  ALL  Tests if the value of an attribute satisfies a  Tests if the value of an attribute satisfies a comparison criteria with at least one result of a comparison criteria with all the results of a sub- sub-query query "Retrieve the employees who work on at least  "Employees who have the best salary" one of the projects of the employee whose SELECT ENSS ENSS=13334 " FROM Employee SELECT ENSS WHERE Salary >= ALL (SELECT Salary FROM Work_on FROM Employee) WHERE ProjNumber = ANY (SELECT ProjNumber FROM Work_on WHERE ENSS =13334)

195 196

DBMS SQL - 8 Predicates(3) Data Retrieval Division using EXISTS predicate Data Retrieval

 EXISTS : "Which departments are SELECT DepNumber distributed on all the FROM Department D  Tests if the answer to a sub-query is empty sites?" WHERE NOT EXISTS "Retrieve the employees' lastname who work at  A department is kept if (SELECT DLocation least on a project " there is no site in which it FROM Dept_locat L1 is not located. WHERE NOT EXIST SELECT EmpName (SELECT L2.* FROM Employee E  => Double negation FROM Dept_locat L2 WHERE EXISTS (SELECT ProjNumber WHERE FROM Work_on W L2.DLocation = L1.DLocation WHERE W.ENSS = E.ENSS); AND L2.DepNumber = D.DepNumber) )

197 198

Division without EXISTS Summary Data Retrieval

SELECT D.* FROM Department D 6 SELECT functions computation (applied to groups if there is any ) on A SELECT DL.DepNumber p

FROM Dept_Locat DL 1 FROM Cartesian Product of Ri relations GROUP BY DL.DepNumber 2 WHERE : Selection of tuples of (1) respecting the C1 C1condition HAVING COUNT(*) = ( 3 GROUP BY Partition of the set obtained in (2) SELECT COUNT(DISTINCT DLocation) according to Ak values FROM Dept_Locat) ) 4 HAVING : C 2 verifying the C 2 condition

5 ORDER BY according to A l values

199 200

DBMS SQL - 9 Summary (2) Data Retrieval Complete example Data Retrieval

"List of departments with the number of employees  Search condition :  Elementary condition : who doesn't receive any bonus and are involved

 WHERE (selection of  valued to true or False in at least one project ascending ordered by tuples), HAVING  Predicate : department“ (selection of groups)  Comparison : =, <, <=, >, >=, <> SELECT D.Last Name , D. DepNumber , COUNT(*)  Composition of  Attribute/value FROM Department D, Employee E, Work_on T elementary conditions  Attribute/attribute (AND, OR, NOT)  Interval :BETWEEN WHERE E.BonusAmount=0  String : LIKE AND D.DepNumber =E.DepNumber  valued to True or False  Nullity : IS NULL AND E.ENSS = T.ENSS  Belonging : IN GROUP BY D.DepNumber  Quantification : EXISTS, ANY, ALL HAVING COUNT(*) > 1 ORDER BY D.DepNumber;

201 202

Update Update Insert (1) Update

 Insert of only one tuple  Insert "Create the new department 'Distribution' number 6, managed by the employee 18886 since the 15/09/96 "  Delete INSERT INTO Department  Modification VALUES (6, 'Distribution', 18886, '1996-09-15');

"Create the new department 'Distribution' number 6, managed by the employee 18886. The start date is unknown for the moment "

INSERT INTO Department (DepNumber, DepName, MgrSSN) VALUES (6, 'Distribution', 18886);

203 204

DBMS SQL - 10 Insert (2) Deleting Update

 Insert of a set of tuples "Process the end of the project ProductY" DELETE FROM Work_on CREATE TABLE WemanEmployees(ESSN Integer, LastName Char(40), FirstName Char(40)) WHERE ProjName = 'ProductY'; "Delete all the employees of the 'R&D' department" INSERT INTO WemanEmployees DELETE FROM Employee E SELECT ESSN, LastName, FirstName FROM Employee WHERE E.DepNumber IN (SELECT D.DepNumber WHERE Sex = 'F' FROM Department D WHERE D.DepName='R&D') CREATE TABLE WemanEmployees AS SELECT ESSN, LastName, FirstName FROM Employee WHERE Sex = 'F'

205 206

Updates Update SQL Rule Processing

SELECT Lastname, FirstName "Increase 10 % of the salary to all the employees FROM Employee who have no bonus " WHERE Salary >300k €

UPDATE Employee Syntactic Analysis Schemas SET Salary = Salary *1.1 Rights WHERE BonusAmount = 0; Verification DD Views placing, index Optimisation IC Statistics

Access plan generation

Query execution form DB 207 Execution 208

DBMS SQL - 11