Relational databases and MySQL Juha Takkinen Outline [email protected] 1. Introduction: Relational data model and SQL 2. Creating tables in Mysql 3. Simple queries and matching 4. Joins and aliasing 5. More about syntax and built-in functions 6. What is NULL? 7. Inserting, deleting and modifying data 8. Views
3 Thanks to José and Vaida for most forof most the slides. José Thanksandto Vaida
Overview Relational data model
Real world Query Answer IBM Research Laboratory, San Model Jose, California Edgar F. Codd (1970), “A Relational Model of Data for Large Shared Data Database DBMS Processing of Banks”, in Communications of the queries and updates ACM, vol. 13 no. 6, p. 377-387, June 1970 . Access to stored data All data organized into tables Other models, e.g. hierarchical, network, object Physical or object-relational DBMS, XML (back to network database model!) 4 5
1 Relational model concepts Relational database constraints
Relation name Attributes Domain Integer String shorter than 30 chars of values Character 400 < x < 8000 M or F ...... yyyy-mm-dd EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO
Ramesh K Narayan 666884444 1962-09-15 … M 38000 333445555 5 EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Joyce A English 453453453 1972-07-31 … F 25000 333445555 5 Tuples ... Ahmad V Jabbar 987987987 1969-03-29 … M 25000 987654321 4 Ramesh K Narayan 666884444 1962-09-15 … M 38000 333445555 5
James E Borg 888665555 1937-11-10 … M 55000 null 1 Joyce A English 453453453 1972-07-31 … F 25000 333445555 5
Ahmad V Jabbar 987987987 1969-03-29 … M 25000 987654321 4
James E Borg 888665555 1937-11-10 … M 55000 null 1 Relation schema Is not null EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO) Relation – set of tuples, Candidate keys Primary key + Database – collection of relations i.e. no duplicates entity integrity constraint Database schema – collection of relation schemas + integrity constraints Unique 6 7
Relational database constraints Integrity constraints Foreign keys + referential integrity constraint (Atomic) domain (or NULL). EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Key.
Ramesh K Narayan 666884444 1962-09-15 … M 38000 333445555 5
Joyce A English 453453453 1972-07-31 … F 25000 333445555 5 NOT NULL.
Ahmad V Jabbar 987987987 1969-03-29 … M 25000 987654321 4
James E Borg 888665555 1937-11-10 … M 55000 null 1 Entity integrity: PK is NOT NULL. Referential integrity: FK of R referring to S if domain(FK(R))=domain(PK(S)) DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE r.FK = s.PK for some s, otherwise NULL. Research 5 333445555 1988-05-22
Administration 4 987654321 1995-01-01
Headquarters 1 888665555 1981-06-19
8 9
2 SQL and MySQL COMPANY schema from book Structured Query Language DDL and DML EMPLOYEE (FNAME, MINIT, LNAME, SSN , BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO ) Declarative (what, not how) Originally interface to System R (SEQUEL) DEPT-LOCATIONS (DNUMBER, DLOCATION )
Used in many database systems, e.g. Oracle DEPARTMENT (DNAME, DNUMBER , MGRSSN, http://www.forbes.com/lists/2010/10/ MGRSTARTDATE ) billionaires-2010_Lawrence-Ellison_JKEX.html WORKS-ON (ESSN, PNO , HOURS ) Standard language for relational databases MySQL: Open source DBMS. PROJECT (PNAME, PNUMBER, PLOCATION, DNUM ) Table, row, column = DEPENDENT (ESSN, DEPENDENT-NAME , SEX, = relation, tuple, attribute BDATE, RELATIONSHIP ) 10 11
Optional but necessary if reserved words in table name Creating tables Creating tables
CREATE TABLE `
3 Creating tables Creating tables CONSTRAINT cand_key UNIQUE(FName,LName) CREATE TABLE dependent ( essn integer references emp(ssn) on delete cascade , dependent_name varchar (9) default 'NN', PNum FName LName Office Phone sex varchar (1) check (sex in ('F', 'M')), bdate date , • CREATE TABLE TEACHER ( relationship varchar (8), PNum CHAR(11), constraint pk_dependent primary key (essn, dependent_name)) FName VARCHAR(20) UNIQUE , ENGINE=InnoDB; LName VARCHAR(20), Office CHAR(10) DEFAULT ’CommonRoom’, Phone CHAR(4) NOT NULL , CONSTRAINT pk_TEACHER PRIMARY KEY (PNum), CONSTRAINT fk_TEACHER FOREIGN KEY (Office) REFERENCES OFFICE(ID) ON DELETE CASCADE ON UPDATE SET NULL ) ENGINE=InnoDB;
14 15
Modifying tables Change the definition of a table: add, delete and modify columns and constraints Querying tables ALTER TABLE EMPLOYEE ADD COLUMN JOB VARCHAR(12); ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE; SELECT
4 Simple query Use of *
List SSN for all employees List all information about the employees of department 5 SELECT SSN SSN SELECT FNAME, MINIT, LNAME,SSN, BDATE, FROM EMPLOYEE ; 123456789 ADDRESS, SEX, SALARY, SUPERSSN, DNO 333445555 FROM EMPLOYEE 999887777 987654321 WHERE DNO = 5 ; 666884444 453453453 or 987987987 888665555 SELECT * FROM EMPLOYEE
18 WHERE DNO = 5 ; 19
Simple query Exact vs substring matching
List last name, birth date and address for all List birth date and address for all employees employees whose name is `Alicia J. Zelaya' whose name contains the substring ‘aya’
SELECT LNAME, BDATE, ADDRESS SELECT BDATE, ADDRESS FROM EMPLOYEE FROM EMPLOYEE Different from WHERE LNAME = ‘%aya %’; WHERE FNAME = ‘ Alicia ’ WHERE LNAME LIKE ‘%aya %’; AND MINIT = ‘ J’ AND LNAME = ‘ Zelaya ’; LNAME BDATE ADDRESS % replaces 0 or more Zel aya 1968-07-19 3321 Castle, Spring, TX LNAME BDATE ADDRESS characters _ replaces a single Nar aya n 1962-09-15 975 Fire Oak, Humble, TX Zelaya 1968-07-19 3321 Castle, Spring, TX character 20 Case insensitive !!! 21
5 Difference wrt Tables as multisets relational model !!! Example SALARY 30000 40000 SQL considers a table as a multi-set (bag), i.e. tuples can List all salaries 25000 occur more than once in a table 43000 Why? SELECT SALARY 38000 25000 Removing duplicates is expensive FROM EMPLOYEE ; User may want information about duplicates 25000 55000 Aggregation operators Exceptions: SALARY The table has a key, i.e. PK or UNIQUE (which is not compulsory). List all salaries without duplicates. Use SELECT DISTINCT instead of SELECT . 30000 40000 SELECT attributes1 FROM tables1 WHERE condition1; UNION SELECT DISTINCT SALARY 25000 SELECT attributes2 FROM tables2 WHERE condition2; 43000 But not if UNION ALL . FROM EMPLOYEE ; 38000 55000
22 23
Foreign key in Primary key in EMPLOYEE DEPARTMENT LNAME DNAME
Join. Cartesian product Join. Equijoin LNAME DNO DNAME DNUMBER Smith Research Wong Research Zelaya Research Smith 5 Research 5 Wallace Research Wong 5 Research 5 Narayan Research Zelaya 4 Research 5 List all employees and their English Research List all employees and their Wallace 4 Research 5 Jabbar Research Narayan 5 Research 5 department Borg Research department English 5 Research 5 Smith Administration Jabbar 4 Research 5 Wong Administration Zelaya Administration SELECT LNAME, DNAME Borg 1 Research 5 SELECT LNAME, DNAME Wallace Administration Smith 5 Administration 4 Narayan Administration FROM EMPLOYEE Wong 5 Administration 4 FROM EMPLOYEE English Administration Zelaya 4 Administration 4 Jabbar Administration Wallace 4 Administration 4 Borg Administration INNER JOIN DEPARTMENT Narayan 5 Administration 4 INNER JOIN DEPARTMENT ; Smith Headquarters English 5 Administration 4 Wong Headquarters ON DNO = DNUMBER ; Jabbar 4 Administration 4 Zelaya Headquarters Wallace Headquarters Borg 1 Administration 4 Narayan Headquarters Smith 5 Headquarters 1 English Headquarters Wong 5 Headquarters 1 Jabbar Headquarters Equijoin Zelaya 4 Headquarters 1 Borg Headquarters Wallace 4 Headquarters 1 Result: each tuple in EMPLOYEE is Narayan 5 Headquarters 1 combined with each tuple in DEPARTMENT English 5 Headquarters 1 Jabbar 4 Headquarters 1 Cartesian product Borg 1 Headquarters 1 (result emphasized) 24 25
6 Ambiguous names. Aliasing Join. Self-join
Why? Same attribute name used in different relations To increase readability (long relation names) List last name for all employees together with last names of their bosses No alias SELECT LNAME, DNAME FROM EMPLOYEE INNER JOIN DEPARTMENT SELECT E.LNAME Employee, Employee Boss ON DNO=DNUMBER ; S.LNAME Boss Smith Wong Whole name SELECT EMPLOYEE. LNAME, FROM EMPLOYEE E Wong Borg DEPARTMENT. DNAME Zelaya Wallace FROM EMPLOYEE INNER JOIN DEPARTMENT INNER JOIN EMPLOYEE S Wallace Borg ON EMPLOYEE. DNO= Narayan Wong DEPARTMENT. DNUMBER ; ON E.SUPERSSN = S.SSN ; English Wong Alias SELECT E. LNAME, D. NAME Jabbar Wallace FROM EMPLOYEE E INNER JOIN DEPARTMENT D + WHERE E.LNAME =‘Borg’; ON E. DNO= D. DNUMBER ;
26 27
Join. Outer join, SELECT E.LNAME, E.SUPERSSN, cont’d S.LNAME, S.SSN Join. Outer join FROM EMPLOYEE E INNER JOIN EMPLOYEE S
E.LNAME E.SUPERSSN S.LNAME S.SSN List last name for all employees Smith 333445555 Smith 123456789 List last name for all employees together with last names of their Wong 888665555 Smith 123456789 and, if available , show last names Employee Boss bosses Zelaya 987654321 Smith 123456789 Wallace 888665555 Smith 123456789 of their bosses SELECT E.LNAME Employee, Narayan 333445555 Smith 123456789 Smith Wong English 333445555 Smith 123456789 SELECT E.LNAME “Employee”, S.LNAME Boss Jabbar 987654321 Smith 123456789 Wong Borg Borg Smith 123456789 S.LNAME “Boss” Zelaya Wallace FROM EMPLOYEE E Smith 333445555 Wong 333445555 Wong 888665555 Wong 333445555 Wallace Borg INNER JOIN EMPLOYEE S Zelaya 987654321 Wong 333445555 FROM EMPLOYEE E Wallace 888665555 Wong 333445555 Narayan Wong ON E.SUPERSSN = S.SSN ; Narayan 333445555 Wong 333445555 LEFT OUTER JOIN EMPLOYEE S English 333445555 Wong 333445555 English Wong Jabbar 987654321 Wong 333445555 ON E.SUPERSSN = S.SSN ; Jabbar Wallace Equijoin does not consider Borg Wong 333445555 Smith 333445555 Zelaya 999887777 Borg NULL tuples having join attributes with Wong 888665555 Zelaya 999887777 NULL values, i.e. an employee ... RIGHT OUTER JOIN. “Borg” is not included in the answer Use “outer join” instead, to catch the NULLs! 28 29
7 AB AB Joins – revisited A1 A2 B1 B2 Outer Joins A1 A2 B1 B2 100 A 100 W – revisited 100 A 100 W Cartesian product null B 200 X null B 200 X SELECT * FROM a INNER JOIN b; 300 C null Y 300 C null Y null D null Z null D null Z A2 A1 B1 B2 A 100 100 W B null 100 W C 300 100 W Equijoin, natural join, inner join D null 100 W A 100 200 X SELECT * from a INNER JOIN b ON a1=b1; B null 200 X A2 A1 B1 B2 C 300 200 X Right outer join Left outer join A 100 100 W D null 200 X SELECT * FROM a RIGHT OUTER JOIN b SELECT * FROM a LEFT OUTER JOIN b A 100 null Y ON a1=b1; ON a1=b1; B null null Y Thetajoin C 300 null Y A2 A1 B1 B2 A2 A1 B1 B2 D null null Y SELECT * from a INNER JOIN b ON a1>b1; A 100 100 W A 100 100 W A 100 null Z B null null Z A2 A1 B1 B2 null null 200 X C 300 null null C 300 null Z C 300 100 W null null null Y B null null null D null null Z C 300 200 X null null null Z D null null null 30 31
Subqueries SQL syntax
Which employees have a 10 hour (exact) project assignment? SELECT
8 Aggregate functions Grouping
Used to apply an aggregate function to subgroups of tuples in a Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT(), … relation GROUP BY – grouping attributes HAVING – condition that a group has to satisfy List the number of employees
May appear just in [DISTINCT] attribute, SELECT COUNT (*) SELECT and HAVING clauses! or * List for each department the department number, the FROM EMPLOYEE ; number of employees and the average salary. Only grouping attributes and aggregate functions COUNT(*) ≡ NULLs are not ignored SELECT DNO, COUNT (*), AVG (SALARY) COUNT(expression) ≡ NULLs are ignored FROM EMPLOYEE GROUP BY DNO DNO COUNT(*) AVG(SALARY) HAVING COUNT(*) > 2; 5 4 33250 4 3 31000 Wrong in YOUR notes No HAVING without GROUP BY 1 1 55000 35 36
Order of query results NULL
Select department names and their locations in alphabetical order. NULL = unknown, unavailable, or not applicable. SELECT DNAME, DLOCATION Hence, each NULL is different from every other. FROM DEPARTMENT D, DEPT_LOCATIONS DL Hence, three-valued logic for AND, OR and NOT WHERE D.DNUMBER = DL.DNUMBER operators (T, F, UNKNOWN, and only tuples that ORDER BY DNAME ASC , DLOCATION DESC ; evaluate to T are selected). Moreover, DNAME DLOCATION
Administration Stafford Headquarters Houston SELECT FName, LName SELECT FName, LName Research Sugarland FROM TEACHER FROM TEACHER Research Houston WHERE Office = NULL; WHERE Office IS NULL; Research Bellaire Wrong ! IS NOT 37 Each NULL is different 38
9 Inserting new data Deleting stored data May be a subquery
DELETE FROM