Relational databases and MySQL Juha Takkinen Outline [email protected] 1. Introduction: Relational data model and SQL 2. Creating tables in Mysql 3. Simple queries and matching 4. Joins and aliasing 5. More about syntax and built-in functions 6. What is NULL? 7. Inserting, deleting and modifying data 8. Views

3 Thanks to José and Vaida for most forof most the slides. José Thanksandto Vaida

Overview Relational data model

Real world Query Answer  IBM Research Laboratory, San Model Jose, California  Edgar F. Codd (1970), “A Relational Model of Data for Large Shared Data Database DBMS Processing of Banks”, in Communications of the queries and updates ACM, vol. 13 no. 6, p. 377-387, June 1970 . Access to stored data  All data organized into tables  Other models, e.g. hierarchical, network, object Physical or object-relational DBMS, XML (back to network database model!) 4 5

1 Relational model concepts Relational database constraints

Relation name Attributes Domain Integer String shorter than 30 chars of values Character 400 < x < 8000 M or F ...... yyyy-mm-dd EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO

Ramesh K Narayan 666884444 1962-09-15 … M 38000 333445555 5 EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Joyce A English 453453453 1972-07-31 … F 25000 333445555 5 Tuples ... Ahmad V Jabbar 987987987 1969-03-29 … M 25000 987654321 4 Ramesh K Narayan 666884444 1962-09-15 … M 38000 333445555 5

James E Borg 888665555 1937-11-10 … M 55000 1 Joyce A English 453453453 1972-07-31 … F 25000 333445555 5

Ahmad V Jabbar 987987987 1969-03-29 … M 25000 987654321 4

James E Borg 888665555 1937-11-10 … M 55000 null 1 Relation schema Is not null EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO) Relation – set of tuples, Candidate keys Primary key + Database – collection of relations i.e. no duplicates entity integrity constraint Database schema – collection of relation schemas + integrity constraints Unique 6 7

Relational database constraints Integrity constraints Foreign keys + referential integrity constraint  (Atomic) domain (or NULL). EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO  Key.

Ramesh K Narayan 666884444 1962-09-15 … M 38000 333445555 5

Joyce A English 453453453 1972-07-31 … F 25000 333445555 5  NOT NULL.

Ahmad V Jabbar 987987987 1969-03-29 … M 25000 987654321 4

James E Borg 888665555 1937-11-10 … M 55000 null 1  Entity integrity: PK is NOT NULL.  Referential integrity: FK of R referring to S if domain(FK(R))=domain(PK(S)) DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE r.FK = s.PK for some s, otherwise NULL. Research 5 333445555 1988-05-22

Administration 4 987654321 1995-01-01

Headquarters 1 888665555 1981-06-19

8 9

2 SQL and MySQL COMPANY schema book  Structured Query Language  DDL and DML  EMPLOYEE (FNAME, MINIT, LNAME, SSN , BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO )  Declarative (what, not how)  Originally interface to System R (SEQUEL)  DEPT-LOCATIONS (DNUMBER, DLOCATION )

 Used in many database systems, e.g. Oracle  DEPARTMENT (DNAME, DNUMBER , MGRSSN,  http://www.forbes.com/lists/2010/10/ MGRSTARTDATE ) billionaires-2010_Lawrence-Ellison_JKEX.html  WORKS-ON (ESSN, PNO , HOURS )  Standard language for relational databases  MySQL: Open source DBMS.  PROJECT (PNAME, PNUMBER, PLOCATION, DNUM )  Table, row, column =  DEPENDENT (ESSN, DEPENDENT-NAME , SEX, = relation, tuple, attribute BDATE, RELATIONSHIP ) 10 11

Optional but necessary if reserved words in table name Creating tables Creating tables

CREATE TABLE `` ( [] , CREATE TABLE works_on ( …, essn integer references emp(ssn) on cascade , pno integer , [] , hours decimal (3,1), … constraint pk_workson primary key (essn, pno), ); ENGINE=InnoDB DEFAULT CHARSET=latin1; constraint fk_workson foreign key (pno) references  data types: integer, decimal(3,1), boolean, varchar(8), proj(pnumber) on delete cascade char(2), date, time, datetime, timestamp, ) ENGINE=InnoDB; enum(‘v1’,’v2’,’v3’), …  constraints: not null, primary key, foreign key, unique, check  Other: auto_increment 12 13

3 Creating tables Creating tables CONSTRAINT cand_key UNIQUE(FName,LName) CREATE TABLE dependent ( essn integer references emp(ssn) on delete cascade , dependent_name varchar (9) default 'NN', PNum FName LName Office Phone sex varchar (1) check (sex in ('F', 'M')), bdate date , • CREATE TABLE TEACHER ( relationship varchar (8), PNum CHAR(11), constraint pk_dependent primary key (essn, dependent_name)) FName VARCHAR(20) UNIQUE , ENGINE=InnoDB; LName VARCHAR(20), Office CHAR(10) DEFAULT ’CommonRoom’, Phone CHAR(4) NOT NULL , CONSTRAINT pk_TEACHER PRIMARY KEY (PNum), CONSTRAINT fk_TEACHER FOREIGN KEY (Office) REFERENCES OFFICE(ID) ON DELETE CASCADE ON UPDATE SET NULL ) ENGINE=InnoDB;

14 15

Modifying tables  Change the definition of a table: add, delete and modify columns and constraints Querying tables ALTER TABLE EMPLOYEE ADD COLUMN JOB VARCHAR(12); ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE; SELECT ALTER TABLE DEPTS-INFO DROP PRIMARY KEY ; ALTER TABLE DEPTS-INFO DROP FOREIGN KEY fk_DEPARTMENT_EMPLOYEE; FROM ALTER TABLE DEPTS-INFO ADD CONSTRAINT PK_Dept PRIMARY KEY (Dno) ; WHERE ; ALTER TABLE TEACHER ADD CONSTRAINT fk_TEACHER FOREIGN KEY (Office) REFERENCES OFFICE(ID) ON DELETE CASCADE ON UPDATE SET NULL; ALTER TABLE EMPLOYEE MODIFY COLUMN ADDRESS VARCHAR(10) DEFAULT  attribute-list: R 1.A 1, …, R k.A r ‘None’; ALTER TABLE ACCOUNT CONVERT TO CHARACTER SET latin1 COLLATE Attributes that are required latin1_swedish_ci;  table-list: R 1, …, R k Relations that are needed to process the query  Delete a table and its definition  : expression with logical operators (and, or, not) DROP TABLE EMPLOYEE ; and equality, inequality and comparison operators(=, <>,  Getting information about the tables >, >=, …); identifies the tuples that should be retrieved SHOW TABLES; DESCRIBE TEACHER; SHOW CREATE TABLE TEACHER; 16 17

4 Simple query Use of *

 List SSN for all employees  List all information about the employees of department 5 SELECT SSN SSN SELECT FNAME, MINIT, LNAME,SSN, BDATE, FROM EMPLOYEE ; 123456789 ADDRESS, SEX, SALARY, SUPERSSN, DNO 333445555 FROM EMPLOYEE 999887777 987654321 WHERE DNO = 5 ; 666884444 453453453 or 987987987 888665555 SELECT * FROM EMPLOYEE

18 WHERE DNO = 5 ; 19

Simple query Exact vs substring matching

 List last name, birth date and address for all  List birth date and address for all employees employees whose name is `Alicia J. Zelaya' whose name contains the substring ‘aya’

SELECT LNAME, BDATE, ADDRESS SELECT BDATE, ADDRESS FROM EMPLOYEE FROM EMPLOYEE Different from WHERE LNAME = ‘%aya %’; WHERE FNAME = ‘ Alicia ’ WHERE LNAME LIKE ‘%aya %’; AND MINIT = ‘ J’ AND LNAME = ‘ Zelaya ’; LNAME BDATE ADDRESS % replaces 0 or more Zel aya 1968-07-19 3321 Castle, Spring, TX LNAME BDATE ADDRESS characters _ replaces a single Nar aya n 1962-09-15 975 Fire Oak, Humble, TX Zelaya 1968-07-19 3321 Castle, Spring, TX character 20 Case insensitive !!! 21

5 Difference wrt Tables as multisets relational model !!! Example SALARY 30000 40000  SQL considers a table as a multi-set (bag), i.e. tuples can  List all salaries 25000 occur more than once in a table 43000  Why? SELECT SALARY 38000 25000  Removing duplicates is expensive FROM EMPLOYEE ;  User may want information about duplicates 25000 55000  Aggregation operators  Exceptions: SALARY  The table has a key, i.e. PK or UNIQUE (which is not compulsory).  List all salaries without duplicates.  Use SELECT DISTINCT instead of SELECT . 30000 40000  SELECT attributes1 FROM tables1 WHERE condition1; UNION SELECT DISTINCT SALARY 25000 SELECT attributes2 FROM tables2 WHERE condition2; 43000  But not if UNION ALL . FROM EMPLOYEE ; 38000 55000

22 23

Foreign key in Primary key in EMPLOYEE DEPARTMENT LNAME DNAME

Join. Cartesian product Join. Equijoin LNAME DNO DNAME DNUMBER Smith Research Wong Research Zelaya Research Smith 5 Research 5 Wallace Research Wong 5 Research 5 Narayan Research Zelaya 4 Research 5  List all employees and their English Research  List all employees and their Wallace 4 Research 5 Jabbar Research Narayan 5 Research 5 department Borg Research department English 5 Research 5 Smith Administration Jabbar 4 Research 5 Wong Administration Zelaya Administration SELECT LNAME, DNAME Borg 1 Research 5 SELECT LNAME, DNAME Wallace Administration Smith 5 Administration 4 Narayan Administration FROM EMPLOYEE Wong 5 Administration 4 FROM EMPLOYEE English Administration Zelaya 4 Administration 4 Jabbar Administration Wallace 4 Administration 4 Borg Administration INNER JOIN DEPARTMENT Narayan 5 Administration 4 INNER JOIN DEPARTMENT ; Smith Headquarters English 5 Administration 4 Wong Headquarters ON DNO = DNUMBER ; Jabbar 4 Administration 4 Zelaya Headquarters Wallace Headquarters Borg 1 Administration 4 Narayan Headquarters Smith 5 Headquarters 1 English Headquarters Wong 5 Headquarters 1 Jabbar Headquarters Equijoin Zelaya 4 Headquarters 1 Borg Headquarters Wallace 4 Headquarters 1 Result: each tuple in EMPLOYEE is Narayan 5 Headquarters 1 combined with each tuple in DEPARTMENT English 5 Headquarters 1 Jabbar 4 Headquarters 1 Cartesian product Borg 1 Headquarters 1 (result emphasized) 24 25

6 Ambiguous names. Aliasing Join. Self-

 Why? Same attribute name used in different relations  To increase readability (long relation names)  List last name for all employees together with last names of their bosses  No alias SELECT LNAME, DNAME FROM EMPLOYEE INNER JOIN DEPARTMENT SELECT E.LNAME Employee, Employee Boss ON DNO=DNUMBER ; S.LNAME Boss Smith Wong  Whole name SELECT EMPLOYEE. LNAME, FROM EMPLOYEE E Wong Borg DEPARTMENT. DNAME Zelaya Wallace FROM EMPLOYEE INNER JOIN DEPARTMENT INNER JOIN EMPLOYEE S Wallace Borg ON EMPLOYEE. DNO= Narayan Wong DEPARTMENT. DNUMBER ; ON E.SUPERSSN = S.SSN ; English Wong  Alias SELECT E. LNAME, D. NAME Jabbar Wallace FROM EMPLOYEE E INNER JOIN DEPARTMENT D  + WHERE E.LNAME =‘Borg’; ON E. DNO= D. DNUMBER ;

26 27

Join. Outer join, SELECT E.LNAME, E.SUPERSSN, cont’d S.LNAME, S.SSN Join. Outer join FROM EMPLOYEE E INNER JOIN EMPLOYEE S

E.LNAME E.SUPERSSN S.LNAME S.SSN  List last name for all employees Smith 333445555 Smith 123456789  List last name for all employees together with last names of their Wong 888665555 Smith 123456789 and, if available , show last names Employee Boss bosses Zelaya 987654321 Smith 123456789 Wallace 888665555 Smith 123456789 of their bosses SELECT E.LNAME Employee, Narayan 333445555 Smith 123456789 Smith Wong English 333445555 Smith 123456789 SELECT E.LNAME “Employee”, S.LNAME Boss Jabbar 987654321 Smith 123456789 Wong Borg Borg Smith 123456789 S.LNAME “Boss” Zelaya Wallace FROM EMPLOYEE E Smith 333445555 Wong 333445555 Wong 888665555 Wong 333445555 Wallace Borg INNER JOIN EMPLOYEE S Zelaya 987654321 Wong 333445555 FROM EMPLOYEE E Wallace 888665555 Wong 333445555 Narayan Wong ON E.SUPERSSN = S.SSN ; Narayan 333445555 Wong 333445555 LEFT OUTER JOIN EMPLOYEE S English 333445555 Wong 333445555 English Wong Jabbar 987654321 Wong 333445555 ON E.SUPERSSN = S.SSN ; Jabbar Wallace  Equijoin does not consider Borg Wong 333445555 Smith 333445555 Zelaya 999887777 Borg NULL tuples join attributes with Wong 888665555 Zelaya 999887777 NULL values, i.e. an employee ...  RIGHT OUTER JOIN. “Borg” is not included in the answer  Use “outer join” instead, to catch the NULLs! 28 29

7 AB AB Joins – revisited A1 A2 B1 B2 Outer Joins A1 A2 B1 B2 100 A 100 W – revisited 100 A 100 W Cartesian product null B 200 X null B 200 X SELECT * FROM a INNER JOIN b; 300 C null Y 300 C null Y null D null Z null D null Z A2 A1 B1 B2 A 100 100 W B null 100 W C 300 100 W Equijoin, natural join, inner join D null 100 W A 100 200 X SELECT * from a INNER JOIN b ON a1=b1; B null 200 X A2 A1 B1 B2 C 300 200 X Right outer join Left outer join A 100 100 W D null 200 X SELECT * FROM a RIGHT OUTER JOIN b SELECT * FROM a LEFT OUTER JOIN b A 100 null Y ON a1=b1; ON a1=b1; B null null Y Thetajoin C 300 null Y A2 A1 B1 B2 A2 A1 B1 B2 D null null Y SELECT * from a INNER JOIN b ON a1>b1; A 100 100 W A 100 100 W A 100 null Z B null null Z A2 A1 B1 B2 null null 200 X C 300 null null C 300 null Z C 300 100 W null null null Y B null null null D null null Z C 300 200 X null null null Z D null null null 30 31

Subqueries SQL syntax

 Which employees have a 10 hour (exact) project assignment? SELECT  The following query returns duplicates (why?): SELECT LNAME FROM EMPLOYEE INNER JOIN WORKS_ON FROM ON SSN = ESSN [ WHERE ] WHERE HOURS = 10.0 ; {=,>,<,>=,<=,<>} [ GROUP BY ] SELECT LNAME + [ HAVING ] FROM EMPLOYEE {ANY, SOME, ALL} WHERE SSN IN (SELECT ESSN FROM WORKS_ON [ ]; WHERE HOURS = 10.0 ); SOME ≡ ANY Or IN ≡ =ANY SELECT LNAME FROM EMPLOYEE (SELECT ….) [AS] NAME WHERE EXISTS (SELECT * FROM WORKS_ON WHERE SSN = ESSN AND HOURS = 10.0 ); NOT EXISTS NOT IN 32 34

8 Aggregate functions Grouping

Used to apply an aggregate function to subgroups of tuples in a Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT(), … relation GROUP BY – grouping attributes HAVING – condition that a group has to satisfy  List the number of employees

May appear just in [DISTINCT] attribute, SELECT COUNT (*) SELECT and HAVING clauses! or *  List for each department the department number, the FROM EMPLOYEE ; number of employees and the average salary. Only grouping attributes and aggregate functions COUNT(*) ≡ NULLs are not ignored SELECT DNO, COUNT (*), AVG (SALARY) COUNT(expression) ≡ NULLs are ignored FROM EMPLOYEE GROUP BY DNO DNO COUNT(*) AVG(SALARY) HAVING COUNT(*) > 2; 5 4 33250 4 3 31000 Wrong in YOUR notes No HAVING without GROUP BY 1 1 55000 35 36

Order of query results NULL

 Select department names and their locations in alphabetical order.  NULL = unknown, unavailable, or not applicable. SELECT DNAME, DLOCATION  Hence, each NULL is different from every other. FROM DEPARTMENT D, DEPT_LOCATIONS DL  Hence, three-valued logic for AND, OR and NOT WHERE D.DNUMBER = DL.DNUMBER operators (T, F, UNKNOWN, and only tuples that ORDER BY DNAME ASC , DLOCATION DESC ; evaluate to T are selected). Moreover, DNAME DLOCATION

Administration Stafford Headquarters Houston SELECT FName, LName SELECT FName, LName Research Sugarland FROM TEACHER FROM TEACHER Research Houston WHERE Office = NULL; WHERE Office IS NULL; Research Bellaire Wrong ! IS NOT 37 Each NULL is different 38

9 Inserting new data Deleting stored data May be a subquery

DELETE FROM

WHERE ; INSERT INTO
( ,…) VALUES ( , …) ; INSERT INTO
( , …) ;  Delete employees having the last name ‘Borg’ from the EMPLOYEE table  Store in WORKS_ON information about how many hours an DELETE FROM EMPLOYEE employee works for project 1: INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5); WHERE LNAME = ‘Borg’; Foreign key

• INSERT INTO TEACHER(PNum, FName, LName, Office, Phone, Dep) EMPLOYEE FNAME M LNAME SSN DEPARTMENT DNAME DNUMBER MGRSSN SELECT * Ramesh K Narayan 666884444 Research 5 333445555 FROM OLD_TEACHER Joyce A English 453453453 Administration 4 987654321 WHERE Phone<999; NULL ? DEFAULT ? REFERENCIAL INTEGRITY ACTIONS !!! Ahmad V Jabbar 987987987 Headquarters 1 888665555 INTEGRITY CONSTRAINTS !!! James E Borg 888665555

NULL ? CASCADE ? DEFAULT ? 39 REFERENCIAL INTEGRITY ACTIONS !!! 40 INTEGRITY CONSTRAINTS !!!

Modifying stored data Views (virtual tables)

UPDATE

SET = ,…  A virtual table derived from other – possible WHERE ; virtual - tables  CREATE VIEW dept_view AS SELECT DNUMBER, DNAME  Give all employees in the ‘Research’ department FROM DEPARTMENT; a 10% raise in salary.  DROP VIEW dept_view;

NULL ? CASCADE ? DEFAULT ?  Why? UPDATE EMPLOYEE REFERENCIAL INTEGRITY ACTIONS !!! INTEGRITY CONSTRAINTS !!!  Simplify query commands SET SALARY = SALARY*1.1  Increase efficiency WHERE DNO IN (SELECT DNUMBER  Always up-to-date FROM DEPARTMENT  Updating of views is problematic WHERE DNAME = ‘Research ’); 41 42

10