SQL Integrity Constraints - Required data - Domain constraints - Entity integrity - Referential integrity - General constraints

Required data - Ensures every tuple has a non-NULL value for an attribute - In the attribute definition, Salary NUMBER(6) NOT NULL - As a constraint, CONSTRAINT must_have_salary CHECK (Salary IS NOT NULL)

Domain constraints - Restrict domain of an attribute beyond the datatype - In the attribute definition Sex CHAR NOT NULL CHECK ( Sex IN (‘M’,’F’) ) - As a constraint CONSTRAINT salary_too_big CHECK (Salary < 200000)

Entity integrity - Defines a or a - In an attribute definition Id NUMBER(9) NOT NULL PRIMARY KEY - As a separate line (better) PRIMARY KEY (DeptCode, CourseNum) - As a constraint (even better) CONSTRAINT student_id_not_unique PRIMARY KEY (Id)

CONSTRAINT crn_year_not_unique UNIQUE (crn, year)

Referential integrity - Defines foreign keys to other relations - In an attribute definition DeptCode VARCHAR(4) REFERENCES Dept(DeptCode) - As a separate line (better) (crn, year) REFERENCES Section(crn, year) - As a constraint (even better) CONSTRAINT no_matching_section FOREIGN KEY (crn,year) REFERENCES Section(crn, year)

General Constraints - Can add more general CHECK clauses with general conditions - Conditions can include queries

Referential Integrity with Insert/Update - Foreign key constraint can include two restrictions when changes made to parent ON DELETE option ON UPDATE option - Option can be any one of four CASCADE – / rows in child SET NULL – set child attributes to NULL SET DEFAULT – set child attributes to default value NO ACTION – reject delete/update operation on parent if referenced by child

SQL Query Syntax

SELECT [DISTINCT | ALL] { * | [ column_expr [AS new_name] ] [ , … ] } FROM table_name [ alias ] [ , … ] [WHERE condition ] [GROUP BY column_list ] [ HAVING condition ] [ORDER BY column_name { ASC | DESC } [ , … ] ]

DISTINCT – eliminates duplicate tuples ALL – all tuples (including duplicates) – default

* - all attributes / fields column_expr - Attributes - Math expressions on attributes, e.g. salary / 12 - Aggregates COUNT, SUM, AVG, MAX, MIN – can only be used in SELECT or HAVING - Date and String Functions - Relationals, Booleans, BETWEEN, IN, NOT IN, IS NULL, IS NOT NULL, LIKE

ORDER BY – ascending order of fields – sorted in order given DESC to descending order of particular

Using Aggregates

If SELECT includes aggregate and no GROUP BY is used, Then no item in SELECT list can include reference to column not in aggregate e.g. SELECT sno, COUNT(salary) FROM Staff; is illegal

Be careful with DISTINCT and aggregates - Count, avg, sum only includes DISTINCT occurrences GROUP BY - Treats nulls as equal

When GROUP BY is used, each item in SELECT list must be single-valued per group.

SELECT clause can only contain - Attribute names – if also in group by clause - Aggregates on anything - Constants - Expressions including the above e.g. SELECT bno, COUNT(sno) AS count, SUM(salary) as sum FROM staff GROUP BY bno ORDER BY bno

HAVING - Restricting groups that appear in result - Column names must appear in GROUP BY or be in aggregates e.g. HAVING count > 1

SUBQUERIES - Inner SELECT statement used in WHERE and HAVING clauses

Subquery with equality, >, <, >=, <=, <>, != must reference single value

WHERE salary = ( SELECT MAX(salary) FROM …

Otherwise use IN or NOT IN or > ANY or < ALL or > SOME or < SOME

WHERE salary IN (SELECT salary FROM NOT IN ( SELECT > ANY ( SELECT … < ALL ( SELECT …

EXISTS WHERE EXISTS ( SELECT * FROM … WHERE NOT EXISTS ( SELECT * FROM … e.g. SELECT LName, FName FROM Student s WHERE EXISTS ( SELECT * FROM Enroll e WHERE s.ID = e.ID AND e.DeptCode = ‘CSCI’ ); SQL Subquery Rules - For subqueries in WHERE or HAVING clause

1. No ORDER BY clause in the subquery

2. SELECT list of subquery must be single column, unless being tested with EXISTS

3. Column name in subquery refers to tables in subquery by default, must qualify to reach outer tables

4. Subquery must be right-hand side of comparison operator e.g. year = (SELECT … ) not (SELECT … ) = year

Strings – Equality and Pattern Matching

- For queries involving strings, SQL has two pattern matching wildcards

% - any sequence of zero or more characters

_ - (underscore) – any single character

- All other characters in the pattern represent themselves

Examples:

WHERE LastName = ’Smith’ - exact match

WHERE LastName LIKE ’van%’ - any last name starting with ‘van’

WHERE LastName NOT LIKE ’Mac%’ - avoid those ‘Mac’-somethings

WHERE Address LIKE ’%Main Street’ - any address ending with ‘Main Street’

WHERE Address LIKE ’%Glasgow%’ - any address containing ‘Glasgow’

WHERE FirstName LIKE ’J_ _ _’ - any 4-character first name starting with’J’

May have to use escape character ‘\’ to exact match a wildcard (use ‘\\’ to indicate ‘\’)

WHERE Discount = ’10\% Off’ - match ’10% Off’

Combining Resulting Tables

- The three SQL operators UNION, INTERSECT, and MINUS represent the set operations union, intersection, and set difference.

For example,

SELECT * FROM cs.Student WHERE FName = ’John’ INTERSECT SELECT * FROM cs.Student WHERE LName LIKE ’Mac%’;

- The tables being combined must be union-compatible.

Database Updates

Three SQL statements can modify the contents of tables in the database

INSERT – add new rows to a table UPDATE – modify existing data in a table DELETE – remove data from a table

INSERT takes on two forms

INSERT INTO TableName [(column_list)] VALUES (dataValueList);

INSERT INTO TableName [(column_list)] SELECT …

- The latter form SELECT clause can be any valid SELECT statement

UPDATE allows contents of existing rows in a table to be changed

UPDATE TableName SET columnName1 = dataValue1 [ , columnName2 = dataValue2 … ] [ WHERE searchCondition ]

- The WHERE clause is optional – without it, all rows in the table will be affected

DELETE allows rows to be deleted from a named table

DELETE FROM TableName [ WHERE searchCondition ]

- The WHERE clause is optional – without it, all rows in the table will be deleted

View - The dynamic result of one or more relational operations operating on the base relations to produce another - Virtual relation – does not exist in the database but can be produced upon request

CREATE VIEW vname [ (newColName [ , … ] ) ] AS subselect [ WITH CHECK OPTION ]

- View is defined by specifying an SQL SELECT statement (defining query)

- Can optionally assign a name to each column of the (instead of using default one) o If so, must list same number of names as columns returned by subquery

- WITH CHECK OPTION causes SQL to ensure that changes to elements of view that result in updates to base table do not remove tuples from the view based on the subselect

Horizontal view – restricts user’s access to selected rows of one or more tables

CREATE VIEW Math17 AS SELECT * FROM cs.Section WHERE deptCode = ‘MATH’ AND year = 2017;

Vertical view – restricts user’s access to selected columns of one or more tables

CREATE VIEW Math17nums AS SELECT CourseNo, SNo FROM Math17;

CREATE VIEW Math17courses AS SELECT DISTINCT CourseNo FROM Math17nums;

Check option:

CREATE VIEW Highgrades AS cannot UPDATE Highgrades SELECT * SET grade = 89 FROM cs.Enroll WHERE ID = 201200125; WHERE grade > 90 WITH CHECK OPTION; would remove it from view – integrity constraint

Restrictions on Views (ISO Standard)

Aggregates - If a column in a view is based on an aggregate function, then column may only appear in SELECT and ORDER BY clauses of queries that access the view - Such a column may not be used in a WHERE clause and may not be argument to another aggregate function in any query based on the view - Oracle allows it

CREATE VIEW MathSectionCount AS SELECT CourseNo, COUNT(*) AS Offerings FROM Math17nums GROUP BY CourseNo;

Groupings - A grouped view can never be joined with another table or view - Oracle allows it

View Updatability - DBMS must be able to trace any or column back to its row/column in source table

According to the ISO Standard, view is updatable iff in the defining query: - DISTINCT is not specified o duplicates are not eliminated - SELECT list contains only column names (and only appears once) o No constants, expressions, or aggregate functions - FROM clause specifies only one table (or view satisfying the condition) o No joins, unions, intersections, or differences - WHERE clause does not include nested SELECTs that reference table in FROM clause o No subqueries in view - GROUP BY or HAVING clauses are not specified

Also, any rows added to view must not violate integrity constraints of base tables

View Resolution – resolving query on View to query on base tables

Merge the two queries by:

1. View column names in FROM are replaced with base table names 2. View names in FROM are replaced with base table names 3. WHERE clauses are combined 4. GROUP BY and HAVING clauses are combined 5. ORDER BY clause is copied

View Advantages

Data independence  structure does not change even if underlying tables change (just view definition)  may have to redefine view based on new structure of underlying tables

Currency – changes immediately reflected in view (as opposed to copying part of table)

Customization – different users see data in different ways

Improved security – restrict access via small set of views containing data appropriate for user

Reduced complexity – combine tables via view so queries can be done simply on view

Convenience – only confronted with necessary data / structure

Data integrity – WITH CHECK OPTION allows conditions defining view to be enforced on update to view

View Disadvantages

Update restriction – some cases cannot be updated

Structure restriction  view structure determined at time of creation  structure not automatically updated  e.g. view of form SELECT * FROM refers to columns of base table, at time of creation adding columns to table(s) later does not add them to view

Performance – view resolution takes time