SQL Integrity Constraints - Required data - Domain constraints - Entity integrity - Referential integrity - General constraints
Required data - Ensures every tuple has a non-NULL value for an attribute - In the attribute definition, Salary NUMBER(6) NOT NULL - As a constraint, CONSTRAINT must_have_salary CHECK (Salary IS NOT NULL)
Domain constraints - Restrict domain of an attribute beyond the datatype - In the attribute definition Sex CHAR NOT NULL CHECK ( Sex IN (‘M’,’F’) ) - As a constraint CONSTRAINT salary_too_big CHECK (Salary < 200000)
Entity integrity - Defines a primary key or a candidate key - In an attribute definition Id NUMBER(9) NOT NULL PRIMARY KEY - As a separate line (better) PRIMARY KEY (DeptCode, CourseNum) - As a constraint (even better) CONSTRAINT student_id_not_unique PRIMARY KEY (Id)
CONSTRAINT crn_year_not_unique UNIQUE (crn, year)
Referential integrity - Defines foreign keys to other relations - In an attribute definition DeptCode VARCHAR(4) REFERENCES Dept(DeptCode) - As a separate line (better) FOREIGN KEY (crn, year) REFERENCES Section(crn, year) - As a constraint (even better) CONSTRAINT no_matching_section FOREIGN KEY (crn,year) REFERENCES Section(crn, year)
General Constraints - Can add more general CHECK clauses with general conditions - Conditions can include queries
Referential Integrity with Insert/Update - Foreign key constraint can include two restrictions when changes made to parent ON DELETE option ON UPDATE option - Option can be any one of four CASCADE – delete/update rows in child table SET NULL – set child attributes to NULL SET DEFAULT – set child attributes to default value NO ACTION – reject delete/update operation on parent if referenced by child
SQL Query Syntax
SELECT [DISTINCT | ALL] { * | [ column_expr [AS new_name] ] [ , … ] } FROM table_name [ alias ] [ , … ] [WHERE condition ] [GROUP BY column_list ] [ HAVING condition ] [ORDER BY column_name { ASC | DESC } [ , … ] ]
DISTINCT – eliminates duplicate tuples ALL – all tuples (including duplicates) – default
* - all attributes / fields column_expr - Attributes - Math expressions on attributes, e.g. salary / 12 - Aggregates COUNT, SUM, AVG, MAX, MIN – can only be used in SELECT or HAVING - Date and String Functions - Relationals, Booleans, BETWEEN, IN, NOT IN, IS NULL, IS NOT NULL, LIKE
ORDER BY – ascending order of fields – sorted in order given DESC to descending order of particular column
Using Aggregates
If SELECT includes aggregate and no GROUP BY is used, Then no item in SELECT list can include reference to column not in aggregate e.g. SELECT sno, COUNT(salary) FROM Staff; is illegal
Be careful with DISTINCT and aggregates - Count, avg, sum only includes DISTINCT occurrences GROUP BY - Treats nulls as equal
When GROUP BY is used, each item in SELECT list must be single-valued per group.
SELECT clause can only contain - Attribute names – if also in group by clause - Aggregates on anything - Constants - Expressions including the above e.g. SELECT bno, COUNT(sno) AS count, SUM(salary) as sum FROM staff GROUP BY bno ORDER BY bno
HAVING - Restricting groups that appear in result - Column names must appear in GROUP BY or be in aggregates e.g. HAVING count > 1
SUBQUERIES - Inner SELECT statement used in WHERE and HAVING clauses
Subquery with equality, >, <, >=, <=, <>, != must reference single value
WHERE salary = ( SELECT MAX(salary) FROM …
Otherwise use IN or NOT IN or > ANY or < ALL or > SOME or < SOME
WHERE salary IN (SELECT salary FROM NOT IN ( SELECT > ANY ( SELECT … < ALL ( SELECT …
EXISTS WHERE EXISTS ( SELECT * FROM … WHERE NOT EXISTS ( SELECT * FROM … e.g. SELECT LName, FName FROM Student s WHERE EXISTS ( SELECT * FROM Enroll e WHERE s.ID = e.ID AND e.DeptCode = ‘CSCI’ ); SQL Subquery Rules - For subqueries in WHERE or HAVING clause
1. No ORDER BY clause in the subquery
2. SELECT list of subquery must be single column, unless being tested with EXISTS
3. Column name in subquery refers to tables in subquery by default, must qualify to reach outer tables
4. Subquery must be right-hand side of comparison operator e.g. year = (SELECT … ) not (SELECT … ) = year
Strings – Equality and Pattern Matching
- For queries involving strings, SQL has two pattern matching wildcards
% - any sequence of zero or more characters
_ - (underscore) – any single character
- All other characters in the pattern represent themselves
Examples:
WHERE LastName = ’Smith’ - exact match
WHERE LastName LIKE ’van%’ - any last name starting with ‘van’
WHERE LastName NOT LIKE ’Mac%’ - avoid those ‘Mac’-somethings
WHERE Address LIKE ’%Main Street’ - any address ending with ‘Main Street’
WHERE Address LIKE ’%Glasgow%’ - any address containing ‘Glasgow’
WHERE FirstName LIKE ’J_ _ _’ - any 4-character first name starting with’J’
May have to use escape character ‘\’ to exact match a wildcard (use ‘\\’ to indicate ‘\’)
WHERE Discount = ’10\% Off’ - match ’10% Off’
Combining Resulting Tables
- The three SQL operators UNION, INTERSECT, and MINUS represent the set operations union, intersection, and set difference.
For example,
SELECT * FROM cs.Student WHERE FName = ’John’ INTERSECT SELECT * FROM cs.Student WHERE LName LIKE ’Mac%’;
- The tables being combined must be union-compatible.
Database Updates
Three SQL statements can modify the contents of tables in the database
INSERT – add new rows to a table UPDATE – modify existing data in a table DELETE – remove data from a table
INSERT takes on two forms
INSERT INTO TableName [(column_list)] VALUES (dataValueList);
INSERT INTO TableName [(column_list)] SELECT …
- The latter form SELECT clause can be any valid SELECT statement
UPDATE allows contents of existing rows in a table to be changed
UPDATE TableName SET columnName1 = dataValue1 [ , columnName2 = dataValue2 … ] [ WHERE searchCondition ]
- The WHERE clause is optional – without it, all rows in the table will be affected
DELETE allows rows to be deleted from a named table
DELETE FROM TableName [ WHERE searchCondition ]
- The WHERE clause is optional – without it, all rows in the table will be deleted
View - The dynamic result of one or more relational operations operating on the base relations to produce another relation - Virtual relation – does not exist in the database but can be produced upon request
CREATE VIEW vname [ (newColName [ , … ] ) ] AS subselect [ WITH CHECK OPTION ]
- View is defined by specifying an SQL SELECT statement (defining query)
- Can optionally assign a name to each column of the view (instead of using default one) o If so, must list same number of names as columns returned by subquery
- WITH CHECK OPTION causes SQL to ensure that changes to elements of view that result in updates to base table do not remove tuples from the view based on the subselect
Horizontal view – restricts user’s access to selected rows of one or more tables
CREATE VIEW Math17 AS SELECT * FROM cs.Section WHERE deptCode = ‘MATH’ AND year = 2017;
Vertical view – restricts user’s access to selected columns of one or more tables
CREATE VIEW Math17nums AS SELECT CourseNo, SNo FROM Math17;
CREATE VIEW Math17courses AS SELECT DISTINCT CourseNo FROM Math17nums;
Check option:
CREATE VIEW Highgrades AS cannot UPDATE Highgrades SELECT * SET grade = 89 FROM cs.Enroll WHERE ID = 201200125; WHERE grade > 90 WITH CHECK OPTION; would remove it from view – integrity constraint
Restrictions on Views (ISO Standard)
Aggregates - If a column in a view is based on an aggregate function, then column may only appear in SELECT and ORDER BY clauses of queries that access the view - Such a column may not be used in a WHERE clause and may not be argument to another aggregate function in any query based on the view - Oracle allows it
CREATE VIEW MathSectionCount AS SELECT CourseNo, COUNT(*) AS Offerings FROM Math17nums GROUP BY CourseNo;
Groupings - A grouped view can never be joined with another table or view - Oracle allows it
View Updatability - DBMS must be able to trace any row or column back to its row/column in source table
According to the ISO Standard, view is updatable iff in the defining query: - DISTINCT is not specified o duplicates are not eliminated - SELECT list contains only column names (and only appears once) o No constants, expressions, or aggregate functions - FROM clause specifies only one table (or view satisfying the condition) o No joins, unions, intersections, or differences - WHERE clause does not include nested SELECTs that reference table in FROM clause o No subqueries in view - GROUP BY or HAVING clauses are not specified
Also, any rows added to view must not violate integrity constraints of base tables
View Resolution – resolving query on View to query on base tables
Merge the two queries by:
1. View column names in FROM are replaced with base table names 2. View names in FROM are replaced with base table names 3. WHERE clauses are combined 4. GROUP BY and HAVING clauses are combined 5. ORDER BY clause is copied
View Advantages
Data independence structure does not change even if underlying tables change (just view definition) may have to redefine view based on new structure of underlying tables
Currency – changes immediately reflected in view (as opposed to copying part of table)
Customization – different users see data in different ways
Improved security – restrict access via small set of views containing data appropriate for user
Reduced complexity – combine tables via view so queries can be done simply on view
Convenience – only confronted with necessary data / structure
Data integrity – WITH CHECK OPTION allows conditions defining view to be enforced on update to view
View Disadvantages
Update restriction – some cases cannot be updated
Structure restriction view structure determined at time of creation structure not automatically updated e.g. view of form SELECT * FROM refers to columns of base table, at time of creation adding columns to table(s) later does not add them to view
Performance – view resolution takes time