CS425 – Summer 2016 Jason Arnold Chapter 4: Introduction to SQL

CS425 – Summer 2016 Jason Arnold Chapter 4: Introduction to SQL Modified from: Database System Concepts, 6th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use Chapter 4: Introduction to SQL n Overview of the SQL Query Language n Data Definition n Basic Query Structure n Additional Basic Operations n Set Operations n Null Values n Aggregate Functions n Nested Subqueries n Modification of the Database Textbook: Chapter 3 CS425 – Fall 2013 – Boris Glavic 4.2 ©Silberschatz, Korth and Sudarshan History n IBM Sequel language developed as part of System R project at the IBM San Jose Research Laboratory n Renamed Structured Query Language (SQL) n ANSI and ISO standard SQL: l SQL-86, SQL-89, SQL-92 l SQL:1999, SQL:2003, SQL:2008 n Commercial systems offer most, if not all, SQL-92 features, plus varying feature sets from later standards and special proprietary features. l Not all examples here may work one-to-one on your particular system. CS425 – Fall 2013 – Boris Glavic 4.3 ©Silberschatz, Korth and Sudarshan Data Definition Language The SQL data-definition language (DDL) allows the specification of information about relations, including: n The schema for each relation. n The domain of values associated with each attribute. n Integrity constraints n And as we will see later, also other information such as l The set of indices to be maintained for each relations. l Security and authorization information for each relation. l The physical storage structure of each relation on disk. CS425 – Fall 2013 – Boris Glavic 4.4 ©Silberschatz, Korth and Sudarshan Domain Types in SQL n char(n). Fixed length character string, with user-specified length n. n varchar(n). Variable length character strings, with user-specified maximum length n. n int. Integer (a finite subset of the integers that is machine- dependent). n smallint. Small integer (a machine-dependent subset of the integer domain type). n numeric(p,d). Fixed point number, with user-specified precision of p digits, with n digits to the right of decimal point. n real, double precision. Floating point and double-precision floating point numbers, with machine-dependent precision. n float(n). Floating point number, with user-specified precision of at least n digits. n More are covered in Chapter 4. CS425 – Fall 2013 – Boris Glavic 4.5 ©Silberschatz, Korth and Sudarshan Create Table Construct n An SQL relation is defined using the create table command: create table r (A1 D1, A2 D2, ..., An Dn, (integrity-constraint1), ..., (integrity-constraintk)) l r is the name of the relation l each Ai is an attribute name in the schema of relation r l Di is the data type of values in the domain of attribute Ai n Example: create table instructor ( ID char(5), name varchar(20) not null, dept_name varchar(20), salary numeric(8,2)) n insert into instructor values (‘10211’, ’Smith’, ’Biology’, 66000); n insert into instructor values (‘10211’, null, ’Biology’, 66000); CS425 – Fall 2013 – Boris Glavic 4.6 ©Silberschatz, Korth and Sudarshan Integrity Constraints in Create Table n not null n primary key (A1, ..., An ) n foreign key (Am, ..., An ) references r Example: Declare ID as the primary key for instructor . create table instructor ( ID char(5), name varchar(20) not null, dept_name varchar(20), salary numeric(8,2), primary key (ID), foreign key (dept_name) references department) primary key declaration on an attribute automatically ensures not null CS425 – Fall 2013 – Boris Glavic 4.7 ©Silberschatz, Korth and Sudarshan And a Few More Relation Definitions n create table student ( ID varchar(5), name varchar(20) not null, dept_name varchar(20), tot_cred numeric(3,0), primary key (ID), foreign key (dept_name) references department) ); n create table takes ( ID varchar(5), course_id varchar(8), sec_id varchar(8), semester varchar(6), year numeric(4,0), grade varchar(2), primary key (ID, course_id, sec_id, semester, year), foreign key (ID) references student, foreign key (course_id, sec_id, semester, year) references section ); l Note: sec_id can be dropped from primary key above, to ensure a student cannot be registered for two sections of the same course in the same semester CS425 – Fall 2013 – Boris Glavic 4.8 ©Silberschatz, Korth and Sudarshan Even more n create table course ( course_id varchar(8) primary key, title varchar(50), dept_name varchar(20), credits numeric(2,0), foreign key (dept_name) references department) ); l Primary key declaration can be combined with attribute declaration as shown above CS425 – Fall 2013 – Boris Glavic 4.9 ©Silberschatz, Korth and Sudarshan Drop and Alter Table Constructs n drop table student l Deletes the table and its contents n alter table l alter table r add A D where A is the name of the attribute to be added to relation r and D is the domain of A. All tuples in the relation are assigned null as the value for the new attribute. l alter table r drop A where A is the name of an attribute of relation r Dropping of attributes not supported by many databases l And more … CS425 – Fall 2013 – Boris Glavic 4.10 ©Silberschatz, Korth and Sudarshan Basic Query Structure n The SQL data-manipulation language (DML) provides the ability to query information, and insert, delete and update tuples n A typical SQL query has the form: select A1, A2, ..., An from r1, r2, ..., rm where P l Ai represents an attribute l Ri represents a relation l P is a predicate. n The result of an SQL query is a relation. CS425 – Fall 2013 – Boris Glavic 4.11 ©Silberschatz, Korth and Sudarshan The select Clause n The select clause list the attributes desired in the result of a query l corresponds to the projection operation of the relational algebra n Example: find the names of all instructors: select name from instructor n NOTE: SQL keywords are case insensitive (i.e., you may use upper- or lower-case letters.) l E.g. Name ≡ NAME ≡ name l Some people use upper case wherever we use bold font. CS425 – Fall 2013 – Boris Glavic 4.12 ©Silberschatz, Korth and Sudarshan The select Clause (Cont.) n SQL allows duplicates in relations as well as in query results. n To force the elimination of duplicates, insert the keyword distinct after select. n Find the names of all departments with instructor, and remove duplicates select distinct dept_name from instructor n The (redundant) keyword all specifies that duplicates not be removed. select all dept_name from instructor CS425 – Fall 2013 – Boris Glavic 4.13 ©Silberschatz, Korth and Sudarshan The select Clause (Cont.) n An asterisk in the select clause denotes “all attributes” select * from instructor n The select clause can contain arithmetic expressions involving the operation, +, –, , and /, and operating on constants or attributes of tuples. l Most systems also support additional functions E.g., substring l Most systems allow user defined functions (UDFs) n The query: select ID, name, salary/12 from instructor would return a relation that is the same as the instructor relation, except that the value of the attribute salary is divided by 12. CS425 – Fall 2013 – Boris Glavic 4.14 ©Silberschatz, Korth and Sudarshan The from Clause n The from clause lists the relations involved in the query l Corresponds to the Cartesian product operation of the relational algebra. n Find the Cartesian product instructor X teaches select from instructor, teaches l generates every possible instructor – teaches pair, with all attributes from both relations n Cartesian product not very useful directly, but useful combined with where-clause condition (selection operation in relational algebra) CS425 – Fall 2013 – Boris Glavic 4.15 ©Silberschatz, Korth and Sudarshan The where Clause n The where clause specifies conditions that the result must satisfy l Corresponds to the selection predicate of the relational algebra. n To find all instructors in Comp. Sci. dept with salary > 80000 select name from instructor where dept_name = ‘Comp. Sci.' and salary > 80000 n Comparison results can be combined using the logical connectives and, or, and not. n Comparisons can be applied to results of arithmetic expressions. n SQL standard: any valid expression that returns a boolean result l Vendor specific restrictions may apply! CS425 – Fall 2013 – Boris Glavic 4.16 ©Silberschatz, Korth and Sudarshan Cartesian Product: instructor X teaches instructor teaches CS425 – Fall 2013 – Boris Glavic 4.17 ©Silberschatz, Korth and Sudarshan Joins n For all instructors who have taught some course, find their names and the course ID of the courses they taught. select name, course_id from instructor, teaches where instructor.ID = teaches.ID n Find the course ID, semester, year and title of each course offered by the Comp. Sci. department select section.course_id, semester, year, title from section, course where section.course_id = course.course_id and dept_name = ‘Comp. Sci.' CS425 – Fall 2013 – Boris Glavic 4.18 ©Silberschatz, Korth and Sudarshan Joined Relations n Join operations take two relations and return as a result another relation. n A join operation is a Cartesian product which requires that tuples in the two relations match (under some condition). It also specifies the attributes that are present in the result of the join n The join operations are typically used as subquery expressions in the from clause CS425 – Fall 2013 – Boris Glavic 4.19 ©Silberschatz, Korth and Sudarshan Join operations – Example n Relation course n Relation prereq n Observe that prereq information is missing for CS-315 and course information is missing for CS-437 CS425 – Fall 2013 – Boris Glavic 4.20 ©Silberschatz, Korth and Sudarshan Outer Join n An extension of the join operation that avoids loss of information. n Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result of the join.

CS425 – Summer 2016 Jason Arnold Chapter 4: Introduction to SQL

Chapter 11 Querying

Using the Set Operators Questions

SQL Version Analysis

“A Relational Model of Data for Large Shared Data Banks”

Relational Algebra and SQL Relational Query Languages

Best Practices Managing XML Data

Chapter 2 – Object-Relational Views and Composite Types Outline

On the Logic of SQL Nulls

Information Technology — Database Languages — SQL — Part 14: XML

DS 1300 - Introduction to SQL Part 3 – Aggregation & Other Topics

Relational Completeness of Data Base Sublanguages

A Next Generation Data Language Proposal