ACS-3902 Ron Mcfadyen Slides Are Based on Chapter 5 (7Th Edition)
Total Page:16
File Type:pdf, Size:1020Kb
ACS-3902 Ron McFadyen Slides are based on chapter 5 (7th edition) (chapter 3 in 6th edition) ACS-3902 1 The Relational Data Model and Relational Database Constraints • Relational model – Ted Codd (IBM) 1970 – First commercial implementations available in early 1980s – Widely used ACS-3902 2 Relational Model Concepts • Database is a collection of relations • Implementation of relation: table comprising rows and columns • In practice a table/relation represents an entity type or relationship type (entity-relationship model … later) • At intersection of a row and column in a table there is a simple value • Row • Represents a collection of related data values • Formally called a tuple • Column names • Columns may be referred to as fields, or, formally as attributes • Values in a column are drawn from a domain of values associated with the column/field/attribute ACS-3902 3 Relational Model Concepts 7th edition Figure 5.1 ACS-3902 4 Domains • Domain – Atomic • A domain is a collection of values where each value is indivisible • Not meaningful to decompose further – Specifying a domain • Name, data type, rules – Examples • domain of department codes for UW is a list: {“ACS”, “MATH”, “ENGL”, “HIST”, etc} • domain of gender values for UW is the list (“male”, “female”) – Cardinality: number of values in a domain – Database implementation & support vary ACS-3902 5 Domain example - PostgreSQL CREATE DOMAIN posint AS integer CHECK (VALUE > 0); CREATE TABLE mytable (id posint); INSERT INTO mytable VALUES(1); -- works INSERT INTO mytable VALUES(-1); -- fails https://www.postgresql.org/docs/current/domains.html ACS-3902 6 Domain example - PostgreSQL CREATE DOMAIN domain_code_type AS character varying NOT NULL CONSTRAINT domain_code_type_check CHECK (VALUE IN ('ApprovedByAdmin', 'Unapproved', 'ApprovedByEmail')); CREATE TABLE codes__domain ( code_id integer NOT NULL, code_type domain_code_type NOT NULL, CONSTRAINT codes_domain_pk PRIMARY KEY (code_id) ) ACS-3902 7 Relation • Relation schema R – Name R and a list of attributes: • Denoted by R (A1, A2, ...,An) • E.g. STUDENT (Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa) – Attribute Ai • Name of a role played by some domain D in the relation schema R • Each attribute has a name and a domain • E.g. age: integer firstName: aName lastName: aName • Degree (or arity) of a relation – Number of attributes in its relation schema – E.g. STUDENT has 7 attributes ACS-3902 8 Relations • The relation (or relation state) r(R) m tuples in relation – Set of n-tuples r = {t1, t2, ..., tm} each tuple has n values – Each n-tuple t each value comes from a domain • Ordered list of n values t =<v1, v2, ..., vn> • Each value vi, 1 ≤ i ≤ n, is an element of domain Ai or is NULL – r(R) is a subset of the Cartesian product of the domains of R: • r(R) ⊆ (domain(A1) × domain (A2) × ... × domain (An)) ACS-3902 9 Relational Databases and Relational Database Schemas • Relational database schema S – Set of relation schemas S = {R1, R2, ..., Rm} – Set of integrity constraints IC • Relational database state – Set of relation states DB = {r1, r2, ..., rm} – Each ri is a state of Ri and such that the ri relation states satisfy integrity constraints specified in IC ACS-3902 10 Characteristics of Relations • No ordering of tuples in a relation – Relation defined as a set of tuples – Order of attributes is not that important (some database systems may have some practical tips) ACS-3902 11 Characteristics of Relations (cont’d.) Figures 5.1 and 5.2 show the same relation state … order of tuples is not important 7th edition Figure 5.2 5.1 ACS-3902 12 Characteristics of Relations (cont’d.) • Values – Each value in a tuple is atomic – Flat relational model • Composite and multivalued attributes not allowed • First normal form assumption – Multivalued attributes • Must be represented by separate relations – Composite attributes • Represented only by simple component attributes in basic relational model An EERD may have such things … we cover proper mapping of EERD to relations ACS-3902 13 Characteristics of Relations (cont’d.) • NULLs – Represent the values of attributes that may be unknown or may not apply to a tuple – Meanings for NULL values • Value unknown • Value exists but is not available • Attribute does not apply to this tuple (value undefined) • … ACS-3902 14 Characteristics of Relations (cont’d.) • Interpretation (meaning) of a relation • Each tuple in the relation is a fact • In the Student relation there are five assertions : five students exist and have the characteristics given • We must be able to make statements regarding the meaning of tuples … see slide 11 • E.g. Dick Davidson is identified by the SSN 422-11-2320, has an unknown home phone number, lives at 3452 Elgin Road, has an office phone number (817)749-1253, is of age 25 and has a current gpa of 3.53 ACS-3902 15 Relational Model Notation • To refer to the current set of tuples (its state): STUDENT • To refer to the schema: STUDENT ( Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa) • An attribute can be qualified with the relation name to which it belongs by using dot notation: STUDENT.Name – Needed in SQL sometimes ACS-3902 16 Constraints • Integrity Constraints – Restrictions on the actual values in a database state – Derived from the rules in the miniworld that the database represents – Three categories: • Constraints inherent in the data model • Constraints expressed in the schema • General constraints not falling into the first two categories – Expressed in application code ACS-3902 17 Constraints • Constraints expressed in a data model: – there are no duplicate tuples There is at least one attribute value that differentiates one student from another – Values for an attribute must come from its domain Each SSN value is numeric In practice these are expressed in the DDL i.e. in the schema Create table, create index, … ACS-3902 18 Relational Model Constraints (cont’d.) • Schema-based constraints – Domains – Keys – NULLs – Entity integrity – Referential integrity ACS-3902 19 Domain Constraints • In DDL we specify a datatype for an attribute – Numeric data types for integers and real numbers – Characters – Booleans – If supported …. a domain – etc CREATE TABLE Customer( First_Name char(50) , Last_Name char(50) , Address char(50) , City char(50) , Country char(25) , Birth_Date datetime ); ACS-3902 20 Domain Constraints • In DDL we can specify a check constraint for an attribute E.g. suppose the value of the age attribute in row must be greater than zero. The following will cause INSERT Customer(‘joe’,’smith’,0) to be rejected CREATE TABLE Customer( First_Name char(50) , Last_Name char(50) , Age int check (age >0) ); ACS-3902 21 Domain Constraints CREATE TABLE distributors ( did integer, name varchar(40), CONSTRAINT con1 CHECK (did > 100 AND name <> '') ); ACS-3902 22 Domain Constraints • In DDL we can ensure each row has a value for an attribute E.g. suppose the value of the age attribute in row must be known. The following will cause INSERT Customer(‘joe’,’smith’,null) to be rejected CREATE TABLE Customer( First_Name char(50) , Last_Name char(50) , Age int NOT NULL ); ACS-3902 23 Domain Constraints • SQL has a create domain statement (see page 94). Examples: CREATE DOMAIN persons_name CHAR(30) ; CREATE DOMAIN street_address CHAR(35) ; • domain definitions can be used in DDL: CREATE TABLE Customers ( ID INT DEFAULT AUTOINCREMENT PRIMARY KEY, Name persons_name, Street street_address); ACS-3902 24 Key Constraints • Superkey: combination of attributes such that every tuple will have a unique value • Key – Of course, a key is a superkey – But a key is minimal in the sense that if you remove an attribute from the key it is no longer a superkey – the term candidate keys refers to all keys of a table • One key in a table is the primary key • Other keys in a table are alternate keys ACS-3902 25 Key Constraints • Candidate key – Relation schema may have more than one key – SQL: unique constraint • Primary key of the relation – Only one CK is designated as the PK – Other candidate keys are designated as unique keys ( secondary keys or alternate keys) – SQL: primary key clause ACS-3902 26 Key Constraints CREATE TABLE Schedule ( eventID INT, room CHAR(10), startTime DATETIME, endTime DATETIME, eventDescription VARCHAR(255), CONSTRAINT pk PRIMARY KEY (eventID), CONSTRAINT ck1 UNIQUE (room, startTime), CONSTRAINT ck2 UNIQUE (room, endTime) ); ACS-3902 27 Referential Integrity Constraints Empid Mgrid Empname Salary 1 NULL Nancy 9000.00 2 1 Andrew 5000.00 3 1 Janet 5000.00 4 1 Margaret 5000.00 5 2 Steven 2500.00 6 2 Michael 2500.00 7 3 Robert 2500.00 8 3 Laura 2500.00 9 3 Ann 2500.00 10 4 Ina 2500.00 11 7 David 2000.00 12 7 Ron 2000.00 13 7 Dan 2000.00 14 11 James 1500.00 Mgrid must be null or have a value that exists in a row of Employee ACS-3902 28 Referential Integrity Constraints Mgrid must be null or have a value that exists in a row of Employee CREATE TABLE Employees ( empid int NOT NULL, mgrid int NULL, empname varchar(25) NOT NULL, salary money NOT NULL, CONSTRAINT PK_Employees_empid PRIMARY KEY (empid), CONSTRAINT FK_Employees_Employees FOREIGN KEY (mgrid) REFERENCES Employees (empid) ) ; ACS-3902 29 By convention PK is underlined PK composite PK ACS-3902 30 Structure diagram showing PKs and FKs: 5.7 A directed line shows a FK references a PK ACS-3902 31 Referential Integrity • Foreign key rules: – The attributes in a FK have the same domain(s) as the primary key attributes PK – Value of FK in a tuple t1 of the current state r1(R1) either occurs