Normalization and Transactions LECTURE 8 Dr

Normalization and Transactions LECTURE 8 Dr. Philipp Leitner [email protected] @xLeitix Some SQL Query Examples EMPLOYEE(id, fname, lname, bday, location, manager) manager -> EMPLOYEE.id DEPENDENT(name, relationship, employee) employee -> EMPLOYEE.id PROJECT(id, name, department) WORKS_ON(employee, project) employee -> EMPLOYEE.id project -> PROJECT.id 6/1/16 Chalmers 2 Some SQL Query Examples Show all employees and their dependents. 6/1/16 Chalmers 3 Some SQL Query Examples Show all employees and their dependents. select * from employee, dependent where dependent.employee = employee.id; 6/1/16 Chalmers 4 Some SQL Query Examples Which employee has no manager? 6/1/16 Chalmers 5 Some SQL Query Examples Which employee has no manager? select * from employee where manager is null; 6/1/16 Chalmers 6 Some SQL Query Examples Show the names of all employees and the names of the projects they work on. Sort by the employee last name. 6/1/16 Chalmers 7 Some SQL Query Examples Show the names of all employees and the names of the projects they work on. Sort by the employee last name. select fname, lname, name from employee, works_on, project where employee.id = works_on.employee and works_on.project = project.id order by lname; 6/1/16 Chalmers 8 Some SQL Query Examples Show the names of all employees and the names of the projects they work on. Sort by the employee last name. Or: select fname, lname, name from employee inner join works_on on employee.id = employee inner join project on project = project.id order by lname; 6/1/16 Chalmers 9 Some SQL Query Examples Show the last names of all employees and the last names of their managers. Employees with no managers should also be contained in the result. 6/1/16 Chalmers 10 Some SQL Query Examples Show the last names of all employees and the last names of their managers. Employees with no managers should also be contained in the result. select e.lname as Employee, m.lname as Manager from employee e left outer join employee m on e.manager = m.id; 6/1/16 Chalmers 11 LECTURE 8 Covers … Database Normalization (Chapter 14) Transactions (Chapter 20) 6/1/16 Chalmers 12 Assignments No directly matching assignment tasks - use the time to finish working on the SQL tasks in Assignment 2 6/1/16 Chalmers 13 What constitutes a “good” relational design? 6/1/16 Chalmers 14 Informal criteria for “good” design Clear semantics Mapping to the real world Minimal (controlled) redundancy Avoidance of NULL values Support for arbitrary queries Efficiency 6/1/16 Chalmers 15 6/1/16 Chalmers 16 An example of a model that does not allow arbitrary queries Original (good) mapping: EMPLOYEE(SSN, Fname, Minit, Lname, Bdate, Address, Sex, Salary, Super_ssn, Dno) Bad mapping: EMPLOYEE(SSN, Fname, Minit, Lname, Bdate, Address, Sex, Salary) (EMPLOYEE misses the foreign keys, so we can’t find out anymore which department an employee works in or who their manager is) 6/1/16 Chalmers 17 An example of redundancy Original (good) mapping: WORKS_ON(ESSN, Pno, Hours) Bad mapping (redundant): WORKS_ON(Emp#, Proj#, Ename, Pname, No_hours) (the employee and project name are unnecessarily redundant, because we could also get them via a join to the employee and project relations) Informally: A redundancy happens if information is stored explicitly that could also be derived from some other place in the database. 6/1/16 Chalmers 18 On redundancy Two problems of redundancy: (1) wastes space (same info is stored multiple times) This is nowadays not the biggest problem anymore (2) data can easily become corrupted when updating (update anomalies) 6/1/16 Chalmers 19 On redundancy However: There is usually a trade-off, and in some database designs developers will choose redundancy to improve query times. Joins, and even moreso calculations, are expensive, but projections are cheap. Alternatively: Using database views 6/1/16 Chalmers 20 Functional dependencies Functional dependencies (FDs) Used to specify formal measures of the "goodness" of relational designs And keys are used to define normal forms for relations Are constraints that are derived from the meaning and interrelationships of the data attributes A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y 6/1/16 Chalmers 21 Functional dependencies A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y Mathematically: X → Y (“X implies Y”) Or in RA form: t1[X]=t2[X] → t1[Y]=t2[Y] 6/1/16 Chalmers 22 Some examples Social security number determines employee name SSN → ENAME Project number determines project name and location PNUMBER → {PNAME, PLOCATION} Employee ssn and project number determines the hours per week that the employee works on the project {SSN, PNUMBER} → HOURS 6/1/16 Chalmers 23 FDs and keys FDs are not bidirectional SSN → ENAME, but the inverse is not true Keys automatically have a FD on every attribute in the relation If you know the key, you can determine all other attribute values 6/1/16 Chalmers 24 Redundancies formulated as FDs Note that data redundancies can be formulated as (unwanted) functional dependencies: WORKS_ON(Emp#, Proj#, Ename, Pname, No_hours) WORKS_ON[Ename] → EMPLOYEE[Ename] WORKS_ON[Pname] → PROJECT[Pname] These are said to be redundant FDs As opposed to the expected FDs between primary / foreign keys. Colloquially, we can state that we need to maintain multiple pairs of FDs where one would do. 6/1/16 Chalmers 25 Acceptable FDs Acceptable FDs always involve keys X[x] → Y[y] is ok if (and only if): (1) x or a subset of x is a key (primary or unique) and X==Y or (2) x or a subset of x is a foreign key pointing at a key of Y 6/1/16 Chalmers 26 Finding FDs We cannot define functional dependencies without knowing what the attributes in our data mean However, given a valid database state, we can rule out certain FDs 6/1/16 Chalmers 27 In-Class Exercise - Functional Dependencies STUDENT(personnr, name, birthday) ENROLLMENT(id, studend_id, course_name, student_name) studend_id -> STUDENT.personnr • Identify one (ok) functional dependency in STUDENT • Identify one (ok) functional dependency that spans both relations • Identify one unwanted likely functional dependency 6/1/16 Chalmers 28 In-Class Exercise - Functional Dependencies HOTEL(id, name, age, type) ROOM(name, hotel_id, hotel_name, size, price) hotel_id -> HOTEL.id • Identify one (ok) functional dependency in ROOM • Identify one unwanted likely functional dependency 6/1/16 Chalmers 29 Normal Forms and Normalization Normalization is the process of identifying FDs and refactoring a database schema so that (1) keys are properly identified and (2) unwanted FDs are removed. Normal Forms: A database is said to be in normal form after normalization. Typically: 1NF, 2NF, 3NF (there are more which we won’t cover) 6/1/16 Chalmers 30 First Normal Form (1NF) Disallows Composite attributes Multivalued attributes Nested relations This basically comes automatically with the RM (RM does not support models that are not in 1NF) 6/1/16 Chalmers 31 Second Normal Form (2NF) Requires that every attribute that is not part of a primary key is fully functionally dependent on the primary key. That is, no attribute should have a FD on only a part of a composed primary key. 6/1/16 Chalmers 32 Third Normal Form (3NF) Requires that no non-key attribute has a functional dependency on another non-key attribute. Basically: Avoid unwanted redundancies 6/1/16 Chalmers 33 More on normal forms In the book you find (much) more formal definitions and concrete algorithms for normalization. Plus: more normal forms In practice you will be fine if you follow these informal guidelines: Define relations / attributes in a way that makes domain sense Avoid redundancy Define primary keys 6/1/16 Chalmers 34 Kahoot Quiz 6/1/16 Chalmers 35 What we will be covering Transactions 6/1/16 Chalmers 36 Multi-User Databases Single-user DBMS At most one user at a time can use the system So far we have used our DBMS in this way Multi-user DBMS Many users can access the DBMS concurrently So far we did not think about the potential of other users running competing updates in parallel 6/1/16 Chalmers 37 Concurrency in DBMSs DBMSs are typically designed to support multiple concurrent users Transactions are way to ensure consistency of interleaved processes 6/1/16 Chalmers 38 Example 6/1/16 Chalmers 39 Transactions Transactions are an atomic set of statements: - Either all statements should be executed, or none of them - No other statements should be executed between statements in a transaction 6/1/16 Chalmers 40 Transactions Boundaries Transactions are typically defined through transaction boundaries Three kinds of markings: BEGIN TRANSACTION COMMIT TRANSACTION (save changes to database) ROLLBACK TRANSACTION (discard changes) 6/1/16 Chalmers 41 Transactions in SQL BEGIN used to start a new transaction COMMIT used to save changes to disk ROLLBACK used to undo changes 6/1/16 Chalmers 42 In Postgres For example: BEGIN; INSERT INTO … INSERT INTO … INSERT INTO … COMMIT; 6/1/16 Chalmers 43 ACID Properties of SQL Transactions Atomicity All changes are applied, or nothing is applied Consistency Database is always in a consistent state (no “in-between” time) Isolation Concurrency control - guarantees that concurrent transactions are not interfering Durability Once a change is applied, it remains even through failures 6/1/16 Chalmers 44 Some Transactional Problems: Lost Updates 6/1/16 Chalmers 45 Some Transactional Problems: Temporary Updates 6/1/16

Load more