Introduction to Database Management Systems Relation Normalization Why Normalization? Functional Dependencies. First, Second, and Third Normal Forms. Boyce/Codd Normal Form. Fourth and Fifth Normal Form. No loss Decomposition. Summary CIS Relation Normalization 1 Why Normalization? An ill-structured relation contains redundant data Data redundancy causes modification anomalies: Insertion anomalies -- Suppose we want to enter SCUBA as an activity that costs $100, we can’t until a student signs up for it Update anomalies -- If we change the price of swimming for student 150, there is no guarantee that student 200 will pay the new price Deletion anomalies -- If we delete Student 100, we lose not only the fact that he/she is a skier, but also the fact that skiing costs $200 Normalization is the process used to remove modification anomalies ACTIVITY SID Activity Fee How can this table be changed 100 Skiing 200 to fix these problems??? 150 Swimming 50 175 Squash 50 200 Swimming 50 CIS Relation Normalization 2 Dave McDonald, CIS, GSU 10-1 Introduction to Database Management Systems Why Normalization... Course SID Name Grade Course# Text Major Dept s1 Joseph A CIS8110 b1 CIS CIS s1 Joseph B CIS8120 b2 CIS CIS s1 Joseph A CIS8140 b5 CIS CIS s2 Alice A CIS8110 b1 CS MCS s2 Alice A CIS8140 b5 CS MCS s3 Tom B CIS8110 b1 Acct Acct s3 Tom B CIS8140 b5 Acct Acct s3 Tom A CIS8680 b1 Acct Acct Is there any redundant data? Insertion anomalies? Update anomalies? Deletion anomalies? CIS Relation Normalization 3 Functional Dependencies Given two attributes, X and Y, of a relation R, Y is functionally dependent on X iff each X value must always occur with the same Y value in R. R.X --> R.Y or X --> Y List all FDs in the Course relation: CIS Relation Normalization 4 Dave McDonald, CIS, GSU 10-2 Introduction to Database Management Systems Functional Dependencies... X is called the determinant of Y. X and Y may be composite. Dependency relationships change with attribute semantics. X and Y could be mutually dependent on each other. Husband --> Wife, Wife --> Husband, Husband <--> Wife X may or may not be the key attribute of R. AYA Y va lue can occur in more t han one tup le in R. Course# --> Text CIS Relation Normalization 5 Fully Functional Dependencies A fully functional dependence ( FFD ) exists between attributes X and Y if Y is not functional dependent on any proper subset of X. ( SID, Course# ) --> Name? ( SID, Course# ) --> Grade? ( SID, Name ) --> Major? ( SID, Name ) --> SID? Note that if X is not compp,osite, then X --> Y is always a FFD. By default, the term FD refers to FFD CIS Relation Normalization 6 Dave McDonald, CIS, GSU 10-3 Introduction to Database Management Systems Transitively Functional Dependencies Given attributes X, Y, and Z of a relation R, Z is transitively dependent on X iff X --> Y and Y --> Z. Given SID --> Dept and Dept --> College SID -->? Given SID --> Major and Major --> Dept, SID -->?> ? CIS Relation Normalization 7 Graphical Representation Course (SID, Name, Grade, Course#, Text, Major, Dept) Primary Key Name Major SID Grade Course# Dept Text CIS Relation Normalization 8 Dave McDonald, CIS, GSU 10-4 Introduction to Database Management Systems First Normal Form (1NF) A relation R is in 1NF iff all attribute domains contain atomic values only. A relation in 1NF has modification anomalies Part# QTY WAddress WHouse# INVENTORY (Part#, WHouse#, WAddress, QTY) CIS Relation Normalization 9 Second Normal Form (2NF) A relation is in 2NF iff R is in 1NF and every non key attribute is fully dependent on the primary key (i.e. has no partial functional dependencies). The term, non key attribute, refers to any attribute that does not belong to any candidate key. Part# QTY WAddress WHouse# INVENTORY (Part#, WHouse#, WAddress, QTY) CIS Relation Normalization 10 Dave McDonald, CIS, GSU 10-5 Introduction to Database Management Systems Modification Anomalies in 2NF 2NF relations have modification anomalies: Redundant Information? Update anomalies? Insertion anomalies? Deletion anomalies? Which FD causes the redundant data? INVENTORY Part# WHouse# WAddress QTY 123 4 Atlanta 10 456 5 Birmingham 6 456 2 Columbus 10 123 7 Oakland 8 235 1 Denver 2 CIS Relation Normalization 11 Third Normal Form (3NF) A relation R is in 3NF iff R is in 2NF and every non key attribute is non transitively dependent on the primary key. Student (SID, Name, Major, Dept) Discussion: If a relation does not have any non-key attribute, would it automatically be in 3NF? CIS Relation Normalization 12 Dave McDonald, CIS, GSU 10-6 Introduction to Database Management Systems Modification Anomalies in 3NF LOCATION (Employee, Department, Location) Redundant Information? Update anomalies? Insertion anomalies? Deletion anomalies? All determinants? ElEmployee Department Location CIS Relation Normalization 13 Boyce/Codd Normal Forms (BCNF) A relation R is in BCNF iff every determinant is a candidate key. BCNF is applied to a relation R if 1. Those candidate keys are composite, and 2. The candidate keys are overlapped, ADVISE (Student, Major, Advisor) STUDENT ADVISOR MAJOR CIS Relation Normalization 14 Dave McDonald, CIS, GSU 10-7 Introduction to Database Management Systems BCNF Example Student Course Instructor Narayan Database Mark Smith Database Jeffries SithSmith OtiOperating Ammar Systems Smith Theory Schulman Wallace Database Mark Wallace Operating Ahamad Systems Wong Database Omiecinski Zelaya Database Jeffries Narayan Operating Ammar Systems Teach(Student, Course, Instructor) CIS Relation Normalization 15 BCNF Example (cont’d) Student Instructor Course There are three possible decompositions (Student, Instructor) and (Student, Course) (Course, Instructor) and (Course, Student) (Instructor, Course) and (Instructor, Student) Which of the three will not generate spurious tuples after a join? CIS Relation Normalization 16 Dave McDonald, CIS, GSU 10-8 Introduction to Database Management Systems Boyce–Codd Normal Form (BCNF) • Difference between 3NF and BCNF is that for a functional dependency A → B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. • Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. • Every relation in BCNF is also in 3NF. However, relation in 3NF may not be in BCNF. • Violation of BCNF is quite rare. • Potential to violate BCNF may occur in a relation that: – contains two (or more) composite candidate keys; – the candidate keys overlap (i.e. have at least one attribute in common). CIS Relation Normalization 17 Fourth Normal Form (4NF) • Although BCNF removes anomalies due to functional dependencies, another type of dependency called a multi-valued dependency (MVD) can also cause data redundancy. • Poss ible ex is tence o f MVDs in a re lati on is d ue to 1NF and can result in data redundancy. • Dependency between attributes (for example, A, B, and C) in a relation, such that for each value of A there is a set of values for B and a set of values for C. However, set of values for B and C are independent of each other. • MVD btbetween attr ibu tes A, B,and C inareltilation using the following notation: A --->> B A --->> C CIS Relation Normalization 18 Dave McDonald, CIS, GSU 10-9 Introduction to Database Management Systems Multi-Valued Dependencies and Fourth Normal Form • A multi-valued dependency Course Teacher Text occurs when a determinant determines more than one Phy sics Prof. Basic dependent, and the Greene Mechanics dependents are independent of Principles of each other Physics Prof. Greene Optics • Ex.: course implies teacher Physics Prof. Basic (course -> teacher); course Mechanics implies text (course -> text), Brown where teacher and text are Physics Prof. Principles of independent Brown Optics • A Relation with course, teacher Math Prof. Basic and text is all key, and exhibits Greene Mechanics redundancy, but is in 3NF Math Prof. Vector – R(course, teacher, text) Greene Analysis • Updates can exhibit anomalies Math Prof. Trigonometry Greene CIS Relation Normalization 19 Fourth Normal Form • Relation R is in 4 NF if and only • In the previous example, if, whenever there exist decompose course, teacher, subsets A and B of the text into two Relations: course attributes of R such that the teacher, and course text nontrivial multi-valued dependency A multi- determines B is satisfied, then all attributes of R are also Course Text functionally dependent on A Physics Basic Mechanics Physics Principles of Course Teacher Optics Prof. Greene Math Basic Physics Mechanics Physics Prof. Brown Math Vector Analysis Math Prof. Greene Math Trignonometry CIS Relation Normalization 20 Dave McDonald, CIS, GSU 10-10 Introduction to Database Management Systems 4NF – Text Example CIS Relation Normalization 21 Join Dependencies and Fifth Normal Form • There exist Relations that cannot be nonloss-decomposed into two Relations, but can be nonloss-decomposed into more than two • ElitjtEx.: supplier, part, project • A supplier supplies parts and projects, a project is supplied by suppliers and parts, but from this you may not validly conclude that a particular supplier supplies a particular part to a particular project Supplier# Part# Project# S1 P1 J2 M N S1 P2 J1 SupplierS-P-Pr Part S2 P1 J1 S1 P1 J1 O Project CIS Relation Normalization 22 Dave McDonald, CIS, GSU 10-11 Introduction to Database Management Systems Join Dependency • Let R be a Relation, and let A, B, … Z be subsets of the attributes of R. Then we say that R satisfies
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages16 Page
-
File Size-