Normalisation to 3NF

Database Systems Lecture 11 Natasha Alechina In This Lecture

• Normalisation to 3NF • Data redundancy • Functional dependencies • Normal forms • First, Second, and Third Normal Forms • For more information • Connolly and Begg chapter 13 • Ullman and Widom ch.3.6.6 (2nd edition), 3.5 (3rd edition) Redundancy and Normalisation

• Redundant data • Normalisation • Can be determined • Aims to reduce data from other data in the redundancy • Redundancy is • Leads to various expressed in terms of problems dependencies • INSERT anomalies • Normal forms are • UPDATE anomalies defined that do not • DELETE anomalies have certain types of dependency

• In most definitions • A is said to of the relational be in first normal model form (1NF) if all data • All data values should values are atomic be atomic • This means that entries should be single values, not sets or composite objects Normalisation to 1NF

To convert to a 1NF relation, split up any non-atomic values 1NF Unnormalised Module Dept Lecturer Text Module Dept Lecturer Texts M1 D1 L1 T1 M1 D1 L1 T2 M1 D1 L1 T1, T2 M2 D1 L1 T1 M2 D1 L1 T1, T3 M2 D1 L1 T3 M3 D1 L2 T4 M3 D1 L2 T4 M4 D2 L3 T1, T5 M4 D2 L3 T1 M5 D2 L4 T6 M4 D2 L3 T5 M5 D2 L4 T6 Problems in 1NF

• INSERT anomalies 1NF • Can't add a module Module Dept Lecturer Text with no texts M1 D1 L1 T1 • UPDATE anomalies M1 D1 L1 T2 • To change lecturer for M2 D1 L1 T1 M1, we have to M2 D1 L1 T3 change two rows M3 D1 L2 T4 M4 D2 L3 T1 • DELETE anomalies M4 D2 L3 T5 • If we remove M3, we M5 D2 L4 T6 remove L2 as well Functional Dependencies

• Redundancy is often • A set of attributes, A, caused by a functional functionally determines dependency another set, B, or: there • A exists a functional (FD) is a link between dependency between A two sets of attributes in a and B (A  B), if relation whenever two rows of • We can normalise a the relation have the relation by removing same values for all the undesirable FDs attributes in A, then they also have the same values for all the attributes in B. Example

• {ID, modCode}  {First, Last, modName} •{modCode}  {modName} •{ID}  {First, Last}

IDFirst Last modCode modName

111Joe Bloggs G51PRG Programming

222Anne Smith G51DBS FDs and Normalisation

•We define a set of • Not all FDs cause a 'normal forms' problem • Each normal form has • We identify various fewer FDs than the sorts of FD that do last • Each normal form • Since FDs represent removes a type of FD redundancy, each that is a problem normal form has less • We will also need a redundancy than the way to remove FDs last Properties of FDs

•In any relation • Rules for FDs • The FDs • Reflexivity: If B is a any set of attributes of A then in that relation A  B K  X • Augmentation: If • K is the primary key, A  B then X is a set of A U C  B U C attributes • Transitivity: • Same for candidate keys If A  B and B  C then A  C • Any set of attributes is FD on itself X  X FD Example

• The primary key is 1NF {Module, Text} so Module Dept Lecturer Text {Module, Text}  M1 D1 L1 T1 {Dept, Lecturer} M1 D1 L1 T2 • 'Trivial' FDs, eg: M2 D1 L1 T1 {Text, Dept}  {Text} M2 D1 L1 T3 M3 D1 L2 T4 {Module}  {Module} M4 D2 L3 T1 {Dept, Lecturer}  { } M4 D2 L3 T5 M5 D2 L4 T6 FD Example

•Other FDsare 1NF • {Module}  Module Dept Lecturer Text {Lecturer} M1 D1 L1 T1 • {Module}  {Dept} M1 D1 L1 T2 • {Lecturer}  {Dept} M2 D1 L1 T1 • These are non-trivial M2 D1 L1 T3 and determinants (left M3 D1 L2 T4 hand side of the M4 D2 L3 T1 dependency) are not M4 D2 L3 T5 keys. M5 D2 L4 T6 Partial FDs and 2NF

• Partial FDs: : •A FD, A  B is a partial • A relation is in second FD, if some attribute of normal form (2NF) if it is A can be removed and in 1NF and no non-key the FD still holds attribute is partially • Formally, there is some dependent on a proper subset of A, C  A, such that C  B • In other words, no C  B • Let us call attributes where C is a strict subset which are part of some of a candidate key and B candidate key, key is a non-key attribute. attributes, and the rest non-key attributes. Second Normal Form

1NF • 1NF is not in 2NF Module Dept Lecturer Text • We have the FD M1 D1 L1 T1 {Module, Text}  M1 D1 L1 T2 {Lecturer, Dept} M2 D1 L1 T1 M2 D1 L1 T3 •But also M3 D1 L2 T4 {Module}  {Lecturer, Dept} M4 D2 L3 T1 • And so Lecturer and M4 D2 L3 T5 Dept are partially M5 D2 L4 T6 dependent on the primary key Removing FDs

• Suppose we have a • It turns out that we can relation R with scheme S split R into two parts: and the FD A  B where • R1, with scheme C U A A ∩ B = { } • R2, with scheme A U B • Let C = S – (A U B) • The original relation can • In other words: be recovered as the • A – attributes on the left natural join of R1 and hand side of the FD R2: • B – attributes on the • R = R1 NATURAL JOIN R2 right hand side of the FD • C – all other attributes 1NF to 2NF – Example 1NF 2NFa 2NFb Module Dept Lecturer Text Module Dept Lecturer Module Text M1 D1 L1 T1 M1 D1 L1 M1 T1 M1 D1 L1 T2 M2 D1 L1 M1 T2 M2 D1 L1 T1 M3 D1 L2 M2 T1 M2 D1 L1 T3 M4 D2 L3 M2 T3 M3 D1 L2 T4 M5 D2 L4 M3 T4 M4 D2 L3 T1 M4 T1 M4 D2 L3 T5 M4 T5 M5 D2 L4 T6 M1 T6 Problems Resolved in 2NF

•Problems in 1NF • In 2NF the first two • INSERT – Can't add a are resolved, but not module with no texts the third one • UPDATE – To change 2NFa lecturer for M1, we have to change two Module Dept Lecturer rows M1 D1 L1 • DELETE – If we M2 D1 L1 remove M3, we M3 D1 L2 remove L2 as well M4 D2 L3 M5 D2 L4 Problems Remaining in 2NF

2NFa • INSERT anomalies Module Dept Lecturer • Can't add lecturers who teach no modules M1 D1 L1 • UPDATE anomalies M2 D1 L1 M3 D1 L2 •To change the M4 D2 L3 department for L1 we M5 D2 L4 must alter two rows • DELETE anomalies • If we delete M3 we delete L2 as well Transitive FDs and 3NF

• Transitive FDs: • •A FD, A  C is a • A relation is in third transitive FD, if there normal form (3NF) if is some set B such it is in 2NF and no that A  B and B  C non-key attribute is are non-trivial FDs transitively dependent • A  B non-trivial on a candidate key means: B is not a subset of A •We have A  B  C Third Normal Form

2NFa •2NFa is not in 3NF Module Dept Lecturer • We have the FDs M1 D1 L1 {Module}  {Lecturer} M2 D1 L1 {Lecturer}  {Dept} M3 D1 L2 • So there is a M4 D2 L3 transitive FD from the M5 D2 L4 primary key {Module} to {Dept} 2NF to 3NF – Example

2NFa 3NFa 3NFb Module Dept Lecturer Lecturer Dept Module Lecturer M1 D1 L1 L1 D1 M1 L1 M2 D1 L1 L2 D1 M2 L1 M3 D1 L2 L3 D2 M3 L2 M4 D2 L3 L4 D2 M4 L3 M5 D2 L4 M5 L4 Problems Resolved in 3NF

•Problems in 2NF • In 3NF all of these are resolved (for this relation – • INSERT – Can't add but 3NF can still have lecturers who teach anomalies!) no modules 3NFb • UPDATE – To change 3NFa Module Lecturer the department for L1 Lecturer Dept M1 L1 we must alter two M2 L1 rows L1 D1 M3 L2 • DELETE – If we delete L2 D1 M4 L3 M3 we delete L2 as L3 D2 M5 L4 well L4 D2 Normalisation and Design

• Normalisation is • When you find you related to DB design have a non-3NF DB • A database should • Identify the FDs that normally be in 3NF at are causing a problem least • Think if they will lead • If your design leads to to any insert, update, a non-3NF DB, then or delete anomalies you might want to • Try to remove them revise it Next Lecture

• More normalisation • Lossless decomposition; why our reduction to 2NF and 3NF is lossless • Boyce-Codd normal form (BCNF) • Higher normal forms • Denormalisation • For more information • Connolly and Begg chapter 14 • Ullman and Widom chapter 3.6