Relational Database Is a Digital Database Based on the Relational Model of Data, As Proposed by E

Total Page:16

File Type:pdf, Size:1020Kb

Relational Database Is a Digital Database Based on the Relational Model of Data, As Proposed by E 1 LECTURE #3: DATABASES, DATABASES, DATABASES E4520: Data Science for Mechanical Systems Instructor: Josh Browne, PhD Guest Lecturer: Gilman Callsen Feb 5, 2020 2 Gilman Callsen [email protected] • “Entrepreneur with a penchant for technology companies.” • Yale, BA Psychology • Started as a physics major, though! • Databases have been a big part of my entire career. • Not a database administrator. 3 Basic Timeline Late 90s & Early 00s Websites Chromic Décor (2006) MC10(2008) Pit Rho (2012) Rho AI (2016) 4 I learned the value of databases pretty quickly at this point... 5 PREP 6 Pair Up We will do some thought experiments and actual coding throughout. Make sure: • You’re next to at least one person you can talk to/brainstorm with • You have a computer or are near a computer that got through pre-class prep 7 Let’s Get Those Laptops Out cd ~/Documents git clone https://github.com/gcallsen/database-class-examples.git git clone https://github.com/rhoai/python-dev.git cd ~/Documents/python-dev docker-compose up -d docker run -it --rm --net python-dev_default -v ~/Documents/database-class-examples:/code rhoai/python-dev:v0.1.0 8 OVERVIEW 9 Data, Data, Data Who remembers what Erik Allen (Lecture #1) said about “Data Collection and Preparation”? 10 Data Collection and Preparation • Be prepared to spend a lot of time (~80%) on data collection and cleaning • If you’ve got a data set, be very very grateful! • Expect to be a partner in the data generation process • Have a full tool belt of models, to prepare for a paucity of data early in the process 11 Where Do the Data Live? With that in mind...where do you think all those collected and prepared data live? 12 You’re Right! DATABASES! 13 Cooking Analogy As we go through this lecture keep cooking in mind Grocery Store Blue Apron Family Cook 14 Question What is a Database? 15 What is a Database? • An organized collection of data ? Data Database Do Stuff 16 What is a Database? • An organized collection of data ? Grocery Store Blue Apron Collection of Chef The Food Ingredients 17 Where are Databases Used? 1. Name an industry/profession/etc. 2. Brainstorm some “data” they might have 18 Where are Databases Used? • Everywhere! • Databases are the backbone of nearly every digital ‘thing’ we interact with today. • Examples • Software engineer • Mechanical engineer • Business person • Marketing • Generic consumer • Even Schools... 19 https://xkcd.com/327/ 20 Data from Mechanical Systems? 1. What sorts of data might mechanical systems produce? 21 DBs for DS in Mechanical Systems? • Types of data and application needs will vary wildly • Super fast (real-time mechanical systems) • Highly accurate • Complex connections • Research • Development • Unlikely anyone here will be DBA but an understanding of how to think of databases goes a long way • Rubicon Global (waste management) • Automated pickup detection - on board vs cloud • Video processing • Accelerometer data • GPS location + time 22 Takeaways 1. Nearly every industry in the world produces data 2. Those data need to be stored somewhere to be useful 23 Types of Databases • No silver bullet • An incredibly wide array of databases exist; all with strengths and weaknesses • All situations require considering what mix of DBs are used. • Polyglot persistence • Grocery Store vs Blue Apron vs Cook Non- Relational Relational Why are there only two ‘types’ here when our analogy has 3 ‘types’? 24 Relational Overview • A relational database is a digital database based on the relational model of data, as proposed by E. F. Codd in 1970. • Virtually all relational database systems use SQL (Structured Query Language) for querying and maintaining the database. Straight from wikipedia https://en.wikipedia.org/wiki/Relational_database 25 Non-Relational Overview • A NoSQL (originally referring to "non SQL" or "non relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. • NoSQL databases are increasingly used in big data and real-time web applications. • NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages, or sit alongside SQL database in a polyglot persistence architecture. Straight from wikipedia https://en.wikipedia.org/wiki/NoSQL 26 https://www.alooma.com/blog/types-of-modern-databases 27 In Practice Amazon.com How do you think they store your data? (What data do they store, what type of database(s))? https://neo4j.com/blog/neo4j-doc-manager-polyglot-persistence-mongodb/ 28 In Practice • In most cases, more than one database is used! Polyglot Persistence https://neo4j.com/blog/neo4j-doc-manager-polyglot-persistence-mongodb/ 29 RELATIONAL A man walks into a bar and sees two tables. He says, “May I join you?” 30 https://www.alooma.com/blog/types-of-modern-databases 31 Summary Version • A database is a means of storing information in such a way that information can be retrieved from it. • A relational database is one that presents information in tables with rows and columns. • A table is a collection of objects of the same type (rows). • Data in a table can be related to other tables (typically using ‘keys’) • The ability to retrieve these related data provides us the term relational database. 32 Relational Databases • Core concepts for today • Tables • Normalization • SQL query language 33 Question How would you describe the contents of a grocery store in an excel spreadsheet? What “columns” would it have? 34 Tables • Grocery store Products Products id int id name price description department name string 1 Apple 1.99 Delicious Produce Apple price float 2 Banana 3.49 Bunch of Produce Bananas description string 3 Bread 3.99 Loaf of Bakery whole wheat department string 4 Cheddar 2.79 Sliced Deli cheddar 35 Normalization • Database Normalization is a technique of organizing the data in the database. • Basically, you break apart tables to: • eliminate data redundancy and • reduce malformed data when performing CRUD* operations • Importantly, this allows the database itself to enforce data integrity. • Once you’ve done this, you now need the concept of of ‘joins’. • To perform a join you need two items: • two tables and a join condition • the tables contain the rows to be combined, and the join condition the instructions to match rows together *CRUD - Create Read Update Delete 36 Things can get crazy... 37 Question What column(s) from our previous table are good candidate(s) for normalization? Products id name price description department 38 Basic Normalization • Grocery store Departments Products id int id int name string department_id int description string name string price float description string 39 Basic Normalization • Grocery store Departments Products id name description id name price descriptio departme n nt_id 1 Produce Healthy 1 Apple 1.99 Delicious 1 stuff! Apple 2 Bakery Breads and 2 Banana 3.49 Bunch of 1 goodies Bananas 3 Deli Meats, 3 Bread 3.99 Loaf of 2 cheeses, etc whole wheat 4 Cheddar 2.79 Sliced 3 cheddar 40 SQL • The fundamentals of most SQL languages are the same • Variations exist based on the database’s functionality • https://www.w3schools.com/sql/sql_intro.asp • Worth going through that as a primer 41 SQL Common Commands • SELECT - extracts data from a database • UPDATE - updates data in a database • DELETE - deletes data from a database • INSERT INTO - inserts new data into a database • CREATE DATABASE - creates a new database • ALTER DATABASE - modifies a database • CREATE TABLE - creates a new table • ALTER TABLE - modifies a table • DROP TABLE - deletes a table • CREATE INDEX - creates an index (search key) • DROP INDEX - deletes an index 42 Let’s Go to the Grocery Store • Start databases. • Go to to `python-dev` folder • docker-compose up -d • docker ps -a • Inside your python-dev docker container • docker run -it --rm --net python-dev_default -v ./database-class-examples:/code rhoai/python-dev:v0.1.0 • Connect to postgres (pwd may “postgres”) • psql -h postgres -U postgres -d postgres_db • \d → nothing → \q • Seed tables • python ./code/ex_postgres.py 43 SQL On Our Tables Command Outcome \d+ Describe the tables/relations SELECT * FROM products; Get all the Products SELECT * FROM departments; Get all the Departments SELECT p.name, d.name, price Get all of the products, display their name, price, FROM products p and name of the department they are in. FULL OUTER JOIN departments d ON d.id = p.department_id; 44 45 BREAK A programmer’s wife sends him to the grocery store with the instructions, “get a loaf of bread and, if they have eggs, get a dozen.” He comes home with a dozen loaves of bread and tells her, “they had eggs.” 46 NON-RELATIONAL Database Admins walked into a NoSQL bar. …a little while later they walked out because they couldn’t find a table. 47 https://www.alooma.com/blog/types-of-modern-databases 48 Summary Version • A database is a means of storing information in such a way that information can be retrieved from it. • A non-relational database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. • There are a lot of options. Why? • Data do not always fit nicely into columns and rows • Wide array of use cases that can have highly optimized solutions (e.g. time-series data) • Scalability (horizontal scalability) • Often, CAP Theorem is at play 49 What Doesn’t Fit in a Table? Think of some examples of data that would be difficult to represent in an relational database
Recommended publications
  • Further Normalization of the Data Base Relational Model
    FURTHER NORMALIZATION OF THE DATA BASE RELATIONAL MODEL E. F. Codd IBM Research Laboratory San Jose, California ABSTRACT: In an earlier paper, the author proposed a relational model of data as a basis for protecting users of formatted data systems from the potentially disruptive changes in data representation caused by growth in the data base and changes in traffic. A first normal form for the time-varying collection of relations was introduced. In this paper, second and third normal forms are defined with the objective of making the collection of relations easier to understand and control, simpler to operate upon, and more informative to the casual user. The question "Can application programs be kept in a viable state when data base relations are restructured?" is discussed briefly and it is conjectured that third normal form will significantly extend the life expectancy of appli- cation programs. Fu909umxk7) August 31,197l Information technolow (IR, Documentetion, etc.) 1. 1. Introduction 1.1 Objectives of Normalization In an earlier paper [l] the author proposed a relational model of data as a basis for protecting users of formatted data systems from the potentially disruptive changes in data representation caused by growth in the variety of data types in the data base and by statistical changes in the transaction or request traffic. Using this model, both the appli- cation programmer and the interactive user view the data base as a time-varying collection of normalized relations of assorted degrees. Definitions of these terms and of the basic relational operations of projection and natural join are given in the Appendix.
    [Show full text]
  • Aslmple GUIDE to FIVE NORMAL FORMS in RELATIONAL DATABASE THEORY
    COMPUTING PRACTICES ASlMPLE GUIDE TO FIVE NORMAL FORMS IN RELATIONAL DATABASE THEORY W|LL|AM KErr International Business Machines Corporation 1. INTRODUCTION The normal forms defined in relational database theory represent guidelines for record design. The guidelines cor- responding to first through fifth normal forms are pre- sented, in terms that do not require an understanding of SUMMARY: The concepts behind relational theory. The design guidelines are meaningful the five principal normal forms even if a relational database system is not used. We pres- in relational database theory are ent the guidelines without referring to the concepts of the presented in simple terms. relational model in order to emphasize their generality and to make them easier to understand. Our presentation conveys an intuitive sense of the intended constraints on record design, although in its informality it may be impre- cise in some technical details. A comprehensive treatment of the subject is provided by Date [4]. The normalization rules are designed to prevent up- date anomalies and data inconsistencies. With respect to performance trade-offs, these guidelines are biased to- ward the assumption that all nonkey fields will be up- dated frequently. They tend to penalize retrieval, since Author's Present Address: data which may have been retrievable from one record in William Kent, International Business Machines an unnormalized design may have to be retrieved from Corporation, General several records in the normalized form. There is no obli- Products Division, Santa gation to fully normalize all records when actual perform- Teresa Laboratory, ance requirements are taken into account. San Jose, CA Permission to copy without fee all or part of this 2.
    [Show full text]
  • Characteristics of Functional Dependencies
    Chapter 14 Normalization Pearson Education © 2009 Chapter 14 - Objectives The purpose of normalization. The potential problems associated with redundant data in base relations. The concept and characteristics of functional dependency, which describes the relationship between attributes. How inference rules can identify a set of all functional dependencies for a relation. How to undertake the process of normalization. How to identify 1st, 2nd, 3rd and BCNF Normal Forms. 2 Pearson Education © 2009 Purpose of Normalization Normalization is a technique for producing a set of suitable relations that support the data requirements of an enterprise. The benefits of of Normalization: – easier for the user to access and maintain the data; – take up minimal storage space on the computer. 3 Pearson Education © 2009 Characteristics of a suitable set of relations – The minimal number of attributes necessary to support the data requirements of the enterprise; – attributes with a close logical relationship are found in the same relation; – minimal redundancy with each attribute represented only once with… – exception for attributes that form all or part of foreign keys. 4 Pearson Education © 2009 Data Redundancy and Update Anomalies Data Redundancy and Update Anomalies 5 Pearson Education © 2009 Data Redundancy and Update Anomalies StaffBranch relation has redundant data; the details of a branch are repeated for every member of staff. In contrast, the branch information appears only once for each branch in the Branch relation and only the branch number (branchNo) is repeated in the Staff relation, to represent where each member of staff is located. 6 Pearson Education © 2009 Data Redundancy and Update Anomalies Relations that contain redundant information may potentially suffer from update anomalies.
    [Show full text]
  • Developing an Application Concept of Data Dependencies of Transactions to Relational Databases
    Jouni Laakso Developing an Application Concept of Data Dependencies of Transactions to Relational Databases Helsinki Metropolia University of Applied Sciences Master's Degree Information Technology Master's Thesis 18. February 2016 Abstract Author Jouni Laakso Title Developing an Application Concept of Data Depencies of Transactions to Relational Databases Number of Pages 73 pages + 5 appendices Date 18. February 2016 Degree Master of Engineering Degree Programme Information Technology Instructor Pasi Ranne, Senior Lecturer Information systems usually use a relational database to store the application data. The relational database can be used outside of the scope of the application. The information systems has to verify the attributes to be the attributes of the transactions to the relational database. The integrity verification includes the verification of the atomicity of the attribute values and the form of their values matching the attributes type. Integrity verification includes the verification and the checking of the dependency constraints. The dependency constraints are usually other attributes the attributes are dependent on. Applications are reprogrammed for different purposes. It has been noted that a complete information system and a new application program is not always needed in the most simple information systems. Sometimes a database query language is enough to use a relational database. For example an administrator of an application can remove and add users with an SQL-editor. The thesis studies the automatic checking of the attributes of the transactions with consistency and integrity verification. The purpose was to develop a concept automatically checking the integrity and consistency of the applications attributes. With the help of the concept, the quality of the application should improve with the help of the reusable application components and with a generic application to be used in different purposes.
    [Show full text]
  • Nosql Databases in Archaeology a Funerary Case Study
    FACULTY OF ARCHAEOLOGY NoSQL databases in Archaeology a Funerary Case Study Rens Cassée;S1228226 15-12-2017 0 1 NoSQL Databases in Archaeology – a Funerary Case Study. Rens W. Cassée S1228226 Thesis MSc Digital Archaeology 1044CS05H-1718ARCH Dr K. Lambers Master Digital Archaeology and Archaeology of the Near East Leiden University, Faculty of Archaeology Leiden, 15-12-2017. Final version 2 Content 1. Introduction ................................................................................................................ 5 2. Case study ................................................................................................................... 9 2.1. Introduction ............................................................................................................ 9 2.2. The Pre-Pottery Neolithic B..................................................................................... 9 2.2.1. Funerary rites in the PPNB ........................................................................ 12 2.2.2. The Pre-Pottery Neolithic B dataset.......................................................... 15 2.3. Funerary data ........................................................................................................ 18 2.3.1. Excavation result ....................................................................................... 19 2.3.2. Osteoarchaeology ..................................................................................... 21 2.3.3. Literary & museum studies ......................................................................
    [Show full text]
  • Extending the Relational Model with Constraint Satisfaction
    Extending The Relational Model With Constraint Satisfaction by Michael J. Valdron A thesis submitted to the School of Graduate and Postdoctoral Studies in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Faculty of Science University of Ontario Institute of Technology (Ontario Tech University) Oshawa, Ontario, Canada January 2021 © Michael J. Valdron, 2021 Thesis Examination Information Submitted by: Michael J. Valdron Master of Science in Computer Science Thesis title: Extending The Relational Model With Constraint Satisfaction An oral defense of this thesis took place on January 12, 2021 in front of the following examining committee: Examining Committee: Chair of Examining Committee Dr. Faisal Qureshi Research Supervisor Dr. Ken Pu Examining Committee Member Dr. Akramul Azim Thesis Examiner Dr. Jeremy Bradbury The above committee determined that the thesis is acceptable in form and content and that a satisfactory knowledge of the field covered by the thesis was demonstrated by the candidate during an oral examination. A signed copy of the Certificate of Approval is available from the School of Graduate and Postdoctoral Studies. i Abstract We propose a new approach to data driven constraint programming. By extending the relational model to handle constraints and variables as first class citizens, we are able to express first order logic SAT problems using an extended SQL which we refer to as SAT/SQL. With SAT/SQL, one can efficiently solve a wide range of practical constraint and optimization problems. SAT/SQL integrates both SAT solver and relational data processing to enable efficient and large scale data driven constraint programming.
    [Show full text]
  • Denormalization in Data Warehouse Example
    Denormalization In Data Warehouse Example Stacy is impartibly frequentative after financial Durant vibrated his polydactylism mildly. Ontogenetic Laurent understudies no parricide pedestrianise cursorily after Tod adhered surely, quite westering. How unpractised is Filmore when scrimpiest and arduous Willard esquires some syphiloma? One example of warehouse proves very fine points of denormalization in data warehouse example, or breach of interest table? Peshawar students in corresponding table etc. Thousands of concurrent users supported. Thanks for determining the identify and efficiently to have his principle ideas about it is more time of warehouse data model? One example is with table joins. For example, Faculty Hire Date, we always have to have join with this address table. In our dimensional data model, stored procedures, you typically use a dimensional data model to build a data mart. Calculus for example if customer categories, and warehouse structure with invalid data and thanks for example in denormalization data warehouse? We should store data denormalization in dimension tables to loop because only purpose is an example in denormalization is known as it. Sometimes, we need some rules to guide our definition of aggregates. You can change your ad preferences anytime. There are updated by denormalization in data warehouse example. It was given use technology advancements have become more insights and to it is denormalization in data warehouse example. This figure is not only one way. Below is a table that stores the names and telephone numbers of customers. You are independent of warehouse design, users frequently hear goes like amazon rds may be consigned to point of accumulating snapshot are.
    [Show full text]
  • Introduction to Databases Presented by Yun Shen ([email protected]) Research Computing
    Research Computing Introduction to Databases Presented by Yun Shen ([email protected]) Research Computing Introduction • What is Database • Key Concepts • Typical Applications and Demo • Lastest Trends Research Computing What is Database • Three levels to view: ▫ Level 1: literal meaning – the place where data is stored Database = Data + Base, the actual storage of all the information that are interested ▫ Level 2: Database Management System (DBMS) The software tool package that helps gatekeeper and manage data storage, access and maintenances. It can be either in personal usage scope (MS Access, SQLite) or enterprise level scope (Oracle, MySQL, MS SQL, etc). ▫ Level 3: Database Application All the possible applications built upon the data stored in databases (web site, BI application, ERP etc). Research Computing Examples at each level • Level 1: data collection text files in certain format: such as many bioinformatic databases the actual data files of databases that stored through certain DBMS, i.e. MySQL, SQL server, Oracle, Postgresql, etc. • Level 2: Database Management (DBMS) SQL Server, Oracle, MySQL, SQLite, MS Access, etc. • Level 3: Database Application Web/Mobile/Desktop standalone application - e-commerce, online banking, online registration, etc. Research Computing Examples at each level • Level 1: data collection text files in certain format: such as many bioinformatic databases the actual data files of databases that stored through certain DBMS, i.e. MySQL, SQL server, Oracle, Postgresql, etc. • Level 2: Database
    [Show full text]
  • A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM 26(2), Feb
    William Kent, "A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM 26(2), Feb. 1983, 120-125. Also IBM Technical Report TR03.159, Aug. 1981. Also presented at SHARE 62, March 1984, Anaheim, California. Also in A.R. Hurson, L.L. Miller and S.H. Pakzad, Parallel Architectures for Database Systems, IEEE Computer Society Press, 1989. [12 pp] Copyright 1996 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or [email protected]. A Simple Guide to Five Normal Forms in Relational Database Theory William Kent Sept 1982 > 1 INTRODUCTION . 2 > 2 FIRST NORMAL FORM . 2 > 3 SECOND AND THIRD NORMAL FORMS . 2 >> 3.1 Second Normal Form . 2 >> 3.2 Third Normal Form . 3 >> 3.3 Functional Dependencies . 4 > 4 FOURTH AND FIFTH NORMAL FORMS . 5 >> 4.1 Fourth Normal Form . 6 >>> 4.1.1 Independence . 8 >>> 4.1.2 Multivalued Dependencies . 9 >> 4.2 Fifth Normal Form . 9 > 5 UNAVOIDABLE REDUNDANCIES .
    [Show full text]
  • Colorado Technical University CS 660 – Database Systems
    Colorado Technical University CS 660 – Database Systems Colorado Technical University Instructor: Dr. John Conklin Unit 2 – Database Design & Modeling Source: (https://twitter.com/idigdata/status/446785729587580928) The Relational Model Objectives • Terminology of relational model. • How tables are used to represent data. • Connection between mathematical relations and relations in the relational model. • Properties of database relations. • How to identify CK, PK, and FKs. • Meaning of entity integrity and referential integrity. • Purpose and advantages of views. 4 Relational Model Terminology • Relation is a table with columns and rows. • Attribute is a named column. • Domain is set of allowable values for attributes. Image Source(https://www.guru99.com/relational-data-model-dbms.html) 5 Relational Model Terminology • Tuple is a row of a relation. • Degree is the number of attributes in a relation. • Cardinality is the number of tuples in a relation. • Relational Database is a collection of normalized relations with distinct relation names. 6 Instances of Branch and Staff Relations 7 Examples of Attribute Domains 8 Alternative Terminology for Relational Model 9 Database Relations • Relation schema • Named relation defined by a set of attribute and domain name pairs. • Relational database schema • Set of relation schemas, each with a distinct name. 10 Properties of Relations • Relation name is distinct from all other relation names in relational schema. • Each cell of relation contains exactly one atomic (single) value. • Each attribute has a distinct name. • Values of an attribute are all from the same domain. 11 Properties of Relations • Each row is distinct; there are no duplicate rows. • Order of attributes has no significance. • Order of rows has no significance, theoretically.
    [Show full text]
  • Functional Dependencies Between Attributes
    Chapter 14 Normalization Pearson Education © 2009 Materials To Have Handy ◆ A paper copy of the StaffBranch relation (pg. 407 of the handout and on or about slide 6) ◆ A paper copy of the Staff Branch function dependencies (pg. 413 of the handout and on or about slide 23) 2 What is Normalization? ◆ A technique to decompose relations into groupings of logically related attributes based on functional dependencies between attributes. ◆ A bottom-up design technique 3 Purpose of Normalization ◆ Normalization attempts to minimize the likelihood of introducing inconsistent data into the database by minimizing the amount of redundancy in the database 4 Pearson Education © 2009 How Normalization Supports Database Design 5 Pearson Education © 2009 Data Redundancy and Update Anomalies ◆ Problems associated with data redundancy are illustrated by looking at the StaffDirectory relation. 6 Pearson Education © 2009 Data Redundancy and Insertion Anomalies ◆ StaffDirectory relation has redundant data; the details of a branch are repeated for every member of staff. ◆ If I want to insert a branch without any staff, I must insert NULL values for the staff – Since NULL values are not allowed in staffNo, a primary key, the insert fails 7 Pearson Education © 2009 Update and Deletion Anomalies ◆ Update Anomaly: If I change a person’s branch number, I must remember to also change their branch address ◆ Update Anomaly: If branch’s address changes, it must be updated multiple times ◆ Deletion Anomaly: If all staff associated with a branch are deleted, the branch details gets deleted as well 8 Data Redundancy and Update Anomalies ◆ Relations that contain redundant information may potentially suffer from update anomalies.
    [Show full text]
  • Normalization
    Normalization March 23, 2008 DB:Normalization 1 Objectives Introduction + Objectives of Normalization + Normal Forms + The Process of Normalization + Un Normalized Form (UNF) + Converting UNF to Normalized Form + Summary + March 23, 2008 DB:Normalization 2 - Introduction A process to validate and improve logical design so that it satisfies certain constraints that avoid unnecessary duplication of the data. Normalization is a formal technique for analyzing a relation based on its primary key and the FDs between the attributes of that relation. The normalization process was first proposed by Codd 1972. The normalization process takes a relation schema through a series of tests to certify whether it satisfies a certain normal form. March 23, 2008 DB:Normalization 3 - Objectives of Normalization Solve problems associated with redundant data. Identify various types of update anomalies such as insertion, deletion, and modification anomalies. Recognize the appropriateness or quality of the design of relations. Use FDs to group attributes into relations that are in a known normal form. March 23, 2008 DB:Normalization 4 -Normal Forms The four most commonly used normal forms are: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) To convert un-normalized table to a normalized one, you first convert it to INF, then to 2NF, then to 3NF and then to BCNF. In other words: UNF Æ 1NF Æ 2NF Æ 3NF Æ BCNF March 23, 2008 DB:Normalization 5 -- Relationship Between Normal Forms March 23, 2008 DB:Normalization 6 - The Process of Normalization Formal technique for analyzing a relation based on its primary key and the functional dependencies between the attributes of that relation.
    [Show full text]