Database Design

Total Page:16

File Type:pdf, Size:1020Kb

Database Design 1 Database design - Satya Satya © 2004 Architect’s buzZ word 2 "Normalization is a logical concept, performance is determined at the physical level. Therefore, it is impossible to denormalize for performance." Fabian Pascal – co-founder & editor of Database Debunkings (dbdebunk.com) Satya © 2004 Architect’s buzZ word 3 “Denormalization, if necessary, should be done at the level of stored files, not at the level of base relvars” “denormalization, is not ‘good for performance’, it is good for the performance of specific applications” Chris J. Date – Most respected database expert in Computer Industry, Author – Database Systems. Satya © 2004 Background 4 • Presenting concepts, not syntax. • Presenting “How” & “What” not “Why” in the RDBMS. Satya © 2004 Agenda 5 • Introduction Satya © 2004 Ready for the data eXplosion? 6 40,000 BCE cave paintings bone tools 3500 writing 0 C.E. paper 105 1½B 1450 in 1999 printing 1870 electricity, telephone transistor 1947 GIGABYTES computing 1950 Late 1960s Internet 1993 (DARPA) The web 1999 Source: UC Berkeley Satya © 2004 The coming content - “Big Bang” 7 2003 24B 2002 12B GIGABYTES 40,000 BCE cave paintings bone tools 3500 writing 0 C.E. paper 105 2001 6B 1450 printing 1870 2000 3B electricity, telephone transistor 1947 computing 1950 Late 1960s Internet 1993 (DARPA) The web 1999 Source: UC Berkeley Satya © 2004 Future data size 8 • Terabytes of data – Common corporate expression – Petabytes(10^15) & Exabytes(10^18) is fast approaching • 2-3 Exabytes = total volume of all information generated worldwide annually – Need structure to efficiently handle large data. Source: 2001 - IBM Informix Conference, Las Vegas. Satya © 2004 Database – the solution 9 Database ? An Organized Store of Information –Flat Files Adabas, FileMaker –Hierarchical Databases IBM’s Information Management System (IMS) – used in Apollo Moon Landing. –Network Databases GE’s Integrated Data Store (IDS) –Relational Databases Oracle, Db2, Sybase, MS SQL, Postgres –Object Relational Databases Oracle 9 –Object Databases Cloudscape Satya © 2004 Database development activities during the 10 systems development life cycle (SDLC) Project Identification Enterprise modeling and Selection Project Initiation and Planning Conceptual data modeling Analysis Logical Design Physical Design Implementation Maintenance Satya © 2004 Database Application Lifecycle 11 DATABASE PLANNING SYSTEMS DEFINITION REQUIREMENTS ANALYSIS DB Design CONCEPTUAL DESIGN APPLICATION DBMS SELECTION LOGICAL DESIGN DESIGN PHYSICAL DESIGN PROTOTYPING IMPLEMENTATION DATA LOADING / MIGRATION TESTING MAINTENANCE Satya © 2004 Database design flow12 Conceptual Design Data analysis and Determine end user views, outputs, and requirements transaction processing requirements. Entity relationship modeling Define entities, attributes and relationships. and normalization Draw ER diagrams. Normalize tables. DBMS independent Identify main processes, insert, update and Data model verification delete rules. Validate reports, queries, views, integrity, sharing and security. Distributed database design Define location of tables, access requirements and fragmentation strategy. DBMS software selection Translate the conceptual model into DBMS Logical design definitions for tables, views and so on… dependent Physical design Define storage structures and access paths Hardware for optimum performance. dependent Satya © 2004 Design Approach 13 • Entity-Relationship (ER) data modeling – A graphical technique for understanding and organizing the data independently of the eventual database implementation • Normalization – An algorithmic process for evaluating the quality of a database design - most applicable to relational database designs • Types of Models – Models (of databases or anything else) can be built at different levels of abstraction – For databases (following the text): • Conceptual – logical ? ER Models (represent semantics) • Internal - for the chosen DBMS • External - the way the User see the data • Physical - for the actual physical storage Satya © 2004 14 ER Modeling Satya © 2004 ER Modeling concepts 15 Entity-Relationship Basics •The concepts upon which ER models are built are: –Entities (or, more correctly, entity types) –also called as “relvars”, “base relvars”, “relation” –at physical implementation level, called as table –Relationships (between entities) –Attributes (of entities and relationships) Satya © 2004 Entities & Entity types 16 •An entity is “A person, place, event, or thing for which we intend to collect data” •Normally a database will contain data about groups of similar entities (e.g. students, subjects, licenses, aircraft or whatever) •These groups of similar entities are referred to as entity types but often this is shortened to just “entity” or “entities” Satya © 2004 Entity types & Attributes 17 •Entity types are conventionally named in the singular •Attributes are represented on ER diagrams as ellipses attached to the relevant entity type symbol •There are other notations as well (e.g. a list of attributes next to the entity type symbol) but they are conceptually equivalent DOB Name student Address Gender Number student Satya © 2004 Relationships 18 •A relationship is an association between entity types •Relationships are represented by diamond shaped symbols on ER diagrams •A descriptive name is placed inside the relationship symbol student enrols subject in Satya © 2004 Relationship & entity 19 •Entity type names are usually nouns •Relationship names are usually, though not always, verbs (or verb phrases) •Most relationships are binary (i.e. connect 2 entity types) - like “enrolls in” •Other types of relationships are possible Satya © 2004 Degree of a Relationship 20 •The degree of a relationship is the number of entity type(s) that it connects –One Unary –Two Binary –Three Ternary •Relationships of degree higher than three are rare enrolls Binary relationship Unary relationship student subject employee supervises item sale item sale sale sale vendor purchaser = vendor purchaser Ternary relationship Three binary relationships Satya © 2004 21 Relationship Connectivity (Cardinality) •Relationships can have different connectivity(s) •one-to-one (1:1) •one-to-many (1:N) •many-to-many (M:N) •Indicated on the ER diagram by placing an appropriate symbol on each “leg” of the relationship 1 supervisor M N enrolls student subject employee supervises 1 N teaches lecturer subject Satya © 2004 22 E R F E R F E R F One-to-one relationship Many-to-one relationship Many-to-many relationship min-card(E, R)=0 min-card(E, R)=0 min-card(E, R)=0 max-card(E,R)=1 max-card(E,R)=N max-card(E,R)=N min-card(F,R)=0 min-card(F,R)=1 min-card(F,R)=0 max-card(F,R)=1 max-card(F,R)=1 max-card(F,R)=N Satya © 2004 Relationship Participation 23 •Entity types connected by a relationship can have two kinds of “participation” in it •Partial (or optional) •Total (or mandatory) •“Total” means that every entity instance must be connected (through the relationship) to an instance of the other participating entity type(s) •“Partial” means not total 1 1 Head of staff department Satya © 2004 Key Attribute(s) 24 •There will normally be one, or perhaps several, attributes that will be unique for every entity instance •Example: •Every student will have a unique student number •Such an attribute (or combination) is called a key •If the key for an entity set consists of two or more attributes in combination it is called a concatenated key •Key attribute(s) are underlined on the ER diagram Qualification Name DOB Age Address Number person Gender Satya © 2004 Derived, Multi-valued attributes 25 •Sometimes it is useful to have, on the ER diagram, attributes that can be derived from other attributes •Example: •An attribute Age can be derived from an attribute DOB and the current date •Derived attributes can be indicated on the ER diagram by using a dashed ellipse and connecting line to the relevant entity type Satya © 2004 Relationships attributes 26 •A relationship is an association between entity sets •Relationships can also have attributes •An attribute of a relationship is drawn attached to the relationship diamond •Usually only M:N relationships have attributes N employee supervises Task M Satya © 2004 Strong & Weak entities/entity types 27 •Sometimes the instances of one entity type depend, for their unique identification, on their relationship to the instances of another entity type Name Number building consists room of Satya © 2004 Supertypes & Subtypes 28 •Sometimes notionally different entity types are really specializations of a more general entity type •Example: •Trucks, cars, motorcycles, buses, taxis are all motor vehicles •Some attributes are common to all, others are specific to one group •This kind of situation can be dealt with using a generalization hierarchy (or super type/subtype hierarchy) •The attribute(s) that are common belong to the super type •The attributes that are specific are attached to the relevant subtype Satya © 2004 Supertypes & Subtypes 29 Seats motor Registration vehicle d U U U truck car bus truck car bus attributes attributes attributes Satya © 2004 Supertypes & Subtypes 30 Gender TFN employee DOB Address o U U U Safety officer engineer pilot Safety engineer pilot attributes attributes attributes Satya © 2004 31 Three schema architecture for Database development External level (individual user views) External (COBOL) External (XML ) <xsd:element name=“Emp”> 01 EMPC. Conceptual level <xsd:element name=“Eno” type=“Number” /> 02 EMPNO PIC X(6). <xsd:elementname=“Dno”
Recommended publications
  • Using Relational Databases in the Engineering Repository Systems
    USING RELATIONAL DATABASES IN THE ENGINEERING REPOSITORY SYSTEMS Erki Eessaar Department of Informatics, Tallinn University of Technology, Raja 15,12618 Tallinn, Estonia Keywords: Relational data model, Object-relational data model, Repository, Metamodeling. Abstract: Repository system can be built on top of the database management system (DBMS). DBMSs that use relational data model are usually not considered powerful enough for this purpose. In this paper, we analyze these claims and conclude that they are caused by the shortages of SQL standard and inadequate implementations of the relational model in the current DBMSs. Problems that are presented in the paper make usage of the DBMSs in the repository systems more difficult. This paper also explains that relational system that follows the rules of the Third Manifesto is suitable for creating repository system and presents possible design alternatives. 1 INTRODUCTION technologies in one data model is ROSE (Hardwick & Spooner, 1989) that is experimental data "A repository is a shared database of information management system for the interactive engineering about the engineered artifacts." (Bernstein, 1998) applications. Bernstein (2003) envisions that object- These artifacts can be software engineering artifacts relational systems are good platform for the model like models and patterns. Repository system contains management systems. ORIENT (Zhang et al., 2001) a repository manager and a repository (database) and SFB-501 Reuse Repository (Mahnke & Ritter, (Bernstein, 1998). Bernstein (1998) explains that 2002) are examples of the repository systems that repository manager provides services for modeling, use a commercial ORDBMS. ORDBMS in this case retrieving, and managing objects in the repository is a system which uses a database language that and therefore must offer functions of the Database conforms to SQL:1999 or later standard.
    [Show full text]
  • Sixth Normal Form
    3 I January 2015 www.ijraset.com Volume 3 Issue I, January 2015 ISSN: 2321-9653 International Journal for Research in Applied Science & Engineering Technology (IJRASET) Sixth Normal Form Neha1, Sanjeev Kumar2 1M.Tech, 2Assistant Professor, Department of CSE, Shri Balwant College of Engineering &Technology, DCRUST University Abstract – Sixth Normal Form (6NF) is a term used in relational database theory by Christopher Date to describe databases which decompose relational variables to irreducible elements. While this form may be unimportant for non-temporal data, it is certainly important when maintaining data containing temporal variables of a point-in-time or interval nature. With the advent of Data Warehousing 2.0 (DW 2.0), there is now an increased emphasis on using fully-temporalized databases in the context of data warehousing, in particular with next generation approaches such as Anchor Modeling . In this paper, we will explore the concepts of temporal data, 6NF conceptual database models, and their relationship with DW 2.0. Further, we will also evaluate Anchor Modeling as a conceptual design method in which to capture temporal data. Using these concepts, we will indicate a path forward for evaluating a possible translation of 6NF-compliant data into an eXtensible Markup Language (XML) Schema for the purpose of describing and presenting such data to disparate systems in a structured format suitable for data exchange. Keywords :, 6NF,SQL,DKNF,XML,Semantic data change, Valid Time, Transaction Time, DFM I. INTRODUCTION Normalization is the process of restructuring the logical data model of a database to eliminate redundancy, organize data efficiently and reduce repeating data and to reduce the potential for anomalies during data operations.
    [Show full text]
  • Translating Data Between Xml Schema and 6Nf Conceptual Models
    Georgia Southern University Digital Commons@Georgia Southern Electronic Theses and Dissertations Graduate Studies, Jack N. Averitt College of Spring 2012 Translating Data Between Xml Schema and 6Nf Conceptual Models Curtis Maitland Knowles Follow this and additional works at: https://digitalcommons.georgiasouthern.edu/etd Recommended Citation Knowles, Curtis Maitland, "Translating Data Between Xml Schema and 6Nf Conceptual Models" (2012). Electronic Theses and Dissertations. 688. https://digitalcommons.georgiasouthern.edu/etd/688 This thesis (open access) is brought to you for free and open access by the Graduate Studies, Jack N. Averitt College of at Digital Commons@Georgia Southern. It has been accepted for inclusion in Electronic Theses and Dissertations by an authorized administrator of Digital Commons@Georgia Southern. For more information, please contact [email protected]. 1 TRANSLATING DATA BETWEEN XML SCHEMA AND 6NF CONCEPTUAL MODELS by CURTIS M. KNOWLES (Under the Direction of Vladan Jovanovic) ABSTRACT Sixth Normal Form (6NF) is a term used in relational database theory by Christopher Date to describe databases which decompose relational variables to irreducible elements. While this form may be unimportant for non-temporal data, it is certainly important for data containing temporal variables of a point-in-time or interval nature. With the advent of Data Warehousing 2.0 (DW 2.0), there is now an increased emphasis on using fully-temporalized databases in the context of data warehousing, in particular with approaches such as the Anchor Model and Data Vault. In this work, we will explore the concepts of temporal data, 6NF conceptual database models, and their relationship with DW 2.0. Further, we will evaluate the Anchor Model and Data Vault as design methods in which to capture temporal data.
    [Show full text]
  • A Mapping of SPARQL Onto Conventional SQL
    A Mapping of SPARQL Onto Conventional SQL Eric Prud'hommeaux, Alexandre Bertails World Wide Web Consortium (W3C) {eric,bertails}@w3.org http://www.w3.org/ Abstract. This paper documents a semantics for expressing relational data as an RDF graph [RDF] and an algebra for mapping SPARQL SELECT queries over that RDF to SQL queries over the original rela- tional data. The RDF graph, called the stem graph, is constructed from the relational structure and a URI identifier called a stem URI. The al- gebra specifies a function taking a stem URI, a relational schema and a SPARQL query over the corresponding stem graph, and emitting a re- lational query, which, when executed over the relational data, produces the same solutions at the SPARQL query executed over the stem graph. Relational databases exposed on the Semantic Web can be queried with SPARQL with the same performance as with SQL. Key words: Semantic Web, SPARQL, SQL, RDF 1 Introduction Motivations to bridge SPARQL and SQL abound: science, technology and busi- ness need to integrate more and more diverse data sources; the far majority of data that we expect machines to interpret are in relational databases; SQL fed- eration extensions like SchemaSQL [SSQL] don't place these databases in the more expressive Semantic Web; etc. These points have motivated many devel- opers to produce mapping tools, but robustness and performance have been a problem, in part because of a lack of a standard semantics for this mapping. In order to draw the desired investment in the Semantic Web, we need to match the performance of conventional relational databases.
    [Show full text]
  • To Be Is to Be a Value of a Variable by C. J. Date with Apologies To
    T o B e I s t o B e a Value of a Variable by C. J. Date with apologies to George Boolos and his book Logic, Logic, and Logic (I cribbed the title of this paper from an essay in that book) If we want things to stay as they are, things will have to change —Giuseppe di Lampedusa "Change" is scientific, "progress" is ethical; change is indubitable, whereas progress is a matter of controversy —Bertrand Russell July 17th, 2006 ABSTRACT In reference [1], two writers, referred to herein as Critics A and B, criticize The Third Manifesto for its support for relation variables and relational assignment. This paper is a response to that criticism. Readers are expected to be familiar with the following concepts and terminology: A relation variable (relvar for short) is a variable whose permitted values are relation values (relations for short). Relational assignment is an operation by which some relation r is assigned to some relvar R. Reference [14] explains these notions in detail, using a language called Tutorial D as a basis for examples. Why we want relvars Critic A's objections Critic B's objections Multiple assignment Database values and variables Concluding remarks Copyright © C. J. Date 2006 page 1 References WHY WE WANT RELVARS As noted in the abstract, the term relvar is short for relation variable. It was coined by Hugh Darwen and myself in reference [9], the first published version of The Third Manifesto; Codd's first papers on the relational model [3-4] used the term time-varying relation instead, but "time- varying relations" are just relvars by another name.
    [Show full text]
  • CS253:HACD.2 Temporal Data and the Relational Model Notes Keyed to Slides
    File: CS253-TDATRM-Notes.doc Printed at: 16:32 on Wednesday, 20 October, 2010 CS253:HACD.2 Temporal Data and The Relational Model Notes keyed to slides 1. Cover slide The chapter in Date first appeared in his 7th edition, as Chapter 22, but the chapter was quite heavily revised for the 8th edition. There is an unfortunate typographical error on page 744. In the first bulleted paragraph (“The expanded form ...”), delete the last three words, “defined as follows:”. 2. Temporal Data and The Relational Model In particular, none of the leading SQL vendors (IBM, Oracle, Microsoft, Sybase ...) have implemented SQL extensions to solve the problems we describe. There was significant interest for a time in the second half of the 1990s, when an incomplete working draft for an international standard for such extensions was produced by the SQL standards committee. However, the project was abandoned when support for XML documents in SQL databases became a higher priority to the industry than temporal extensions. (Some people question the industry’s priorities!) MighTyD In the academic year 2005-6 a team of computer science undergraduates at Warwick University, for their final-year project, made the beginnings of a prototype implementation of Tutorial D with some of the temporal extensions taught in CS253. They call their product MighTyD. Like Rel, it is written in Java. It is available for free download at http://sourceforge.net/projects/mightyd. If by any chance you might be interested in continuing the work on MighTyD, please arrange a meeting with me to discuss ideas. 3. The Book’s Aims Imagine, for example, a database from which nothing is ever deleted and in which every record is somehow timestamped to show the time at which it arrived and, if its information is no longer current, the time at which it was superseded or deleted.
    [Show full text]
  • Chapter 3 the Relational Database Model
    11e Database Systems Design, Implementation, and Management Coronel | Morris Chapter 3 The Relational Database Model ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Learning Objectives In this chapter, one will learn: That the relational database model offers a logical view of data About the relational model’s basic component: relations That relations are logical constructs composed of rows (tuples) and columns (attributes) That relations are implemented as tables in a relational DBMS ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 2 Learning Objectives In this chapter, one will learn: About relational database operators, the data dictionary, and the system catalog How data redundancy is handled in the relational database model Why indexing is important ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 3 A Logical View of Data Relational database model enables logical representation of the data and its relationships Logical simplicity yields simple and effective database design methodologies Facilitated by the creation of data relationships based on a logical construct called a relation ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 4 Table 3.1 - Characteristics of a Relational Table ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
    [Show full text]
  • Database Management Systems Ebooks for All Edition (
    Database Management Systems eBooks For All Edition (www.ebooks-for-all.com) PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 20 Oct 2013 01:48:50 UTC Contents Articles Database 1 Database model 16 Database normalization 23 Database storage structures 31 Distributed database 33 Federated database system 36 Referential integrity 40 Relational algebra 41 Relational calculus 53 Relational database 53 Relational database management system 57 Relational model 59 Object-relational database 69 Transaction processing 72 Concepts 76 ACID 76 Create, read, update and delete 79 Null (SQL) 80 Candidate key 96 Foreign key 98 Unique key 102 Superkey 105 Surrogate key 107 Armstrong's axioms 111 Objects 113 Relation (database) 113 Table (database) 115 Column (database) 116 Row (database) 117 View (SQL) 118 Database transaction 120 Transaction log 123 Database trigger 124 Database index 130 Stored procedure 135 Cursor (databases) 138 Partition (database) 143 Components 145 Concurrency control 145 Data dictionary 152 Java Database Connectivity 154 XQuery API for Java 157 ODBC 163 Query language 169 Query optimization 170 Query plan 173 Functions 175 Database administration and automation 175 Replication (computing) 177 Database Products 183 Comparison of object database management systems 183 Comparison of object-relational database management systems 185 List of relational database management systems 187 Comparison of relational database management systems 190 Document-oriented database 213 Graph database 217 NoSQL 226 NewSQL 232 References Article Sources and Contributors 234 Image Sources, Licenses and Contributors 240 Article Licenses License 241 Database 1 Database A database is an organized collection of data.
    [Show full text]
  • 2006 Fall CS157A Final Exam Study Guide
    2006 Fall CS157A Final Exam Study Guide Prof. Sin-Min Lee Note: Please bring 882E scantron! The exam will be comprehensive. Test material will be drawn from the text book, lectures, assignments and any supplementary material provided in class. You should review the following: All the lecture notes All the assigned readings (if you don't have the text book, then the lecture notes will suffice) Three midterm exams Important Date Information Section 3 Wednesday, December 13 0715-0930 PREPARE FOR WRITING EXAMINATION 1. STUDY-- Read the material which the exam will cover. A. Formulate possible study questions, use any study questions made available to you. B. Arrange your material (Notes, Note cards) so that you can anticipate an answer to one or more of the study questions. 2. Know the exam conditions and be prepared to meet those conditions. (Since you most likely will have limited time, select and arrange those materials which will best help you present what you know about the subject and help you show that you can control that material. 3. Before writing the exam, read all the questions. Choose that question which comes closest to those you formulated on your own and which most closely relates to those materials you have studied, reviewed, and arranged. 4. While writing the exam-- A. Be sure to CLEARLY and CONCISELY and PRECISELY answer the question. B. Make sure the discussion subsequent to your answer follows clearly from your answer: Your topic sentences should relate back to your answer. Your topic sentences should be supported by CLEAR, SPECIFIC references to your materials (i.e., your sources).
    [Show full text]
  • Relational Databases and SQL
    Relational databases and SQL Matthew J. Graham CACR Methods of Computational Science Caltech, 29 January 2009 mat!ew graham relational model Proposed by E. F. Codd in 1969 An attribute is an ordered pair of attribute name and type (domain) name An attribute value is a specific valid value for the attribute type A tuple is an unordered set of attribute values identified by their names A relation is defined as an unordered set of n-tuples mat!ew graham databases A relation consists of a heading (a set of attributes) and a body (n-tuples) A relvar is a named variable of some specific relation type and is always associated with some relation of that type A relational database is a set of relvars and the result of any query is a relation A table is an accepted representation of a relation: attribute => column, tuple => row mat!ew graham structured query language Appeared in 1974 from IBM First standard published in 1986; most recent in 2006 SQL92 is taken to be default standard Different flavours: Microsoft/Sybase Transact-SQL MySQL MySQL Oracle PL/SQL PostgreSQL PL/pgSQL mat!ew graham create CREATE DATABASE databaseName CREATE TABLE tableName (name1 type1, name2 type2, . ) CREATE TABLE star (name varchar(20), ra float, dec float, vmag float) Data types: boolean, bit, tinyint, smallint, int, bigint; • real/float, double, decimal; • char, varchar, text, binary, blob, longblob; • date, time, datetime, timestamp • CREATE TABLE star (name varchar(20) not null, ra float default 0, ...) mat!ew graham keys CREATE TABLE star (name varchar(20), ra float, dec float, vmag float, CONSTRAINT PRIMARY KEY (name)) A primary key is a unique identifier for a row and is automatically not null CREATE TABLE star (name varchar(20), ..., stellarType varchar(8), CONSTRAINT stellarType_fk FOREIGN KEY (stellarType) REFERENCES stellarTypes(id)) A foreign key is a referential constraint between two tables identifying a column in one table that refers to a column in another table.
    [Show full text]
  • Missing Data in the Relational Model
    Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2013 Missing Data in the Relational Model Marion Morrissett Virginia Commonwealth University Follow this and additional works at: https://scholarscompass.vcu.edu/etd Part of the Engineering Commons © The Author Downloaded from https://scholarscompass.vcu.edu/etd/3004 This Dissertation is brought to you for free and open access by the Graduate School at VCU Scholars Compass. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of VCU Scholars Compass. For more information, please contact [email protected]. c Marion R. Morrissett, 2013 All Rights Reserved Dedication This research is dedicated to content, data with missing values that represent the always-complete real world. And to structure, the relational model created and developed by the scientists, researchers, teachers, and practitioners who populate my test case database. MISSING DATA IN THE RELATIONAL MODEL A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Virginia Commonwealth University. by MARION R. MORRISSETT Bachelor of Arts, University of Virginia, 1972 Mathematical Sciences Certificate in Computer Science, Virginia Commonwealth University, 1987 Master of Science, Virginia Commonwealth University, 1994 Doctor of Philosophy, Virginia Commonwealth University, 2013 Director: LORRAINE M. PARKER ASSOCIATE PROFESSOR, DEPARTMENT OF COMPUTER SCIENCE Virginia Commonwealth University Richmond, Virginia May, 2013 ii Acknowledgments Many people have provided help and support during this work. My friends and family listened to my dissertation status reports, the programmers among them heard the technical details and all were patient. John Cookson, Tom Nicholls, and Paul Bruggeman shared their experience with problems created and solved by computers.
    [Show full text]
  • The Relational Model: a Tutorial
    The Relational Model: a Tutorial Hugh Darwen This is an informal description of E.F. Codd’s model [2] that was originally drafted as part my contribution to a special edition of the IEEE Annals of the History of Computing devoted to the history of relational model. It turned out to be too long and its level of detail was not thought to be suited to my article, which was to serve as the introductory article to that edition. The terminology is that used in The Third Manifesto by Chris Date and myself. Our terminology is mostly the same as Codd’s but we made a few changes in an attempt to clear up some matters that had given rise to confusion over the years. The notation used in my examples is taken from Tutorial D [3,4], a language Chris Date and I devised as an example for teaching purposes. My description is in three main sections. First, in Definitions, I describe the objects that constitute a relational database. Then, in Relational Algebra, I describe the operators that are defined in the model for operating on those objects. Finally, Database Integrity covers the mechanism the model prescribes for maintaining consistency in a database. Differences between the current definition and Codd’s original are mentioned in footnotes. Definitions What’s a relation? Consider the sentence, “Student S1, named Anne, is enrolled on course C1”. Because it is the kind of sentence of which we can say, “That’s true”, or “That’s false”, it denotes a proposition. A predicate is a generalized form of a proposition in which some designators (such as S1, Anne, and C1) are replaced by symbols denoting parameters,1 as in “Student StudentId, named Name, is enrolled on course CourseId.” That sentence no longer denotes a proposition, but nevertheless it has the same declarative form as a proposition and thus denotes a predicate.
    [Show full text]