Introduction to Data Modeling
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 2 Introduction to Data Modeling Objectives: to understand • Data modeling reduces complexities of database • Data modeling and why data models are design important • Designers, programmers, and end users see data • The basic data-modeling building blocks in different ways • What business rules are and how they influence • Different views of same data lead to designs that database design do not reflect organization’s operation • How the major data models evolved historically • Various degrees of data abstraction help • How data models can be classified by level of reconcile varying views of same data abstraction CS275 Fall 2010 1 CS275 Fall 2010 2 Data Modeling and Data Models The Importance of Data Models • Model: an abstraction of a real-world object or • Facilitate interaction among the designer, the event applications programmer, and the end user – Useful in understanding complexities of the real- • End users have different views and needs for data world environment • Data model organizes data for various users • Data models • Data model is a conceptual model - an abstraction – Relatively simple representations of complex real- • It’s a graphical collection of logical constructs world data structures representing the data structure and relationships • Often graphical within the database. • Creating a Data model is iterative and progressive – Cannot draw required data out of the data model – An implementation model would represent how the data are represented in the database. CS275 Fall 2010 3 CS275 Fall 2010 4 1 Data Model Basic Building Blocks Business Rules Terminology • Entity : anything about which data are to be • Descriptions of policies or principles within an collected and stored organization • Attribute : a characteristic of an entity • Description of operations or procedures, to • Relationship : describes an association among create/enforce actions within an organization’s entities environment – One-to-many (1:M) relationship – Must be in writing and kept up to date – Many-to-many (M:N or M:M) relationship – Must be easy to understand and widely disseminated – Sometimes externally defined, i.e. government – One-to-one (1:1) relationship regulations. • Constraint : a restriction placed on the data • These describe characteristics of data as viewed by the company CS275 Fall 2010 5 CS275 Fall 2010 6 Discovering Business Rules Importance of Business Rules • Sources of business rules: • Standardize company’s view of data – Company managers • Useful as a communications tool between users – Policy makers and designers – Department managers • Allows the designer to – Written documentation – understand the nature, role, and scope of data • Procedures – understand business processes • Standards – develop appropriate relationship participation • Operations manuals rules and constraints – Direct interviews with end users • Promotes the creation of an accurate data model • Always verify sources of information CS275 Fall 2010 7 CS275 Fall 2010 8 2 Translating Business Rules into Data Naming Conventions Model Components • Naming occurs during translation of business • Generally, nouns translate into entities rules to data model components • Verbs translate into relationships among entities • Names should make the object unique and • Relationships are bidirectional distinguishable from other objects • Two questions to identify the relationship type: • Names should also be descriptive of objects in the – How many instances of B are related to one environment and be familiar to users instance of A? • Proper naming: – How many instances of A are related to one – Facilitates communication between parties instance of B? – Promotes self-documentation CS275 Fall 2010 9 CS275 Fall 2010 10 Evolution of Data The Hierarchical Model Implementation Models • Hierarchical • The hierarchical model was developed in the 1960s – Logically represented by an upside down tree to manage large amounts of data for manufacturing • Each parent can have many children projects • Each child has only one parent • Basic logical structure is represented by an upside- down “tree” • Network • Hierarchical structure contains levels or segments • Relational – Segment analogous to a record type • Object oriented – Set of one-to-many relationships between segments • Hybrid, XML • Example – manufacturing a car from components (a,b,or c), each made of subassemblies (1,2,or3), each having parts (x,y,&z) ....(tree structure) CS275 Fall 2010 11 CS275 Fall 2010 12 3 Hierarchical Structure Hierarchical Structure • Each parent can have many children • Each child has only one parent • Tree is defined by path that traces parent segments to child segments, beginning from the left • Hierarchical path – Ordered sequencing of segments tracing hierarchical structure • Preorder traversal or hierarchic sequence – “Left-list” path CS275 Fall 2010 13 CS275 Fall 2010 14 The Hierarchical Model The Hierarchical Model • GUAM (Generalized Update Access Method) • Advantages • Disadvantages – Based on the recognition that the many smaller – Conceptual simplicity – Complex parts would come together as components of still implementation larger components – Database security – Data independence – Difficult to manage • Information Management System (IMS) – Database integrity – Lacks structural – World’s leading mainframe hierarchical database – Efficiency independence system in the 1970s and early 1980s – Complex applications • TCDMS/ADABAS – jointly developed by IBM and programming and use Lane County – Implementation limitations – Lack of standards CS275 Fall 2010 15 CS275 Fall 2010 16 4 The Network Model The Network Model • The network model was created to represent • Collection of records in 1:M relationships complex data relationships more effectively than • A Set is a relationship and composed of two the hierarchical model record types: – Improves database performance – Owner: Equialent to the hierarchical model’s – Imposes a database standard parent – Represent complex data relationships more – Member: Equivalent to the hierarchical model’s effectively – such as child w/ multiple parents child • Conference on Data Systems Languages (CODASYL) • American National Standards Institute (ANSI) • Database Task Group (DBTG) CS275 Fall 2010 17 CS275 Fall 2010 18 The Network Model Components The Network Model • Concepts still used today: • Advantages: • Disadvantages of the – Schema: Conceptual organization of entire – Conformance to network model: database as viewed by the database administrator standards – System complexity – Subschema : Database portion “seen” by the – Handled more – Lack of ad hoc query application programs relationship types capability placed – Data management language (DML): Defines the – Data access flexibility burden on environment in which data can be managed programmers to – Data definition language (DDL): Enables the generate code for reports administrator to define the schema components – Structural change in the database could produce havoc in all CS275 Fall 2010 19 CS275 Fall 2010 application programs 20 5 The Relational Model Relational Table • Developed by E.F. Codd (IBM) in 1970 • A Relational table is a purely logical structure • Relational models were considered impractical in – How data are physically stored in the database is the 1970’s. of no concern to the user or the designer. • Model was conceptually simple at expense of • Stores a collection of related entities computer overhead – Resembles a file • Relational table is purely logical structure • Table (relations) – How data are physically stored in the database is – Matrix consisting of a series of row/column of no concern to the user or the designer intersections – This concept is the source of a real database – Each row in a relation is called a tuple revolution – Related to each other by sharing a common entity characteristic CS275 Fall 2010 21 CS275 Fall 2010 22 The Relational Model Components • Relational data management system (RDBMS) – Performs same functions provided by hierarchical model, but hides complexity from the user • Relational schema/diagram – Visual representation of relational database’s entities, attributes within those entities, and relationships between those entities • Relational diagram – Representation of entities, attributes, and relationships • Relational table stores collection of related entities. CS275 Fall 2010 23 CS275 Fall 2010 6 The Relational DBMS Application The Relational Implementation Model • SQL-based relational database application • Advantages • Disadvantages involves three parts: – Structural – Substantial hardware – User interface independence and system software overhead • Allows end user to interact with the data – Improved conceptual simplicity – Can facilitate poor – Set of tables stored in the database – Easier database design, design and • Each table is independent from another implementation, implementation • Rows in different tables are related based on management, and use – May promote “islands common values in common attributes – Ad hoc query of information” problems – SQL “engine” capability (SQL) – Powerful database • Executes all queries management system CS275 Fall 2010 25 CS275 Fall 2010 26 Logical/Conceptual Model The Entity Relationship Model The Entity Relationship Model • Widely accepted standard for data modeling • Entity instance (or occurrence) is row in table • Introduced by Chen in 1976 • Entity set is collection of like entities • Graphical representation of entities and their • Connectivity labels types of relationships