2.1.1. Overview Lifecycle 2 Conceptual Database Design 2.1 Requirement analysis -> Text 2.2 Modeling languages Requirement analysis
-> Conceptual Model 2.1.1 Overview Conceptual Design 2.1.2 Requirement Analysis (case study) 2.2.1 Basic Modeling Primitives 2.2.2 Modeling Languages: UML and CREATE TABLE Studentin (SID INTEGER PRIMARY KEY, -> Database schema VName CHAR(20) Name CHAR(30)CREATE NOT NULL,TABLE Kurs Entity-Relationship Model (ERM) Logical Schema Design Email Char(40));(KID CHAR(10) PRIMARY KEY, Name CHAR(40) NOT NULL, Dauer INTEGER); CREATE TABLE Order 2.2.3 Conceptual DB design: basics ODate DATE soldBy INTEGER FOREIGN KEY REFEREBCES Peronal(PID) 2.2.4 From Requirements to Models CID INTEGER FOREIGN KEY REFERENCES Customer (CID)); Physical Schema Design -> Access paths
References: Kemper / Eickler chap 2, Elmasri / Navathe chap. 3 Administration Garcia-Molina / Ullmann / Widom: chap. 2 © HS-2009 Redesign 02-DBS-Conceptual-2
Database Design:Terminology 2.1.2 Requirement Analysis Most important: talk with your customers! Def.: Database Design The process of defining the overall structure of a database, Tasks during RA: i.e. the schema, on different layers of abstraction. Identify essential "real world" information (e.g. Design levels: Conceptual, logical, physical interviews) Remove redundant, unimportant details Includes "Analysis" and "Design" from SE Clarify unclear natural language statements DB Modeling: defining the "static model" using formal or visual languages Fill remaining gaps in discussions Distinguish data and operations DB SE Requirements Requirements Conceptual modeling Analysis Requirement analysis & Conceptual Design aims at Logical modeling Design focusing thoughts and discussions ! Physical modeling Implementation
© HS-2009 02-DBS-Conceptual-3 © HS-2009 02-DBS-Conceptual-4
Example: Geo-DB Requirement Analysis
The database we develop will contain data about countries, cities, Clarify unclear statements organizations and geographical facts. In the first step, countries, cities, • what is a country? regions (like "Bundesländer" or geographical regions), and continents Political unit: compare Korea vs South /North Korea are to be represented in the DB. Fill gap • Cities are located in regions. What if In the requirements analysis it has to be clarified, what kind of a country does not have regions? information is supposed to be represented, not how it should be → region is country itself represented! • Can a region belong to different countries? No, but there may be regions with the same name in different countries • Can a country belong to different continents? Yes. Distinguish data from operations First step: filter essential information , ignore unim- • Gross National Product per inhabitant: calculate portant details • "It happens that countries are united" .... Note: importance of some piece of information depends on the application scenario
© HS-2009 02-DBS-Conceptual-5 © HS-2009 02-DBS-Conceptual-6
1 2.2.1 Basic modeling primitives Modeling language requirements Conceptual modeling What is the right language for "modeling reality"? Distinguish between types (classes) and individual facts (metadata vs data) Which language primitives ? The name of this woman is Kunz with first name Tamara. As opposed to: An old problem of philosophy: how to describe the A person is identified by first name, last name and birth date. world in an appropriate, comprehensible way?
Describe reality on a type level One of the answers were logic languages. They allow to express more than we (currently) want to: Use a graphical language in order to get an overall facts and rules. impression of the domain modeled. e.g.: human(Plato) , ∀x (human(x) ⇒ mortal(x)
© HS-2009 02-DBS-Conceptual-7 © HS-2009 02-DBS-Conceptual-8
Basic modeling primitives Basic modeling primitives
Modeling the "Real World" Issues Design choices Entity (type) City Country attribute or entity? something which exists, has a name continent: attribute of country or separate entity?
City Country Attribute There is never exactly one way of modeling reality. property of an entity populat name name GNP Many good designs, much more bad designs. populat Relationship Country Continent connects two or more entities Identification name name inhabit encom_ area e.g. name obviously identifies continents but not cities passes population? "Non sunt multiplicanda entia praeter necessitatem" Identifying attributes needed at all? William van Ockham, English philosopher, 13th century (Principle of Economy, Law of Parsimony) © HS-2009 02-DBS-Conceptual-9 © HS-2009 02-DBS-Conceptual-10
2.2.2 Modeling notations and languages Graphical modeling languages
Entity-Relationship-Model (ERM) Unified modeling language (UML) data-oriented: static modeling of data 1976 introduced by P.P. Chen Modeling of data and operations (Peter P. Chen: The Entity-Relationship Model - Toward a Unified View of Data. ACM TODS 1(1): 9-36, 1976, see Reader) Traditional graphical notation with squares, bullets and diamond Object oriented flavor e.g: each object (entity) has identity - a unique pointer ERM: entities having the same type and the same attribute values are indistinguishable Student Name FName Attributes may be constructed (lists, sets, arrays,…) Matr-Nr Title attend Relationships are directed (uni- or bidirectional) Email ERM: always bidirectional LID Lecture hours
© HS-2009 02-DBS-Conceptual-11 © HS-2009 02-DBS-Conceptual-12
2 UML versus ERM 2.2.3 Conceptual Design: Basics
UML Entities & attributes attend Called association Student Lecture in UML, may be Matr-Nr LID directed Name Title
© HS-2009 02-DBS-Conceptual-13 © HS-2009 02-DBS-Conceptual-14
Basics Conceptual Design: Basics
Identifying attributes Relationships "Axiom" of Relational DB:
© HS-2009 02-DBS-Conceptual-15 © HS-2009 02-DBS-Conceptual-16
Modeling basics Conceptual Design: UML
Weak entity Notation ! UML-Terminology Class = entity type (UML: attribute = field) account accTransaction has Object = entity accNumber s_number name day Association = relationship acc_type value //+ or - balance … NO keys: objects are identified by unique address … Relationship may have a direction Example: account statement identified by "number" and "acc_number" which is not attribute of 'statement' entity (!)
© HS-2009 02-DBS-Conceptual-17 © HS-2009 02-DBS-Conceptual-18
3 Conceptual Design: Basics Conceptual Design: Basics
Notation Recursive relationships Sometimes attributes are omitted superior order of relationship role? Employee Employee Employee
PK pid name belongs_to first_name superior Country Continent UML-Notation position encompasses salary
Roles: particularly useful in recursive relations Role names Used in ERM boss rents and UML Employee child Customer DVD Person renter rental object to distinguish
the roles of has superior
entities in a parent
role names relationship subordinate basically for © HS-2009 02-DBS-Conceptual-19 © HS-2009 02-DBS-Conceptual-20 documentation
Conceptual Design: Basics 2.2.3 From requirements to models
Multiple relationships Text to conceptual model The only step which cannot be automated
© HS-2009 02-DBS-Conceptual-21 © HS-2009 02-DBS-Conceptual-22
Conceptual Design: case study Summary
• Conceptual modeling: the art of structuring the data of an application domain Region belongs-to • Basis: careful requirement analysis r_id: String name: String • Simple, powerful base constructs: population: Numb entities, attributes, relationships
is_neighbor position ∈ {N,W,S,E,Z..} • Visual (graphical) language encompass Continent Country • E-R modeling language and UML related name: String
capitalOf E-R language simpler name: String area: Numb L_ID: String More appropriate for modeling of data population: Numb area: Numb many dialects locatedIn GNP: Numb area: Compatibility to UML makes sense numb City Some differences, e.g. no keys in UML name: String capital population: Numb longitude: Numb © HS-2009 latitude:02-DBS-Conceptual-23 Numb © HS-2009 02-DBS-Conceptual-24
4