LogicalLogical SchemaSchema Design:Design: TheThe RelationalRelational DataData ModelModel
Basics of the Relational Model From Conceptual to Logical Schema LogicalLogical SchemaSchema DesignDesign
Select data model Hierarchical data model: hierarchies of record types mainframe oldie, still in use, outdated Network data model: graph like data structures, still in use, outdated Relational data model: the most important one today Object-Oriented data model: more flexible data structures, less powerful data manipulation language Object-Relational: the best of two worlds?
FU-Berlin, DBS I 2006, FU-Berlin, DBS Transform conceptual model into logical schema of data model easy for Relational Data Model (RDM)
Hinze / Scholz can be performed automatically (e.g. by Oracle Designer)
1.2 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel
The Relational Data Model Important terms 1970 introduced by E.F. Codd, honoured by Turing award (Codd: A Relational Model of Data for Large Shared Data Banks. ACM Comm. 13( 6): 377- 387, 1970) Basic ideas: Database is collection of relations Relation R = set of n-tuples
Relation schema R(A1,A2,…An)
Attributes Ai = atomic values of domain Di=dom(Ai)
Student relation (table) FName Name Email SID FU-Berlin, DBS I 2006, FU-Berlin, DBS attribute Tina Müller mueller@... 13555 Anna Katz katz@... 12555 tuple Carla Maus piep@... 11222 relation schema: Hinze / Scholz Student(Fname, Name, Email, SID) or Student(Fname:string, Name:string, Email:string, SID:number) 1.3 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel
Relation Schema R(A1, A2, …, An)
notation sometimes R:{[A1, A2, …, An]}
Relation R ⊆ dom(A1) x dom(A2) x … x dom(An)
Attribute set R ={A1, A2, …, An} Degree of a relation: number of attributes
Example: Student(Fname, Name, Email, SID) FU-Berlin, DBS I 2006, FU-Berlin, DBS Student ⊆ string x string x string x number R (Student) ={Fname, Name, Email, SID} Hinze / Scholz Database Schema is set of relation schemas 1.4 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel
Time-variant relations Relations have state at each point in time. Integrity constraints on state part of DB schema
Tuples (rows, records) Not ordered No duplicate tuples (Relations are sets) Null-values for some attributes possible Distinguishable based on tuple values (key-concept) FU-Berlin, DBS I 2006, FU-Berlin, DBS Different to object identification in o-o languages: there, each object has implicit identity, usually completely unrelated to its fields Hinze / Scholz
1.5 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel
Superkey: Important terms subset of attributes of a relation schema R for which no two tuples have the same value.
Every relation at least 1 superkey. Example: Student(Fname, Name, Email, SID) Superkeys: {Fname, Name, Email, SID}, {Name, Email, SID}, {Fname, Email, SID}, {Fname, Name, SID}, {Fname, SID}, {Name, SID}, {Fname, Name, Email}, FU-Berlin, DBS I 2006, FU-Berlin, DBS {Fname, Email}, {Name, Email}, {Email, SID}, {Email}, {SID} Hinze / Scholz
1.6 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel
Candidate Key: Important terms superkey K of relation R such that if any attribute A ∈ K is removed, the set of attributes K\A is not a superkey of R.
Example: Student(Fname, Name, Email, SID) Candidate Keys: {Email}, {SID}
Primary Key:
FU-Berlin, DBS I 2006, FU-Berlin, DBS arbitrarily designated candidate key
Example: Student(Fname, Name, Email, SID)
Hinze / Scholz Primary Key: {SID}
1.7 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel
Foreign Key: Important terms set of attributes FK in relation schema R1 that references relation R2 if: Attributes of FK have the same domain as the attributes
of primary key K2 of R2
A value of FK in tuple t1 in R1 either occurs as a value of K2 for some t2 in R2 or is null.
Example: FU-Berlin, DBS I 2006, FU-Berlin, DBS R1: Exercise(EID, Tutor, ExcerciseHours, Lecture)
R2: Student(Fname, Name, Email, SID)
Hinze / Scholz K1={EID}, K2={SID} Tutor is foreign key of Student (from Exercise ). 1.8 LogicalLogical SchemaSchema Design:Design: TransformationTransformation
1. Select data model → relational data model 2. Transform conceptual model into logical schema of relational data model
Define relational schema, table names, attributes and types, invariants Design steps: Translate entities into relations Translate relationships into relations Simplify the design FU-Berlin, DBS I 2006, FU-Berlin, DBS Define tables in SQL Define additional invariants (Formal analysis of the schema) Hinze / Scholz
1.9 LogicalLogical SchemaSchema Design:Design: EntitiesEntities
Step 1: Transform entities For each entity E create relation R with all attributes Key attributes of E transform to keys of the relation
director year id category Price_per_day
length title
Movie FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Movie(id, title, category, year, director, Price_per_day, length) Hinze / Scholz
1.10 LogicalLogical SchemaSchema Design:Design: WeakWeak EntitiesEntities
Step 2: Transform weak entities For each weak entity WE create relation R Include all attributes of WE Add key of identifying entity to weak entities key Part of key is foreign key
eid cid (0,*) (1,1) Employee has Child FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Employee(eid) Child(cid, eid)
Hinze / Scholz Note: weak entity and defining relationship transformed together! 1.11 LogicalLogical SchemaSchema Design:Design: WeakWeak EntitiesEntities
Step 2: Transform weak entities Item iid Example: (1,1)
has
date (1,*) (1,*) (1,1) Delivery contains Order did
oid FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: contact date Delivery(did, date) Order(oid, did, date, contact) Item(iid, oid, did) Hinze / Scholz
1.12 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships
Step 3: Transform Relationships For each 1:1-relationship between E1, E2 create relation R Include all key attributes Define as key: key of entity E1 or entity E2 Choose entity with total participation for key (driving licence)
(1,1) (1,1) Country has President
CName PName FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Country(CName) President(PName)
Hinze / Scholz has(CName, PName) or has(Cname, PName)
1.13 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships
Step 3: Transform Relationships For each 1:N-relationship between E1, E2 create relation R Include all key attributes Define as key: key of N-side of relationship
lid Lecture hold Lecturer (1,1) (0,*) lnr name
FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Lecture(lnr) Lecturer(lid, name) hold(lnr,lid) Hinze / Scholz
1.14 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships
Step 3: Transform Relationships For each N:M-relationship create relation R Include all key attributes Define as key: keys from both entities
User_account has emailMessage (0,*) (1,*) id username from
FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: User_account(username) emailMessage(id, from) has(username, messageid) Hinze / Scholz
1.15 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships
Step 3: Transform Relationships Include all relationship attributes in relation R
lid Lecture hold Lecturer (1,1) (0,N) lnr date name Relational Schema: Lecture(lnr) Lecturer(lid, name) hold(lnr,lid, date) lnr
FU-Berlin, DBS I 2006, FU-Berlin, DBS Rename keys in recursive relationships Lecture Relational Schema: predecessor successor require(preLNR, succLNR) (0,1) (0,*) Hinze / Scholz require
1.16 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships
Step 3: Transform Relationships For each n-ary relationship (n>2) create relation R Include all key attributes Define as key: keys from all numerously involved entities
rating
(0,*) (1,*) Lecturer recommend Textbook ISBN
(0,*) title lid name lnr FU-Berlin, DBS I 2006, FU-Berlin, DBS Lecture author Relational Schema: Lecture(lnr) Lecturer(lid, name) Hinze / Scholz Textbook(ISBN, title, author) recommend(lid, lnr, ISBN, rating) 1.17 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization
Step 4: Generalization FName 3 Variants for Transformation Person Name Email
is-a is-a pid
SID Student Lecturer Contract
Person(pid, name, fname, email) Student(pid, sid) Lecturer(pid, contract) FU-Berlin, DBS I 2006, FU-Berlin, DBS Student(pid, name, fname, email, sid) Lecturer(pid, name, fname, email, contract) Hinze / Scholz Person(pid, name, fname, email, sid, contract)
1.18 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization aid (1) Separate relation for each entity A a Key in specialized relations: is-a is-a foreign key to general relation b B C c
A(aid, a) B: B(aid, b) A: C(aid, c) C:
Gathering data from different tables time consuming
FU-Berlin, DBS I 2006, FU-Berlin, DBS Appropriate specialization prevents unnecessary data access Hinze / Scholz
1.19 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization aid (2) Relations for specialization A a Separate tables which include is-a is-a A’s attributes B C Separate keys for relations b c
AB(aid, a, b) AB: AC(aid, a, c) AC:
FU-Berlin, DBS I 2006, FU-Berlin, DBS Only valid for exhaustive specialization (no data in A only) Time consuming data retrieval for non-distinct specializations Hinze / Scholz
1.20 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization aid (3) one relation for all entities A a is-a is-a
b B C c ABC(aid, a, b, c)
ABC: NULLs, if subtypes are disjoint
Empty, if subtypes are exhaustive FU-Berlin, DBS I 2006, FU-Berlin, DBS Removes generalization! Relation may contain many NULL-values Hinze / Scholz
1.21 LogicalLogical SchemaSchema Design:Design: ExampleExample year director title category id charge Format Price_per_day name (0,*) (1,*) hold Movie length belong_to (1,1) (0,*) (1,1) play
id Tape First_name (1,*) Address (0,*) Last_name is_in Telephone Actor (1,1) Mem_no FU-Berlin, DBS I 2006, FU-Berlin, DBS birthday from Rental stage_name
until (1,1) real_name
Hinze / Scholz (0,*) have Customer
1.22 LogicalLogical SchemaSchema Design:Design: ExampleExample
Format(name, charge) Tape(id) Movie(id, title, category, year, director, price_per_day, length) Actor(stage_name, real_name, birthday) Customer(mem_no, last_name, first_name, address, telephone) Rental(tape_id, member, from, until)
Belong_to(formatname, tape_id) Hold(tape_id, movie_id)
FU-Berlin, DBS I 2006, FU-Berlin, DBS Play(movie_id, actor_name)
Reduce relation number by simplification! Hinze / Scholz
1.23 LogicalLogical SchemaSchema Design:Design: SimplificationSimplification
Collect those relation schemas with the same key attributes into one relation schema Semantics of attributes must match, not literal name Do not destroy generalization! hours title LID Lecture(lid, title, hours) (1,1) Lecture hold Person(pid, fname, name, email) (3,N) FNameName Student(pid, sid) (0,N) Email Person pid Lecturer(pid, contract) attend Attend(lid, pid) FU-Berlin, DBS I 2006, FU-Berlin, DBS is-a is-a Hold(lid, pid) (1,N) Student Lecturer Hinze / Scholz Lecture(lid, title, hours, lecturer) SID Contract 1.24 LogicalLogical SchemaSchema Design:Design: ExampleExample
Format(name, charge) Tape(id) Movie(id, title, category, year, director, price_per_day, length) Actor(stage_name, real_name, birthday) Customer(mem_no, last_name, first_name, address, telephone) Rental(tape_id, member, from, until)
Belong_to(formatname, tape_id) Hold(tape_id, movie_id) Play(movie_id, actor_name) FU-Berlin, DBS I 2006, FU-Berlin, DBS Simplified relation from Tape/Belong_to/Hold: Tape(id, format, movie_id) Hinze / Scholz No simplification based on partial keys!
1.25 LogicalLogical SchemaSchema Design:Design: SimplificationSimplification
Restrictions on schema simplification Simplification (folding) of relations may result in NULL values in some tables duration
Student get Grant gid (0,1) (0,1) … … id amount since responsible Student(id, …) Grant(gid, duation, amount) Get(student, gid, since, responsible)
FU-Berlin, DBS I 2006, FU-Berlin, DBS Grant(gid, duation, amount, student, since, responsible)
Consequence: optional relationships should not be transformed if Hinze / Scholz many NULLS to be expected May lead to time consuming retrieval 1.26 LogicalLogical SchemaSchema Design:Design: ShortShort summarysummary
Relational data model Representation data in relations (tables) the most important data model today
Important terms & concepts Relation: set of n-tuples Relation schema defines structure Attribute: property of relation, atomic values Superkey, candidate key, primary key identify tuple Transformation rules for entities, relationships
FU-Berlin, DBS I 2006, FU-Berlin, DBS Simplification
Open aspects of logical schema design DDL for defining and changing relation schemas Hinze / Scholz Definition of constraints (Min-cardinalities, Value restrictions for attributes,…) 1.27