LogicalLogical SchemaSchema Design:Design: TheThe RelationalRelational DataData ModelModel

Basics of the From Conceptual to LogicalLogical SchemaSchema DesignDesign

Select ƒ Hierarchical data model: hierarchies of record types mainframe oldie, still in use, outdated ƒ Network data model: graph like data structures, still in use, outdated ƒ Relational data model: the most important one today ƒ Object-Oriented data model: more flexible data structures, less powerful data manipulation language ƒ Object-Relational: the best of two worlds?

FU-Berlin, DBS I 2006, FU-Berlin, DBS Transform conceptual model into logical schema of data model ƒ easy for Relational Data Model (RDM)

Hinze / Scholz ƒ can be performed automatically (e.g. by Oracle Designer)

1.2 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel

The Relational Data Model Important terms ƒ 1970 introduced by E.F. Codd, honoured by Turing award (Codd: A Relational Model of Data for Large Shared Data Banks. ACM Comm. 13( 6): 377- 387, 1970) Basic ideas: ƒ is collection of relations ƒ Relation R = set of n-tuples

ƒ Relation schema R(A1,A2,…An)

ƒ Attributes Ai = atomic values of domain Di=dom(Ai)

Student relation (table) FName Name Email SID FU-Berlin, DBS I 2006, FU-Berlin, DBS attribute Tina Müller mueller@... 13555 Anna Katz katz@... 12555 tuple Carla Maus piep@... 11222 relation schema: Hinze / Scholz Student(Fname, Name, Email, SID) or Student(Fname:string, Name:string, Email:string, SID:number) 1.3 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel

Relation Schema R(A1, A2, …, An)

notation sometimes R:{[A1, A2, …, An]}

Relation R ⊆ dom(A1) x dom(A2) x … x dom(An)

Attribute set R ={A1, A2, …, An} Degree of a relation: number of attributes

Example: Student(Fname, Name, Email, SID) FU-Berlin, DBS I 2006, FU-Berlin, DBS Student ⊆ string x string x string x number R (Student) ={Fname, Name, Email, SID} Hinze / Scholz  is set of relation schemas 1.4 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel

Time-variant relations ƒ Relations have state at each point in time. ƒ Integrity constraints on state part of DB schema

Tuples (rows, records) ƒ Not ordered ƒ No duplicate tuples (Relations are sets) ƒ Null-values for some attributes possible ƒ Distinguishable based on tuple values (key-concept) FU-Berlin, DBS I 2006, FU-Berlin, DBS Different to object identification in o-o languages: there, each object has implicit identity, usually completely unrelated to its fields Hinze / Scholz

1.5 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel

Superkey: Important terms subset of attributes of a relation schema R for which no two tuples have the same value.

ƒ Every relation at least 1 superkey. ƒ Example: Student(Fname, Name, Email, SID) Superkeys: {Fname, Name, Email, SID}, {Name, Email, SID}, {Fname, Email, SID}, {Fname, Name, SID}, {Fname, SID}, {Name, SID}, {Fname, Name, Email}, FU-Berlin, DBS I 2006, FU-Berlin, DBS {Fname, Email}, {Name, Email}, {Email, SID}, {Email}, {SID} Hinze / Scholz

1.6 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel

Candidate Key: Important terms superkey K of relation R such that if any attribute A ∈ K is removed, the set of attributes K\A is not a superkey of R.

ƒ Example: Student(Fname, Name, Email, SID) Candidate Keys: {Email}, {SID}

Primary Key:

FU-Berlin, DBS I 2006, FU-Berlin, DBS arbitrarily designated candidate key

ƒ Example: Student(Fname, Name, Email, SID)

Hinze / Scholz Primary Key: {SID}

1.7 LogicalLogical SchemaSchema Design:Design: RelationalRelational DataData ModelModel

Foreign Key: Important terms set of attributes FK in relation schema R1 that references relation R2 if: ƒ Attributes of FK have the same domain as the attributes

of primary key K2 of R2

ƒ A value of FK in tuple t1 in R1 either occurs as a value of K2 for some t2 in R2 or is null.

ƒ Example: FU-Berlin, DBS I 2006, FU-Berlin, DBS R1: Exercise(EID, Tutor, ExcerciseHours, Lecture)

R2: Student(Fname, Name, Email, SID)

Hinze / Scholz K1={EID}, K2={SID} Tutor is foreign key of Student (from Exercise ). 1.8 LogicalLogical SchemaSchema Design:Design: TransformationTransformation

1. Select data model → relational data model 2. Transform conceptual model into logical schema of relational data model

 Define relational schema, table names, attributes and types, invariants  Design steps: ƒ Translate entities into relations ƒ Translate relationships into relations ƒ Simplify the design FU-Berlin, DBS I 2006, FU-Berlin, DBS ƒ Define tables in SQL ƒ Define additional invariants ƒ (Formal analysis of the schema) Hinze / Scholz

1.9 LogicalLogical SchemaSchema Design:Design: EntitiesEntities

Step 1: Transform entities ƒ For each entity E create relation R with all attributes ƒ Key attributes of E transform to keys of the relation

director year id category Price_per_day

length title

Movie FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Movie(id, title, category, year, director, Price_per_day, length) Hinze / Scholz

1.10 LogicalLogical SchemaSchema Design:Design: WeakWeak EntitiesEntities

Step 2: Transform weak entities ƒ For each weak entity WE create relation R ƒ Include all attributes of WE ƒ Add key of identifying entity to weak entities key ƒ Part of key is foreign key

eid cid (0,*) (1,1) Employee has Child FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Employee(eid) Child(cid, eid)

Hinze / Scholz Note: weak entity and defining relationship transformed together! 1.11 LogicalLogical SchemaSchema Design:Design: WeakWeak EntitiesEntities

Step 2: Transform weak entities Item iid Example: (1,1)

has

date (1,*) (1,*) (1,1) Delivery contains Order did

oid FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: contact date Delivery(did, date) Order(oid, did, date, contact) Item(iid, oid, did) Hinze / Scholz

1.12 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships

Step 3: Transform Relationships ƒ For each 1:1-relationship between E1, E2 create relation R ƒ Include all key attributes ƒ Define as key: key of entity E1 or entity E2 ƒ Choose entity with total participation for key (driving licence)

(1,1) (1,1) Country has President

CName PName FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Country(CName) President(PName)

Hinze / Scholz has(CName, PName) or has(Cname, PName)

1.13 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships

Step 3: Transform Relationships ƒ For each 1:N-relationship between E1, E2 create relation R ƒ Include all key attributes ƒ Define as key: key of N-side of relationship

lid Lecture hold Lecturer (1,1) (0,*) lnr name

FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: Lecture(lnr) Lecturer(lid, name) hold(lnr,lid) Hinze / Scholz

1.14 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships

Step 3: Transform Relationships ƒ For each N:M-relationship create relation R ƒ Include all key attributes ƒ Define as key: keys from both entities

User_account has emailMessage (0,*) (1,*) id username from

FU-Berlin, DBS I 2006, FU-Berlin, DBS Relational Schema: User_account(username) emailMessage(id, from) has(username, messageid) Hinze / Scholz

1.15 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships

Step 3: Transform Relationships ƒ Include all relationship attributes in relation R

lid Lecture hold Lecturer (1,1) (0,N) lnr date name Relational Schema: Lecture(lnr) Lecturer(lid, name) hold(lnr,lid, date) lnr

FU-Berlin, DBS I 2006, FU-Berlin, DBS ƒ Rename keys in recursive relationships Lecture Relational Schema: predecessor successor require(preLNR, succLNR) (0,1) (0,*) Hinze / Scholz require

1.16 LogicalLogical SchemaSchema Design:Design: RelationshipsRelationships

Step 3: Transform Relationships ƒ For each n-ary relationship (n>2) create relation R ƒ Include all key attributes ƒ Define as key: keys from all numerously involved entities

rating

(0,*) (1,*) Lecturer recommend Textbook ISBN

(0,*) title lid name lnr FU-Berlin, DBS I 2006, FU-Berlin, DBS Lecture author Relational Schema: Lecture(lnr) Lecturer(lid, name) Hinze / Scholz Textbook(ISBN, title, author) recommend(lid, lnr, ISBN, rating) 1.17 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization

Step 4: Generalization FName ƒ 3 Variants for Transformation Person Name Email

is-a is-a pid

SID Student Lecturer Contract

Person(pid, name, fname, email) Student(pid, sid) Lecturer(pid, contract) FU-Berlin, DBS I 2006, FU-Berlin, DBS Student(pid, name, fname, email, sid) Lecturer(pid, name, fname, email, contract) Hinze / Scholz Person(pid, name, fname, email, sid, contract)

1.18 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization aid (1) Separate relation for each entity A a ƒ Key in specialized relations: is-a is-a foreign key to general relation b B C c

A(aid, a) B: B(aid, b) A: C(aid, c) C:

ƒ Gathering data from different tables time consuming

FU-Berlin, DBS I 2006, FU-Berlin, DBS ƒ Appropriate specialization prevents unnecessary data access Hinze / Scholz

1.19 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization aid (2) Relations for specialization A a ƒ Separate tables which include is-a is-a A’s attributes B C ƒ Separate keys for relations b c

AB(aid, a, b) AB: AC(aid, a, c) AC:

FU-Berlin, DBS I 2006, FU-Berlin, DBS ƒ Only valid for exhaustive specialization (no data in A only) ƒ Time consuming data retrieval for non-distinct specializations Hinze / Scholz

1.20 LogicalLogical SchemaSchema Design:Design: GeneralizationGeneralization aid (3) one relation for all entities A a is-a is-a

b B C c ABC(aid, a, b, c)

ABC: NULLs, if subtypes are disjoint

Empty, if subtypes are exhaustive FU-Berlin, DBS I 2006, FU-Berlin, DBS ƒ Removes generalization! ƒ Relation may contain many NULL-values Hinze / Scholz

1.21 LogicalLogical SchemaSchema Design:Design: ExampleExample year director title category id charge Format Price_per_day name (0,*) (1,*) hold Movie length belong_to (1,1) (0,*) (1,1) play

id Tape First_name (1,*) Address (0,*) Last_name is_in Telephone Actor (1,1) Mem_no FU-Berlin, DBS I 2006, FU-Berlin, DBS birthday from Rental stage_name

until (1,1) real_name

Hinze / Scholz (0,*) have Customer

1.22 LogicalLogical SchemaSchema Design:Design: ExampleExample

Format(name, charge) Tape(id) Movie(id, title, category, year, director, price_per_day, length) Actor(stage_name, real_name, birthday) Customer(mem_no, last_name, first_name, address, telephone) Rental(tape_id, member, from, until)

Belong_to(formatname, tape_id) Hold(tape_id, movie_id)

FU-Berlin, DBS I 2006, FU-Berlin, DBS Play(movie_id, actor_name)

Reduce relation number by simplification! Hinze / Scholz

1.23 LogicalLogical SchemaSchema Design:Design: SimplificationSimplification

Collect those relation schemas with the same key attributes into one relation schema ƒ Semantics of attributes must match, not literal name ƒ Do not destroy generalization! hours title LID Lecture(lid, title, hours) (1,1) Lecture hold Person(pid, fname, name, email) (3,N) FNameName Student(pid, sid) (0,N) Email Person pid Lecturer(pid, contract) attend Attend(lid, pid) FU-Berlin, DBS I 2006, FU-Berlin, DBS is-a is-a Hold(lid, pid) (1,N) Student Lecturer Hinze / Scholz Lecture(lid, title, hours, lecturer) SID Contract 1.24 LogicalLogical SchemaSchema Design:Design: ExampleExample

Format(name, charge) Tape(id) Movie(id, title, category, year, director, price_per_day, length) Actor(stage_name, real_name, birthday) Customer(mem_no, last_name, first_name, address, telephone) Rental(tape_id, member, from, until)

Belong_to(formatname, tape_id) Hold(tape_id, movie_id) Play(movie_id, actor_name) FU-Berlin, DBS I 2006, FU-Berlin, DBS Simplified relation from Tape/Belong_to/Hold: Tape(id, format, movie_id) Hinze / Scholz No simplification based on partial keys!

1.25 LogicalLogical SchemaSchema Design:Design: SimplificationSimplification

Restrictions on schema simplification ƒ Simplification (folding) of relations may result in NULL values in some tables duration

Student get Grant gid (0,1) (0,1) … … id amount since responsible Student(id, …) Grant(gid, duation, amount) Get(student, gid, since, responsible)

FU-Berlin, DBS I 2006, FU-Berlin, DBS Grant(gid, duation, amount, student, since, responsible)

Consequence: ƒ optional relationships should not be transformed if Hinze / Scholz many NULLS to be expected ƒ May lead to time consuming retrieval 1.26 LogicalLogical SchemaSchema Design:Design: ShortShort summarysummary

Relational data model ƒ Representation data in relations (tables) ƒ the most important data model today

Important terms & concepts ƒ Relation: set of n-tuples ƒ Relation schema defines structure ƒ Attribute: property of relation, atomic values ƒ Superkey, candidate key, primary key identify tuple ƒ Transformation rules for entities, relationships ƒ

FU-Berlin, DBS I 2006, FU-Berlin, DBS Simplification

Open aspects of logical schema design ƒ DDL for defining and changing relation schemas Hinze / Scholz ƒ Definition of constraints (Min-cardinalities, Value restrictions for attributes,…) 1.27