<<

The Entity-Relationship

Chapter 2 ER Model Basics name ssn dob

Employees

. Entity: Real-world object distinguishable from other objects. An entity is described (in DB) using a set of attributes. . Entity Set: A collection of similar entities. E.g., all employees. – All entities in an entity set have the same set of attributes. (Until we consider ISA hierarchies, anyway!) – Each entity set has a key. – Each attribute has a domain. Relationships

. Connect two or more entity sets. . Represented by diamonds.

name address title film-type

Movies stars-in Stars

year length

3 Relationship Set Think of the “value” of a relationship set as a table. . One column for each of the connected entity sets. . One row for each list of entities, one from each set, that are connected by the relationship. Example: stars-in relationship set Movies Stars Sharon Stone Total Recall Total Recall Sharon Stone … …

4 Multiplicity of Relationships manager dept movies stars employee dept

Many-many Many-one One-one

Representation of Many-One: Arrow pointing = “at most one.” Thick line = ”at least one” Thick arrow = “exactly one.”

5 Multi-way Relationships Usually binary relationships suffice.

Studios Contracts Stars stars-in Movies

. However, there are some cases where three or more E.S. must be connected by one relationship. . Example: Stars have Contracts with Studios for particular Movies.

6 Multi-way (ternary) Relationships

Stars Contracts Movies

Studios Stars Movies Studios Sharon Stone Basic Instinct Sony Arnold Schwarzenegger Total Recall Columbia Sharon Stone Total Recall Columbia … … … 7 Multi-way Relationships (cont.)

Notice arrow convention for multi-way relationships: “all other E.S. determine one of these.” . Not sufficiently general to express any possibility. . Assume that a movie has only one studio, then with the ternary relationship we cannot guarantee that the movie also has many actors. . We could use two 2-way relationships: Studios- Movies and Movies-Stars. . Or better: just make studio an attribute of Movies.

8 Roles in relationships

Sometimes an E.S. participates more than once in a relationship. Thus, we need to label edges with roles to distinguish.

Acted-with Husband Wife Arnold Schw Sharon Stone Husband Wife Arnold Schw Jamie Lee Curtis … … Stars

9 Roles in relationships (cont.)

Co-star Actor1 Actor2

1 2 Nicole Kidman Tom Cruise Stars … …

Note: Co-star is symmetric, Acted-with was not. There is no way to say “symmetric” in E/R. Design Question: What if we replace Husband and Wife by one role, e.g., in-a-couple-with? 10 Attributes on Relationships

salary

Movies Contracts Stars

11 Attributes on Relationships

Is this reasonable?

Salaries value

Movies Contracts Stars

12 Attributes on Relationships

Studio

Movies Contracts Stars

13 Converting Multi-way to 2-Way Baroque in E/R, but necessary in certain “object-oriented” models. . Create a new E.S. to represent rows of a relationship set. . Add many-one relationships from the new E.S. to the E.S.’s that participated in the original relationship.

Contracts The- The- Movies Stars Movie Star

The- Studio

Studios 14 name ssn lot

Employees

Monitors until Wrong !! since started_on dname pid pbudget did budget

Projects Sponsors Departments

15 name ssn lot

Employees

Partially correct !!

since started_on dname pid pbudget did budget

Projects Sponsors Departments

 Now, a dept not only must sponsor at most one project, but it is monitored by only one employee

16 name ssn lot

Aggregation Employees

 Used when we have to model a Monitors until relationship involving (entitity started_on since sets and) a dname relationship set. pid pbudget did budget . Aggregation allows us Sponsors to treat a relationship Projects Departments set as an entity set for purposes of  Aggregation vs. ternary relationship: participation in  Monitors is a distinct relationship, (other) relationships. with a descriptive attribute.  Also, can say that each sponsorship is monitored by at most one employee. 17 Attributes on Relationships salary

Movies Contracts Stars

… a shorthand for the 3-way relationship:

Salaries salary

has

Movies Contracts Stars

18 Binary vs. Ternary Relationships

name ssn lot pname age  If each policy is Employees Covers Dependents owned by just 1 employee, and Bad design Policies each dependent policyid cost is tied to the name pname age covering policy, ssn lot first diagram is Dependents inaccurate. Employees

 Purchaser What are the Beneficiary additional constraints in the Better design 2nd diagram? Policies policyid cost 19 More Design Issues

1. Subclasses. 2. Keys. 3. Weak entity sets.

21 Subclasses

Subclass = special case = fewer entities = more properties. . Example: A cartoon is a kind of movie. In addition to the properties (= attributes and relationships) of movies, there is a voices relationship for cartoons.

22 E/R Subclasses

. Subclasses form a tree (no multiple inheritance). . isa relationships indicate the subclass relation . they are one-one

title Movies length

isa isa to Stars

Murder voices Cartoons weapon Mysteries

23 Different Subclass Viewpoints

. E/R viewpoint: When we talk about entities in the higher entity set, we are considering all entities, even the subclasses (which might have more attributes) Object-oriented viewpoint: An object (entity) belongs to exactly one class. It inherits properties of its superclasses.

title Movies length

to Stars isa

voices Cartoons

Courtesy of Pixar 24 Multiple Inheritance Theoretically, an E.S. could be a subclass of several other entity sets.

name nominated name nominated

Stars Directors

isa isa

Star- Directors

25 Problems

How should conflicts be resolved? . Example: nominated means Oscar nominated as an actor for Stars, and as a director for Directors. What does it mean for Star-Directors? . Need ad-hoc notation to resolve meanings. . In practice, we shall assume a tree of entity sets connected by isa, with all “isa’s” pointing from child to parent.

26 Keys Consider an entity set E. A key is a set of attributes K of E such that, given two entities e1 and e2 of E, e1 and e2 cannot have identical values in K . In E/R model, every E.S. must have a key. – It could have more than one key, but one set of attributes is the “designated” key. . In E/R diagrams, you should underline all attributes of the designated key. . Keys: – Super key (a key in general). – Candidate Key: A minimal superkey – Primary key: The candidate key we chose to use 27 Example

Suppose title is the key for Movies. It makes sense to use title as key for Cartoons also. Thus, we use root key as key for all subclasses.

title Movies length

to Stars isa

voices Cartoons

28 Example: A Multi-attribute Key

title year

Movies

film-type length

Possibly, the combination of title + length also forms a key, but we have not designated it as such.

29 Weak Entity Sets Consider an entity set E. It’s possible that E’s key comes not (completely) from its own attributes, but also from the keys of one or more E.S.’s to which E is linked (through a supporting many-one relationship). . Entity set E is called a weak E.S. . It is represented by putting double rectangle around E and a double diamond around each supporting relationship. . Many-one-ness of supporting relationship (includes 1-1) essential. – With many-many, we wouldn't know which entity provided the key value. . “Exactly one” also essential, or else we might not be able to extract key attributes by following the supporting relationship.

30 Example: Logins (Email Addresses)

Login name = user name + host name, e.g. [email protected] . A “login” entity corresponds to a user name on a particular host. The Login entity doesn’t record the host, just the user name, e.g., tasos. . Key for a login = the user name at the host (which is unique for that host only) + the IP address of the host (which is unique globally).

name Logins @ Hosts name

. Design issue: Under what circumstances could we simply make login-name and host-name be attributes of logins, and dispense with the weak E.S.?

31 Example: Chain of “Weakness” Consider IP addresses consisting of a primary domain (e.g., edu), subdomain (e.g., cdf.toronto), and host (e.g., eddie).

name name name

2ndary Primary Hosts dot1@ Domains dot2@ Domains

. Key for primary domain = its name. . Key for secondary domain = its name + name of primary domain. . Key for host = its name + key of secondary domain = its name + name of secondary domain + name of primary domain. 32 All “Connecting” Entity Sets Are Weak

Contracts The- The- Movies Stars Movie Star

The- title name Studio

Studios name . If Contracts are uniquely determined only by the movie and the star, we can omit Studios name from the key, and remove the double diamond from Studios.

. Better: Studios is attribute of Contracts. 33 Design Principles Setting: client has (possibly vague) idea of what he/she wants. You must design a database that represents these thoughts and only these thoughts.

Important note: Avoid redundancy,i.e., saying the same thing more than once. It wastes space and encourages inconsistency. Example Good: name name Movies filmed-by Studios addr

36 Example This is a bad design…

studio-name name Movies studio-addr

Yeap! That’s also a bad design…

name name Movies filmed-by Studios studio-name addr

37 Use Schema to Enforce Constraints

The design schema should enforce as many constraints as possible. – Don't rely on future data to follow assumptions. Example: . If a user must associate only one director with a movie, don't allow sets of directors and count on users to enter only one director per movie.

38 Entity Sets Vs. Attributes

. You may be unsure which concepts are worthy of being entity sets, and which are handled more simply as attributes. . Especially tricky for the class design project, since there is a temptation to create needless entity sets to make project “larger.”

name Movies filmed-by Studios name

Which one do you like?!!? name Movies studio-name

39 Intuitive Rule for E.S. Vs. Attribute Make an entity set only if it either: 1. Is more than a name of something; i.e., it has non-key attributes or it has relationships with a number of different entity sets, or 2. Is the “many” in a many-one relationship.

40 Example

The following design illustrates both points:

name name Movies filmed-by Studios addr

. Studios deserves to be an E.S. because we record addr, a non-key attribute. . Movies deserves to be an E.S. because it is at the “many” end.

41 Don't Overuse Weak E.S.

. There is a tendency to feel that no E.S. has its entities uniquely determined without following some relationships. . However, in practice, we almost always create unique ID's to compensate: social-security numbers, OHIP's, etc. . The only times weak E.S.'s seem necessary are when: a) We can't easily create such ID's; e.g., no one is going to accept a “species ID” as part of the standard nomenclature (species is a weak E.S. supported by membership in a genus). b) There is no global authority to create them, e.g., crews and studios.

42 Summary of Conceptual Design

. Conceptual design follows requirements analysis, – Yields a high-level description of data to be stored . ER model popular for conceptual design – Constructs are expressive, close to the way people think about their applications. . Basic constructs: entities, relationships, and attributes (of entities and relationships). . Some additional constructs: weak entities, ISA hierarchies, and aggregation. . Note: There are many variations on ER model. Summary of ER (Contd.)

. Several kinds of integrity constraints can be expressed in the ER model: key constraints, participation constraints, and overlap/covering constraints for ISA hierarchies. Some foreign key constraints are also implicit in the definition of a relationship set. – Some constraints (notably, functional dependencies) cannot be expressed in the ER model. – Constraints play an important role in determining the best database design for an enterprise. Summary of ER (Contd.)

. ER design is subjective. There are often many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include: – Entity vs. attribute, entity vs. relationship, binary or n-ary relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation. . Ensuring good database design: resulting relational schema should be analyzed and refined further. FD information and normalization techniques are especially useful. Exercise

. university database contains information about professors (identified by social security number, or SSN) and courses (identified by courseid). Professors teach courses; each of the following situations concerns the Teaches relationship set. For each situation, draw an ER diagram that describes it (assuming no further constraints hold). Professors can teach the same course in several semesters, and each offering must be recorded. Professors can teach the same course in several semesters, and only the most recent such offering needs to be recorded. (Assume this condition applies in all subsequent questions.) Every professor must teach some course. Every professor teaches exactly one course (no more, no less). Every professor teaches exactly one course (no more, no less), and every course must be taught by some professor. . Now suppose that certain courses can be taught by a team of professors jointly, but it is possible that no one professor in a team can teach the course. Model this situation, introducing additional entity sets and relationship sets if necessary.

46