2 Database Design
Total Page:16
File Type:pdf, Size:1020Kb
2 Database Design Designing a database system is a complex undertaking typically divided into four phases. 2.1.1 Requirements specification collects information about the users’ needs with respect to the database system. A large number of approaches for requirements specification have been developed by both academia and practitioners. During this phase, the active participation of users will increase their satisfaction with the delivered system and avoid errors, which can be very expensive to correct if the subsequent phases have already been carried out. 2.1.2 Conceptual design Conceptual design aims at building a user-oriented representation of the database that does not contain any implementation considerations. This is done by using a conceptual model in order to identify the relevant concepts of the application. The entity-relationship model is one of the globally used conceptual models for designing a database application. Other similar applied models are the object-oriented modeling techniques based on the UML (Unified Modeling Language) notation. Two approaches for conceptual design can be performed according to a) the complexity of the target system and b) the developers’ experience: – Top-down design: The requirements of the various users are merged before the design process begins, and a unique schema is built. Afterward, a separation of the views corresponding to individual users’ requirements can be performed. This approach can be difficult and expensive for large databases and inexperienced developers. – Bottom-up design: A separate schema is built for each group of users with different requirements, and later, during the view integration phase, these schemas are merged to form a unified conceptual schema for the target database. This approach is mainly used for large databases. 2.1.3 Logical design Logical design aims at translating the conceptual representation of conceptual model previous phase into a particular logical model common to several DBMSs. Currently, the most common logical model is the relational model. To ensure an adequate logical representation, we specify a set of suitable mapping rules that transform the constructs in the conceptual model to appropriate structures of the logical model. 2.1.4 Physical design Physical Design aims at customizing the logical representation of the database obtained in the previous phase to a physical model targeted to a particular DBMS platform. Common DBMSs include SQL Server, Ms Access, Oracle, DB2, MySQL, and PostgreSQL, and many more. 2.2 The Entity Relationship Model (ER-Model) Entity Relationship model is the popular conceptual model for designing databases. Entity–relationship (ER) diagrams have been widely adopted in the conceptual modeling community as the fundamental approach for data modeling. The term ER Diagram was introduced by Peter Chen in his mile stone paper. The core concepts utilized in the ER Diagram are: the Entity, the Relationship (that exists between Entities) and the Attribute (of an Entity or Relationship). Additional concepts of an ER Diagram are the primary key (for Entities), cardinalities and roles (for Relationships). The typical usage of ER diagrams is in the requirements analysis and design phases when the modeler employs ER to construct a data model which conceptual-logical-physical layers. The objectives of ER modeling have been traditionally related to data modeling and database design, a prominent use being the generation of data schemata. The above mentioned concepts are analyzed in the following section with representative examples. 2.3 ER Concepts Entity types are used to represent a set of real-world objects of interest to an application. Examples of Entity types are Students or Professors of a University. Entities are objects that belong to an entity type (else Instance). For example John Smith who studies at a specific Faculty of the University is an object which belongs to the Students Entity Type. The characteristics of each entity are known as Attributes. A Students Entity Type may have such Attributes as, Name, Date of Birth, Address and Student_id. Relationships are the associations between objects in the real world (i.e between Entity Types), and Relationship types are used to represent these associations between objects. In our example, the word Studies at a specific Faculty, indicates the relationship type between the Entities (Objects) STUDENT and FACULTY. Role is the participation of an Entity Type in a Relationship Type and is illustrated with a line (link) which connects the two Relationship Types. Cardinality: Each role of a relationship type has associated with it a pair of cardinalities describing the minimum and maximum number of times that an entity may participate in that relationship type. Referential integrity specifies a link between two tables (or twice the same table), where a set of attributes in one table, called the foreign key, references the primary key of the other table. This means that the values in the foreign key must also exist in the primary key. Entity Relationship diagram (ERD) is the most popular technique for data design and for data oriented systems. It is a tool for illustrating relationships in a database. ERD shows: Types of objects which are called entities, for which data is stored Relationships between entities (types of relationships) Attributes of the entity which are stored(not always) It is utilized both during analysis and design phase to define the data model. During analysis phase existing data model is designed and during design phase required data model is designed It is a common tool for relational database design. Rules: Entities are drawn with a rectangle and its name is substantive Name Title Reader Book 1,1 Orders 1, N Date ISBN of Figure 1 Simple ERD Relation is depicted with a diamond on the line that connects 2 entities and its name is always a verb. Attributes of the entity are symbolized with circles and their names are substantives. Cardinality Types (symbol is a mark on relation line): a) Right One A B b) None or One A B c) One or More A B d) None or More A B e) More than One A B Figure 2 A More Complex Entity Relationship Diagram 2.4 Integrity (Keys) Primary Key A table’s PK is a field (possibly composite) that has unique values – each row has a PK value different from any other row in the table. Such a field is a unique identifier. Foreign Key A foreign key is a field (or combination of fields) in a table B that is associated with a PK in a table A through a relationship (A and B can be the same table). Entity Integrity When we define a PK for a table we are enforcing entity integrity. Entity integrity means that each row in the table is identifiable through its primary key. MS Access requires a value for a PK in a newly added row, and MS Access enforces uniqueness of those values. Referential Integrity Suppose we have two tables, A and B, where a relationship is defined between the primary key of table A and a foreign key in table B. We say RI exists for this relationship if for each row in B either: the FK has no value at all (i.e. it is null), or the FK has a value that exists as a PK value in some row of A 2.5 Normal Forms 1 NF: Rule: All attributes are atomic. No “repeating groups” are allowed. Examples: Tables not in 1NF (0 NF) Surname Name Jones John, Tomas, Kevin Name Weight Tones Nick, James Nick 79 Kg Tomas 90 lb Tables in 1NF Surname Name Jones John Jones Tomas Name Weight Unit Jones Kevin Nick 79 Kg Kg Tones Nick Tomas 200 Kg lb Tones James 2 NF: Rules: In 1NF All NON-KEY attributes are fully dependent on the PK (Primary Key). No partial dependencies exist. Examples: Table not in 2NF Student Course_ID Grade Address Jones BIN 1 Praha 1 Tones APS 2 Liberec 1 Brand IS1 1 Olomouc 3 Table Normalized in 2NF Student Address Jones Praha 1 Course_ID Grade Student Tones Liberec 1 BIN 1 Jones Brand Olomouc 1 APS 2 Tones IS1 1 Brand 3 NF: Rules: In 2NF No transitive dependencies are allowed Examples: Table not in 3NF Student Course_ID Grade Grade_Value Jones BIN A 1 Tones APS B 2 Brand IS1 A 1 Table normalized in 3NF Student Course_ID Grade Jones BIN A Grade Grade_Value Tones APS B A 1 Brand IS1 A B 2 .