Fall, 2009 CIS 550 Database and Information Systems Homework 3

Total Page:16

File Type:pdf, Size:1020Kb

Fall, 2009 CIS 550 Database and Information Systems Homework 3 Fall, 2009 CIS 550 Database and Information Systems Homework 3 Due on October 21, 2009 Problem 1 [15 points]: Consider a relation R with four attributes ABCD. You are given the following (possibly incomplete) set of dependencies: A ! BC, C ! D, CD ! AB. 1. List all keys for R (other than superkeys). C; A 2. Is R in 3NF? Why or why not? Yes, it is in BCNF. 3. Is R in BCNF? Why or why not? Yes, C; A; CD are all keys. Problem 2 [30 points]: The task is to design a social network schema, roughly along the lines of the example schemas that we have been using (but with enhancements). We will have users who may create accounts, and will add profile information and connections to other users (friends, partners). Users might also post messages that can be viewed by other users or by everyone. These messages will be searchable by keyword. • Each user will have a unique numeric ID. Additionally, the first and last names, email address, age, gender, and physical address will need to be stored. Users may be categorized as Students or Workers depending on education and employment status (described below). • Each user will have one or more educational experiences, at some institution, some level (e.g., grade school, university), and with a start and end time. • Each user will have zero or more interests. • Each user will have zero or more employment experiences, with a job title, employer, employer address, and salary. • Each user will be connected to zero or more friends (who are also users). • Users can assign to friends different access levels that will control how much personal infor- mation they can see. 1 • Users have personal relationships with other users (e.g., dating, marriage, etc.). • Users have mailboxes with IDs. • Each mailbox has zero or more posts, with the option of being publicly viewable, or only viewable by the recipient. Draw an ER diagram for the social network system. The ER diagram should include various at- tributes, keys, participation constraints, overlap and covering constraints. Problem 3 [25 points]: Consider a relation R with five attributes ABCDE and the FD set F = fAB ! E; AC ! B; E ! D; D ! Eg. Let F + denote the closure set of F . 1. For each of the following attribute sets, do the following: (i) write down a minimal cover of the subset of F + that holds over the set; (ii) name the strongest normal form that is not violated by the relation containing these attributes; (iii) decompose it into a collection of BCNF relations if it is not already in BCNF. (a) ABDE (i) fAB ! E; E ! D; D ! Eg (ii) ABDE has a single key AB. Thus, neither E ! D or D ! E satisfy the conditions for 3NF. However, neither of these is a partial dependency, and so the relation is 2NF. (iii) ABE; DE (b) ABCD (i)fAB ! D; AC ! Bg (ii) Corrected: ABCD has key AC. Therefore it is not in BCNF, since AB is not a superkey, nor is it in 3NF since both B and D are not in AC. (iii) ABC; ABD 2. For each of the following decompositions of the above R = ABCDE, with the same set of functional dependencies F , say whether the decomposition is (i) dependency preserving, and (ii) lossless join. (a) fABC; ABE; EDg (i) FABC contains fAC ! Bg, FABE contains fAB ! Eg, and FDE contains E ! D; D ! E, + + so F = (FABC [ FABE [ FDE) and so the decomposition is dependency preserving. (ii) ABC 1 ABE is lossless since F + contains AB ! E. ABCE 1 ED is lossless since F + contains E ! D. Thus, the decomposition is lossless. (b) fABDE; ABCg (i) FABDE contains fAB ! E; E ! D; D ! Eg and FABC contains fAC ! Bg, so the decomposition is dependency preserving. (ii) F + contains AB ! C so the decomposition is lossless. Problem 4 [20 points]: Suppose you are given a relation R(A; B; C; D). For each of the following (complete) sets of FDs, (i) identify the candidate key(s) for R, and (ii) state whether or not the proposed decomposition of R into smaller relations is a \good" decomposition and briefly explain why or why not. 2 1. A ! B, B ! CD. Decompose into AB, BCD. (i) R has one candidate key, A. (ii) The decomposition is good. First off, it is in BCNF, eliminating redundancy. Furthermore, it a lossless join decomposition (F + contains B ! CD), and dependency preserving A ! B is cointained in FAB and B ! CD is contained in FBCD. 2. D ! A, B ! C. Decompose into AD and BC. (i) R has one candidate key, BD. (ii) The decomposition is \bad" in the sense that it is lossy. Problem 5 [15 points]: What is the difference between an FD and a candidate key? Do candidate keys capture all possible FDs? FDs are more general than candidate keys. A key is just one type of functional dependency. Candidate keys do not capture all possible FDs. For example, ABC might have the FDs A ! BC and B ! C. A is a candidate key. However, this does not capture the fact that a relation instance cannot, for example, contain tuples (a1; b1; c1); (a2; b1; c2). 3.
Recommended publications
  • Keys Are, As Their Name Suggests, a Key Part of a Relational Database
    The key is defined as the column or attribute of the database table. For example if a table has id, name and address as the column names then each one is known as the key for that table. We can also say that the table has 3 keys as id, name and address. The keys are also used to identify each record in the database table . Primary Key:- • Every database table should have one or more columns designated as the primary key . The value this key holds should be unique for each record in the database. For example, assume we have a table called Employees (SSN- social security No) that contains personnel information for every employee in our firm. We’ need to select an appropriate primary key that would uniquely identify each employee. Primary Key • The primary key must contain unique values, must never be null and uniquely identify each record in the table. • As an example, a student id might be a primary key in a student table, a department code in a table of all departments in an organisation. Unique Key • The UNIQUE constraint uniquely identifies each record in a database table. • Allows Null value. But only one Null value. • A table can have more than one UNIQUE Key Column[s] • A table can have multiple unique keys Differences between Primary Key and Unique Key: • Primary Key 1. A primary key cannot allow null (a primary key cannot be defined on columns that allow nulls). 2. Each table can have only one primary key. • Unique Key 1. A unique key can allow null (a unique key can be defined on columns that allow nulls.) 2.
    [Show full text]
  • CANDIDATE KEYS a Candidate Key of a Relation Schema R Is a Subset X
    CANDIDATEKEYS A candidate key of a relation schema R is a subset X of the attributes of R with the following twoproperties: 1. Every attribute is functionally dependent on X, i.e., X + =all attributes of R (also denoted as X + =R). 2. No proper subset of X has the property (1), i.e., X is minimal with respect to the property (1). A sub-key of R: asubset of a candidate key; a super-key:aset of attributes containing a candidate key. We also use the abbreviation CK to denote "candidate key". Let R(ABCDE) be a relation schema and consider the fol- lowing functional dependencies F = {AB → E, AD → B, B → C, C → D}. Since (AC)+ =ABCDE, A+ =A,and C+ =CD, we knowthat ACisacandidate key,both A and C are sub-keys, and ABC is a super-key.The only other candidate keysare AB and AD. Note that since nothing determines A, A is in every can- didate key. -2- Computing All Candidate Keys of a Relation R. Givenarelation schema R(A1, A2,..., An)and a set of functional dependencies F which hold true on R, howcan we compute all candidate keysofR? Since each candidate key must be a minimal subset Z of {A1,..., + An}such that Z =R,wehav e the following straightforward (and brute-force) algorithm: (1) Construct alist L consisting of all non-empty subsets of {A1, n − ..., An}(there are 2 1ofthem). These subsets are arranged in L in ascending order of the size of the subset: 〈 〉 ≤ We get L = Z1, Z2,..., Z2n−1 ,such that |Zi| |Zi+1|.
    [Show full text]
  • There Are Only Four Types of Relational Keys Candidate Key As Stated
    There are only four types of relational keys Candidate Key As stated above, a candidate key is any set of one or more columns whose combined values are unique among all occurrences (i.e., tuples or rows). Since a null value is not guaranteed to be unique, no component of a candidate key is allowed to be null. There can be any number of candidate keys in a table . Relational pundits are not in agreement whether zero candidate keys is acceptable, since that would contradict the (debatable) requirement that there must be a primary key. Primary Key The primary key of any table is any candidate key of that table which the database designer arbitrarily designates as "primary". The primary key may be selected for convenience, comprehension, performance, or any other reasons. It is entirely proper (albeit often inconvenient) to change the selection of primary key to another candidate key. As we will see below, there is no property of a primary key which is not shared by all other candidate keys in of the table except this arbitrary designation. Unfortunately E-R methodologies and RDBMS products have come to rely on the primary key almost to the exclusion of the concept of candidate keys, which are rarely supported by software. Alternate Key The alternate key s of any table are simply those candidate keys which are not currently selected as the primary key. According to { Date95 } (page 115), "... exactly one of those candidate keys [is] chosen as the primary key [and] the remainder, if any, are then called alternate keys ." An alternate key is a function of all candidate keys minus the primary key.
    [Show full text]
  • All Keys in Dbms with Example
    All Keys In Dbms With Example Leigh usually fudging whithersoever or concertina ineffably when diachronic Dickey unbuilds eventually and catachrestically. Nocturnal Chane toast deafly or syllabify presumptively when Dirk is gratulatory. Muriatic and strawy Pavel feds almost algebraically, though Rolfe evolve his almahs hobnails. We would make the keys in all dbms example of attributes Is known s control key contains a database choosing a set to retrieve the same plant as with all in dbms keys in below. Keys with all you can be in all dbms keys with example to? This example to all the ad links and in all dbms keys with example: the minimum number also update operation that each tuple. Used for each employee, let us in the. Nulls in your experience platforms are two same first columns as primary key are required for this set to table roll_no, as demonstrated here. It would work as the keys with the foreign. What is referred table then the curve with all keys in dbms example, you would be a list? Ssis tutorial of your dbms keys by the tables suggests that is not simple words, driving license number field cannot be ignored during your preferred and in all dbms keys with example. Solution for me with databases and with all keys in example, various levels table. You with all keys in dbms example, andere brauche ich für statistiken zu verfolgen. The following example: roll number of the keys and design stage, and with all candidate key in all dbms keys with example might be in.
    [Show full text]
  • CSC 443 – Database Management Systems Data and Its Structure
    CSC 443 – Database Management Systems Lecture 3 –The Relational Data Model Data and Its Structure • Data is actually stored as bits, but it is difficult to work with data at this level. • It is convenient to view data at different levels of abstraction . • Schema : Description of data at some abstraction level. Each level has its own schema. • We will be concerned with three schemas: physical , conceptual , and external . 1 Physical Data Level • Physical schema describes details of how data is stored: tracks, cylinders, indices etc. • Early applications worked at this level – explicitly dealt with details. • Problem: Routines were hard-coded to deal with physical representation. – Changes to data structure difficult to make. – Application code becomes complex since it must deal with details. – Rapid implementation of new features impossible. Conceptual Data Level • Hides details. – In the relational model, the conceptual schema presents data as a set of tables. • DBMS maps from conceptual to physical schema automatically. • Physical schema can be changed without changing application: – DBMS would change mapping from conceptual to physical transparently – This property is referred to as physical data independence 2 Conceptual Data Level (con’t) External Data Level • In the relational model, the external schema also presents data as a set of relations. • An external schema specifies a view of the data in terms of the conceptual level. It is tailored to the needs of a particular category of users. – Portions of stored data should not be seen by some users. • Students should not see their files in full. • Faculty should not see billing data. – Information that can be derived from stored data might be viewed as if it were stored.
    [Show full text]
  • DBMS Keys Mahmoud El-Haj 13/01/2020 The
    DBMS Keys Mahmoud El-Haj 13/01/2020 The following is to help you understand the DBMS Keys mentioned in the 2nd lecture (2. Relational Model) Why do we need keys: • Keys are the essential elements of any relational database. • Keys are used to identify tuples in a relation R. • Keys are also used to establish the relationship among the tables in a schema. Type of keys discussed below: Superkey, Candidate Key, Primary Key (for Foreign Key please refer to the lecture slides (2. Relational Model). • Superkey (SK) of a relation R: o Is a set of attributes SK of R with the following condition: . No two tuples in any valid relation state r(R) will have the same value for SK • That is, for any distinct tuples t1 and t2 in r(R), t1[SK] ≠ t2[SK] o Every relation has at least one default superkey: the set of all its attributes o Basically superkey is nothing but a key. It is a super set of keys where all possible keys are included (see example below). o An attribute or a set of attributes that can be used to identify a tuple (row) of data in a Relation (table) is a Superkey. • Candidate Key of R (all superkeys that can be candidate keys): o A "minimal" superkey o That is, a (candidate) key is a superkey K such that removal of any attribute from K results in a set of attributes that IS NOT a superkey (does not possess the superkey uniqueness property) (see example below). o A Candidate Key is a Superkey but not necessarily vice versa o Candidate Key: Are keys which can be a primary key.
    [Show full text]
  • Course Overview the Relational Model
    COURSE OVERVIEW THE RELATIONAL MODEL CS121: Relational Databases Fall 2017 – Lecture 1 Course Overview 2 ¨ Introduction to relational database systems ¤ Theory and use of relational databases ¨ Focus on: ¤ The Relational Model and relational algebra ¤ SQL (the Structured Query Language) ¤ The Entity-Relationship model ¤ Database schema design and normal forms ¤ Various common uses of database systems ¨ By end of course: ¤ Should be very comfortable using relational databases ¤ Familiar with basic relational database theory Textbook 3 ¨ No textbook is required for the course ¤ The lecture slides contain most of the relevant details ¤ Other essential materials are provided with the assignments ¤ I also make lecture recordings available th ¨ A great book: Database System Concepts, 5 ed. ¤ Silberschatz, Korth, Sudarshan ¤ (The current edition is 6th; they messed a lot of things up…) ¤ Covers theory, use, and implementation of relational databases, so good to have for 121/122/123 sequence Assignments 4 ¨ Assignments are given approximately weekly ¤ Set of problems focusing on that week’s material ¤ Most include hands-on practice with real databases ¤ Made available around Wednesday each week ¤ Due approx. one week later: Thursdays at 2am n That’s the start of Thursday, not the end of Thursday ¨ Midterm and final exam are typically 4-6 hours long ¨ Assignment and exam weighting: ¤ 8 assignments, comprising 70% of your grade ¤ Midterm counts for 15% of your grade ¤ Final exam counts for 15% of your grade Course Website and Submissions 5 ¨ CS121
    [Show full text]
  • The Relational Model
    The Relational Model Read Text Chapter 3 Laks VS Lakshmanan; Based on Ramakrishnan & Gehrke, DB Management Systems Learning Goals given an ER model of an application, design a minimum number of correct tables that capture the information in it given an ER model with inheritance relations, weak entities and aggregations, design the right tables for it given a table design, create correct tables for this design in SQL, including primary and foreign key constraints compare different table designs for the same problem, identify errors and provide corrections Unit 3 2 Historical Perspective Introduced by Edgar Codd (IBM) in 1970 Most widely used model today. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy systems” are usually hierarchical or network models (i.e., not relational) e.g., IMS, IDMS, … Unit 3 3 Historical Perspective Competitor: object-oriented model ObjectStore, Versant, Ontos A synthesis emerging: object-relational model o Informix Universal Server, UniSQL, O2, Oracle, DB2 Recent competitor: XML data model In all cases, relational systems have been extended to support additional features, e.g., objects, XML, text, images, … Unit 3 4 Main Characteristics of the Relational Model Exceedingly simple to understand All kinds of data abstracted and represented as a table Simple query language separate from application language Lots of bells and whistles to do complicated things Unit 3 5 Structure of Relational Databases Relational database: a set of relations Relation: made up of 2 parts: Schema : specifies name of relation, plus name and domain (type) of each field (or column or attribute). o e.g., Student (sid: string, name: string, address: string, phone: string, major: string).
    [Show full text]
  • Chapter 17: Parallel Databases
    ChapterChapter 18:18: ParallelParallel DatabasesDatabases IntroductionIntroduction • Parallel machines are becoming quite common and affordable – Prices of microprocessors, memory and disks have dropped sharply – Recent desktop computers feature multiple processors and this trend is projected to accelerate • Databases are growing increasingly large – large volumes of transaction data are collected and stored for later analysis. – multimedia objects like images are increasingly stored in databases • Large-scale parallel database systems increasingly used for: – storing large volumes of data – processing time-consuming decision-support queries – providing high throughput for transaction processing ParallelismParallelism inin DatabasesDatabases • Data can be partitioned across multiple disks for parallel I/O. • Individual relational operations (e.g., sort, join, aggregation) can be executed in parallel – data can be partitioned and each processor can work independently on its own partition. • Queries are expressed in high level language (SQL, translated to relational algebra) – makes parallelization easier. • Different queries can be run in parallel with each other. Concurrency control takes care of conflicts. • Thus, databases naturally lend themselves to parallelism. I/OI/O ParallelismParallelism • Reduce the time required to retrieve relations from disk by partitioning • The relations on multiple disks. • Horizontal partitioning – tuples of a relation are divided among many disks such that each tuple resides on one disk. • Partitioning techniques (number of disks = n): Round-robin: Send the I th tuple inserted in the relation to disk i mod n. Hash partitioning: – Choose one or more attributes as the partitioning attributes. – Choose hash function h with range 0…n - 1 – Let i denote result of hash function h applied to the partitioning attribute value of a tuple.
    [Show full text]
  • Data Dictionary for HANDS Version 12.4.1
    Data Dictionary for HANDS Version 12.4.1 Date Prepared: 6/26/2019 6/26/2019 11:15:29PM Data Dictionary for HANDS Version 12.4.1 Table of Contents 1. Tablespaces................................................................Page 3 Includes Tablespace Names and file system names. Also, base tables and indexes include their tablespace name. 2. Tables..........................................................................Page 4 Includes Column, Constraint, Index, and Dependency Information, as well as creation dates, row counts, and descriptions of all base tables and columns 3. Views..........................................................................Page 713 Includes Column and Referenced Table Information 4. Functions....................................................................Page 742 Includes Function Return Types, Input and Output Parameters and Parameter Types 5. Stored Procedures......................................................Page 745 Includes Input Parameters and Parameter Types 6. Sequences..................................................................Page 755 Includes Minimum, Maximum, and Incremental Values and Last Number Assigned 7. Description Quick Reference.......................................Page 773 Listing of Object name and description for quick reference A. Tablespace Reference.............................Page 774 B. Table Reference.....................................Page 807 B. View Reference......................................Page 845 C. Column Reference..................................Page
    [Show full text]
  • The Relational Model
    The Relational Model Watch video: https://youtu.be/gcYKGV-QKB0?t=5m15s Topics List • Relational Model Terminology • Properties of Relations • Relational Keys • Integrity Constraints • Views Relational Model Terminology • A relation is a table with columns and rows. • Only applies to logical structure of the database, not the physical structure. • Attribute is a named column of a relation. • Domain is the set of allowable values for one or more attributes. Relational Model Terminology • Tuple is a row of a relation. • Degree is the number of attributes in a relation. • Cardinality is the number of tuples in a relation. • Relational Database is a collection of normalised relations with distinct relation names. Instances of Branch and Staff Relations Examples of Attribute Domains Alternative Terminology for Relational Model Topics List • Relational Model Terminology • Properties of Relations • Relational Keys • Integrity Constraints • Views Properties of Relations • Relation name is distinct from all other relation names in relational schema. • Each cell of relation contains exactly one atomic (single) value. • Each attribute has a distinct name. • Values of an attribute are all from the same domain. Properties of Relations • Each tuple is distinct; there are no duplicate tuples. • Order of attributes has no significance. • Order of tuples has no significance, theoretically. Topics List • Relational Model Terminology • Properties of Relations • Relational Keys • Integrity Constraints • Views Relational Keys • Superkey • Super key stands for superset of a key. A Super Key is a set of one or more attributes that are taken collectively and can identify all other attributes uniquely. • An attribute, or set of attributes, that uniquely identifies a tuple within a relation.
    [Show full text]
  • Full UDS Data Dictionary
    Full UDS Data Dictionary • Overview • Concepts o Normalized and Denormalized Data o Data Load / Snapshots o Data Quality o No Fake / Test Data in Production o Best Practices for Querying the UDS • Data Model Diagram • Submissions o Data Sources ▪ Data Source documentation ▪ Data ingestion processes o Tables ▪ submission ▪ Description ▪ Columns ▪ measurement_set ▪ Description ▪ Columns ▪ measurement ▪ Description ▪ Columns ▪ Measurement Type Field Mappings ▪ multi_performance_rate_stratum ▪ Description ▪ Columns ▪ acr_measurement_count ▪ Description ▪ Columns ▪ submission_audit_log ▪ Description ▪ Columns ▪ submission_request_log ▪ Description ▪ Columns ▪ submission_request_json_flattened ▪ Description ▪ Columns • Submission Scoring o Data Sources ▪ Data Source documentation ▪ Data ingestion processes o Submission Scoring Discrepancies o Tables ▪ submission_score ▪ Description ▪ Columns ▪ score_part ▪ Description ▪ Columns ▪ category_score ▪ Description ▪ Columns ▪ measurement_set_score_part ▪ Description ▪ Columns ▪ measurement_score ▪ Description ▪ Columns ▪ measurement_score_part ▪ Description ▪ Columns ▪ measures_picked ▪ Description ▪ Columns • Final Scoring o Data Sources ▪ Data Source documentation ▪ Performance Year Changes ▪ Data ingestion processes o Tables ▪ final_score ▪ Description ▪ Columns ▪ candidate_score ▪ Description ▪ Columns ▪ targeted_review ▪ Description ▪ Columns ▪ hvbp_provider_score/hvbp_provider_preview ▪ Description ▪ Columns ▪ hvbp_score_percentiles/hvbp_preview_percentiles ▪ Description ▪ Columns ▪ payment_adjustment
    [Show full text]