
CISC 7610 Lecture 2 Review of relational databases Topics: Relational database management systems Example data modeling problem Entity-relationship diagrams Structured query language A relational database management system (RDBMS) ● Uses relational data structures ● Has a declarative data manipulation language at least as powerful as the relational algebra ● Not required, but typically also – Supports ACID transactions – Uses SQL as the data manipulation language Uses relational data structures ● Relation: table with rows and columns ● Attribute: column ● Tuple: row ● Key: combination of attributes that uniquely identifies each row ● Integrity rules: Constraints imposed upon the database Has a declarative data manipulation language ● Declarative: says what, not how to manipulate data ● Relational algebra – Selection: extract a subset of tuples – Projection: extract a subset of attributes – Cartesian product: extract all combinations of pairs of tuples from two relations – Union: combine two sets of tuples – Set difference: remove one set of tuples from another Supports ACID transactions ● Transaction: A sequence of DB operations that represents a single real-world operation ● ACID properties – Guaranteed by RDBMSs – Atomicity: all operations happen or none – Consistency: transaction moves DB from one state that meets integrity constraints to another – Isolation: concurrent transactions have the same effect as serial – Durability: once committed, transaction’s effects are permanent ● Example: bank account transfer ● Relaxed by NoSQL databases in various combinations Structured query language (SQL) ● Data definition language – Define relational schemata (pl of schema) – Create/alter/delete tables and the attributes ● Data manipulation language – Insert/delete/modify tuples in relations – Query one or more tables ● Can implement relational algebra, but also takes some liberties with it Example data: Music collection ● Artists: Name ● Albums: Name, Release date ● Tracks: Name, Duration, Number ● Each album has one artist ● Tracks can appear on multiple albums (compilations) Schema normalization: Unnormalized data Artist Album Released Track Num Track Dur David Bowie Space 1969 1 Space 5:15 Oddity Oddity David Bowie … Ziggy 1972 10 Suffragette 3:25 Stardust ... city David Bowie Best of 2002 1 Space 5:15 Bowie Oddity David Bowie Best of 2002 8 Suffragette 3:25 Bowie city Queen Hot space 1982 11 Under 4:02 pressure Entity-relationship diagrams Attribute Entity Cardinality Relationship Cardinality2 Entity2 Do: Draw ER diagram for ex data ● Artists: Name ● Albums: Name, Release date ● Tracks: Name, Duration, Number ● Each album has one artist ● Tracks can appear on multiple albums (compilations) Translating ER diagrams to schema ● Entities become tables ● Attributes become their attributes ● Many-to-many relationships become join tables – Can have additional attributes ● Other relationships become foreign keys – One-to-one, many-to-one, one-to-many – Attributes added to table Do: Translate ER diagram to schema for example data SQL CREATE statement CREATE TABLE table_name ( column_name1 data_type(size), column_name2 data_type(size), column_name3 data_type(size), .... ); Do: Create tables for example data SQL INSERT statement INSERT INTO table_name (column1,column2,column3,...) VALUES (value1,value2,value3,...); Do: Populate tables with ex data Artists Albums Id Name Id Name Release ArtistId 1 Space oddity 1969 1 1 David Bowie 2 … Ziggy 1972 1 startdust ... 2 Queen 3 Best of Bowie 2002 1 4 Hot space 1982 2 Track AlbumsHaveTracks Id Name Duration AlbumId TrackId Number 1 Space 5:15 1 1 1 oddity 2 2 10 2 Suffragette 3:25 city 3 1 1 3 Under 4:02 3 2 8 pressure 4 3 11 Schema normalization: Anomalies in unnormalized data ● The above example schema can suffer from three types of “anomalies” – Update anomaly: repeated data could be inconsistent between rows – Insertion anomaly: can’t add info on artist or album with out a track – Deletion anomaly: deleting the last track deletes an album or artist Schema normalization: Normal forms ● Schema normalization factors logically independent data into independent relations ● And links them using foreign key relationships ● Projection is the process of factoring an unnormalized relation into separate normalized relations ● Boyce-Codd normal form: there are only non-trivial functional dependencies from superkeys (sets of attributes that uniquely identify entities) to other attributes Schema normalization: Unnormalized data Artist Album Released Track Num Track Dur David Bowie Space 1969 1 Space 5:15 Oddity Oddity David Bowie … Ziggy 1972 10 Suffragette 3:25 Stardust ... city David Bowie Best of 2002 1 Space 5:15 Bowie Oddity David Bowie Best of 2002 8 Suffragette 3:25 Bowie city Queen Hot space 1982 11 Under 4:02 pressure Schema normalization: Normalized data Artists Albums Id Name Id Name Release ArtistId 1 Space oddity 1969 1 1 David Bowie 2 … Ziggy 1972 1 startdust ... 2 Queen 3 Best of Bowie 2002 1 4 Hot space 1982 2 Track AlbumsHaveTracks Id Name Duration AlbumId TrackId Number 1 Space 5:15 1 1 1 oddity 2 2 10 2 Suffragette 3:25 city 3 1 1 3 Under 4:02 3 2 8 pressure 4 3 11 Reminder: Main question of course How can systems process and store multimedia data so that users can find what they are looking for in the future? Queries: find what they are looking for ● Search through the data ● Search through complex relationships ● Aggregate over the data for reporting ● And do all of this efficiently... SQL SELECT, single table SELECT attribute1, attribute2 FROM relation WHERE attribute1 = 'condition' ORDER BY attribute2; Do: Write a select query to answer What is the duration of “Suffragette City”? SQL SELECT, multiple tables SELECT r1.attribute1, r2.attribute1 FROM relation1 AS r1, Relation2 AS r2 WHERE attribute1 = 'condition' AND r1.attribute1 = r2.attribute2 ORDER BY r1.attribute1; Do: Write a select query to answer Find the AlbumIds of all of David Bowie's albums Do: Write a select query to answer Find the TrackIds of all of David Bowie's tracks Do: Write a select query to answer ● Find all songs containing David Bowie's vocals ● Find all songs at 120 beats per minute ● Find all songs sampled by other artists – These all require further modeling or analysis of the audio... How do we make databases that are ● Effective (correct, durable, coherent, ...) – Transactions ● Efficient – Concurrency – Memory hierarchy – Indexing – Query optimization .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages29 Page
-
File Size-