CISC 7610 Lecture 2 Review of Relational Databases

CISC 7610 Lecture 2 Review of Relational Databases

CISC 7610 Lecture 2 Review of relational databases Topics: Relational database management systems Example data modeling problem Entity-relationship diagrams Structured query language A relational database management system (RDBMS) ● Uses relational data structures ● Has a declarative data manipulation language at least as powerful as the relational algebra ● Not required, but typically also – Supports ACID transactions – Uses SQL as the data manipulation language Uses relational data structures ● Relation: table with rows and columns ● Attribute: column ● Tuple: row ● Key: combination of attributes that uniquely identifies each row ● Integrity rules: Constraints imposed upon the database Has a declarative data manipulation language ● Declarative: says what, not how to manipulate data ● Relational algebra – Selection: extract a subset of tuples – Projection: extract a subset of attributes – Cartesian product: extract all combinations of pairs of tuples from two relations – Union: combine two sets of tuples – Set difference: remove one set of tuples from another Supports ACID transactions ● Transaction: A sequence of DB operations that represents a single real-world operation ● ACID properties – Guaranteed by RDBMSs – Atomicity: all operations happen or none – Consistency: transaction moves DB from one state that meets integrity constraints to another – Isolation: concurrent transactions have the same effect as serial – Durability: once committed, transaction’s effects are permanent ● Example: bank account transfer ● Relaxed by NoSQL databases in various combinations Structured query language (SQL) ● Data definition language – Define relational schemata (pl of schema) – Create/alter/delete tables and the attributes ● Data manipulation language – Insert/delete/modify tuples in relations – Query one or more tables ● Can implement relational algebra, but also takes some liberties with it Example data: Music collection ● Artists: Name ● Albums: Name, Release date ● Tracks: Name, Duration, Number ● Each album has one artist ● Tracks can appear on multiple albums (compilations) Schema normalization: Unnormalized data Artist Album Released Track Num Track Dur David Bowie Space 1969 1 Space 5:15 Oddity Oddity David Bowie … Ziggy 1972 10 Suffragette 3:25 Stardust ... city David Bowie Best of 2002 1 Space 5:15 Bowie Oddity David Bowie Best of 2002 8 Suffragette 3:25 Bowie city Queen Hot space 1982 11 Under 4:02 pressure Entity-relationship diagrams Attribute Entity Cardinality Relationship Cardinality2 Entity2 Do: Draw ER diagram for ex data ● Artists: Name ● Albums: Name, Release date ● Tracks: Name, Duration, Number ● Each album has one artist ● Tracks can appear on multiple albums (compilations) Translating ER diagrams to schema ● Entities become tables ● Attributes become their attributes ● Many-to-many relationships become join tables – Can have additional attributes ● Other relationships become foreign keys – One-to-one, many-to-one, one-to-many – Attributes added to table Do: Translate ER diagram to schema for example data SQL CREATE statement CREATE TABLE table_name ( column_name1 data_type(size), column_name2 data_type(size), column_name3 data_type(size), .... ); Do: Create tables for example data SQL INSERT statement INSERT INTO table_name (column1,column2,column3,...) VALUES (value1,value2,value3,...); Do: Populate tables with ex data Artists Albums Id Name Id Name Release ArtistId 1 Space oddity 1969 1 1 David Bowie 2 … Ziggy 1972 1 startdust ... 2 Queen 3 Best of Bowie 2002 1 4 Hot space 1982 2 Track AlbumsHaveTracks Id Name Duration AlbumId TrackId Number 1 Space 5:15 1 1 1 oddity 2 2 10 2 Suffragette 3:25 city 3 1 1 3 Under 4:02 3 2 8 pressure 4 3 11 Schema normalization: Anomalies in unnormalized data ● The above example schema can suffer from three types of “anomalies” – Update anomaly: repeated data could be inconsistent between rows – Insertion anomaly: can’t add info on artist or album with out a track – Deletion anomaly: deleting the last track deletes an album or artist Schema normalization: Normal forms ● Schema normalization factors logically independent data into independent relations ● And links them using foreign key relationships ● Projection is the process of factoring an unnormalized relation into separate normalized relations ● Boyce-Codd normal form: there are only non-trivial functional dependencies from superkeys (sets of attributes that uniquely identify entities) to other attributes Schema normalization: Unnormalized data Artist Album Released Track Num Track Dur David Bowie Space 1969 1 Space 5:15 Oddity Oddity David Bowie … Ziggy 1972 10 Suffragette 3:25 Stardust ... city David Bowie Best of 2002 1 Space 5:15 Bowie Oddity David Bowie Best of 2002 8 Suffragette 3:25 Bowie city Queen Hot space 1982 11 Under 4:02 pressure Schema normalization: Normalized data Artists Albums Id Name Id Name Release ArtistId 1 Space oddity 1969 1 1 David Bowie 2 … Ziggy 1972 1 startdust ... 2 Queen 3 Best of Bowie 2002 1 4 Hot space 1982 2 Track AlbumsHaveTracks Id Name Duration AlbumId TrackId Number 1 Space 5:15 1 1 1 oddity 2 2 10 2 Suffragette 3:25 city 3 1 1 3 Under 4:02 3 2 8 pressure 4 3 11 Reminder: Main question of course How can systems process and store multimedia data so that users can find what they are looking for in the future? Queries: find what they are looking for ● Search through the data ● Search through complex relationships ● Aggregate over the data for reporting ● And do all of this efficiently... SQL SELECT, single table SELECT attribute1, attribute2 FROM relation WHERE attribute1 = 'condition' ORDER BY attribute2; Do: Write a select query to answer What is the duration of “Suffragette City”? SQL SELECT, multiple tables SELECT r1.attribute1, r2.attribute1 FROM relation1 AS r1, Relation2 AS r2 WHERE attribute1 = 'condition' AND r1.attribute1 = r2.attribute2 ORDER BY r1.attribute1; Do: Write a select query to answer Find the AlbumIds of all of David Bowie's albums Do: Write a select query to answer Find the TrackIds of all of David Bowie's tracks Do: Write a select query to answer ● Find all songs containing David Bowie's vocals ● Find all songs at 120 beats per minute ● Find all songs sampled by other artists – These all require further modeling or analysis of the audio... How do we make databases that are ● Effective (correct, durable, coherent, ...) – Transactions ● Efficient – Concurrency – Memory hierarchy – Indexing – Query optimization .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    29 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us