Database Management Systems

Fall 2007 CS 370 / IT 376 Exam 2 Page 1 Database Management Systems 11/7/07 Name______

1. True/False. [20 pts] ______JDBC is a statement level interface and access protocol for embedded SQL. ______A prepared statement in JDBC is a static SQL statement that cannot be altered before execution. ______Changes to the result set with a scroll-sensitive cursor also changes the database. ______The SQLSTATE system variable is an integer type that contains standard return codes. ______Tetabyte (a trillion bytes) is considered a medium to large sized database. ______Disk accesses are typically a million times slower than RAM access. ______ODBC is a database access protocol that is for object oriented languages only. ______A cylinder on a disk is typically composed of one track per platter surface. ______A heap or pile file maintains the data in purely chronological order of insertion. ______A table’s tuples stored with a B+-tree index will be clustered by the primary key. ______Entity- relationship modeling is part of the UML system of design tools. ______Person ISA sub-entities such as (employee, client, donor) are examples of mutually exclusive specialization. ______Extendible hashing may require an additional disk access per random read but eliminates the need to rehash a file to accommodate a larger, growing data set. ______CASCADE DELETE in a weak entity table is appropriate when referencing a strong entity. ______INSERTing into a view that is based on a projection is always acceptable. ______Null values are equivalent to zero for numeric attributes in relational databases. ______Nulls are not permitted in 1NF table. ______NOT NULL UNIQUE tags on an attribute implies a candidate key. ______Hash files are good for queries that request a range of keys. ______Telephone numbers are best stored as DECIMAL (10).

2. Explain what a logical transaction is. Why do we have them? How is it started? How is it terminated? [7 pts] To create a sequence of SQL statements that constitute an all or nothing block. If any failure occurs in the sequence, all the statements are discounted. Only if all are successful does the database record the changes. It starts with the first statement and ends with a COMMIT or ROLLBACK statement. Fall 2007 CS 370 / IT 376 Exam 2 Page 2

3. Given the relation schema Customer(CID, CustName, OrderID, Address, PhoneNum, OrderDate, OrderTotal, ItemID, ItemPrice, ItemName, Quantity) and the functional dependencies below, generate a 3NF decomposition and show the resulting relations. Note that the functional dependencies are not a minimum cover meaning there are some superfluous functional dependencies listed. CID→CustName, Address (CID,OrderID)→OrderDate OrderID →CID, OrderTotal (OrderID,ItemID)→ItemPrice,Quantity ItemID→ItemPrice, ItemName Phone → CID Phone →CustName [11 pts]

Customers(CID, CustName, Address)

Orders(OrderID, CID, OrderTotal)

Items(ItemID,ItemName,ItemPrice)

OrderItems(OrderID,ItemID,Quantity)

Phones(CID,Phone)

4. Interpret what each of the following complementary SQL triggers accomplish. Any syntax errors are not intended and there are some extensions to the grammar. [7 pts] CREATE TRIGGER INV_TR BEFORE INSERT ON ORDER_ITEM REFERENCING NEW AS P FOR EACH ROW WHEN (P.Quantity<=(SELECT T.QuantityOnHand FROM ITEMS T WHERE P.ItemID = T.ItemID)) UPDATE ITEMS SET QuantityOnHand = QuantityOnHand – P.Quantity WHERE P.ItemID = T.ItemID;

As items are applied to orders, the corresponding quantity on hand is reduced

CREATE TRIGGER OTOTAL_TR AFTER INSERT ON ORDER_ITEM REFERENCING NEW AS P FOR EACH ROW UPDATE ORDER SET OrderTotal = OrderTotal + P.Quantity*P.ItemPrice WHERE P.OrderID = ORDER.OrderID; As items are applied to orders, the total cost is adjusted upward automatically Fall 2007 CS 370 / IT 376 Exam 2 Page 3 5. The following components are crucial for successful access to a database from Java or in any embedded SQL setting. Give a brief explanation of what each of the objects contain AND accomplish. [18 pts] a. Driver: This object provides the necessary driver software for the type of database we’re accessing: postgres, mysql, sqlserver, access, etc.

b. Connection: This object establishes the path to the database server to transmit requests and receive results. Authorization to a particular database on the server is set here.

c. PreparedStatement: This object holds a sql statement that may have one or more placeholders for program variable values to be inserted prior to submission to the server for execution. Methods provide replacing the placeholders and initiating the execution.

d. ResultSet: This object receives the data table that is returned by the query. It provides methods to step through the tuples and access values within each tuple. Updates to the data may also be managed within this object.

e. Cursor: Internal to the result set, this object tracks the current tuple for access by the host program and controls how the next tuple is selected.

6. Disk drives. a. If a disk has sectors that are 2048 bytes and tuple sizes are 51 bytes, describe how the data in a large relational table of those tuples would likely be broken up. [4 pts] The likely arrangement is to store up to 40 tuples per block/sector. 40*51 is 2040 bytes. Much depends on the file structure. A b+-tree may result in a varying number of records (20-40) in a block

b. What are the three components of time that account for a random disk drive access? [3 pts] Seek time (movement of r/w heads) + Latency (time for the disk to spin the desired sector passed the head) + Transfer (time for the entire sector to pass the r/w head)

c. Give an estimate (ballpark) for the typical total access time (within one order of magnitude). [2 pts] 5-25 msec Fall 2007 CS 370 / IT 376 Exam 2 Page 4

7. Construct the index node(s) for the B+-tree below. The number of values that will fit in one node is 5. For degree = 6, you normally have index nodes with two to five keys in it (the root can have just one). Initially for level 1, assume that you have one node which is the root for now. Show it and the pointers for the data below. [8 pts]

Root level

Level 1:

Data Nodes: (2-3-5-9-13) (17-19-23-28-35) (40-43-45-50-54) (59-63-66) (70-76-81) (83-88-90)

Now show how the B+-tree appears after inserting the value 44. You may need to generate a new root level.

For the SQL queries, use the following relational schema for a moving company database. Keys are underlined. The attributes should be self-evident. If not, please ask for clarification. (It is simplified from the previous exam.)

CUSTOMER(custID, custName, currentAddr, currentZip, currentPhone, newAddr, newZip, newPhone) CSZ(zip, city, state) INVOICE(invID, custID, dateOfMove, dateOfDelivery, estWeight, estCost, mileage, finalCost, finalTotalBill) TRUCK(truckID, capacity, make, yearBought) DRIVER(driverLicNum, driverName, startYear) RUN(invID, truckID, driverLicNum, dateStart, dateEnd)

Syntax for SQL, where [] means optional, {op1|op2|...} means choice SELECT [DISTINCT] {* | attribute-list | aggregate functions}... FROM tables-list and aliases WHERE condition [GROUP BY attr [HAVING condition]]

SQL conditions consist of <,>,<=,>=, <>,=, AND, OR, BETWEEN value AND value [NOT] EXISTS ({list | SELECT...}), rel-op {ANY|SOME|ALL} ({ list | SELECT...}), IS [NOT] NULL Aggregate functions: COUNT(*), MIN(attr), MAX(attr), SUM(attr), AVG(attr) Fall 2007 CS 370 / IT 376 Exam 2 Page 5

8. Give the SQL select statements for the following queries. [20 pts] a) List all drivers who have worked for the company since 2002.

SELECT d.driverName FROM DRIVER d WHERE d.startYear>=’1-1-2002’;

b) List customer names whose moves were more than 1000 miles.

SELECT c.custName FROM CUSTOMER c, INVOICE i WHERE c.custID=i.custID AND i.mileage>1000;

c) Hazardous material (mercury) was found on truck ID ‘356’ and it is not known for how long the material was present. List all names who have driven this truck ‘356’.

SELECT d.driverName FROM DRIVER d, RUN r WHERE d.driverLicNum=r.driverLicNum AND r.truckID=’356’;

d) List all customers and their new phone numbers whose moves used that contaminated truck ‘356’.

SELECT c.custName, c.currentPhone FROM CUSTOMER c, INVOICE i, RUN r WHERE c.custID=i.custID AND i.invID=r.invID AND r.truckID=’356’;

e) List the zipcodes of destinations of moves where the total costs in that zipcode were greater than $50000 for the moves so far this year. (You need to use group by – having….)

SELECT c.newZip FROM CUSTOMER c, INVOICE i WHERE i.dateOfMove >=’1-1-2007’ AND c.custID = i.custID GROUP BY c.newZIP HAVING sum(c.finalCost)>50000;