October 2017 Database Management Systems Solution Set

I a Suppose you want to build a video site similar to YouTube and keep data in file- processing system. Discuss the relevance of each of the following points to the storage of actual video data, and to metadata about the video, such as title, the user who uploaded it, tags, and which users viewed it. i. Data redundancy and inconsistency ii. Difficulty in accessing data iii. Data isolation iv. Integrity problems v. Atomicity problems vi. Concurrent system anomalies vii. Security problems Ans i. Data redundancy and inconsistency. This would be relevant to metadata to 1a some extent, although not to the actual video data, which is not updated. There are very few relationships here, and none of them can lead to redundancy. ii. Difficulty in accessing data. If video data is only accessed through a few predefined interfaces, as is done in video sharing sites today, this will not be a problem. However, if an organization needs to find video data based on specific search conditions (beyond simple keyword queries) if meta data were stored in files it would be hard to find relevant data without writing application programs. Using a database would be important for the task of finding data. iii. Data isolation. Since data is not usually updated, but instead newly created, data isolation is not a major issue. Even the task of keeping track of who has viewed what videos is (conceptually) append only, again making isolation not a major issue. However, if authorization is added, there may be some issues of concurrent updates to authorization information. iv. Integrity problems. It seems unlikely there are significant integrity constraints in this application, except for primary keys. if the data is distributed, there may be issues in enforcing primary key constraints. Integrity problems are probably not a major issue. v. Atomicity problems. When a video is uploaded, metadata about the video and the video should be added atomically, otherwise there would be an inconsistency in the data. An underlying recovery mechanism would be required to ensure atomicity in the event of failures. vi. Concurrent-access anomalies. Since data is not updated, concurrent access anomalies would be unlikely to occur. vii. Security problems. These would be a issue if the system supported authorization.

1 October 2017 Database Management Systems Solution Set b State the advantages and disadvantages of the following data models: Hierarchical, Network, Relational, Entity Relationship, Object Oriented and NoSQL. State if the models support data and structural independence.

2 October 2017 Database Management Systems Solution Set

1 c State and explain the twelve Codd’s rules for relational databases.

3 October 2017 Database Management Systems Solution Set

1 d What is Unified modelling language? What are its parts? Show the ER diagram notations and equivalent notations in UML. Ans 1 d The Unified Modeling Language (UML) is a standard developed under the auspices of the Object Management Group (OMG) for creating specifications of various components of a software system. Some of the parts of UML are: • Class diagram: A class diagram is similar to an E-R diagram. • Use case diagram: Use case diagrams show the interaction between users and the system, in particular the steps of tasks that users perform (such as withdrawing money or registering for a course). • Activity diagram: Activity diagrams depict the flow of tasks between various components of a system. • Implementation diagram: Implementation diagrams show the system components and their interconnections, both at the software component level and the hardware component level.

4 October 2017 Database Management Systems Solution Set

1 e Construct an E-R diagram for a car insurance company whose customers own one or more cars each. Each car has associated with it zero to any number of recorded accidents. Each insurance policy covers one or more cars, and has one or more premium payments associated with it. Each payment is for a particular period of time, and has an associated due date, and the date when the payment was received. Ans 1 e The E-R diagram is shown in Figure below. Payments are modeled as weak entities since they are related to a specific policy. Note that the participation of accident in the relationship participated is not total, since it is possible that there is an accident report the participating car is unknown.

1 f i. Design an E-R diagram for keeping track of the exploits of your favourite sports team. You should store the matches played, the scores in each match, the players in each match, and individual player statistics for each match. Summary statistics should be modelled as derived attributes.

5 October 2017 Database Management Systems Solution Set

ii. Consider an E-R diagram in which the same entity set appears several times, with its attributes repeated in more than one occurrence. Why is allowing this redundancy a bad practice that one should avoid? The different occurrences of an entity may have different sets of attributes, leading to an inconsistent diagram. Instead, the attributes of an entity should be specified only once. All other occurrences of the entity should omit attributes. Since it is not possible to have an entity without any attributes, an occurrence of an entity without attributes clearly indicates that the attributes are specified elsewhere.

2 a The natural outer-join operations extend the natural-join operation so that tuples from the participating relations are not lost in the result of the join. Describe how the theta join operation can be extended so that tuples from the left, right, or both relations are not lost from the result of a theta join.

2 b Given the following relational schemas: 푅 = (퐴, 퐵, 퐶) 푆 = (퐷, 퐸, 퐹) Suppose the relations r(R) and s(S) are defined. Write the expressions in tuple relational calculus equivalent to each of the following: i. ∏퐴(푟) ii. 휎퐵=17(푟) iii. 푟 × 푠 iv. ∏퐴,퐹(휎퐶=퐷(푟 × 푠)) Ans 2b i. {t | ∃ q ∈ r (q[A] = t[A])} ii. {t | t ∈ r ∧ t[B] = 17} iii. {t | ∃ p ∈ r ∃ q ∈ s (t[A] = p[A] ∧ t[B] = p[B] ∧ t[C] = p[C] ∧ t[D] = q[D] ∧ t[E] = q[E] ∧ t[F] = q[F])} iv. {t | ∃ p ∈ r ∃ q ∈ s (t[A] = p[A] ∧ t[F] = q[F] ∧ p[C] = q[D]}

2 c Consider the below, where primary keys are underlined. employee (person name, street, city) works (person name, company name, salary) company (company name, city) manages (person name, manager name) Give an expression in tuple relational calculus for each of the following queries: i. Find all employees who work directly for “Jones.” ii. Find all cities of residence of all employees who work directly for “Jones.”

6 October 2017 Database Management Systems Solution Set

iii. Find the name of the manager of the manager of “Jones.” iv. Find those employees who earn more than all employees living in the city “Mumbai.” Ans 2c i. {t | ∃ m ∈ manages (t[person name] = m[person name] ∧ m[manager name] = ’Jones’)} ii. {t | ∃ m ∈ manages ∃e ∈ employee(e[person name] = m[person name] ∧ m[manager name] = ’Jones’ ∧ t[city] = e[city])} iii. {t | ∃ m1 ∈ manages ∃m2 ∈ manages(m1[manager name] = m2[person name] ∧ m1[person name] = ’Jones’ ∧ t[manager name] = m2[manager name])} iv. {t | ∃ w1 ∈ works ¬∃w2 ∈ works(w1[salary] < w2[salary] ∃e2 ∈ employee (w2[person name] = e2[person name] ∧ e2[city] = ’Mumbai’))}

2 d What is normalization? What is its objective? Give a distinguishing characteristic of 1NF, 2NF, 3NF, 4NF and BCNF. Ans 2d Normalization is a process for evaluating and correcting structures to minimize data redundancies, thereby reducing the likelihood of data anomalies. The normalization process involves assigning attributes to tables based on the concept of determination. The objective of normalization is to ensure that each table conforms to the concept of well-formed relations—in other words, tables that have the following characteristics: • Each table represents a single subject • No data item will be unnecessarily stored in more than one table (in short, tables have minimum controlled redundancy). The reason for this requirement is to ensure that the data is updated in only one place. • All nonprime attributes in a table are dependent on the primary key—the entire primary key and nothing but the primary key. The reason for this requirement is to ensure that the data is uniquely identifiable by a primary key value. • Each table is void of insertion, , or deletion anomalies, which ensures the integrity and consistency of the data.

7 October 2017 Database Management Systems Solution Set

i. Using the INVOICE table structure shown in table below, write the relational schema, draw its dependency diagram and identify all dependencies (including all partial and transitive dependencies). You can assume that the table does not contain repeating groups and that any invoice number may reference more than one product. (Hint: This table uses a composite primary key.)

Attribute Name Sample Value Sample Value Sample Sample Value Sample Value Value INV_NUM 211347 211347 211347 211348 211349 PROD_NUM AA-E3422QW QD-300932X RU-995748G AA-E3422QW GH-778345P

SALE_DATE 15-Jan-2016 15-Jan-2016 15-Jan-2016 15-Jan-2016 16-Jan-2016 PROD_LABEL Rotary sander 0.25-in. drill bit Band saw Rotary sander Power drill

VEND_CODE 211 211 309 211 157 VEND_NAME NeverFail, Inc. NeverFail, Inc. BeGood, Inc. NeverFail, Inc. ToughGo, Inc. QUANT_SOLD 1 8 1 2 1 PROD_PRICE ₹4995 ₹345 ₹3999 ₹4995 ₹8775

ii. Using the initial dependency diagram drawn in question i, remove all partial dependencies, draw the new dependency diagrams, and identify the normal forms for each table structure you created. iii. Using the table structures you created in question ii, remove all transitive dependencies and draw the new dependency diagrams. Also identify the normal forms for each table structure you created

Ans 2e Note: QUANT_SOLD = NUM_SOLD

8 October 2017 Database Management Systems Solution Set

2f Explain the phases of database design. Ans 2f • The initial phase of database design is to characterize fully the data needs of the prospective database users. The database designer needs to interact extensively with domain experts and users to carry out this task. The outcome of this phase is a specification of user requirements. • Next, the designer chooses a data model and, by applying the concepts of the chosen data model, translates these requirements into a conceptual schema of the database. The schema developed at this conceptual-design phase provides a detailed overview of the enterprise. The entity-relationship model, is typically used to represent the conceptual design. Stated in terms of the entity-relationship model, the conceptual schema specifies the entities that are represented in the database, the attributes of the entities, the relationships among the entities, and constraints on the entities and relationships. Typically, the conceptual-design phase results in the creation of an entity-relationship diagram that provides a graphic representation of the schema. • A fully developed conceptual schema also indicates the functional requirements of the enterprise. In a specification of functional requirements, users describe the kinds of operations (or transactions) that will be performed on the data. Example operations include modifying or updating data, searching for and retrieving specific data, and deleting data. At this stage of conceptual design, the designer can review the schema to ensure it meets functional requirements. • The process of moving from an abstract data model to the implementation of the database proceeds in two final design phases. ◦ In the logical-design phase, the designer maps the high-level conceptual schema onto the implementation data model of the database system that will be used. The implementation data model is typically the relational data model, and this step typically consists of mapping the conceptual schema defined using the entity-relationship model into a relation schema. ◦ Finally, the designer uses the resulting system-specific database schema in the subsequent physical-design phase, in which the physical features of the database are specified.

9 October 2017 Database Management Systems Solution Set

3a What are constraints? What are the different types of constraints? Explain. • Constraints enforce rules at the table level. • Constraints prevent the deletion of a table if there are dependencies. • The following are the five types of constraints: • NOT NULL • UNIQUE • PRIMARY KEY • FOREIGN KEY • CHECK The NOT NULL Constraint The NOT NULL constraint ensures that the contains no values. Columns without the NOT NULL constraint can contain null values by default. The NOT NULL constraint can be specified only at the column level, not at the table level. The UNIQUE Constraint A UNIQUE key integrity constraint requires that every value in a column or set of columns (key) be unique—that is, no two rows of a table can have duplicate values in a specified column or set of columns. The column (or set of columns) included in the definition of the UNIQUE key constraint is called the unique key. If the UNIQUE constraint comprises more than one column, that group of columns is called a composite unique key. UNIQUE constraints allow the input of nulls unless you also define NOT NULL constraints for the same columns. In fact, any number of rows can include nulls for columns without NOT NULL constraints because nulls are not considered equal to anything. A null in a column (or in all columns of a composite UNIQUE key) always satisfies a UNIQUE constraint. UNIQUE constraints can be defined at the column or table level. A composite unique key is created by using the table level definition. The PRIMARY KEY Constraint A PRIMARY KEY constraint creates a primary key for the table. Only one primary key can be created for each table. The PRIMARY KEY constraint is a column or set of columns that uniquely identifies each row in a table. This constraint enforces uniqueness of the column or column combination and ensures that no column that is part of the primary key can contain a null value. PRIMARY KEY constraints can be defined at the column level or table level. A composite PRIMARY KEY is created by using the table-level definition. A table can have only one PRIMARY KEY constraint but can have several UNIQUE constraints. The FOREIGN KEY Constraint The FOREIGN KEY, or referential integrity constraint, designates a column or combination of columns as a foreign key and establishes a relationship between a primary key or a unique key in the same table or a different table. A foreign key value must match an existing value in the parent table or be NULL.

10 October 2017 Database Management Systems Solution Set

Foreign keys are based on data values and are purely logical, not physical, pointers. FOREIGN KEY constraints can be defined at the column or table constraint level. A composite foreign key must be created by using the table-level definition. The foreign key is defined in the child table, and the table containing the referenced column is the parent table. The foreign key is defined using a combination of the following keywords: • FOREIGN KEY is used to define the column in the child table at the table constraint level. • REFERENCES identifies the table and column in the parent table. • ON DELETE CASCADE indicates that when the row in the parent table is deleted, the dependent rows in the child table will also be deleted. • ON DELETE SET NULL converts foreign key values to null when the parent value is removed. The default behavior is called the restrict rule, which disallows the update or deletion of referenced data. Without the ON DELETE CASCADE or the ON DELETE SET NULL options, the row in the parent table cannot be deleted if it is referenced in the child table. The CHECK Constraint The CHECK constraint defines a condition that each row must satisfy. The condition can use the same constructs as query conditions, with the following exceptions: • References to the CURRVAL, NEXTVAL, LEVEL, and ROWNUM pseudocolumns • Calls to SYSDATE, UID, USER, and USERENV functions • Queries that refer to other values in other rows A single column can have multiple CHECK constraints which refer to the column in its definition. There is no limit to the number of CHECK constraints which you can define on a column. CHECK constraints can be defined at the column level or table level.

3b What is a view? What are its advantages? A view is a logical table based on a table or another view. A view contains no data of its own but is like a window through which data from tables can be viewed or changed. The tables on which a view is based are called base tables. The view is stored as a SELECT statement in the data dictionary. Advantages of Views • Views restrict access to the data because the view can display selective columns from the table. • Views can be used to make simple queries to retrieve the results of complicated queries. For example, views can be used to query information from multiple tables without the user knowing how to write a join statement. • Views provide data independence for ad hoc users and application programs. One view can be used to retrieve data from several tables.

11 October 2017 Database Management Systems Solution Set

• Views provide groups of users access to data according to their particular criteria.

3c State the rules for performing DML operations on a view. Ans Performing DML Operations on a View You can perform DML operations on data through a view if those operations follow certain rules. You can remove a row from a view unless it contains any of the following: • Group functions • A GROUP BY clause • The DISTINCT keyword • The pseudocolumn ROWNUM keyword

You can add data through a view unless it contains any of the items listed in the slide or there are NOT NULL columns without default values in the base table that are not selected by the view. All required values must be present in the view. We add values directly into the underlying table through the view. You cannot add data through a view if the view includes: • Group functions • A GROUP BY clause • The DISTINCT keyword • The pseudocolumn ROWNUM keyword • Columns defined by expressions • NOT NULL columns in the base tables that are not selected by the view

You cannot modify data in a view if it contains: • Group functions • A GROUP BY clause • The DISTINCT keyword • The pseudocolumn ROWNUM keyword • Columns defined by expressions

You can ensure that no DML operations occur on your view by creating it with the WITH READ ONLY option.

3d Explain GROUP BY and ORDER BY clauses with examples. The GROUP BY Clause You can use the GROUP BY clause to divide the rows in a table into groups. You can then use the group functions to return summary information for each group. In the syntax:

12 October 2017 Database Management Systems Solution Set

group_by_expression specifies columns whose values determine the basis for grouping rows • If you include a group function in a SELECT clause, you cannot individual results as well, unless the individual column appears in the GROUP BY clause. You receive an error message if you fail to include the column list in the GROUP BY clause. • Using a WHERE clause, you can exclude rows before dividing them into groups. • You must include the columns in the GROUP BY clause. • You cannot use a column alias in the GROUP BY clause. • By default, rows are sorted by ascending order of the columns included in the GROUP BY list. You can override this by using the ORDER BY clause. When using the GROUP BY clause, make sure that all columns in the SELECT list that are not group functions are included in the GROUP BY clause. Restricting Group Results In the same way that you use the WHERE clause to restrict the rows that you select, you use the HAVING clause to restrict groups. SELECT column, group_function FROM table [WHERE condition] [GROUP BY group_by_expression] [HAVING group_condition] [ORDER BY column];

The ORDER BY Clause The order of rows returned in a query result is undefined. The ORDER BY clause can be used to sort the rows. If you use the ORDER BY clause, it must be the last clause of the SQL statement. You can specify an expression, or an alias, or column position as the sort condition. Syntax SELECT expr FROM table [WHERE condition(s)] [ORDER BY {column, expr} [ASC|DESC]]; In the syntax: ORDER BY specifies the order in which the retrieved rows are displayed ASC orders the rows in ascending order (this is the default order) DESC orders the rows in descending order If the ORDER BY clause is not used, the sort order is undefined, and the Oracle server may not fetch rows in the same order for the same query twice. Use the ORDER BY clause to display the rows in a specific order.

The default sort order is ascending: • Numeric values are displayed with the lowest values first— for example, 1–999.

13 October 2017 Database Management Systems Solution Set

• Date values are displayed with the earliest value first—for example, 01-JAN-92 before 01-JAN-95. • Character values are displayed in alphabetical order—for example, A first and Z last. • Null values are displayed last for ascending sequences and first for descending sequences. Reversing the Default Order To reverse the order in which rows are displayed, specify the DESC keyword after the column name in the ORDER BY clause. The slide example sorts the result by the most recently hired employee.

You can sort query results by more than one column. The sort limit is the number of columns in the given table. In the ORDER BY clause, specify the columns, and separate the column names using commas. If you want to reverse the order of a column, specify DESC after its name. You can also order by columns that are not included in the SELECT clause.

Examples of each clause.

3e What are NULL values? Explain. Ans Null Values If a row lacks the data value for a particular column, that value is said to be null, or to contain a null. A null is a value that is unavailable, unassigned, unknown, or inapplicable. A null is not the same as zero or a space. Zero is a number, and a space is a character. Columns of any data type can contain nulls. However, some constraints, NOT NULL and PRIMARY KEY, prevent nulls from being used in the column.

If any column value in an arithmetic expression is null, the result is null. For example, if you attempt to perform division with zero, you get an error. However, if you divide a number by null, the result is a null or unknown.

3f What are joins? What are different types of joins? Explain. Ans An SQL join clause combines columns from one or more tables in a relational database. It creates a set that can be saved as a table or used as it is. A JOIN is a means for combining columns from one (self-join) or more tables by using values common to each. Oracle proprietary joins are: • Equijoin • Non-equijoin • Outer join • Self join

14 October 2017 Database Management Systems Solution Set

ANSI-standard SQL specifies five types of JOIN: INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER and CROSS. As a special case, a table (base table, view, or joined table) can JOIN to itself in a self-join. Defining Joins When data from more than one table in the database is required, a join condition is used. Rows in one table can be joined to rows in another table according to common values existing in corresponding columns, that is, usually primary and foreign key columns. To display data from two or more related tables, write a simple join condition in the WHERE clause. SELECT table1.column, table2.column FROM table1, table2 WHERE table1.column1 = table2.column2; In the syntax: table1.column denotes the table and column from which data is retrieved table1.column1 = is the condition that joins (or relates) the tables together table2.column2

• When writing a SELECT statement that joins tables, precede the column name with the table name for clarity and to enhance database access. • If the same column name appears in more than one table, the column name must be prefixed with the table name. • To join n tables together, you need a minimum of n-1 join conditions. For example, to join four tables, a minimum of three joins is required. This rule may not apply if your table has a concatenated primary key, in which case more than one column is required to uniquely identify each row.

Explanation of each type of join expected in brief.

4a The lost update anomaly is said to occur if a transaction Tj reads a data item, then another transaction Tk writes the data item (possibly based on a previous read), after which Tj writes the data item. The update performed by Tk has been lost, since the update done by Tj ignored the value written by Tk . i. Give an example of a schedule showing the lost update anomaly. ii. Give an example schedule to show that the lost update anomaly is possible with the read committed isolation level. iii. Explain why the lost update anomaly is not possible with the repeatable read isolation level. Ans 4a i. A schedule showing the Lost Update Anomaly:

15 October 2017 Database Management Systems Solution Set

In the above schedule, the value written by the transaction T2 is lost because of the write of the transaction T1.

ii. Lost Update Anomaly in Read Committed Isolation Level

The locking in the above schedule ensures the Read Committed isolation level. The value written by transaction T2 is lost due to T1’s write. iii. Lost Update Anomaly is not possible in Repeatable Read isolation level. In repeatable read isolation level, a transaction T1 reading a data item X, holds a shared lock on X till the end. This makes it impossible for a newer transaction T2 to write the value of X (which requires X-lock) until T1 finishes. This forces the serialization order T1, T2 and thus the value written by T2 is not lost.

4b State and explain the ACID properties of transactions. • Atomicity. Either all operations of the transaction are reflected properly in the database, or none are. • Consistency. Execution of a transaction in isolation (that is, with no other transaction executing concurrently) preserves the consistency of the database. • Isolation. Even though multiple transactions may execute concurrently, the system guarantees that, for every pair of transactions Ti and Tj , it appears to Ti that either Tj finished execution before Ti started or Tj started execution after Ti finished. Thus, each transaction is unaware of other transactions executing concurrently in the system. • Durability. After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.

4c i. Consider a database for a bank where the database system uses snapshot isolation. Describe a particular scenario in which a nonserializable execution occurs that would present a problem for the bank.

16 October 2017 Database Management Systems Solution Set

ii. Consider a database for an airline where the database system uses snapshot isolation. Describe a particular scenario in which a nonserializable execution occurs, but the airline may be willing to accept it in order to gain better overall performance. Ans i. Suppose that the bank enforces the integrity constraint that the sum of the balances in the checking and the savings account of a customer must not be negative. Suppose the checking and savings balances for a customer are $100 and $200 respectively. Suppose that transaction T1 withdraws $200 from the checking account after verifying the integrity constraint by reading both the balances. Sup- pose that concurrent transaction T2 withdraws $200 from the checking account after verifying the integrity constraint by reading both the balances. Since each of the transactions checks the integrity constraints on its own snapshot, if they run concurrently each will believe that the sum of the balances after the withdrawal is $100 and therefore its withdrawal does not violate the integrity constraint. Since the two transactions update different data items, they do not have any update conflict, and under snapshot isolation both of them can commit. This is a non-serializable execution which results into a serious problem of withdrawal of more money.

ii. Consider a web-based airline reservation system. There could be many concurrent requests to see the list of available flights and available seats in each flight and to book tickets. Suppose, there are two users A and B concurrently accessing this web application, and only one seat is left on a flight. Suppose that both user A and user B execute transactions to book a seat on the flight, and suppose that each transaction checks the total number of seats booked on the flight, and inserts a new booking record if there are enough seats left. Let T3 and T4 be their respective booking transactions, which run concurrently. Now T3 and T4 will see from their snapshots that one ticket is available and insert new booking records. Since the two transactions do not update any common data item (tuple), snapshot isolation allows both transactions to commit. This results in an extra booking, beyond the number of seats available on the flight. However, this situation is usually not very serious since cancellations often resolve the conflict; even if the conflict is present at the time the flight is to leave, the airline can arrange a different flight for one of the passengers on the flight, giving incentives to accept the change. Using snapshot isolation improves the overall performance in this case since the booking transactions read the data from their snapshots only and do not block other concurrent transactions.

4d Show that the two-phase locking protocol ensures conflict serializability, and that transactions can be serialized according to their lock points. Ans Suppose two-phase locking does not ensure serializability. Then there exists a set of transactions T0, T1 ... Tn−1 which obey 2PL and which produce a nonserializable schedule.A non-serializable schedule implies a cycle in the precedence graph, andwe shall showthat 2PL cannot produce such cycles. Without loss of generality, assume the following cycle exists in the precedence graph:

17 October 2017 Database Management Systems Solution Set

T0 → T1 → T2 → ... → Tn−1 → T0. Let ai be the time at which Ti obtains its last lock (i.e. Ti ’s lock point). Then for all transactions such that Ti → Tj , ai < aj . Then for the cycle we have a0 < a1 < a2 < ... < an−1 < a0 Since a0 < a0 is a contradiction, no such cycle can exist. Hence 2PL cannot produce non-serializable schedules. Because of the property that for all transactions such that Ti → Tj , ai < aj , the lock point ordering of the transactions is also a topological sort ordering of the precedence graph. Thus, transactions can be serialized according to their lock points.

4e Consider the following two transactions: T34: read(A); read(B); if A = 0 then B := B + 1; write(B). T35: read(B); read(A); if B = 0 then A := A + 1; write(A). Add lock and unlock instructions to transactions T34 and T35, so that they observe the two-phase locking protocol. Can the execution of these transactions result in a deadlock? Ans i. Lock and unlock instructions: T34: lock-S(A) read(A) lock-X(B) read(B) if A = 0 then B := B + 1 write(B) unlock(A) unlock(B)

T35: lock-S(B) read(B) lock-X(A) read(A) if B = 0 then A := A + 1 write(A) unlock(B) unlock(A)

18 October 2017 Database Management Systems Solution Set

ii. Execution of these transactions can result in deadlock. For example, consider the following partial schedule:

The transactions are now deadlocked.

4f Explain the different ways to handle deadlocks. Ans There are two principal methods for dealing with the deadlock problem. We can use a deadlock prevention protocol to ensure that the system will never enter a deadlock state. Alternatively, we can allow the system to enter a deadlock state, and then try to recover by using a deadlock detection and deadlock recovery scheme. Both methods may result in transaction rollback. Prevention is commonly used if the probability that the system would enter a deadlock state is relatively high; otherwise, detection and recovery are more efficient. A detection and recovery scheme requires overhead that includes not only the run-time cost of maintaining the necessary information and of executing the detection algorithm, but also the potential losses inherent in recovery from a deadlock.

Explanation expected.

5a What are triggers? What are different types of triggers? How are they created? Give the syntax and examples of the same. Types of Triggers Application triggers execute implicitly whenever a particular data manipulation language (DML) event occurs within an application. An example of an application that uses triggers extensively is one developed with Oracle Forms Developer. Database triggers execute implicitly when a data event such as DML on a table (an INSERT, UPDATE, or DELETE triggering statement), an INSTEAD OF trigger on a view, or data definition language (DDL) statements such as CREATE and ALTER are issued, no matter which user is connected or which application is used. Database triggers also execute implicitly when some user actions or database system actions occur, for example, when a user logs on, or the DBA shut downs the database.

19 October 2017 Database Management Systems Solution Set

Note: Database triggers can be defined on tables and on views. If a DML operation is issued on a view, the INSTEAD OF trigger defines what actions take place. If these actions include DML operations on tables, then any triggers on the base tables are fired. Database triggers can be system triggers on a database or a schema. With a database, triggers fire for each event for all users; with a schema, triggers fire for each event for that specific user.

Syntax and examples expected.

5b What are packages? What are the components of packages? How are packages developed? Explain with syntax and example. Ans Packages bundle related PL/SQL types, items, and subprograms into one container. For example, a Human Resources package can contain hiring and firing procedures, commission and bonus functions, and tax exemption variables. A package usually has a specification and a body, stored separately in the database. The specification is the interface to your applications. It declares the types, variables, constants, exceptions, cursors, and subprograms available for use. The package specification may also include PRAGMAs, which are directives to the compiler. The body fully defines cursors and subprograms, and so implements the specification. The package itself cannot be called, parameterized, or nested. Still, the format of a package is similar to that of a subprogram. Once written and compiled, the contents can be shared by many applications. When you call a packaged PL/SQL construct for the first time, the whole package is loaded into memory. Thus, later calls to constructs in the same package require no disk input/output (I/O).

20 October 2017 Database Management Systems Solution Set

You create a package in two parts: first the package specification, and then the package body. Public package constructs are those that are declared in the package specification and defined in the package body. Private package constructs are those that are defined solely within the package body.

How to Develop a Package 1. Write the syntax: Enter the code in a text editor and save it as a SQL script file. 2. Compile the code: Run the SQL script file to generate and compile the source code. The source code is compiled into P code.

There are three basic steps to developing a package, similar to those steps that are used to develop a stand-alone procedure. 1. Write the text of the CREATE PACKAGE statement within a SQL script file to create the package specification and run the script file. The source code is compiled into P code and is stored within the data dictionary. 2. Write the text of the CREATE PACKAGE BODY statement within a SQL script file to create the package body and run the script file. The source code is compiled into P code and is also stored within the data dictionary. 3. Invoke any public construct within the package from an Oracle server environment.

Syntax: CREATE [OR REPLACE] PACKAGE package_name IS|AS public type and item declarations

21 October 2017 Database Management Systems Solution Set

subprogram specifications END package_name; To create packages, you declare all public constructs within the package specification. • Specify the REPLACE option when the package specification already exists. • Initialize a variable with a constant value or formula within the declaration, if required; otherwise, the variable is initialized implicitly to NULL.

Example to create package must be given.

5c What are functions? What are procedures? How do they differ from each other? What are the benefits of stored procedures and functions? Ans A function is a named PL/SQL block that can accept parameters and be invoked. Generally speaking, you use a function to compute a value. Functions and procedures are structured alike. A function must return a value to the calling environment, whereas a procedure returns zero or more values to its calling environment. Like a procedure, a function has a header, a declarative part, an executable part, and an optional exception-handling part. A function must have a RETURN clause in the header and at least one RETURN statement in the executable section. Functions can be stored in the database as a schema object for repeated execution. A function stored in the database is referred to as a stored function. Functions can also be created at client side Functions promote reusability and maintainability. When validated they can be used in any number of applications. If the processing requirements change, only the function needs to be updated. Function is called as part of a SQL expression or as part of a PL/SQL expression. In a SQL expression, a function must obey specific rules to control side effects. In a PL/SQL expression, the function identifier acts like a variable whose value depends on the parameters passed to it.

CREATE [OR REPLACE] FUNCTION function_name [(parameter1 [mode1] datatype1, parameter2 [mode2] datatype2, . . .)] RETURN datatype IS|AS PL/SQL Block;

A procedure is a named PL/SQL block that can accept parameters (sometimes referred to as arguments), and be invoked. Generally speaking, you use a procedure to perform an action. A procedure has a header, a declaration section, an executable section, and an optional exception-handling section. A procedure can be compiled and stored in the database as a schema object.

22 October 2017 Database Management Systems Solution Set

Procedures promote reusability and maintainability. When validated, they can be used in any number of applications. If the requirements change, only the procedure needs to be updated.

CREATE [OR REPLACE] PROCEDURE procedure_name [(parameter1 [mode1] datatype1, parameter2 [mode2] datatype2, . . .)] IS|AS PL/SQL Block;

Syntax Definitions

Parameter Description procedu Name of the procedure re_name paramet Name of a PL/SQL variable whose value is passed to or er populated by the calling environment, or both, depending on the mode being used mode Type of argument: IN (default) OUT IN OUT Data Data type of the argument–can be any SQL / PLSQL data type type. Can be of %TYPE, %ROWTYPE, or any scalar or composite data type. PL/SQL Procedural body that defines the action performed by the block procedure

You create new procedures with the CREATE PROCEDURE statement, which may declare a list of parameters and must define the actions to be performed by the standard PL/SQL block. The CREATE clause enables you to create stand- alone procedures, which are stored in an Oracle database. • PL/SQL blocks start with either BEGIN or the declaration of local variables and end with either END or END procedure_name. You cannot reference host or bind variables in the PL/SQL block of a stored procedure. • The REPLACE option indicates that if the procedure exists, it will be dropped and replaced with the new version created by the statement. You cannot restrict the size of the data type in the parameters.

23 October 2017 Database Management Systems Solution Set

Benefits In addition to modularizing application development, stored procedures and functions have the following benefits: • Improved performance • Avoid reparsing for multiple users by exploiting the shared SQL area • Avoid PL/SQL parsing at run time by parsing at compile time • Reduce the number of calls to the database and decrease network traffic by bundling commands • Easy maintenance • Modify routines online without interfering with other users • Modify one routine to affect multiple applications • Modify one routine to eliminate duplicate testing • Improved data security and integrity • Control indirect access to database objects from nonprivileged users with security privileges • Ensure that related actions are performed together, or not at all, by funneling activity for related tables through a single path • Improved code clarity: By using appropriate identifier names to describe the actions of the routine, you reduce the need for comments and enhance clarity.

5d What is a cursor? Explain implicit and explicit cursors. How are explicit cursors controlled? Ans Implicit and Explicit Cursors The Oracle server uses work areas, called private SQL areas, to execute SQL statements and to store processing information. You can use PL/SQL cursors to name a private SQL area and access its stored information.

24 October 2017 Database Management Systems Solution Set

Cursor Type Description

Implicit Implicit cursors are declared by PL/SQL implicitly for all DML and PL/SQL SELECT statements, including queries that return only one row. Explicit For queries that return more than one row, explicit cursors are declared and named by the programmer and manipulated through specific statements in the block’s executable actions.

The Oracle server implicitly opens a cursor to process each SQL statement not associated with an explicitly declared cursor. PL/SQL allows you to refer to the most recent implicit cursor as the SQL cursor.

Explicit Cursors Use explicit cursors to individually process each row returned by a multiple-row SELECT statement. The set of rows returned by a multiple-row query is called the active set. Its size is the number of rows that meet your search criteria. The diagram on the slide shows how an explicit cursor “points” to the current row in the active set. This allows your program to process the rows one at a time. A PL/SQL program opens a cursor, processes rows returned by a query, and then closes the cursor. The cursor marks the current position in the active set. Explicit cursor functions: • Can process beyond the first row returned by the query, row by row • Keep track of which row is currently being processed • Allow the programmer to manually control explicit cursors in the PL/SQL block Controlling Explicit Cursors 1. Declare the cursor by naming it and defining the structure of the query to be performed within it. 2. Open the cursor. The OPEN statement executes the query and binds any variables that are referenced. Rows identified by the query are called the active set and are now available for fetching. 3. Fetch data from the cursor. In the flow diagram shown on the slide, after each fetch you test the cursor for any existing row. If there are no more rows to process, then you must close the cursor. 4. Close the cursor. The CLOSE statement releases the active set of rows. It is now possible to reopen the cursor to establish a fresh active set.

25 October 2017 Database Management Systems Solution Set

You use the OPEN, FETCH, and CLOSE statements to control a cursor. The OPEN statement executes the query associated with the cursor, identifies the result set, and positions the cursor before the first row. The FETCH statement retrieves the current row and advances the cursor to the next row until either there are no more rows or until the specified condition is met. Close the cursor when the last row has been processed. The CLOSE statement disables the cursor.

5e What are hierarchical queries? Explain the syntax of hierarchical queries. Ans Using hierarchical queries, you can retrieve data based on a natural hierarchical relationship between rows in a table. A relational database does not store records in a hierarchical way. However, where a hierarchical relationship exists between the rows of a single table, a process called tree walking enables the hierarchy to be constructed. A hierarchical query is a method of reporting, in order, the branches of a tree. Imagine a family tree with the eldest members of the family found close to the base or trunk of the tree and the youngest members representing branches of the tree. Branches can have their own branches, and so on. SELECT [LEVEL], column, expr... FROM table [WHERE condition(s)] [START WITH condition(s)] [CONNECT BY PRIOR condition(s)] ;

Hierarchical queries can be identified by the presence of the CONNECT BY and START WITH clauses. In the syntax: SELECT Is the standard SELECT clause. LEVEL For each row returned by a hierarchical query, the LEVEL pseudocolumn returns 1 for a root row, 2 for a child of a root, and so on.

26 October 2017 Database Management Systems Solution Set

FROM table Specifies the table, view, or snapshot containing the columns. You can select from only one table. WHERE Restricts the rows returned by the query without affecting other rows of the hierarchy. condition Is a comparison with expressions. START WITH Specifies the root rows of the hierarchy (where to start). This clause is required for a true hierarchical query. CONNECT BY Specifies the columns in which the relationship between parent and child PRIOR rows exist. This clause is required for a hierarchical query. The SELECT statement cannot contain a join or query from a view that contains a join.

5f What are composite data types? Explain the PL/SQL records. How is a PL/SQL record created? Ans Like scalar variables, composite variables have a data type. Composite data types (also known as collections) are RECORD, TABLE, NESTED TABLE, and VARRAY. You use the RECORD data type to treat related but dissimilar data as a logical unit. You use the TABLE data type to reference and manipulate collections of data as a whole object A record is a group of related data items stored as fields, each with its own name and data type. A table contains a column and a primary key to give you array-like access to rows. After they are defined, tables and records can be reused.

PL/SQL Records A record is a group of related data items stored in fields, each with its own name and data type. For example, suppose you have different kinds of data about an employee, such as name, salary, hire date, and so on. This data is dissimilar in type but logically related. A record that contains such fields as the name, salary, and hire date of an employee allows you to treat the data as a logical unit. When you declare a record type for these fields, they can be manipulated as a unit. • Each record defined can have as many fields as necessary. • Records can be assigned initial values and can be defined as NOT NULL. • Fields without initial values are initialized to NULL. • The DEFAULT keyword can also be used when defining fields. • You can define RECORD types and declare user-defined records in the declarative part of any block, subprogram, or package.

27 October 2017 Database Management Systems Solution Set

• You can declare and reference nested records. One record can be the component of another record. TYPE type_name IS RECORD (field_declaration[, field_declaration]…); identifier type_name; field_name {field_type | variable%TYPE | table.column%TYPE | table%ROWTYPE} [[NOT NULL] {:= | DEFAULT} expr]

To create a record, you define a RECORD type and then declare records of that type. In the syntax: type_name is the name of the RECORD type. (This identifier is used to declare records.) field_name is the name of a field within the record. field_type is the data type of the field. (It represents any PL/SQL data type except REF CURSOR. You can use the %TYPE and %ROWTYPE attributes.) expr is the field_type or an initial value. The NOT NULL constraint prevents assigning nulls to those fields. Be sure to initialize NOT NULL fields.

28