BIS4435 Discussion Topic 1 Question

Total Page:16

File Type:pdf, Size:1020Kb

BIS4435 Discussion Topic 1 Question

BIS4435 Discussion Topic 1 Question

Why do you think entity and referential integrity are very important when you create a database. Can you imagine any application or situation where it is not possible to maintain these integrities absolutely ? Please cite references through out your text and list down the sources at the end. BIS4435 Discussion Topic 1 Answer by Dodis

When creating a database, Entity and Referential integrity are very important, because Entity integrity is an integrity rule which states that every table of a database must have a primary key and that the column or columns chosen to be the primary key should be unique and not null. A direct consequence of this integrity rule is that duplicate rows are forbidden in a table. If each value of a primary key must be unique no duplicate rows can logically appear in a table. The NOT NULL characteristic of a primary key ensures that a value will be used to identify all rows in a table of a database [1]. Furthermore the other rule which is very important in creating a database is the Referential integrity rule. Referential integrity is a property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute (column) in a different relation (table). For example, customer numbers in a CUSTOMER file are the primary keys, and customer numbers in the ORDER file are the foreign keys. If a customer record is deleted, the order records must also be deleted; otherwise they are left without a primary reference [2].

A situation where it is not possible to maintain Entity and Referential integrity absolutely, is by using out-dated and legacy systems that use file systems to create a database. Such an application is a spreadsheet application which lacks any kind of data integrity. This requires companies to invest a large amount of time, money, and personnel in the creation of data integrity systems [3].

Sources

[1] http://www.answers.com/topic/entity-integrity#cite_note-0 http://www.answers.com/topic/database-integrity

[3] http://www.answers.com/topic/data-integrity

[2] http://www.answers.com/topic/referential-integrity http://www.answers.com/topic/data-domain BIS4435 Discussion Topic 1 Answer by Maria Chrysandreou

I think entity integrity is important when we create a database because it ensures that the primary key of our table is unique and does not have a NULL value. For example in a table called Book the primary key is the ISBN which is associated with a book. We cannot have two records within the table “Book” with the same ISBN. It is also important because when we want to insert a new record or update an existing record and the insertion or update produces an existing key or a NULL value the operation is rejected because of the entity integrity rule [1].

Furthermore, referential integrity is important when we create a database because it ensures that if a foreign key exists in a table then that foreign key must match with a primary key of another table [2]. Also, it ensures that users or applications do not enter inconsistent data [3]. For example we have two tables, the Customer and the Order table where the relationship is one-to-many (1:*). A customer can make more than one orders. So in the Order table we enter a column which is the foreign key. This foreign key must exist in the Customer table as the primary key.

A situation where it is not possible to maintain entity and referential integrity absolutely is by using out-dated and legacy systems that use file systems to create a database. Such an application is a spreadsheet application which lacks any kind of data integrity. This requires companies to invest a large amount of time, money, and personnel in the creation of data integrity systems [4].

References

[1] http://www.databasedev.co.uk/entity_integrity.html

[2] Connolly, T.M., and Begg, C.E., Database Systems: A practical approach to Design, Implementation and management.

[3] http://www.webopedia.com/TERM/R/referential_integrity.html

[4] http://www.answers.com/topic/data-integrity BIS4435 Discussion Topic 1 Answer by Laith Gharib

I believe that entity and referential integrity are important because in database design, the term referential integrity simply means that if a row in a table has a pointer to a row in another table, the row in the table that is pointed at must exist put another way, you should not remove a row which contains information that other row(s) depend on. It's the key to correct data being entered in the database. And also It's the rules that govern what can be inserted and what cannot.

Also when we talk about referential integrity we need to know that it is a database constraint that ensures that references between data are indeed valid and intact. Referential integrity is a fundamental principle of database theory and arises from the notion that a database should not only store data, but should actively seek to ensure its quality.

Here is some additional information that I found on the Web:

Referential integrity in a relational database is consistency between coupled tables. Referential integrity is usually enforced by the combination of a primary key and a foreign key. For referential integrity to hold, any field in a table that is declared a foreign key can contain only values from a parent table's primary key field”[1]. Referential integrity is a database management safeguard that ensures every foreign key match a primary key. For example, client numbers in a client file are the primary keys, and client numbers in the order file are the foreign keys. If a client record is deleted, the order records must also be deleted; otherwise they are left without a primary reference. If the DBMS does not test for this, it must be programmed into the applications.”[2]. Also, in the relational data model, entity integrity is one of the three inherent integrity rules. Entity integrity is an integrity rule which states that every table must have a primary key and that the column or columns chosen to be the primary key should be unique and not null”[3].

Finally, I cant imagine any application or situation where it is not possible to maintain these integrities because referential integrity rules prevent you from: a) Adding records to a foreign key in the related table where there are no matching values in the primary key of the parent table. b) Changing values in the primary key field of the parent table if there are matching records in any related tables. c) Deleting records from the primary table when there are matching records in any related tables.

However, It in terms of changing or deleting records, referential integrity can however be overridden by selecting the cascade update or cascade delete option. However this is not advisable, certainly not for the novice database creator and it doesn't adhere to good relational database build standards.”[4]

References

[1] En.wikipedia.org/wiki/Referential_integrity [2] Computing-dictionary. thefreedictionary.com/ referential+integrity [3] Beynon-Davies P. (2004). Database Systems 3rd Edition. Palgrave, Basingstoke, UK. ISBN 1-4039-1601-2 [4] Http://www.brighthub.com

Respond Comments on Laith Gharib answer by Nida Waseem

Regarding the importance of entity and referential integrity I completely agree with the point of my course fellow that "It's the key to correct data being entered in the database". Elaborating it a bit further, the entity integrity will make sure that each row in a table is uniquely identified and no duplicated row is entered by mistake or deliberately. Having duplicated record can lead to updation anomalies and inconsistent data. In the same way referential integrity makes sure that data in child table is not referencing to some record in parent table that does not exist any more. It makes sure that any changes to a given table has been properly propagated to related tables to prevent any occurrence of incorrect or dirty data in database.

Now regarding a situation where it is not possible to maintain these integrities, I also can’t think of scenario where it is not possible to maintain these integrities but there could be situations where maintaining these integrities could be costly. For example if we have a very large table with millions of records and we have defined entity integrity constraint on it then certain operations like insert could be very time consuming because before inserting a new record we would need to check that the record with same values does not already exist[1]. Furthermore you can take a distributed database scenario. In distributed databases enforcing referential integrity between two remote tables can be complicated. For example if you have define a relationship between two remote tables and you update one table that requires the other remote table entry to be changed and the other remote table is not available because of network failure then both tables may become inaccessible [2]. Otherwise i believe entity and refrential integrity are very important in maintaining a non-redundant consistent data.

References

[1] http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/co nstra.htm

[2] http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28424/adfns _constraints.htm#i1006526

Respond Comment on Nida Waseem comment to Laith Gharib answer by Gilgith Karthika Vilasan

There are two basic ways to implement Referential Integrity rules one is within database and other by implementing in application logic , either the business objects themselves or within your database encapsulation layer.If we implement RI within our business layer, we can easily tackle the problems created by distributed database scenario. BIS4435 Discussion Topic 1 Answer by Khuram Shahzad

Integrities (entity and referential) are very important in relational (Table) database management system. We have to maintain the integrities that are very important through out the database system to (transfer, storage and retrieval) data information from different tables with predefine rule and constraints.

In rational database system insertion deletion and modification is helpful through integrities between parents and Childs table.

ENTITY INTEGRITY

Entity Integrity based on the following rules

1) Primary key must existence in the entity integrity rule 2) Each entity must be unique value 3) Unique Value must be other than null value The relationship between some rows of the DEPT and EMP tables, shown in the following figure, show entity integrity concepts. For example, entity integrity ensures that every Primary key value in the DEPTNO column of the DEPTNO table matches a foreign key value in the Dept column of the EMP table. All the values are uniquely identified and must be not null (fig 1 Ref.1).

If in a parent table there is a primary key that is unique where insertion cause fails when user try to attempt the duplicate value and null is not allowed there should be some values which are helpful during insertion, modification, deletion and searching in parents and child tables.

REFERENTIAL INTEGRITY

Requires every value of one attribute of a relation table to exist as a value of another attribute in a different table.

Referential Integrity based on the following rules

1) Referential integrity enforced with primary key and foreign key 2) The values of the foreign key are valid The relationship between some rows of the DEPT and EMP tables, shown in the following figure, illustrates referential integrity concepts. For example, referential integrity ensures that every foreign key value in the DEPT column of the EMP table matches a primary key value in the DEPTNO column of the DEPT table. Fig-1 Ref(1)

Referential integrity enforced with primary and foreign key (2)

Foreign key is combination of one column in a table (table referred to Childs table) which takes the value from parent in the primary key column (2).

Figure 1.

Entity Integrity and Referential integrity of DEPT and EMP tables (Ref.1)

In my conclusion it will be impossible to maintain the application or situation without integrities which could create problems during insertion, deletion, modification, and in record searching. Because in relational database we must maintain strong integrities to generate quality and accurate information from database system.

Reference:

[1] http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?

topic=/com.ibm.db29.doc.intro/db2z_integrity.htm

[2] http://oa.mo.gov/itsd/cio/architecture/domains/information/CC- DBMSIntegrityARC.pdf BIS4435 Discussion Topic 1 Answer by Althea Walters

Entity integrity is important because according to (Article world.org- no date [online]) “primary keys serve as identifiers for individual tuples (rows) in a database. This means that no primary key should be "missing". If a primary key happens to be null, it would be a contradiction in terms. This is because it would effectively state that there is an entity without a known identity. This is the reason for the use of the term 'entity integrity'. Also, if nulls are allowed, this implies that some tuples cannot be uniquely identified. For example, if two or more tuples have null as their primary key, then we cannot distinguish between them.”

Secondly entity is important because according to (Wikipedia Free Encyclopedia- no date [online]) “it forbid duplication of row.” This means that because this integrity constraint requires “each value of the primary key to be unique no duplicate rows can logically appear.”

All in all, entity integrity ensures that the data you store remains accurate and understandable to all aspect of the database.

On the other hand Referential Integrity is important because it creates the relationship or link between each relation in a database. This relationship/link is created through a foreign key according to (Wikipedia Free Encyclopedia- no date [online]) a “foreign key is simple a column or set of column in one (referencing) table that refers to a column or set of columns in another (referenced) table).” “If these links are ever broken, the system becomes unreliable at best and unusable at its worst” states (Bilderau.com- no date [online]). In other words, referential integrity makes sure that all the references within a database are valid and no invalid link exists between the various tables that make up the system. It also helps to prevent user error by failing any attempts by the user to link records that does not exist.

Since integrity constraints are put in place to ensure data accuracy, comprehensibility and usability, I cannot think of any application or situation where it is not possible to maintain this integrity absolutely. I would assume that if they are possibilities, then there will also be a high risk of data inconsistency, data redundancy and incomprehensibility

References

Entity Integrity n.d Retrieved Oct 9th 2009 http://www.articleworld.org/index.php/Entity_integrity

Entity Integrity n.d Retrieved Oct 9th 2009 http://en.wikipedia.org/wiki/Entity_integrity

Referential Integrity n.d Retrieved Oct 9th 2009 http://www.builderau.com.au/program/mysql/soa/Using-foreign-keys

Referential Integrity n.d Retrived Oct. 9th 2009 http://en.wikipedia.org/wiki/referential integrity BIS4435 Discussion Topic 1 Answer by Faith Ime Abakada

Entity and referential integrity are two very important rules that need to be observed when creating a database.

They are the access cards to correct and efficient data being inputted in the database.

To explain my reason for accepting their importance I would first of all define what these two terms mean and their relative importance with respect to the creation of a database.

Entity integrity ''ensures that there are no duplicate records within the table created and that the field that identifies each record within the table is unique and never null''-this is known to be the primary key.[1]

Referential integrity ''ensures that relationships between tables remain consistent. When one table has a foreign key to another table, the concept of referential integrity states that you may not add a record to the table that contains the foreign key unless there is a corresponding record in the linked table. It also includes the techniques known as cascading update and cascading delete, which ensure that changes made to the linked table are reflected in the primary table''[2]

The importance of these two rules is that they have the prerogative to decide what can be inserted or updated and what cannot. Also, when a database exhibits and obeys these two integrity constraints there is no chance for any data redundancy problems to occur, thus creating a suitable database for the end users.

With respect to whether I can imagine any application or situation where it is not possible to mention those integrity absolutely?....My answer is NO, I don't believe there is any situation where it is not possible to maintain these integrities absolutely.

References [1] http://www.databasedev.co.uk [2] http://www.databaseabout.com http://www.articleworld.org http://www.webopedia.com BIS4435 DISCUSSION TOPIC 1 READING MATERIAL

Database integrity

Database integrity ensures that data entered into the database is accurate, valid, and consistent. Any applicable integrity constraints and data validation rules must be satisfied before permitting a change to the database.

Three basic types of database integrity constraints are:

 Entity integrity, allowing no two rows to have the same identity within a table.  Domain integrity, restricting data to predefined data types, e.g.: dates.  Referential integrity, requiring the existence of a related row in another table, e.g. a customer for a given customer ID. http://www.answers.com/topic/database-integrity

Types of integrity constraints

Data integrity is normally enforced in a database system by a series of integrity constraints or rules. Three types of integrity constraints are an inherent part of the relational data model: entity integrity, referential integrity and domain integrity.

Entity integrity concerns the concept of a primary key. Entity integrity is an integrity rule which states that every table must have a primary key and that the column or columns chosen to be the primary key should be unique and not null.

Referential integrity concerns the concept of a foreign key. The referential integrity rule states that any foreign key value can only be in one of two states. The usual state of affairs is that the foreign key value refers to a primary key value of some table in the database. Occasionally, and this will depend on the rules of the business, a foreign key value can be null. In this case we are explicitly saying that either there is no relationship between the objects represented in the database or that this relationship is unknown.

Domain integrity specifies that all columns in relational database must be declared upon a defined domain. The primary unit of data in the relational data model is the data item. Such data items are said to be non-decomposable or atomic. A domain is a set of values of the same type. Domains are therefore pools of values from which actual values appearing in the columns of a table are drawn.

If a database supports these features it is the responsibility of the database to insure data integrity as well as the consistency model for the data storage and retrieval. If a database does not support these features it is the responsibility of the application to insure data integrity while the database supports the consistency model for the data storage and retrieval. Having a single, well controlled, and well defined data integrity system increases stability (one centralized system performs all data integrity operations), performance (all data integrity operations are performed in the same tier as the consistency model), re-usability (all applications benefit from a single centralized data integrity system), and maintainability (one centralized system for all data integrity administration).

Today, since all modern databases support these features (see Comparison of relational database management systems), it has become the defacto responsibility of the database to insure data integrity. Out-dated and legacy systems that use file systems (text, spreadsheets, ISAM, flat files, etc.) for their consistency model lack any kind of data integrity model. This requires companies to invest a large amount of time, money, and personnel in the creation of data integrity systems on a per application basis that effectively just duplicate the existing data integrity systems found in modern databases. Many companies, and indeed many database systems themselves, offer products and services to migrate out-dated and legacy systems to modern databases to provide these data integrity features. This offers companies a substantial savings in time, money, and resources because they do not have to develop per application data integrity systems that must be re-factored each time business requirements change.

Examples

An example of a data integrity mechanism in cryptography is the use of SHA-256 hash values. These blocks of bytes function as a numeric summation of the content of a data item. Should the data change even slightly, the SHA-256 hash would yield a totally different result. MD5 that was used to provide message integrity has since been broken and no longer used.

Another example of a data integrity mechanism is the parent and child relationship of related records. If a parent record owns one or more related child records all of the referential integrity processes are handled by the server itself, which automatically insures the accuracy and integrity of the data so that no child record can exist without a parent (also called being orphaned) and that no parent loses their child records. It also insures that no parent record can be deleted while the parent record owns any child records. All of this is handled at the database level and does not require coding integrity checks into each application. http://www.answers.com/topic/data-integrity

Entity integrity

In the relational data model, entity integrity is one of the three inherent integrity rules. Entity integrity is an integrity rule which states that every table must have a primary key and that the column or columns chosen to be the primary key should be unique and not null [1]. A direct consequence of this integrity rule is that duplicate rows are forbidden in a table. If each value of a primary key must be unique no duplicate rows can logically appear in a table. The NOT NULL characteristic of a primary key ensures that a value can be used to identify all rows in a table.

Within relational databases using SQL, entity integrity is enforced by adding a primary key clause to a schema definition. The system enforces Entity Integrity by not allowing operations (INSERT, UPDATE) to produce an invalid primary key. Any operation that is likely to create a duplicate primary key or one containing nulls is rejected. The Entity Integrity ensures that the data that you store remains in the proper format as well as comprehendable. http://www.answers.com/topic/entity-integrity#cite_note-0

Referential integrity

A database management safeguard that ensures every foreign key matches a primary key. For example, customer numbers in a customer file are the primary keys, and customer numbers in the order file are the foreign keys. If a customer record is deleted, the order records must also be deleted; otherwise they are left without a primary reference. If the DBMS does not test for this, it must be programmed into the applications.

Referential integrity is a property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (table).[citation needed]

Less formally, and in relational databases: For referential integrity to hold, any field in a table that is declared a foreign key can contain only values from a parent table's primary key or a candidate key. For instance, deleting a record that contains a value referred to by a foreign key in another table would break referential integrity. Some relational database management systems (RDBMS) can enforce referential integrity, normally either by deleting the foreign key rows as well to maintain integrity, or by returning an error and not performing the delete. Which method is used may be determined by a referential integrity constraint defined in a data dictionary. An example of a database that has not enforced referential integrity. In this example, there is a foreign key (artist_id) value in the album table that references a non-existent artist — in other words there is a foreign key value with no corresponding primary key value in the referenced table. What happened here was that there was an artist called "Aerosmith", with an artist_id of "4", which was deleted from the artist table. However, the album "Eat the Rich" referred to this artist. With referential integrity enforced, this would not have been possible. http://www.answers.com/topic/referential-integrity

Data domain

In data management and database analysis, a data domain refers to all the unique values which a data element may contain. The rule for determining the domain boundary may be as simple as a data type with enumerated list of values.

For example, a database table that has information about people, with one record per person, might have a "gender" column. This gender column might be declared as a string data type, and allowed to have one of two known code values: "M" for male, "F" for female -- and NULL for records where gender is unknown or not applicable (or arguably "U" for unknown as a sentinel value). The data domain for the gender column is : "M", "F". In a normalized data model, the reference domain is typically specified in a reference table. Following the previous example, a Gender reference table would have exactly two records, one per allowed value -- excluding NULL. Reference tables are formally related to other tables in a database by the use of foreign keys.

Less simple domain boundary rules, if database-enforced, may be implemented through a check constraint or, in more complex cases, in a database trigger. For example, a column requiring positive numeric values may have a check constraint declaring the values must be greater than zero.

This definition combines the concepts of domain as an area over which control is exercised and the mathematical idea of a set of values of an independent variable for which a function is defined. See: domain (mathematics). http://www.answers.com/topic/data-domain

Unique key or primary key

In relational database design, a unique key or primary key is a candidate key to uniquely identify each row in a table. A unique key or primary key comprises a single column or set of columns. No two distinct rows in a table can have the same value (or combination of values) in those columns. Depending on its design, a table may have arbitrarily many unique keys but at most one primary key.

A unique key must uniquely identify all possible rows that exist in a table and not only the currently existing rows. Examples of unique keys are Social Security numbers (associated with a specific person[1][2]) or ISBNs (associated with a specific book). Telephone books and dictionaries cannot use names, words, or Dewey Decimal system numbers as candidate keys because they do not uniquely identify telephone numbers or words.

A primary key is a special case of unique keys. The major difference is that for unique keys the implicit NOT NULL constraint is not automatically enforced, while for primary keys it is enforced. Thus, the values in unique key columns may or may not be NULL. Another difference is that primary keys must be defined using another syntax.

The relational model, as expressed through relational calculus and relational algebra, does not distinguish between primary keys and other kinds of keys. Primary keys were added to the SQL standard mainly as a convenience to the application programmer.

Unique keys as well as primary keys can be referenced by foreign keys.

Defining primary keys

Primary keys are defined in the ANSI SQL Standard, through the PRIMARY KEY constraint. The syntax to add such a constraint to an existing table is defined in SQL:2003 like this: ALTER TABLE

ADD [ CONSTRAINT ] PRIMARY KEY ( {, }... )

The primary key can also be specified directly during table creation. In the SQL Standard, primary keys may consist of one or multiple columns. Each column participating in the primary key is implicitly defined as NOT NULL. Note that some DBMS require that primary key columns are explicitly marked as being NOT NULL.

CREATE TABLE table_name ( id_col INT, col2 CHARACTER VARYING(20), ... CONSTRAINT tab_pk PRIMARY KEY(id_col), ... )

If the primary key consists only of a single column, the column can be marked as such using the following syntax:

CREATE TABLE table_name ( id_col INT PRIMARY KEY, col2 CHARACTER VARYING(20), ... )

Defining unique keys

The definition of unique keys is syntactically very similar to primary keys.

ALTER TABLE

ADD [ CONSTRAINT ] UNIQUE ( {, }... )

Likewise, unique keys can be defined as part of the CREATE TABLE SQL statement.

CREATE TABLE table_name ( id_col INT, col2 CHARACTER VARYING(20), key_col SMALLINT, ... CONSTRAINT key_unique UNIQUE(key_col), ... ) CREATE TABLE table_name ( id_col INT PRIMARY KEY, col2 CHARACTER VARYING(20), ... key_col SMALLINT UNIQUE, ... )

Surrogate keys

Main article: Surrogate key In some design situations the natural key that uniquely identifies a tuple in a relation is difficult to use for software development. For example, it may involve multiple columns or large text fields. A surrogate key can be used as the primary key. In other situations there may be more than one candidate key for a relation, and no candidate key is obviously preferred. A surrogate key may be used as the primary key to avoid giving one candidate key artificial primacy over the others.

Since primary keys exist primarily as a convenience to the programmer, surrogate primary keys are often used—in many cases exclusively—in database application design.

Due to the popularity of surrogate primary keys, many developers and in some cases even theoreticians have come to regard surrogate primary keys as an inalienable part of the relational data model. This is largely due to a migration of principles from the Object- Oriented Programming model to the relational model, creating the hybrid object- relational model. In the ORM, these additional restrictions are placed on primary keys:

 Primary keys should be immutable, that is, not change until the record is destroyed.  Primary keys should be anonymous integer or numeric identifiers.

However, neither of these restrictions are part of the relational model or any SQL standard. Due diligence should be applied when deciding on the immutability of primary key values during database and application design. Some database systems even imply that values in primary key columns cannot be changed using the UPDATE SQL statement[citation needed].

Notes

1. ^ SSN uniqueness: Rare SSN duplicates do exist in the field, a condition that led to problems with early commercial computer systems that relied on SSN uniqueness. Practitioners are taught that well-known duplications in SSN assignments[citation needed] occurred in the early days of the SSN system. This situation points out the complexity of designing systems that assume unique keys in real- world data. 2. ^ http://news.yahoo.com/s/ap/us_identity_sharing The Federated States of Micronesia, the Republic of the Marshall Islands and the Republic of Palau still use local social security numbers which overlap with those of residents of New Hampshire and Maine. http://www.answers.com/topic/unique-key

Recommended publications