
International Journal of Computer Applications (0975 – 8887) Volume 100– No.12, August 2014 Automating the Lower and Higher Normal Form Process for the Database Systems Manal Fadel Younis Department of Computer Engineering, University of Baghdad Baghdad-Iraq ABSTRACT 2. A FUNCTIONAL DEPENDENCY (FD) Normalization is an important technique for the analysis of [1][4][3] relational databases. It aims to create a set of relational tables The attribute B is functionally dependent on the with minimum data redundancy that preserve consistency and attribute A if each value in column A determines one and only facilitate correct insertion, deletion, and modification. It is one value in column B. (written A → B) very much time consuming to do this data analysis manually. Thus in this paper, a system is proposed which aims to automate the most complex phase of the database design Partial dependency: The determinant is only part normalization. It will help to achieve a good database design of the primary key. and eliminate the drawbacks of manual normalization process. Full functional dependency: If attribute B is functionally dependent on a composite key A but This system is suitable to eliminate redundancy and not on any subset of that composite key, the inconsistent dependency automatically. It aims to handle the attribute B is fully functionally dependent on A. normalization process up to fifth normal. This includes Transitive dependency: Is a functional dependence creating tables and establishing relationships between those exists among nonprime attributes. tables by using their general definitions in a step-by-step Multivalued dependency: Two or more attributes feature on the set of functional dependencies to remove are dependent on a determinant and each dependent redundant data. Then this system is tested on many examples attribute has a specific set of values. The values in with multiple candidate keys taken from different sources. these dependent attributes are independent of each other. General Terms Join dependency: A table T is subject to a join Normalization, Relational Database System dependency if T can always be recreated by joining multiple tables each having a subset of the attributes Keywords of T. Functional Dependency, Keys, Redundancy, Normal Forms. 1. INTRODUCTION Normalization works through a series of stages called normal form, these are [1][5]: Good relational database system is not enough to avoid the • First Normal Form (1NF). data redundancy. Normalization is a process used for • Second Normal Form (2NF). evaluating and correcting table structures to minimize this • Third Normal Form (3NF). redundancy, and reducing the data anomalies which is based • Boyce-Codd Normal Form (BCNF) on their functional dependencies and primary keys. The • Fourth Normal Form (4NF). normalization process involves assigning attributes to tables • Fifth Normal Form (5NF). based on the concept of determination [1]. It usually involves dividing a database into two or more tables The first normal form (1NF): and defining relationships between the tables. The objective is There are no repeating groups in the table. In other to isolate data so that additions, deletions, and modifications words, each row/column intersection contains one of a field can be made in just one table and then propagated and only one value, not a set of values. through the rest of the database via the defined relationships Define the primary key. [2]. Define all dependencies on the table. Key is one or more attributes which determine other attributes The second normal form (2NF): [1][3]. It is in 1NF. Superkey: An attribute (or combination of Remove all partial dependencies. attributes) that uniquely identifies each row in a table. The third normal form (3NF): Candidate key: A superkey that does not contain a It is in 2NF. subset of attributes that is itself a superkey. It contains no transitive dependencies. Non-prime attribute: A non-prime attribute is an attribute that does not occur in any candidate key. The Boyce-Codd normal form (BCNF): Primary key: A candidate key selected to uniquely identify all other attribute values in any given row Every determinant in the table is a candidate key. cannot contain null entries. 13 International Journal of Computer Applications (0975 – 8887) Volume 100– No.12, August 2014 Is special case of 3NF, when the table contains only Table 1: Employee one candidate key, then 3NF and the BCNF are equivalents. Name Project Task Office Floor Phone Bill 100X T1 400 4 1400 The fourth normal form (4NF): T2 400 4 1400 200Y T1 400 4 1400 It is in 3NF. T2 400 4 1400 Remove the multivalued dependencies. Sue 100X T33 442 4 1442 200Y T33 442 4 1442 The fifth normal form: 300Z T33 442 4 1442 Ed 100X T2 588 5 1588 It is in 4NF. The entity has no join dependencies. Also called First Normal Form: project-join normal form. Step 1: Eliminate the Repeating Groups This section 1 describes the introduction of the proposed work. Section 2 focuses on literature survey. Section 3 Eliminate the nulls by making sure that each repeating group describes the proposed system for automatic higher normal attribute contains an appropriate data value. form. Section 4 focuses on the result of the proposed system.. 3. LITERATURE SURVEY Step 2: Identify the Primary Key This section focus on literature survey of the paper: The above table has more than one field that represent the Sherry Verma,”Comparing manual and automatic primary key (Name, Project, Task) because the field Name is normalization techniques for relational database”, [6] not uniquely identify all of the remaining entity (row) proposed the Comparing manual and automatic normalization attributes. For example, the Name value Bill can identify any techniques for relational database, based on the dependency one of two projects. Then if the primary key composed from matrix and approach primary key to generate automatically (Name and Project) so not uniquely identify any one of two identified the final table. tasks. Amir Hassan bahmani, Mahmoud Naghib zadeh, “Automatic To maintain a proper primary key that will uniquely identify database normalization and primary key generation”, [7] the any attribute value, the new key must be composed of a authors proposed an approach for automatic database combination of Name, Project and Task. normalization and primary key generation. In discussed an automatic distinguish one primary key for every final table For example, if Name=Bill, Project=100X and Task=T1 the which is generated. The problem is to normalize the database entries for the attributes Office, Floor and Phone must be 400, tables automatically. In the current normalization process, 4, 1400 and so on. This change converts the table Employee even first normal form, second normal form and third normal to table Employee2 which is in 1NF. forms are difficult by doing automatically. Table 2: Employee2 P.B. Alappanavar, Dhiraj Patil, Radhika Grover, Srishti Hunjan, Yuvraj Girnar ,“An Ameliorated Approach towards Name Project Task Office Floor Phone Automating the Relational Database Normalization Process”, Bill 100X T1 400 4 1400 [8] aims to automate the most complex and elaborate phase Bill 100X T2 400 4 1400 of the database design process-Normalization, which will Bill 200Y T1 400 4 1400 help to achieve the trademarks of an acceptable database Bill 200Y T2 400 4 1400 design. Sue 100X T33 442 4 1442 Sue 200Y T33 442 4 1442 G.Sunitha, Dr.A.Jaya, “A KNOWLEDGE BASED Sue 300Z T33 442 4 1442 APPROACH FOR AUTOMATIC DATABASE Ed 100X T2 588 5 1588 NORMALIZATION”, [9] aims to provide automatic Step 3: Identify All Dependencies normalization of databases up to 3NF in order to reduce the time consuming in manually normalization. The primary key in step 2 identified the following dependency: Moussa Demba, “ALGORITHM FOR RELATIONAL DATABASE NORMALIZATION UP TO 3NF”, [2] the Name, Project, Task Office, Floor, Phone author proposed an algorithmic approach for database a- Partial dependency: normalization up to third normal form by taking into account Name Office, Floor, Phone all candidate keys, including the primary key. b- Fully dependency: None c- Transitive dependency: 4. PROPOSED SYSTEM FOR Office is the office number for the employee. AUTOMATIC ALL NORMAL FORM Bill works in office number 400. First Case study: Consider the following table with set of Floor is the floor on which the office is attributes to apply the proposed system: located. Phone is associated with the phone in the given office. Office Floor, Phone d- Multivalued dependency: Name Project Name Task 14 International Journal of Computer Applications (0975 – 8887) Volume 100– No.12, August 2014 Second Normal Form: Table 8: EmpNP ( Name, Project) Split the table which results from the 1NF according to partial dependency into two relations (tables):- Name Project Bill 100X Table 3: EmpNPT (Name, Project, Task) Bill 200Y Sue 100X Name Project Task Sue 200Y Bill 100X T1 Sue 300Z Bill 100X T2 Ed 100X Bill 200Y T1 Bill 200Y T2 Sue 100X T33 Table 9: EmpNT ( Name, Task) Sue 200Y T33 Sue 300Z T33 Name Task Ed 100X T2 Bill T1 Bill T2 Table 4:EmpNOFP (Name, Office, Floor, Phone) Sue T33 Ed T2 Name Office Floor Phone Bill 400 4 1400 Sue 442 4 1442 Table 10: EmpNO (Name, Office) Ed 588 5 1588 Name Office Floor Third Normal Form: Bill 400 4 Step 1: Identify the transitive dependency:- Sue 442 4 There is transitive dependency in table EmpNOFP (Name, Ed 588 5 Office, Floor, Phone) Step 2: Split EmpNOFP into two tables Table 11: EmpOPF (Office, Phone) Table 5 : EmpNPT (Name, Project, Task) Office Phone Floor Name Project Task 400 1400 4 Bill 100X T1 442 1442 4 Bill 100X T2 588 1588 5 Bill 200Y T1 Second Case study of BCNF: Bill 200Y T2 Sue 100X T33 Sue 200Y T33 To check for BCNF, it must identify all the determinants and Sue 300Z T33 make sure that they are candidate keys.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-