Chapter 8 Database Design and Construction a Database Is the Place Where Permanent Data Are Stored for Software Products

Chapter 8 Database Design and Construction A database is the place where permanent data are stored for software products. This chapter presents details on databases. In Chapter 8, we will learn • What a database is • What a relational database is • What the elements of a database are • How to make database design • What an entity-relationship diagram is • How to create database table relationships • How to ensure the integrity of a database Databases are designed to hold a huge amount of data in a manner that these data can easily be accessed and modified when required. The data in a database are permanent in nature unless you delete them: they won’t be lost when the power is off or when a software application crashes. They won’t be lost if you lose the connection to your website. The data you store in a database must conform to some requirements so that they are usable. Some important characteristics of a database include the following: • Data integrity • Data consistency • Data query • Data manipulation Databases are very important for most industries that run their businesses using software products. These industries perform their everyday business transactions such as sales orders and purchase orders by storing the information about these transactions in their databases. If any of the transaction data are lost or get corrupted, then these businesses face problems. Therefore, databases are extremely important for these industries. Databases are also used to store data about many other things. Today, social media has become very popular. People use social media to make friends as well as to do business. Storing the personal or business data of the users and keeping them secure and safe are important for the owners of these social media sites. Databases are integral part of any software product. Most software products need to create and keep permanent data. This permanent data is created and kept in databases. Databases are extremely important for many types of software products. The power of search engines like Google come from their sophisticated and huge databases. Most commercial and enterprise software products (ERPs, SAS, Accounting etc.) depend on their databases for all business transactions and reports. Even social media software products like Facebook, Twitter etc. have databases as their backbone. There are 2 parts of any database: the database engine and the stored data. The database engine itself is a software product. Many software companies build database engines. Oracle database engine is created by Oracle Corporation, MS SQL Server is created by Microsoft Corporation etc. Software companies who build software products use one of the database engines to create and store data which is used by their software products. 64 | Page @Foundation of Software Engineering, A. Ahmed and B. Prasad, CRC Press – Taylor & Francis, 2016 Databases and Software Engineering Methodologies In the Waterfall methodology, the size of the software project teams tends to be large. Thus, database design and construction require software engineers who have specialized skills in those activities. Such software engineers will be solely working on designing and creating the databases. In agile methodologies, there are no separate team members who will exclusively take the database design and construction assignments. Database design and creation are mostly done by the same team members who are also involved in software design and programming activities. This is the norm except for very large agile projects. Database Types In general discussions, when people use the term databases, they mean both the database engine and the database schemas. However, there is a difference between these two. The database engine can be thought of as a superstructure. In the database engine, you can create many schemas. The database engine is also used for querying and accessing the data in the database schema. The database schema is the data structure (e.g., database tables in the case of relational databases, which are explained later) and the data that reside in that data structure. Here, when we use the term database types, we are actually talking about the database engine. From the very beginning of the software industry, companies and researchers have tried to create databases that can be used in building commercial and scientific software products. Because of these efforts, many types of databases exist. Some database types include file server based, relational, object oriented, object relational, and NoSQL. Of all these types of databases, relational databases have been the most commonly used. Some of the most popular relational databases include Oracle, Microsoft’s SQL Server, and MySQL. Relational databases can hold very large amounts of data (trillions of gigabytes in size). NoSQL databases There are many types of databases. Most databases fall under the category of either NoSQL databases or relational databases. Web-based software products have an architecture in which the database sits too far away from the users’ browsers. If the users want to validate their input using traditional databases, then the responsiveness of the web-based software products becomes a concern. There was a need for a better way to do it. This need was addressed when XML was invented. XML is a file that can be stored on an operating system and can be queried quickly. This file is generally stored at the web server and can be easily accessed by the user interface (web browser). This provides a quick response if any queries arising from the web browser need to be validated. XML can help in validating the user data quickly without accessing a traditional database that is located at the back end. NoSQL databases are based on this assumption. NoSQL databases are essentially a file system. They are not relational in nature. You cannot make a relationship between two NoSQL files. This is their limitation. In contrast, relational databases allow a relationship between two or more database tables, as explained in the latter sections of this chapter. NoSQL databases are essentially a file containing data. The data in these NoSQL databases is kept in a structured format. For example, we can store data in an XML file. The advantage of NoSQL databases is that they can be stored alongside other kinds of files. For example, an XML file can be kept on a web server alongside the HTML files. Reading and writing from these XML files is fast as compared to a relational database as they reside near the files which need data from these files. For web-based software products, use of XML files thus can be used to write client side scripts and perform user validations. The disadvantages of NoSQL databases is that it is difficult to enforce integrity rules to keep atomic and non- repeating data. When a data file becomes too large then reading or writing in data files becomes very slow. 65 | Page @Foundation of Software Engineering, A. Ahmed and B. Prasad, CRC Press – Taylor & Francis, 2016 Relational databases Relational databases keep their data in tables. Each table contains many columns. Each column represents a field and each row represents a record. Relational database technology is the most mature among all the database technologies available today. That is why most businesses and government agencies use relational databases to keep their data safely. Relational databases are the most often type of databases used with software products. Relational databases are powerful and can grow to contain extremely large size of data. The main idea behind relational databases is that data can be kept in different tables. Later a relation can be set among all these tables so that it is possible to create data and keep this data in different tables. Whenever required to view or modify data, a search through all the connected tables is possible. The ability to separate data in tables and yet be able to connect all this data through a relation between tables is something which makes relational databases powerful. Database languages To search a database, to manipulate a database, to create a database, or to manipulate the data in a database, you need to communicate with the database. This communication is done using database languages. The database language is divided into two parts: The part of language which deals with database entities creation and management is known as Data Definition Language (DDL). The part of language which deals with data creation and management is known as Data Manipulation Language (DML). Together, these languages are known as Structured Query Language (SQL). Both DDL and DML use SQL. Database entities include database schemas, database tables, sequences, indexes, primary keys, foreign keys, stored procedures, triggers etc. We will discuss about relational databases, entities and how they work throughout this chapter. Database entities: database schema A database consists of many entities. There is a top-level entity known as schema under which all other database entities are defined. You need to use DDL to define a new database entity. You can also delete a database entity, but you cannot update a database entity once it is defined. Schema. Schema is the topmost-level entity in a database. Schema is a blueprint that describes how the data will be stored and what structures will be available for the storage of data inside the database. A schema is divided into database tables and other database entities such as stored procedures, sequences, primary keys, and secondary keys. Tables. Tables are the second top-level entity inside a database after the schema. A table may contain many columns. Each column keeps the data at the most atomic level. For example, you can keep the data about the branch names of a bank in one column of a table. A table contains data related to a task or a function.

Load more