August Rodil

Spring 2021 Database Education Officer Deliverable

Please keep all questions and information on your deliverable when submitting it. Use any method to explain how to solve them:

1. Explain what the 4 major components of the Database course are, why they’re important, and how they relate to each other. Within MIS, there are 3 courses that help teach students the basics of Databases and their structure; Systems Analysis and Design, Database I and Database II. These classes provide 4 major components which are Relational Databases, • Relational Database – Strictly defined, a relational database is “structured to recognize relations among stored items of information.” This is simply a table that displays data into columns and rows while maintaining the connection between attributes or tuples. This table is also a great way to visually reaffirm relationships between tables and see the database in its fullest form. Personally, I have experienced it many times my Professor would lecture us on the data in a table and I would understand the concepts he was talking about but could not wrap my head around some of the logic behind it. Having a visual aid such as a relational database really helped me understand some of the core mechanics in a database and laid the foundation for learning more about databases. • Structured Query Language (SQL) – There are many computer languages for a database application development project, but MIS courses heavily rely on SQL as the starting language for aspiring Data Scientists. SQL was first released around the mid-1970s but has maintained its popularity and usability to this day. First, it allowed writing one command prompt to access multiple records. Secondly, it did not need a user to specify how to reach said records with or without an index. SQL also was the first to use Edgar F. Codd’s relational model as well as complying to domestic and international computing standards, greatly increasing it usage among institutions. With so many decades of use, SQL is a great starting language for those wanting to understand the mechanics of data manipulation. While many students may complain that SQL is outdated to more modern methods, it is irrefutable that this language is a good baseline to start and branch out into other database concepts such as relational databases, data normalization and so on. • Cleaning Data – Data is only as useful as it is readable. Through the courses at the University of Houston, there are many ways to go about organizing data but the method we mainly used was called Data Normalization. Organizing data may seem easy, but there are so many factors that go into this that it can be hard for one person to keep track of everything. One concept that comes up regularly in the database courses are data anomalies. This is when a connection in a database is broken, the logic does not make sense to the compiler or human error from poorly planned data input. Data normalization is a step that structures data into normal forms and simplifies the process August Rodil

of compiling queries and validates the data in a database for any redundancies. When we think of the bigger picture, any time a specialist gets the chance to validate their information and ensure its integrity is crucial especially in a business when working with Big Data. • Users – There is a term that is thrown around a lot nowadays called “Big Data.” Our population has gotten to the point where all our information combined is so massive that is impossible to practically sort and use. But somewhere in there, that data is useful to someone. After collecting this “Big Data” it can be segmented, refined and repurposed into information for the users we wish to reach. Users are a very important aspect in the database courses at the University of Houston as they are the client, we are presenting our end-product to. Our courses regularly present tables and databases with mock business information in them where user’s passwords, names, job titles, etc. are located. It is up to us to use the aforementioned concepts of database to get that data into a useable state for the users of our program. The more comprehensible our data is to users, the quicker they perform their duties which results in a more effective business unit with greater performance results. • These four concepts showcase different segments in the database management system. From learning the languages used in manipulating tables of data, to visualizing and understanding the logic of data movement to the final process of showcasing said data into information for an end-user. These are the main components used in our courses at the University of Houston.

2. Explain each concept with as much detail as you can. Please also compare and contrast them to other concepts. a. Unique identifiers – As the name suggests, this is unique name that allows a DB specialist to single out and identify an entity from others without confusion. Using a combination of alphanumeric keywords for the names of each column and table within a database ensures you are accessing and manipulating the proper data. It would be horrible if a specialist wanted to an old password from an ID of 111 only to find out there are multiple columns with the name ID. Just like key attributes, this is another measure to keep everything standardized and have multiple specialists on the same page without to meet up every time before an is made on the database. b. Key attributes – Key attributes are unique characteristics within a table. There are many keys in DBMS, here are some to name: a super key, candidate key, primary key, secondary key and more. Each of these keys are identifiers in a table that uniquely identify their record. All four of these concepts help ensure a standard when manipulating data. c. Min Max Notation – A min max notation denotes the minimum amount of the entity must participate in other relationships and the maximum amount. This term is usually used in entity relationship modeling and helps set boundaries for how many times an entity enters a relationship in a specified database. This concept is very similar to cardinality ratios, the main difference being that min, max notation can set a specific August Rodil

number for the times an entity is in a relationship. Cardinality ratios uses many, one or none to define these relationships. d. Cardinality ratios – A cardinality ration defines the “maximum number of relationship instances in which an entity can be a part of.” There are many cardinalities, from many- to-many, many-to-one, one-to-one, etc. and each with their own symbols for an entity- relationship diagram.

3. How would you resolve each of the following errors? a. Ambiguously defined column – this error occurs when a clause combines two, or more, columns that share the same name in several tables. The best way to address this issue is to give each column an alias with a different name and retry the code. b. Disconnected from the rest of the join graph – This error seems to come from a logic oversight on the compiler’s part. If a DB specialist writes a line of code joining the Emp_Store to Employees table I believe that it expects the join to be like this: Emp_Store.empid=Employees.empid, but if you write it the opposite way this disconnected error will show up. So essentially, the order of the conditions for the join should go from the first identified table to the next and so on.

4. How do you set aliases in SQL? Give an example. • Aliases are temporary names given to tables or columns in tables. Sometimes the names of columns or tables are too long or unreadable, so aliases help a specialist make the names more understandable while working on the database. • Example: SELECT fname AS employee_first_name FROM Employees;

5. How do you compare/contrast JOIN, INNER JOIN, and OUTER JOIN?

• JOIN – A join statement “combines columns from one or more tables in a relational database by using values common to each table.” • INNER JOIN – An inner join clause selects all records from Table A and Table B on the specified join . Let’s say you want to combine the Emp_Store and Employees tables. You would combine them on the EmpID column since that is a field both tables share. This inner join would also leave out the columns that are not in both tables. • OUTER JOIN – An outer join returns unmatched rows form both tables including matched rows. This means that it results in a full combination of both tables regardless if the columns are on both tables or not. This is the main difference between an inner and outer join since an inner join combines on a specified entity and any other entities that both tables have in common.

August Rodil

Here is a visual to help better understand this concept as this can be quite confusing when described only through words.

Using the Access screenshots given, please answer question 6 below.

6. For each store, display the store’s address, city, state, zip, the manager’s first and last name, and manager’s phone number

SELECT S.address, S.city, S.state, S.zip, E.fname, E.lname, E.phone FROM STORE S INNER JOIN EMPLOYEE E ON S.manager = E.EmployeeId;

August Rodil

August Rodil