The

Chapter 6 6.1 Operational Data Modeling and Normalization

• One-to-ORlihiOne Relationships • One-to-Many Relationships • Many-to-Many Relationships Data Modeling and Normalization

• First Normal Form • Second Normal Form • Third N ormal F orm Make Customer ID Type ID Year Income Range

Vehicle - Type Customer

Figure 6.1 A simple entity- relationship diagram The Table 6.1a • Relational Table for Vehicle-Type

Type ID Make Year

4371 Chevrolet 1995 6940 Cadillac 2000 4595 Chevrolet 2001 2390 Cadillac 1997 Table 6.1b • Relational Table for Customer

CtCustomer Income ID Range ($) Type ID

0001 70–90K 2390 0002 30–50K 4371 0003 70– 90K 6940 0004 30–50K 4595 0005 70–90K 2390 Table 6.2 • Join of Tables 6.1a and 6.1b

CtCustomer Income ID Range ($) Type ID Make Year

0001 70–90K 2390 Cadillac 1997 0002 30–50K 4371 Chevrolet 1995 0003 70– 90K 6940 Cadillac 2000 0004 30–50K 4595 Chevrolet 2001 0005 70–90K 2390 Cadillac 1997 6.2 Data Warehouse Design The Data Warehouse

“A data warehouse is a subject-oriented, integg,rated, time-variant, and nonvolatile collection of data in support of managggpement’s decision making process (W.H. Inmon).” Granularity

Granularity is a term used to describe the level of detail of stored information. Dependent External Data Extract/Summarize Data

ETL Routine Operational (Extract/Transform/Load) Data (s) Warehouse

Independent Report Data Mart

Figure 6.2 A data warehouse process model Entering Data into the Warehouse

• Independent Data Mart • ETL (Extract, Transform, Load Routine) • Structuring the Data Warehouse: Two Methods • Structure the warehouse model using the • Structure the warehouse model as a multidimensional array The Star Schema

• Dimension Tables • Slowly Changing Dimensions Purchase Dimension Purchase Key Category 1 Supermarket 2 Travel & Entertainment 3 AtAuto & &Vhil Vehicle Time Dimension 4 Retail Time Key Month Day Quarter Year 5 Restarurant 10 Jan 5 1 2002 6 Miscellaneous ......

Fact Table Cardholder Key Purchase Key Location Key Time Key Amount 1 2 1 10 14.50 15 4 5 11 8.25 1 2 3 10 22.40 ......

Cardholder Dimension Location Dimension Cardholder Key Name Gender Income Range Location Key Street City State Region 1 John Doe Male 50 - 70,000 10 425 Church St Charleston SC 3 2 Sara Smith Female 70 - 90,000 ...... Figure 6.3 A star schema for credit card purchases The Multidimensionality of the Star S c hema Cardholder Ci

0) ,1 ,2 Purchase ,1 (C i Key A

ey K e im T

Location Key

Figure 6.4 Dimensions of the fact table shown in Figure 6.3 Additional Relational Schemas

• Constellation Schema Time Dimension Time Key Month Day Quarter Year Promotion Dimension 5 Dec 31 4 2001 Promotion Key Description Cost 8 Jan 3 1 2002 1 wathtch promo 15.25 10 Jan 5 1 2002 ......

Purchase Dimension Purchase Key Category 1 Supermarket 2 Travel & Entertainment 3 Auto & Vehicle 4 Retail 5 Restarurant 6 Miscellaneous

Promotion Fact Table Purchase Fact Table Cardholder Key Promotion Key Time Key Response Cardholder Key Purchase Key Location Key Time Key Amount 1 1 5 Yes 1 2 1 10 14.50 2 1 5 No 15 4 5 11 8.25 . . . . 1 2 3 10 22.40 ......

Cardholder Dimension Location Dimension Cardholder Key Name Gender Income Range Location Key Street City State Region 1 John Doe Male 50 - 70,000 5 425 Church St Charleston SC 3 2 Sara Smith Female 70 - 90,000 ......

Figure 6.5 A constellation schema for credit card purchases and promotions Decision Support: Analyzing the Warehouse Data

• Reporting Data •Analyzing Data • Knowledggye Discovery 63On6.3 On-line Analytical Processing OLAP Operat ions

• Slice – A single dimension operation • Dice – A multidimensional operation • Roll-up – A higher level of generalization •Drill-down – AllfdilA greater level of detail • Rotation – View data from a new perspective Month = Dec. Category = Vehicle Region = Two Amount = 6,720 Count = 110

Dec.

Nov.

Oct.

Sep.

Aug.

Jul.

Month Jun.

May

Apr.

Mar.

Feb. Fo ur Th re Jan. T e wo O ne n io Reg Retail Travel Vehicle Restaurant Supermarket Miscellaneous Category

Figure 6.6 A multidimensional cube for credit card purchases Concept Hierarchy

A mapping that allows attributes to be viewed from varying levels of detail. Region

State

City

Street Address

Figure 6.7 A concept hierarchy for location Month = Oct./Nov/Dec. Category == SupermarketSupermarket Region = One

Q4

Q3

Time Q2 Fo ur T hr ee Q1 T wo O ne on gi Re Retail Travel Vehicle Restaurant upermarket scellaneous ii SS M Category

Figure 6.8 Rolling up from months to quarters 6.4 Excel Pivot Tables for Creating a Simple Pivot Table Figure 6.9 A pivot table template Figure 6.10 A summary report for income range Figure 6.11 A pie chart for income range Piblfhiivot Tables for Hypothesis Testing Figure 6.12 A pivot table showing age and credit card insurance choice Figure 6.13 Grouping the credit card promotion data by age Figure 6.14 PivotTable Layout Wizard Creating a Multidimensional Pivot Table Watch Promo = No Life Insurance Promo = Yes Magazine Promo = Yes

No h Promo cc Yes Y Wat es No

No e

Yes in az ag o M om Pr

Life Insurance Promo

Figure 6.15 A credit card promotion cube Figure 6.16 A pivot table with page variables for credit card promotions