Dimensional Model : An Overview (Why) Dan Kirpes, Fireman’s Fund Insurance Company, Novato, CA

ABSTRACT MODELS This paper is an introduction to Dimensional Model Data Warehousing. This paper will first contrast “Report Centric” versus “Information ENTITY-RELATIONSHIP MODELING (ER): Centric” reporting for decision support. A brief history of Data Entity-relationship modeling is a logical design technique Warehousing will be covered, followed by Data Warehouse that seeks to eliminate data redundancy. ER models show Definitions. The Entity-Relationship model and specifically the the relationship between data. These models are difficult to Dimensional Model will be reviewed. read and understand unless trained in the model relationship to data warehousing will also be discussed. methodology. Also, it is difficult to understand the business from viewing the ER model. See appendix for an example. REPORT CENTRIC VS. INFORMATION CENTRIC Dimensional modeling is the name of a logical design Many companies are Report Centric in its decision support design. technique used for data warehouses. Every dimensional For many Decision Support or Management Information departments, model is composed of a “fact” table and a set of “dimension” hundreds and most likely thousands of reports are created each month tables. Conformed fact and dimension elements are in an effort to satisfy the user’s appetite for information. Unfortunately, elements that conform to the enterprises centralized this approach will never satisfy this need. Users continue to develop . For example, “store_id” would have a new report requests, adhoc or one-off reports, which can not be common definition and attributes across the enterprise and handled by the report centric design. The users are spoon-feed the as such would have the same information across dimension data in precise formats, they use these reports for their job functions, tables. See the appendix for an example of a Dimensional and then react to the information. Model.

It is recommended to move to the Information centric design where BUSINESS INTELLIGENCE AND DATA users are empowered to query or browse the data themselves and as a result would be pro-active. This is a cultural shift for most WAREHOUSING organizations; management, report users, and report application Today's business environment requires a responsiveness designers/developers. The Information Centric design requires that that can only be achieved with timely and accurate insight the data be arranged in a manner that is user understandable, user into business conditions. To be successful, businesses need friendly to query, and consistent in its results. rapid, easy access to information about their customers, their internal finances, and external market conditions— collectively known as business intelligence. The Data Warehousing Dimensional Model design in conjunction with a Business Intelligence tool provides the business or organization with an Information Centric design. Business Intelligence is the process of analyzing your organization's accumulated raw data and extracting useful insight from it. HISTORY OF DATA WAREHOUSING 1960’s: General Mills and Dartmouth University in a joint project BI provides decision makers with the right information, at the developed the terms “dimension” and “fact”. This was the beginning of right time, in the right place, enabling them to make better the first business decisions. BUSINESS INTELLIGENCE 1970’s: Nielson Marketing Research took the techniques of “dimensions” and “facts” forward for grocery and drug store audit data. A: Data: Centralize data from multiple sources into a data warehouse. B: Insight: BI tools analyze the data to help better 1980’s: Nielson Marketing Research and IRI used grocery and drug understand the business. store scanner data to tie together with customers’ internal shipment C: Action: Act on the insight provided by BI tools by data. reallocating resources. DATA WAREHOUSE DEFINITIONS Data Warehouse: The queryable source of data in the enterprise.

Data Mart: A logical subset of the complete data warehouse.

Operational Data Store (ODS): The point of for operational systems.

Metadata Database: All of the information in the data warehouse environment that is not the actual data itself. It is centrally maintained and stored.

1 2. Information Centric reporting via the data warehouse’s BUSINESS INTELLIGENCE IS MADE UP OF MANY RELATED dimensional model puts the effort in the organization ACTIVITIES and storage of the data for easy access instead of Discover relationships among data points creating endless reporting applications. 3. Information Centric reporting empowers users to be OLAP (Online Analytical Processing) proactive instead of being reactive. Online analysis of transactional data 4. Information Centric reporting provides for an increased efficiencies, increased productivity which results in Query & Reporting increased profitability. View and manipulate data via multiple report formats 5. Data warehousing and Business Intelligence provide the foundation for Information Centric reporting. Proactive Information Delivery Receive information on a scheduled or event-driven basis REFERENCES via web, wireless, or voice device Dimensional Modeling, www.ou.edu/class/aschwarz/DataWarehouse APPENDIX Welbrock, Peter R., Startegic Data Warehousing Principles 1. ENTITY-RELATIONSHIP MODEL Using SAS Software, Cary NC: SAS Institute Inc., 1998. 384pp.

PRODUCT CUSTOMER SKU (PK) Kimball, Ralph, The Data Warehouse Toolkit: The Complete customer_ID (PK) ERD description customer_name Guide to Dimensional Modeling, New York, NY: John Wiley brand purchase_profile and Sons, Inc., 2002. 436pp. category credit_profile address Kimball, Ralph, The Data Warehouse Lifecycle Toolkit: ORDER ORDER-LINE Expert Methods for Designing, Developing, and Deploying order_num (PK) (FK) STORE order_num (PK) Data Warehouses, New York, NY: John Wiley and Sons, store_ID (PK) customer_ID (FK) SKU (PK) (FK) store_name store_ID (FK) promotion_key (FK) Inc., 1998. 771pp. address clerk_ID (FK) dollars_sold district date units_sold floor_type dollars_cost CONTACT INFORMATION Your comments and questions are valued and encouraged. CLERK PROMOTION Contact the author at: clerk_id (PK) promotion_NUM (PK) clerk_name promotion_name Dan Kirpes clerk_grade price_type Fireman’s Fund Insurance Company ad_type 777 San Marin Drive Novato, CA 94998 (415) 899-3561: Email: [email protected]

2. DIMENSIONAL MODEL

PRODUCT TIME DIMENSONAL product_key (PK) time_key (PK) MODEL SKU SQL_date description day_of_week brand month FACT category time_key (FK) store_key (FK) STORE clerk_key (FK) CUSTOMER store_key (PK) product_key (FK) customer_key (PK) store_ID customer_key (FK) customer_name store_name promotion_key (FK) purchase_profile address dollars_sold credit_profile district units_sold address floor_type dollars_cost

CLERK PROMOTION clerk_key (PK) promotion_key (PK) clerk_id promotion_name clerk_name price_type clerk_grade ad_type

CONCLUSION 1. Information Centric reporting is an 180 degree shift from the current Report Centric reporting environment that most organizations use.

2