Data Warehouse – Part 02

Data Warehouse – Part 02

Data Warehouse – Part 02 Based on Chapter 06 The Data Warehouse in Data- Mining: A Tutorial-Based Primer by Roiger and Geatz 1 Data Warehouse Purpose House data for decision support Support organizational decision making – so that it can be fact-based instead of ad-hoc 2 Decision Support Categories Reporting Analyzing Knowledge Discovery 3 Sample of Credit Card Promotion Data (from Tabl e 2. 3) Income Magazine Watch Life Ins CC Ins Sex Age Range Promo Promo Promo 40-50K Yes No No No Male 45 30-40K Yes Yes Yes No Female 40 40-50KNoNoNoNoMale 42 30-40K Yes Yes Yes Yes Male 43 50-60K Yes No Yes No Female 38 20-30K No No No No Female 55 30-40K Yes No Yes Yes Male 35 20-30K No Yes No No Male 27 30-40K Yes No No No Male 43 30-40K Yes Yes Yes No Female 41 4 Credit Card Purchases and Promotions Cons tell ati on Desig n 5 Online Analytical Processing (OLAP) Query-based methodology that supports data analysis OLAP engine structures data as a cube A cube can have more than three dimensions – as the term cube is used in business intelligence/data warehousing 6 Find the Total Sales by Product by Year and by Regi on Region South Central Mythic World Product 2005 7 Year Data Cubes http://www.info- source.us/data__g_gwarehousing_mining/Data- http://zeesql.wordpress.com/2008/05/21/ Mining-and-Data-Warehousing-in-Biology- data-cubes/ 8 Medicine-and-Health-Care.html Data Cube Characteristics Designed for a specific purpose For four dimensions, visualize multiple cubes with same three dimension, but each cube represents a particular value of the fourth dimension ElExtrapolate to n dimensions http://zeesql.wordpress.com/2008/05/21/ data-cubes/ 9 Data Cube Characteristics Cubes with many empty cells are not as useful Thus, a cube with two time dimensions is not a good design, b/c intersection of quarter and month would be often empty http://www.info- source.us/data__g_gwarehousing_mining/Data- Mining-and-Data-Warehousing-in-Biology- 10 Medicine-and-Health-Care.html Data Store Behind Data Cube Relational Multidimensional Star schema Arrays Advantage: user can view Advantage: query speed data at detail level defined by star schema 11 OLAP Interfaces Many are emerging – especially interfaces designed for visual exploration Default interface is a spreadsheet workbook format OLAP usefu l functiona lity Different views of data Statistical calculations Drill-down and reverse drill down (or roll-up) Look at data at a more granular (detail) level or vice-versa Short video in right panel demoing OLAP interface: http://www.softwarefx.com/Extensions/featuresOlap.aspx 12 Slice A slice is a subset of a multi- dimensional array corresponding to a single value for one or more members of the dimensions not in the subset. http://www.practicaldb.com/blog/cubes/ Dice The dice operation is a slice on more than two dimensions of a dtdata cube (or more than two consecutive slices) OLAP Concept Example Credit card purchase data Month = Dec. Category = Vehicle Region = Two Amount = 6,720 Count = 110 Total amount and total number of Dec. vehicle purchases in Nov. Oct. region two for the Sep. month of December Aug. Jul. Month Jun. May Apr. Mar. Feb. Fo ur Th re Jan. T e wo O ne n io Reg Retail Travel Vehicle Restaurant Supermarket Miscellaneous Category Figure 6.6 A multidimensional cube for credit card purchases Attributes May Be Based on Concept Hierarchy 17 Location Excel Pivot Tables Accomplish the cube concept aggregate your information show a new perspective http:// www.ti meatl as.com/5 _mi nut e_ti ps/ ch unk ers/l earn _to_use_pivot_tables_in_excel_2007_to_organize_data 19 Excel Pivot Table – Example – p. 1 Open CreditCardPromotion.xlsx Copy the original data to a new worksheet In order to preserve the original data Remove any blan k co lumns or rows Each column must have a heading CllCells should be properly formatte d for the data type Highlight the data 20 Excel Pivot Table – Example – p. 2 Click Insert Select Pivot Table Select Pivot Table to open the Create Pivot Table dialog box Select Table/Range to make sure you selected the correct range SlSelec t New Work khsheet bttbutton Click Ok 21 Excel Pivot Table – Example – p. 3 Select Income-Range for row labels Select Income Range for values Click on Count of Income Range Go to Field Setting Choose % of column setting 22 Excel Pivot Table – Example – p. 4 Highlight the percentages Total Select Insert Pie Chart 20-30K 30-40K 40-50K 50-60K 23 Excel Pivot Table – Example – p. 5 Check out the drill-down functionality Double click in the pivot table on the % value for a particular income range The detail values for that income range are disp laye d in a new worksheet 24 Continue – With Page 205 Creating a Multidimensional Pivot Table 25 Data Warehouse – Part 02 Based on Chapter 06 The Data Warehouse in Data- Mining: A Tutorial-Based Primer by Roiger and Geatz 26.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    26 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us