DB2 UDB's High Function Business Intelligence in E-Business
Total Page:16
File Type:pdf, Size:1020Kb
Front cover DB2 UDB’s High Function Business Intelligence in e-business Exploit DB2’s materialized views (ASTs/MQTs) feature Leverage DB2’s statistics, analytic and OLAP functions Review sample business scenarios Nagraj Alur Peter Haas Daniela Momiroska Paul Read Nicholas Summers Virginia Totanes Calisto Zuzarte ibm.com/redbooks International Technical Support Organization DB2 UDB’s High-Function Business Intelligence in e-business September 2002 SG24-6546-00 Take Note! Before using this information and the product it supports, be sure to read the general information in “Notices” on page xvii. First Edition (September 2002) This edition applies to all operating system platforms that DB2 UDB Version 8 supports. Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept. QXXE Building 80-E2 650 Harry Road San Jose, California 95120-6099 When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. © Copyright International Business Machines Corporation 2002. All rights reserved. Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp. Contents Figures . vii Tables . .xi Examples. xiii Notices . xvii Trademarks . xviii Preface . xix The team that wrote this redbook. xx Notice . xxii Comments welcome. xxii Chapter 1. Business Intelligence overview. 1 1.1 e-business drivers . 2 1.1.1 Impact of e-business . 5 1.1.2 Importance of BI . 7 1.2 IBM’s BI strategy and offerings . 9 1.2.1 BI and analytic enhancements in DB2 UDB . 11 1.2.2 Advantages of BI functionality in the database engine . 12 1.3 Redbook focus . 12 1.3.1 Materialized views. 13 1.3.2 Statistics, analytic and OLAP functions. 14 Chapter 2. DB2 UDB’s materialized views . 15 2.1 Materialized view overview . 16 2.1.1 Materialized view motivation . 16 2.1.2 Materialized view concept overview . 17 2.1.3 Materialized view usage considerations . 19 2.1.4 Materialized view terminology . 20 2.2 Materialized view CREATE considerations . 21 2.2.1 Step 1: Create the materialized view . 22 2.2.2 Step 2: Populate the materialized view . 22 2.2.3 Step 3: Tune the materialized view . 26 2.3 Materialized view maintenance considerations . 26 2.3.1 Deferred refresh . 27 2.3.2 Immediate refresh . 34 2.4 Loading base tables (LOAD utility) . 37 © Copyright IBM Corp. 2002 iii 2.5 Materialized view ALTER considerations . 41 2.6 Materialized view DROP considerations . 42 2.7 Materialized view matching considerations . 42 2.7.1 State considerations . 44 2.7.2 Matching criteria considerations . 44 2.7.3 Matching permitted . 45 2.7.4 Matching inhibited . 56 2.8 Materialized view design considerations . 60 2.8.1 Step 1: Collect queries & prioritize . 63 2.8.2 Step 2: Generalize local predicates to GROUP BY . 64 2.8.3 Step 3: Create the materialized view . 65 2.8.4 Step 4: Estimate materialized view size . 65 2.8.5 Step 5: Verify query routes to “empty” the materialized view . 66 2.8.6 Step 6: Consolidate materialized views . 66 2.8.7 Step 7: Introduce cost issues into materialized view routing. 67 2.8.8 Step 8: Estimate performance gains . 67 2.8.9 Step 9: Load the materialized views with production data . 69 2.8.10 Generalizing local predicates application example . 69 2.9 Materialized view tuning considerations . 87 2.10 Refresh optimization . 90 2.11 Materialized view limitations . 92 2.11.1 REFRESH DEFERRED and REFRESH IMMEDIATE . 92 2.11.2 REFRESH IMMEDIATE and queries with staging table . 93 2.12 Replicated tables in nodegroups . 95 Chapter 3. DB2 UDB’s statistics, analytic, and OLAP functions. 99 3.1 DB2 UDB’s statistics, analytic, and OLAP functions . 100 3.2 Statistics and analytic functions . 100 3.2.1 AVG. 101 3.2.2 CORRELATION . 101 3.2.3 COUNT . 102 3.2.4 COUNT_BIG . 102 3.2.5 COVARIANCE . 103 3.2.6 MAX . 103 3.2.7 MIN . 104 3.2.8 RAND . 104 3.2.9 STDDEV . 105 3.2.10 SUM . 106 3.2.11 VARIANCE . 106 3.2.12 Regression functions. 107 3.2.13 COVAR, CORR, VAR, STDDEV, and regression examples. 110 3.3 OLAP functions . 117 3.3.1 Ranking, numbering and aggregation functions . 118 iv High-Function Business Intelligence in e-business 3.3.2 GROUPING capabilities ROLLUP & CUBE . 125 3.3.3 Ranking, numbering, aggregation examples. 127 3.3.4 GROUPING, GROUP BY, ROLLUP and CUBE examples. 138 Chapter 4. Statistics, analytic, OLAP functions in business scenarios. 149 4.1 Introduction . 150 4.1.1 Using sample data . 150 4.1.2 Sampling and aggregation example . 151 4.2 Retail . 154 4.2.1 Present annual sales by region and city . 154 4.2.2 Provide total quarterly and cumulative sales revenues by year . 156 4.2.3 List the top 5 sales persons by region this year . 159 4.2.4 Compare and rank the sales results by state and country . 160 4.2.5 Determine relationships between product purchases . 164 4.2.6 Determine the most profitable items and where they are sold . 167 4.2.7 Identify store sales revenues noticeably different from average . 171 4.3 Finance . 173 4.3.1 Identify the most profitable customers . 173 4.3.2 Identify the profile of transactions concluded recently . 176 4.3.3 Identify target groups for a campaign . 181 4.3.4 Evaluate effectiveness of a marketing campaign . 184 4.3.5 Identify potential fraud situations for investigation . 192 4.3.6 Plot monthly stock prices movement with percentage change . 193 4.3.7 Plot the average weekly stock price in September . 195 4.3.8 Project growth rates of Web hits for capacity planning purposes . 198.