Leveraging DB2 10 for High Performance of Your Data Warehouse

Front cover IBM® DB2® IBM® Information Management Software Leveraging DB2 10 for High Performance of Your Data Warehouse Comprehensive platform for architected data warehouse solutions Faster time to value for business decision making Standards-based for flexibility and change Whei-Jen Chen Scott Andrus Bhuvana Balaji Enzo Cialini Michael Kwok Roman B. Melnyk Jessica Rockwood ibm.com/redbooks International Technical Support Organization Leveraging DB2 10 for High Performance of Your Data Warehouse January 2014 SG24-8157-00 Note: Before using this information and the product it supports, read the information in “Notices” on page ix. First Edition (January 2014) This edition applies to DB2 for Linux, UNIX, and Windows Version 10.5. © Copyright International Business Machines Corporation 2014. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . ix Trademarks . x Preface . xi Authors . xi Acknowledgements . xiv Now you can become a published author, too! . xv Comments welcome. xv Stay connected to IBM Redbooks . xv Part 1. Overview . 1 Chapter 1. Gaining business insight with IBM DB2 . 3 1.1 Current business challenges . 4 1.2 Big data and the data warehouse . 5 1.2.1 Data warehouse infrastructure . 6 1.3 High performance warehouse with DB2 for Linux, UNIX, and Windows. 7 Chapter 2. Technical overview of IBM DB2 Warehouse. 9 2.1 DB2 Warehouse solutions. 10 2.1.1 The component groups . 10 2.1.2 Editions . 11 2.2 Expert systems . 17 2.2.1 PureData for operational analytics . 18 2.2.2 DB2 Warehouse components . 19 2.2.3 Database management system . 22 2.3 DB2 Warehouse topology . 25 Chapter 3. Warehouse development lifecycle . 29 3.1 Defining business requirements . 30 3.2 Building the data model. 31 3.2.1 Defining the physical data model . 31 3.2.2 Creating the physical data model . 32 3.2.3 Working with diagrams . 36 3.2.4 Using the Diagram Editor . 38 3.2.5 Optimal attributes of the physical data model . 39 3.2.6 Model analysis . 40 3.2.7 Deploying the data model . 42 3.2.8 Maintaining model accuracy . 45 © Copyright IBM Corp. 2014. All rights reserved. iii 3.3 Matching the technology to the requirements . 48 Part 2. Technologies . 51 Chapter 4. Column-organized data store with BLU Acceleration. 53 4.1 Simple and fast with BLU Acceleration . 54 4.1.1 Dynamic in-memory columnar technology . 54 4.1.2 Actionable compression . 55 4.1.3 Parallel vector processing . 55 4.1.4 Data skipping . 56 4.2 Getting started with BLU Acceleration. 56 4.2.1 Capacity planning . 57 4.2.2 Storage requirements . 58 4.3 Creating a column-organized data store . 58 4.3.1 Single configuration setting for analytic workloads . 59 4.3.2 Fine-tuning a configuration . 60 4.3.3 Creating column-organized tables . 61 4.3.4 Converting to column-organized tables. 63 4.4 Loading a column-organized data store . 63 4.4.1 Synopsis tables and data skipping . 64 4.4.2 Compression of column-organized tables. 65 4.4.3 Reducing load times . 67 4.5 Managing a column-organized data mart . 68 4.5.1 Query execution plans and the new CTQ operator. 68 4.5.2 Database maintenance . 70 Chapter 5. Row-based data store. 71 5.1 Scale-out solution: Database Partitioning Feature . 72 5.1.1 Partition . 72 5.1.2 Partition group. 73 5.1.3 Table spaces in a DPF environment . 74 5.1.4 Partitioning keys . 74 5.1.5 Partition maps . 75 5.1.6 Choosing partitioning keys . 76 5.1.7 Concluding remarks . 78 5.2 Scale-up solution: Intrapartition parallelism. 79 5.3 Other features for row-based data warehouse . 81 5.3.1 Compression . 81 5.3.2 Multidimensional data clustering . 83 5.3.3 Range partitioning . 87 Chapter 6. Data movement and transformation . 91 6.1 Introduction . 92 6.1.1 Load . 92 iv Leveraging DB2 10 for High Performance of Your Data Warehouse 6.1.2 Ingest . 94 6.1.3 Continuous data ingest . 96 6.2 DB2 Warehouse SQL Warehousing Tool . 98 6.2.1 Architecture . 99 6.2.2 Development environment . 100 6.2.3 Moving from development to production . 101 6.2.4 Runtime environment . 102 Chapter 7. Monitoring . 105 7.1 Understanding monitor elements . 106 7.1.1 Monitor element collection levels . 108 7.1.2 New monitoring elements for column-organized tables . 110 7.2 Using DB2 table functions to monitor your database in real time . 111 7.2.1 Monitoring requests. 111 7.2.2 Monitoring activities. 112 7.2.3 Monitoring data objects . 112 7.2.4 Monitoring locks . 113 7.2.5 Monitoring system memory . 113 7.2.6 Monitoring routines . ..

Leveraging DB2 10 for High Performance of Your Data Warehouse

Xquery in IBM DB2 10 for Z/OS

Best Practices Managing XML Data

Mongodb - Document-Oriented Store

DB2 10.5 with BLU Acceleration / Zikopoulos / 349-2

History Repeats Itself: Sensible and Nonsensql Aspects of the Nosql Hoopla C

Software Withdrawal: IBM DB2 Linux , UNIX and Windows and IBM Informix Dynamic Server Product Packaging Update, Release and Feature

Multi-Model Databases: a New Journey to Handle the Variety of Data

IDUG EU 2008 Paul Zikopoulos: IBM Data Studio: the Toolset of The

Xquery Rewrite Optimization in IBM DB2 Purexml

DB2 Purexml - Advanced of Data Storage in the Relational - Hierarchical Structures

Compare the Distributed DB2 10.5 Database Servers

IDUG NA 2009 Paul Zikopoulos: IBM Data Studio: the Toolset of The