IBM Db2 Db2 is Resilient and Consumable

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Your Presenters

Les King David Kalmuk lking@ca..com [email protected] Hybrid Data Management Solutions STSM – Db2 Warehouse Architect

Nidhi Bhatnagar Loic Julien [email protected] [email protected] Db2 Offering Management STSM – Db2 Cloud Deployments IBM Db2 Introducing Db2 11.5.5

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Collect: IBM Hybrid Data Management

Enterprise and open source data IBM® Db2®

All workloads OLTP/Operational OLAP/Mixed Big data Fast data

One engine and experience Db2 common SQL engine

Easily access all data Data virtualization

All deployment targets Public cloud Private cloud On-premises Appliance

On my cloud IBM Cloud Pak for Data

Hyperconverged Private Cloud System

IBM Hybrid Data Management / September 2020 / © 2020 IBM 4 Corporation Db2 V11.5 - Make Data Simple and Accessible in a Complex, AI- Driven, Multi-Cloud World

Making DataSimple

Rock SolidDatabase • Performant • Secure Enterprise • Available Readiness • Automated administration & monitoring InfuseAI • ML optimizer Cost Savings • Adaptive workloadmanagement

Making DataAccessible

Modern • Developerfriendly Development • Multi modal Containers Consumability • Deploy across multiple formfactors • Multi Cloud and Containers 5 Db2 – The “Common Engine” Cornerstone of most HDM Offerings

Db2 Warehouse on Cloud

Fully Managed Optimized for Analytic Workloads Db2 on Cloud Db2 Big SQL Public Cloud, Private Cloud and On-premises Fully Managed Native support in HDFS environment Optimized for OLTP and Operational Workloads Targeted at cornerstone of Logical Data Lake Db2 Db2 Warehouse

Private Cloud (CP4D) and On-premises Db2 Event Store Optimized for Analytic Workloads V11.5 Public Cloud, Private Cloud and On-premises Event processing from many IoT sources Full integration with many data sources Integrated Analytics System Db2 Appliance Deployment Form On-premises Optimized for Analytic Workloads Customizable and fully functional version of Db2 Ability to optimize for any workload Db2 – The “Collect” Cornerstone for Modern Workloads

Multi-Modal Mixed Workloads Traditional Relational - SQL Data Virtualization JSON / BSON – Mongo API OLTP => Operational XML - XQuery Access to all business data OLAP => Mixed Analytics Graph Query - Gremlin No matter where it is stored Data Ingestion => Real Time – Continual NoSQL / NewSQL Data Store Minimize data movement Data Ingestion Full integration with in Db2 locally

Multi/Hybrid Cloud Db2 In-Db2 ML Deploy Anywhere Performance & Scalability Public Cloud – Fully Managed V11.5 In-Db2 Model Training Private Cloud In-Db2 Exploration & Preparation In-Db2 Scoring Containerization Cloud Pak for Data

Integration with Cornerstone of “Collect” phase Rapid Deployment Integration in Organize, Analyze and Infuse Flexible Deployment Common Management and base platform Elastic and Modular Db2 11.5.5

Modern Workloads Performance & BLU & MPP (DPF) Scalability

• Native support for Graph Data • Machine Learning Optimizer – • Compact VARCHAR Phase 2 – GROUP BY and HASH fully integrated with Db2 data Cardinality Estimations – Phase 3 JOIN reduced memory requirements. (Technology Preview) (Technology Preview) • Automatic REORG with RECOMPRESS options for • NoSQL / NewSQL Access to graph • Compact VARCHAR Phase 2 – faster performance and additional page-based data via Gremlin language GROUP BY and HASH JOIN string compression (Technology Preview) performance improvements • Reduce storage consumption for Columnar On-line • Spatial Analytics Enhancements • JOIN filter push-down Incremental Schema level Backup Restore (COISBAR) objects • Integrated Machine Learning data • WORKLOAD=SAP Enhancements cleansing and model calls (DB2_SELECTIVITY=ALL) • UPDATE/DELETE ability to skip locked data (Technology Preview)

• Mixed Workload Performance enhancements Db2 11.5.5

Availability & Core Engine pureScale Recoverability

▪ Columnar On-line Incremental Schema ▪ Concurrent on-line modification updates ▪ Automatically periodically validate Backup & Restore (for Db2 Warehouse/IIAS the health of the cluster topology only) ▪ GPFS 5.0.5.0 Support and Integration ▪ Includes LOAD support (pureScale, IAS) ▪ Lightspeed RDMA ping validation to ▪ Includes quiesce schema ensure all components are alive ▪ Includes RCAC support ▪ db2histmon – new reporting functions ▪ REORG Table reclaim extents ▪ Storage savings

▪ Advanced Log Space Management (ALSM) ▪ Log mirroring support ▪ Additional monitoring capabilities

▪ Pacemaker GA for HADR Db2 11.5.5

Data Management Application Security Console Development

• Support for Db2 on Cloud and Db2 ▪ Compliance with JDBC 4.3 ▪ Schema Level Authorization with Audit Warehouse on Cloud connections controls • Support for monitoring and alerts in ▪ Odata pureScale environment ▪ MQT Support ▪ JWT Enhancements • Deploy DMC within a Docker container ▪ VIEW Support • dmcTop • Job management and scheduling ▪ JCC Runtime for OpenJDK 13 • Monitoring reports • Custom (user-defined) alerts ▪ Visual Studio Enhancements • More monitoring KPIs and database objects to manage ▪ Entity Framework 6 Enhancements • Audit logging: keep auditing trail of user activity for DMC • DMC can use Db2 on Cloud as a repository database for historical data • Multiple UI and usability enhancements Db2 11.5.5

Deployment & Data Virtualization Appliance Containerization

• Db2 container for Red Hat OpenShift ▪ New Federated Sources: ▪ GPFS 5.0.5.0 upgrade aligned with November CP4D release ▪ MySQL CE • RH-OKD Support ▪ Snowflake via ODBC • HADR added to tile for Db2 Warehouse ▪ Snowflake native SMP ▪ PostgreSQL 10.x • Operator based deployment ▪ Hive 3.x • Support for Db2 Graph • Support for OCP v4.5 and zLinux • Storage solutions: • IBM Spectrum Scales CSI 2.0 Fully Managed • Microsoft Azure Blob Storage (Object Storage) • Amazon S3 Cloud object storage ▪ Db2 on Cloud V1->V2: Packaging • Red Hat OpenShift Container storage ▪ Db2 on Cloud V1 -> V2: Migration (OCS) 4.5 ▪ Self-service maintenance windows • Portworx 2.5.5 ▪ Db2 Hosted (deprecation) • Upgrade to TSA 4.1.0.6 ▪ Global DR Support • Upgrade to GPFS 5.0.5.0 ▪ HIPAA certification • Upgrade to Java 8.0.6.15 ▪ Backup and Retention Policy IBM Db2 Security Enhancements

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Schema Authorization

• 11.5.5 extends existing schema authorizations with many new privileges and authorities • Allow delegation of key authorities from database to schema • Give applications full control over a schema • Simplify ongoing maintenance of object privileges within the schema

• Previously available in container-based environments (DB2U, Db2 Warehouse on Cloud, IIAS etc.), it is now available on-prem Schema Level Security

Database

Administration : DBADM Security : SECADM, ACCESSCTRL Access : DATAACESS Load : LOAD Schema

Administration : SCHEMAADM Security : ACCESSCTRL Access : DATAACESS, SELECTIN, INSERTIN, UPDATEIN, DELETEIN Load : LOAD Object

Administration : Owner/CONTROL AUTHORITY Security : Owner/CONTROL PRIVILEGE Access : SELECT, INSERT, UPDATE, DELETE Load : LOAD

14 JWT Enhancements

• 11.5.4 introduced the ability to use JSON Web Tokens (JWT) for authentication • For web applications that are integrated with an Identity Provider to obtain the JWT

• 11.5.5 enhances that support to allow for multiple certificates (JWT_IDP_*_LABEL) to be used when validating an IDP’s JWTs • Allows for new and soon to expire certificates

• db2pd -authntokencfg • Print out in-memory version of token configuration • Config is dynamic, confirms what settings the engine is using IBM Db2 Performance + Availability

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Db2 Performance Highlights

Core Engine Columnar / Analytics

Query Performance Authentication Caching • Early aggregation + distinct Faster index splitting at non-leaf • Full outer join levels under high contention • Join residual predicate support • ML optimizer – tech preview • Varchar performance for group- by and joins PureScale ETL Performance • Vectorized Insert + Update • Reduced UNDO logging 2x faster LOAD w/ Range • Update + Delete performance Partitioned Tables enhancements

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Compact Varchar for Columnar Join + Group-by

Impact – Improved memory efficiency for wide VARCHARs in Columnar Group By and Join – Reduce memory consumption, spill I/O and Out of Memory errors – Performance improvements – Increase in concurrency within Group By and Join operator

Results (from internal performance benchmarking) – Overall workload elapse time, memory footprint and spilling greatly improved • Performance: Up to 2.9X overall workload, 17.6X individual query • Memory reduction: Up to 1.1X overall workload, 2.5X individual query • Spilling reduction: Up to 5.6X overall workload, >1200X individual query Authentication Caching

Without a cache – every connection needs to call out to the LDAP Server

With caching - only a small subset need to, the rest are serviced by the cache

Short duration connections Cache LDAP Server Db2 Availability + Resiliency Highlights

Columnar / Analytics

Core Engine Adaptive Workload Management Improved MPP resiliency Automatic in-doubt resolution Advanced Log Space Management Logical Schema Backup and Restore Parallel logging when using Mirrored Logs Ability to block reorg pending operations with registry variable HADR + PureScale Faster database activation Automated HADR with Pacemaker for RHEL PureScale concurrent online fixpack

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Advanced Log Space Management

• Avoid log full scenarios due to long running transactions • Extract log records from long running active transactions to allow intermediate log files to be closed, archived, reused

21 Log records from long running transaction

S0000000.LOG S0000001.LOG S0000002.LOG S0000003.LOG

TXID_EXTRACT.LOG Available log files Active log files21 Availability -- FASTER Database STARTUP [11.5.4]

ON by DEFAULT 140 Database Activation Time

120 Speeds database startup by 100 up to 120x. 80

60

The larger the buffer pools and 40 lock list the larger the speed-up. 43x faster 20 120x faster

0 11.5 11.5 Tech Preview 11.5 future LINUX - 320 GB AIX 512 GB

22 Adaptive Workload Management [11.5.4]

Intelligent Job Scheduling Out of the Box Time Estimate Time Historical Actuals

Memory Estimate Memory Historical Actuals • Evaluate resource requirements and expected runtime of incoming SQL • Schedule based on fit Which • Automatically balance different classes of Costclass? SQL based on runtime Db2 • Continuous feedback and adaptation based on actual behavior MPP Cluster Resiliency • MPP configurations allow fast failover to standby nodes for HA • In 11.5.4 we improved our communications infrastructure to narrow the outage window on failover • Faster database activation and automatic in-doubt recovery further narrow the outage window HA Group #1

Host A Host C Host B Standby Partition 0 Partition 16 Partition 1 Partition 8 Partition 6 Partition 17 Partition 2 Partition 9 Partition 9 Partition 18 Partition 3 Partition 10 Partition 10 Partition 19 Partition 4 Partition 11 Partition 11 Partition 20 Partition 5 Partition 12 Partition 21 Partition 12 Partition 6 Partition 13 Partition 22 Partition 13 Partition 7 Partition 14 Partition 23 Partition 14 Partition 15 Partition 15

Container Restart Standby Host

Clustered / Shared File System Logical Schema Backup and Restore

db_backup and db_restore Python scripts Columnar Online Incremental Schema BnR:

IIAS or WH only (not traditional DB2) ✓ IN lock on tables (allows read and write access) Run outside of engine - use SQL ✓ Supports INCREMENTAL (delta and cumulative) Backup image is a file with unloaded table contents, also includes db2look output ✓ Point in time snapshot backup Unload uses External Tables ✓ QREP support Restore DROPs and CREATEs tables and non-table objects in schema. Then runs INSERT ❖ Only Columnar tables Disregards database logging, no ROLLFORWARD ❖ Only in schema enabled for ROW MODIFICATION Not intended for disaster recovery (use DB2 BACKUP) TRACKING (aka RMT) Also known as Schema BnR, Logical BnR ❖ Table have 3 extra system hidden columns: SYSROWID, CREATEXID, DELETEXID New in 11.5.5

• REORG TABLE … RECLAIM EXTENTS reclaims space from deleted SYSROWID/CREATEXID/DELETEXID columns

• DB2 LOAD utility is allowed into tables with 2 identity columns

• Methods to validate backup image integrity

• Schemas that are enabled for row modification tracking can be part of multi-schema backup/restore

• Schema backup/restore parallelism via –sessions parameter

• Target restore schema name can be specified using -target-schema option Pacemaker & Corosync 2 Node automated HADR failover for Linux

Cloud Ready Cloud Ready Cloud Ready

• Integrated bundling and install • Cluster manager-aware integrated Db2 commands

• Integrated data collection via db2support • Support on POWER (AIX and PPCLE) • Multiple Standby Support • New cluster manager configuration utility – db2cm • Active-Passive HA Configuration • Multiple instances & databases support • Fast redeployment via import & • Enhanced quorum type support with QDevice export support • Mount point monitoring

• RHEL 8.1, SLES 15 SP1 support on Intel and Linux on IBM Z • DPF HA • Two node support with fencing • on AWS More Cloud validations • Validated on AWS with RHEL 8.1 • Majority Quorum (Potentially)

Linux Linux + AIX 27 IBM Db2 Survey Questions

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Db2 is Resilient and Consumable – Poll questions What version of Db2 is being used in your organization? (Multiple selections allowed) a. 10.1 or earlier b. 10.5 c. 11.1 d. 11.5

29 Db2 is Resilient and Consumable – Poll questions When do you your organization will shift Db2 workloads to the cloud?

(Multiple selections allowed) a. Already have shifted ALL Db2 workloads to the cloud b. Already have shifted SOME Db2 workloads to the cloud c. Plan to shift some Db2 workloads in 2021 d. Plan to shift some Db2 workloads in 2022+ e. Have no plans to shift Db2 workloads to the cloud

30 Db2 is Resilient and Consumable – Poll questions If you plan to shift Db2 workloads to the cloud, is Db2 on Cloud or Db2 Warehouse on cloud your default choice? a. Yes b. No c. Undecided d. N/A

31 Db2 is Resilient and Consumable – Poll questions Regarding your Db2 workloads, does IBM Cloud Pak on Data resonate with your organization? a. Never heard of Cloud Pak for Data b. Heard of Cloud Pak for Data but not familiar with it c. Yes, Cloud Pak for Data is strategic for us d. No, Cloud Pak for Data is not strategic for us

32 Db2 is Resilient and Consumable – Poll questions Regarding your Db2 deployment, have you adopted a containerized deployment strategy? a. Yes b. No, but we plan to do so c. No, we have no plans to do so

33 Db2 is Resilient and Consumable – Poll questions What is the perception of Db2 within your organization?

(Allow for comments) a. Positive b. Neutral c. Negative

34 IBM Db2 Integrated with Cloud Pak for Data

IBM Hybrid Data Management / November 2020 / © 2020 IBM Corporation Containers are the future of Cloud-Ready Applications

71% or ORGS are planning to containerize existing applications

Portable Secure

RHOS Containers Kubernetes Principles Prebuilt OS extend Cloud and Containers Abstraction Certified experience to any Private DC

Microservices Agile Infrastructure Architecture Maintain for complex Operators desired State operations

Unify DevOps Light weight Pipeline

36 Introducing Cloud - Ready Containerized Database Db2 on Red Hat Openshift

37 OpenShift Containers and Kubernetes

Develop secure and scalable Kubernetes applications without dealing with complexity of Kubernetes

RedHat OpenShift is a tightly Integrated Stack with Infrastructure as a Service, Provides Operating System and Kubernetes working in tandem abstraction layer for faster • Simple and Unified interface to leverage power of Kubernetes without needed to application learn Kubernetes deployments and • An enterprise-grade Kubernetes distribution with hundreds of security, defect, increase time to and performance fixes in each release value • Validated popular storage and networking plug-ins for Kubernetes • Automated operations from over-the-air updates for RedHat CoreOS and Kubernetes (coming soon)

IBM Cloud Pak for Data / © 2020 IBM Corporation 38 Introducing Cloud - Ready Containerized Database Db2 on Red Hat Openshift

39 What is IBM Cloud Pak for Data and How Db2 adds value

App Data Data Data Business Developers Engineers Stewards Scientists Users & Analysts

Collect Organize Analyze

o Data virtualization o Discovery & search o Data visualization o Data warehousing o Data transformation o Machine learning learning The Ladder to AI o Databases on-demand o Data cataloging o Model build & deploy o Data source ingestion o Business glossary o Model management o Distributed processing o Policies, rules & privacy o Dashboards

Powered by: Db2 and Db2 Warehouse Powered by: Infosphere, Data Powered by: Studio open technologies Stage and IGC/WKC source and Cognos

• Logging • Metering • Kubernetes • Identity Access Mgmt. Multicloud Services • Monitoring • Persistent Storage • Security • Docker Registry / Helm

Cloud Pak for Data

IBM Cloud Pak for Data / © 2020 IBM Corporation Why IBM Db2 in Cloud Pak for Data?

Easy to Deploy IBM Db2 ❖ Cookie Cutter Deployment ❖ Deploy within minutes, enables standardized deployment and management with flexibility ❖ Pre-customized DB created during provisioning

❖ Automated Failover (New in Q12020): ❖ Out of Box enhanced Kubernetes availability Accelerate your ❖ Support for HA/DR cloud and AI ❖ Automated Update: journey and ❖ Service Packs, versions and mods can be deployed at the push of button Reduce Complexity Easy to Manage ❖ Automated Management by “Application Group” ❖ Use Namespaces to manage access control and provisioning options

❖ Monitor and Manage at an application level ❖ Platform and Service level monitoring and management 41 IBM Cloud Pak for Data / © 2020 IBM Corporation IBM Cloud Pak for Data / © 2020 IBM Corporation IBM Db2 in Cloud Pak for Data Ecosystem

Ecosystem integration Platform integration

Integration with the Marketplace User Management Lifecycle management experience Unified Db2 Console

IBM Db2 contributes to the rich ecosystem offered by Cloud Pak for Data

Data & AI integration New/ Coming

Notebook integration Platform level Backup/restore AI & ML integration Platform level Disaster Recovery Catalog/Discovery integration Metering, Logging and serviceability

42 IBM Cloud Pak for Data / © 2020 IBM Corporation Introducing Cloud - Ready Containerized Database Db2 on Red Hat Openshift

43 IBM Db2 as Cloud- Native Containerized Database

Rapid IBM Db2 services deployed as Simplified Lifecycle Management Deployment of Microservices that can be developed, (Upgrade via Helm, Coming soon: Database updated and scaled independently Operator Backup/restore) which is Agile, Elastic and Modular

Flexibility to deploy Rapid Deployment on-prem or any cloud provider IBM Cloud Pak for Data / © 2020 IBM Corporation 44 Simplified High Availability

Keep your systems up and running in an operational state with negligible impact on performance and manage it easily

❖ Spread your workloads and mitigate single points of failure with configuring standby and failing over when needed to keep your business up and running ❖ Out of Box enhanced Kubernetes availability ❖ New in 11.5.4 : Automated Failover for high availability with Single Standby Keep your business running 24*7 with Db2 High Availability Db2 supports Hight Availability on various topologies, such as: ❖ Primary and Standby on the same OpenShift/Kubernetes cluster (same or different namespaces) ❖ Primary and Standby on different OpenShift/Kubernetes clusters

IBMIBM Cloud Cloud Pak Pak for forData Data / © / 2020© 2020 IBM IBM Corporation Corporation 45 Improved Security

Fully secure your data with powerful Db2 security along with integrated RedHat OpenShift Container security capabilities

Db2 Security • Supports Transport Layer Security (TLS) to encrypt data in transit RedHat OpenShift • Client-server communications are fully encrypted at both the network and provides container disk level security capabilities on top of Db2 Cloud Pak for Data security certifications: Security features to secure Data • Leverage UDB RedHat base image designed for OpenShift Containers • RedHat image certification • SELinux enforcing mode on OpenShift • All IBM Db2 containers running as non-root users • Helm Charts certified by Cloud Pak for Data

46 IBM Cloud Pak for Data / © 2020 IBM Corporation Traditional & Software Defined Storage

Db2 on Open Shift Supports all Traditional Storage ( SAN for performance, NFS, Cluster file System such as GPFS and local disks)

New in 11.5.4: • Software Defined Storage • It is native to RedHat OpenShift and Kubernetes and a defined control plan manages the storage layer like dynamic provisioning, replication, Choose Storage options snapshots, expansion etc based on your • Virtualizes storage by decoupling Software from Hardware eliminating the performance and need of specialized expensive hardware scalability needs and • Following storage types are supported: scale the storage as Portworx, Ceph, OCS, IBM Cloud File Storage (Gold), IBM Spectrum needed to reduce costs Scale CSI

* Based on internal performance test results between Ceph and Portworx, Ceph is the recommended storage solution *

47 IBMIBM Cloud Cloud Pak Pak for forData Data / © / 2020© 2020 IBM IBM Corporation Corporation Db2 is Resilient and Consumable

DEMO

48 Db2 Resources

Information Resources: • Db2 Roadmap - http://ibm.biz/AnalyticsRoadmaps • Db2 RFE (Idea) Portal - http://ibm.biz/submitdb2idea • Db2 Recorded Educational Webinars- http://ibm.biz/db2webinar • Subscribe to Db2 technical newsletter - http://ibm.biz/db2nlsignup • Connect with the Db2 online community - http://ibm.biz/db2tribe

Developer Resources: • Db2 Developer Page to get started - http://ibm.biz/db2developer • For Experienced Db2 developers, get your fav Db2 code sample on github - http://ibm.biz/db2github • Want to try Machine Learning with Db2, check out - http://ibm.biz/learndb2 • Want details on Db2 Python Driver - http://ibm.biz/db2-drivers-python • Want Details on Db2 PHP Driver - http://ibm.biz/db2-drivers-php • Want Details on Db2 Node.js Driver - http://ibm.biz/db2-drivers-node • Download the free Db2 python e-book - http://ibm.biz/db2pythonbook