What’s Coming in Fall 2020 with IBM

Nagapriya (Priya) Tiruthani Offfering management – ntiruth@us..com As data becomes more ACCESSIBLE it provides more VALUE

Data Driven Insight Driven Digital Transformation Outcomes Culture Change Prediction New Business Models Breaking Silos Optimization Disruptive Technology Discover “what” Automation Real-Time Decisions Understand “why” Collaboration

Capabilities Self Service Models AI Reports Visualization Multi-Cloud Business Intelligence Applications

Cost Reduction Competitive Market Drivers Modernization Leader Value from Data

IBM Data & AI / © 2020 IBM Corporation There is no AI without an IA “Information Architecture”

“ No amount of AI algorithmic sophistication will overcome a lack of data [architecture]

Data collection & preparation is the most time consuming and difficult part of AI ”

Sources: 2018 MITSlone ”Reshaping business with AI” IBM Data & AI / © 2020 IBM Corporation 3 Open source will 3+ 87% of AI Developers continue to drive average number of depend on Open Open Source Databases innovation and speed Source technology up the journey to AI used by enterprises today

85% >50% 3 of the of enterprises are of revenue will be from engaged in Open Source servers that run open top 5 projects source AI software most popular databases are Open Source, including Postgres SQL

Source: https://dzone.com/articles/2019-open-source-database- 4 report-top-databases-pub A long open Serve on the Built on open history board source

In 1999, IBM IBMers serve on Many of IBM’s supported Linux many open offerings IBM’s by investing $1 source boards leverage open billion in its including Linux, source— Commitment development, Eclipse, Apache, including cloud, To Open Source making is less CNCF, Node.js, big data and risky to traditional Hyperledger, and analytics, enterprise users. more. blockchain, IoT, machine learning, and AI.

IBM Data & AI / © 2020 IBM Corporation IBM and Cloudera Relationship and Significant Milestones Together selling more than $100M annually in software, support and services

2017 2018 2019 2020

IBM + Strategic IBM wins Hortonworks overall Cloudera + Hortonworks Merger Cloudera announces Red Hat Partnership Announced “2018 Partner of the Year” Completed with hybrid cloud OpenShift as preferred container vision solution for CDP HDP and HDF Certified for IBM Announces intent to buy IBM Power, Spectrum Scale, Red Hat for $34B and IBM and Cloudera expand IBM releases NEW version of Db2 Big SQL and Watson become the world’s largest partnership to include resell and Db2 Big SQL supporting Studio hybrid cloud provider support of entire1 Cloudera Cloudera’s CDP Private Cloud portfolio IBM announces workshops to IBM releases NEW versions of help customer plan for CDP Db2 Big SQL supporting Private Cloud Cloudera’s CDH 5 & 6 1 Cloudera portfolio IBM Power and IBM Storage ✓ All legacy Cloudera offerings IBM Wins Cloudera overall “2019 come to market with CDP Private ✓ All legacy Hortonworks offerings Partner of the Year” Cloud Base ✓ Cloudera Data Platform (CDP) ✓ All Services & Training offerings (DSE, PSE, Operational Services)

IBM Data & AI / © 2020 IBM Corporation Changing Customer Needs

Any Tier | All Data Data Lifecycle Secure & Open + Governed Standards

Multiple public Streaming Data & metadata 100% Open source Hybrid Data engineering Fine grained security Open data formats Private Data warehousing Lineage and provenance Open storage & compute Open APIs Data center Machine learning & AI Data & workload migration Edge

IBM Data & AI / © 2020 IBM Corporation Data Landscape is Evolving The new realities of managing data and workloads across clouds

Decade 1 Decade 2 Hadoop on-prem and on the cloud Hadoop powered data clouds

● Need to efficiently store & process data ● Need to integrate the entire lifecycle USE CASES ● Batch process “big data” ● Industrialize data-driven decision making

● Co-locate compute and storage to use TECHNOLOGY ● High performance analytics with commodity hardware and avoid costly INFRASTRUCTURE remote disaggregated storage with network transfers memory and SSD caching

● Deploy software in months and quarters USER EXPERIENCE ● Spin up services in minutes

● Network perimeter & physical access ● Security at the workload, data & PRIVACY, SECURITY & controls are the norm metadata layer GOVERNANCE ● Simplicity over robust mechanisms ● Solutions for new regulations (GDPR)

8 IBM Data & AI / © 2020 IBM Corporation Complete Enterprise Data Lifecycle Manage and secure the data lifecycle in any cloud or datacenter

Data Operational Engineering Database Collect Report Predict 02 04

01 03 05 Streaming Curate Data Serve Machine & Data Flow Warehouse Learning & AI

Security | Governance | Lineage | Management | Automation

IBM Data & AI / © 2020 IBM Corporation Poll:

Cloudera Enterprise Data Hub Which platform are Hortonworks Data Platform you using today?

Cloudera Data Platform

IBM Data & AI / © 2020 IBM Corporation Introducing Cloudera Data Platform…. Industry’s First Enterprise Data Cloud

Cloudera Data Platform Private Cloud with IBM

Cloud Cloud Data Cloud Cloud Data Data Center Data Machine Flow Hub Software Warehouse Learning

DataFlow HDP Cloudera Data CDF Enterprise Enterprise Science HDF Plus Data Hub Workbench

Today’s Products

IBM Data & AI / © 2020 IBM Corporation Introducing Cloudera Data Platform

Data center & Public Hybrid Control Private Cloud Multi-Cloud Cloud Plane

Data • Control cloud costs with auto Anywhere scale, suspend and resume

• Optimize workloads based on analytics and machine learning Governed Catalog | Schema | Migration | Security | Governance Everywhere • View data lineage across any cloud and transient clusters Data Flow & Data Data Operational Machine • Use a single pane of glass across Edge to AI Streaming Engineering Warehouse Database Learning hybrid and multi-clouds Analytics

• Scale to petabytes of data and 1,000s of diverse users Open Cloudera Runtime

Distribution Identity | Orchestration | Management | Operations | Management | Orchestration | Identity Management Console

IBM Data & AI / © 2020 IBM Corporation One Platform – Two Form Factors

CDP Public Cloud CDP Private Cloud (Base + Plus) (platform-as-a-service) (installable software)

Control Plane

CDP Datacenter Private AWS Azure GCP Cloud

Virtual Private Self-Serve Self-Serve Physical Clusters Experiences Experiences Clusters

DW, ML, DE, DW, ML, DE, Data Hub Data Center … …

Cloudera Runtime

IBM Data & AI / © 2020 IBM Corporation CDP Public / Private Cloud Architecture

Management Console Management Console - A single pane of glass to Data Workload Replication manage one or more environments and the services that Catalog Manager Manager run within each environment

Environment - A logical encapsulation of a customer network and the the services that run within that network Environment (like an Azure virtual network)

Data DW ML DataHuHub Cluster – A distributed computing service that running on ClusterCDW ClusterCML Clusterb VMs (Data Hub) or K8s (the experiences) and has Clusterss Clusterss Clusterss access the shared

SDX – The data access control layer that sits on top of SDX the backend object store and provides coherent data security and governance for all the applications running with the environment

IBM Data & AI / © 2020 IBM Corporation Poll: How soon are you planning to migrate to Cloudera Data Platform?

6 months

12 months 18 months

IBM Data & AI / © 2020 IBM Corporation 24 months or later Cloudera Data Platform Private Cloud (BASE & PLUS)

CDP Private Cloud PLUS expands Experiences upon the value of CDP Private Machine Data Data PLUS Cloud BASE by providing: DataFlow Learning Warehouse Engineering • New set of Experiences

• Leverages Red Hat Open Shift • Provides customers greater flexibility as they can run on any private/public cloud of choice SDX • Leverages BareMetal Schema • Allows customers to leverage their existing Security HDFS / Ozone BASE investments and architecture Governance • Allows customers to build toward future state of compute and storage BareMetal

IBM Data & AI / © 2020 IBM Corporation Cloudera Data Platform / CDP Private Cloud BASE

The most comprehensive Data Analytics Platform

EDH Cloudera Enterprise Data Hub

+ + New Features = CDP Private Cloud BASE recently renamed from CDP Data Center HDP HORTONWORKS DATA PLATFORM powered by

IBM Data & AI / © 2020 IBM Corporation CDP Private Cloud Plus

CDP Private Cloud PLUS expands upon the value of CDP Private Cloud BASE by providing:

• 10x faster deployments of analytics and machine learning services with a petabyte-scale hybrid data architecture that can burst to public clouds

• 100% tenant isolation in meeting the SLAs of your mission-critical workloads eliminating the noisy neighbor problem

• 50% reduced data center costs by drastically improving efficiency and utilization of your compute infrastructure and eliminating data replication

IBM Data & AI / © 2020 IBM Corporation New Features for everyone… CDP Private Cloud BASE First Step to Private Cloud PLUS and MAX

New features for CDH 6 customers New features for HDP 3 customers

• Virtual private clusters • Dynamic row filtering & column masking • Automated wire encryption setup Cloudera Manager Ranger 2.0 • Attribute-based access control • Fine-grained RBAC for administrators • SparkSQL fine-grained access control • Streamlined maintenance workflows

• Advanced data discovery • Advanced data lineage Atlas 2.0 Atlas 2.0 • Improved performance and scalability • Faceted search

• Relevance-based text search over • Hive-on-Tez for better ETL performance Solr 7 Hive 3 unstructured data (text, pdf, jpg, …) • ACID transactions • Better fit for Data Mart migration use Impala Ozone (Preview) • 10x scalability of HDFS cases (interactive, BI style queries)

Knox* Gateway-based SSO Hue Built-in SQL Editor

Low-latency DataMart for real-time and Better performance for fast changing / Druid* Kudu aggregate data updateable data Better at-rest Key Trustee Server, NavEncrypt* Spark on Docker* Simplified dependency management Encryption

* In future release CDP Data Center First Step to Private Cloud (Includes SDX IBM Data & AI / © 2020 IBM Corporation and many other important capabilities) CDP Private Cloud Base 7.1 Components

IBM Data & AI / © 2020 IBM Corporation Customers on HDP2.6.x/3.x and CDH5.x/6.x End-of-Support Dates

Current End-of-Support (EoS) Dates The following table specifies the planned End of Support Schedule for Cloudera products. All future dates are provided for planning purposes only and are subject to change, but with the expectation that dates may move later but will not move earlier. In each case, the projected EoS Date is considered to be the last day of the month specified in the table below. Check website for dates: https://www.cloudera.com/legal/policies/support-lifecycle-policy.html

Release End of Full Support Date Release End of Full Support Date

HDP 2.3 July 2018 CDH 5.14 December 2020

HDP 2.4 March 2019 CDH 5.15 December 2020

HDP 2.5 August 2019 CDH 5.16 December 2020

HDP 2.6 December 2020 CDH 6.0 August 2021

HDP 3.0 July 2021 CDH 6.1 December 2021

HDP 3.1 December 2021 CDH 6.2 March 2022 CDH 6.3 March 2022

IBM Data & AI / © 2020 IBM Corporation Three Paths to CDP

Migrate to Public Cloud Migrate to CDP PvC Upgrade to CDP PvC BASE BASE

CDP CDP CDP

Copy data and metadata to a Build a new CDP Private Cloud Upgrade from classic cluster to public cloud; implement new, or BASE cluster on-premises; CDP Private Cloud BASE in- migrate existing workloads on copy data and metadata from place on the same hardware CDP Public Cloud. existing classic cluster; and infrastructure. migrate existing workloads.

Small initial investment Higher initial investment Single cutover, lower capital investment

IBM Data & AI / © 2020 IBM Corporation Create new apps using Upgrading to CDP – Private Cloud CDP - Private Cloud CDP - Private Cloud PLUS (faster time to value) CDP Private Cloud Base provides the stateful elements for a new wave of containerized applications Altus DataPlane ✔Isolation from noisy neighbors Self-serve Self-serve analytic experiences • Storage DistroX Analytic ✔ • Table Schema Clusters Experience • Authentication & Authorization s ✔Decoupled from storage • Governance SDX ✔Decoupled upgrade cycles

Create new apps using CDP - Private Container Cloud ✔Elastic compute (batteries included or customer provided) Cloud as sidecar to CDH / HDP clusters (faster time to value)

CDH 5 / HDP 2 CDH 6 / HDP 3 CDP - Private Cloud Cluster Upgrade Cluster Upgrade BASE (DistroX on bare metal) Existing Apps ✔Latest Existing Apps ✔Best of CDH Existing Apps Upgrade existing clusters & upstream and HDP applications in-place Existing Data features Existing Data features Existing Data (protect existing investment)

Existing Hardware ExistingDirect Upgrade Hardware Existing Hardware

IBM Data & AI / © 2020 IBM Corporation Upgrading an Existing Cluster: Option A

CDP Private Cloud Step 1: Upgrade an existing cluster to CDP PvC Base, thus creating an SDX environment based on existing data Management Console

Step 2: Install CDP Private Cloud and use the Experiences to build new applications Data CDW CML Hub Step 3: Use Workload Manager to intelligently migrate key workloads from the CDP PvC Base cluster to the CDP Private Cloud Experiences

CDP PvC Base CDH 5 / HDP 2 CDH 6 / HDP 3 (SDX environment)

Existing Apps Upgrade Existing Apps Upgrade Existing Apps Existing Data Existing Data Existing Data Existing Existing Existing Upgrade Hardware Hardware Hardware

IBM Data & AI / © 2020 IBM Corporation Migrating from an Existing Cluster: Option B

CDP Private Cloud Step 1: Install CDP Data Center on new hardware and use Replication Manager to replicate data, metadata, and policies from an existing Management Console cluster to create the SDX environment

Step 2: Install CDP Private Cloud and use the Experiences to build new Data CDW CML applications Hub

Step 3: Use Workload Manager to intelligently migrate key workloads from the CDH / HDP cluster to the CDP Private Cloud Experiences

CDH / HDP CDP PvC Base (SDX environment) No bare metal Existing Apps apps Existing Data Intelligent Replication (data, metadata, policies) New Data Existing Hardware New Hardware

IBM Data & AI / © 2020 IBM Corporation Complete Data Lifecycle

Collect

Streaming & Data Flow Data at Serve Predict rest Curate Report

Data Data Operational Machine IBM Data & AI / © 2020 IBM Corporation Engineering Warehouse Database Learning & AI Complete and Connected Data Lifecycle

Stream Flow Streaming Messaging Management Analytics

Data in Analyze Act motion Buffer Distribute

Batch Operational Scoring Collect Enrichment curation Insights

Streaming & Data Flow Data at Serve Predict rest Curate Report

Data Data Operational Machine IBM Data & AI / © 2020 IBM Corporation Engineering Warehouse Database Learning & AI Poll:

Are you exploring real time use cases in your data platform?

IBM Data & AI / © 2020 IBM Corporation Cloudera Data Platform DataFlow

IBM Data & AI / © 2020 IBM Corporation Cloudera Data Platform Streaming Edition

Advanced messaging and stream processing powered by + Friends.

This new product combines 3 offerings into 1 package built on CDP:

1 3 CDP Streaming

2 Edition

IBM Data & AI / © 2020 IBM Corporation CDP Streaming Offerings

NEW CDP Streams NEW CDP-Streaming Component OLD CSP Messaging Base OLD CSP & CSM Edition CDP PvC Base Cloudera Manager x x x x x Zookeeper x x x x x Knox x x x x x Ranger X (Sentry) x X (Sentry) x x Atlas x x x x x Kafka x x x x x Schema Registry x x x x x Kafka Streams x x x x x SMM x x x SRM x x x Cruise Control (NEW) x x Kafka Connect (NEW) x x HDFS / HBase / Solr x x YARN x x Flink x Balance of CDP PvC Base (30+ x components)

IBM Data & AI / © 2020 IBM Corporation 2020-21 Roadmap

2020 2021

• Introducing new experiences in CDP Private Cloud Plus • CDP Private Cloud Base • DataFlow • v7.0.3 • Data Engineering • V7.1.3 • CDP Private Cloud Control Plane • CDP Private Cloud Plus (DW and ML experiences) • Replication Manager • CDP Streaming Edition, includes • Workload Manager • Stream Processing • Data Catalog • Streams Management • Enhanced Data Warehouse and Machine Learning • Streaming Analytics capabilities • CDP Private Cloud Base support on POWER • CDP Private Cloud Base support on POWER/Spectrum • Db2 Big SQL support on CDP PvC Base Scale • Db2 Big SQL support on CDP PvC Base

* Roadmap plans may change

IBM Data & AI / © 2020 IBM Corporation Db2 Big SQL: Supercharge Big Data Workloads on CDP

FAST, INTEGRATED and SECURE DATA ACCESS LAYER for data platforms

INDUSTRY LEADING ADVANCED ELASTIC SQL SQL ENGINE FOR ANALYTICS AND ML ENGINE FOR BIG DATA MADE SIMPLE WORKLOADS

SQL compatibility with many BI and data science tools can Scale compute nodes based SQL dialects, enables reuse of access data stored in Hadoop on workloads to efficiently skills and applications or object stores use resources

Supports all open source file formats like ORC, Parquet, Avro, etc.

IBM Data & AI / ©2020 IBM Corporation Infuse the power of Db2 in CDP using Db2 Big SQL

SQL Compatibility Federation Performance Enterprise & • Understands different • Connect to remote data • Execute all 99 TPCDS Security SQL dialects sources queries • Automatic memory • Reuse skills and • Query pushdown • Scales linearly with management applications with less/no • Spark connectors for increased concurrency • Role/column based data changes more data sources & ML security models

Db2 Big SQL is the only SQL engine on the open source data platform that …

• SQL compatible with: • Federates to more than • Exhibits high • Secures data using SQL 10 data sources: performance even when with roles RDBMS, NoSQL and/or data scales up to 100TB • Integrates with Ranger Object Stores with complex SQLs for centralized • Integrates bi- • Handles many management • Applications work as-is directionally with Spark, concurrent users without without any changes like no other relinquishing • Operationalizes ML performance models

34 IBM Data & AI / © 2020 IBM Corporation To Summarize - Get more for Less with Db2 Big SQL

Accelerate time to market Empower SQL users to Augment disparate data Enable BI analytics with while modernizing your operationalize ML models for deep analytics and AI high performance and warehouse enterprise security

Above all, bring stability to your applications even when the platform goes through updates......

Now available.... Introducing Db2 Big SQL support on CDP v7.1

Make Db2 Big SQL the point of entry to Big Data irrespective of which platform has the data

35 IBM Data & AI / © 2020 IBM Corporation World Class Customer Support

Since beginning of the partnership: LEVEL 1 • More than 200 customers • Financial, retail, travel, automotive, energy, communications • More than 2000 Cases • 98+% managed within the SLA

IBM Support is your competitive advantage: • The ability for your business to easily and quickly access high-quality support is a critical advantage that will help you keep your business ahead of the competition. • To meet the growing needs of your business in today’s competitive landscape, IBM embarked on a transformation journey to reimagine the way we deliver support to you. LEVEL 2 One of the results of this revolution is the Cognitive Support Platform (CSP). • Infused with IBM’s enterprise AI technology, Watson, our Cognitive Support Platform helps you resolve issues quickly by providing you with an omnichannel support experience that is driven by insights, fueled by knowledge, and powered by Cognition. • One Click, One Call • One Case Owner

IBM Data & AI / © 2020 IBM Corporation 36 Professional Services Our Professional Services will help you unlock the value of your data throughout your data-driven journey.

Optimize at every stage of Shorten your time to your data journey production and value

We have just the right package for ensure your success • SmartStart - Get started with the Cloudera Platform in your data center Realize the full value of • SmartMigrate - Migrate legacy Cloudera CDH or HDP workloads to CDP in your data center your data • SmartUpgrade - Upgrade existing Cloudera CDH or HDP deployment to CDP with minimum disruption • SmartOffload - Offload your legacy data warehouse to the Cloudera platform • SmartHealth - Comprehensive platform health check for optimal performance overall

Our goal is to ensure your infrastructure outperforms standards at every stage of your organization's journey to becoming truly data-driven. IBM Data & AI / © 2020 IBM Corporation The Power of ONE

Greater Outcomes

IBM and Cloudera have a prescriptive approach IBM and Cloudera have a prescriptive approach • Mitigate the risk of 1. No vendor lock-in: Lay the Foundationcompliance with Red finesHat’s - OpenShiftup to Container Platform 1. No vendor lock-in: Lay the Foundation with Red Hat’s OpenShift Container Platform 2. New Revenue: Explore new business4% models of gross with sales machine learning at scale 2. New Revenue: Explore new businessby eachmodels personal with machine learning at scale 3. Reduce risk: establish and enforce theinformation enterprise data governance and security policies for data 3. Reduce risk: establish and enforce the enterprise data governance and security policies for data 4. Reduce costs: RHOS delivers up to 38%data lower breach infrastructure incident. and development costs per application 4. Reduce costs: RHOS delivers up to 38% lower infrastructure and development costs per application 5. Improve productivity: automate and• GDPRgovern will the become data and a reality AI lifecycle while ensuring compliance 5. Improve productivity: automate andin 2018govern. the data and AI lifecycle while ensuring compliance

IBM / Hybrid Data Management / © 2019 IBM Corporation © 2019 IBM Corporation 38 IBM Data & IBMAI / ©/ Hybrid 2020 IBM Data Corporation Management / © 2019 IBM Corporation Events and Resources

IBM and Cloudera Partnership ibm.com/analytics/partners/cloudera

IBM Open Source Offerings Community

Stay tuned and learn upcoming events by joining ibm.biz/hdmoscomm

IBM Data & AI / © 2020 IBM Corporation Thank you

Priya Tiruthani Fred Koopmans Offering Manager – Cloudera products and Db2 Big SQL VP, Product Management – Cloudera — — [email protected] Venky Sellappa Partner Solutions – Cloudera Lynn Chou — Offering Manager – Cloudera products — Dave Fowler [email protected] Partner Solutions – Cloudera —

IBM Data & AI / © 2020 IBM Corporation IBM Data & AI / © 2020 IBM Corporation Use Cases Modernize Enterprise Data Grow your business Architect the information architecture of the enterprise Increase your revenue, improve your customer to have a secured and governed platform to help drive: satisfaction, and power new business models by • Drive growth focusing on use cases such as: • Connect business • Marketing automation • Secure the processes • Personalized marketing • Customer experience • Churn prevention • Customer retention Protect your business Connect your business Protect your business by tackling challenging use cases Connect Operational Technology (OT) with IT to achieve such as: operational excellence around use cases such as the • Regulatory compliance following: • Risk modeling & analysis • Predictive maintenance • Financial crime prevention • Connected vehicles • Fraud detection • Smart cities • Cybersecurity • Healthcare analytics IBM Data & AI / © 2020 IBM Corporation • Industrial IoT