YugabyteDB – Distributed SQL on Kubernetes

Amey Banarse VP of Product, Yugabyte, Inc. Taylor Mull Senior Data Engineer, Yugabyte, Inc.

© 2020 - All Rights Reserved 1 Introduction – Amey

Amey Banarse

VP of Product, Yugabyte, Inc.

Pivotal • FINRA • NYSE

University of Pennsylvania (UPenn)

@ameybanarse

about.me/amey

© 2020 - All Rights Reserved 2 Introduction – Taylor

Taylor Mull

Senior Data Engineer, Yugabyte, Inc.

DataStax • Charter

University of Colorado at Boulder

© 2020 - All Rights Reserved 3 Kubernetes Is Massively Popular in Fortune 500s

● Walmart – Edge Computing KubeCon 2019 https://www.youtube.com/watch?v=sfPFrvDvdlk

● Target – Data @ Edge https://tech.target.com/2018/08/08/running-cassandra-in-kubernetes -across-1800-stores.html

● eBay – Platform Modernization https://www.ebayinc.com/stories/news/ebay-builds-own-servers-intends -to-open-source/

© 2020 - All Rights Reserved 4 The State of Kubernetes 2020

● Substantial Kubernetes growth in Large Enterprises ● Clear evidence of production use in enterprise environments ● On-premises is still the most common deployment method ● Though there are pain points, most developers and executives alike feel Kubernetes is worth it

VMware The State of Kubernetes 2020 report https://tanzu.vmware.com/content/ebooks/the-state-of-kubernetes-2020 https://containerjournal.com/topics/container-ecosystems/vmware-releases-state-of-kubernetes-2020-report/

© 2020 - All Rights Reserved 5 Data on K8s Ecosystem Is Evolving Rapidly

© 2020 - All Rights Reserved 6 Why Data Services on K8s?

Containerized data workloads running on Kubernetes offer several advantages over traditional VM / bare metal based data workloads including but not limited to:

● Better cluster resource utilization ● Robust automation framework can be embedded inside CRDs (Custom Resource Definitions) or ● Portability between cloud and on-premises commonly referred as ‘K8s Operator’

● Self Service Experience and seamlessly Scale ● Simple and selective instant upgrades on demand during peak traffic

© 2020 - All Rights Reserved 7 A Brief History of Yugabyte

Part of Facebook’s cloud native DB evolution Builders of multiple popular

Yugabyte founding team ran Facebook’s public cloud scale DBaaS

+1 Trillion +100 Petabytes ops/day data set sizes ● Yugabyte team dealt with this growth first hand ● Massive geo-distributed deployment given global users ● Worked with world-class infra team to solve these issues

© 2020 - All Rights Reserved 8 Transactional, distributed SQL database designed for resilience and scale

100% open source, PostgreSQL compatible, enterprise-grade RDBMS …..built to run across all your cloud environments

© 2020 - All Rights Reserved 9 Enabling Business Outcomes

● System of Record driving shopping list service ● Designed to be Multi-Cloud on GCP and Azure ● Scaling to 42 states, and 9m shoppers

● Real-time APIs for financial services data ● 14 Billion requests per day with data from over 100 world-wide exchanges ● Serves major financial tech and services firms from Betterment to Schwab

● Retail personalization platform serving 600+ retailers like Walmart and Nike ● Designed to be Multi-cloud/Multi-AZ and tuned to handle Black Friday

© 2020 - All Rights Reserved 10 YugabyteDB – The Promise of Distributed SQL

SQL

PostgreSQL compatible

Resilience & HA Low Latency

Multi/ Hybrid cloud /K8s 100% Open Source

Horizontal Geo Distribution Scalability

© 2020 - All Rights Reserved 11 Designing the Perfect Distributed SQL Database

PostgreSQL is more popular than MongoDB Aurora much more popular than Spanner

Amazon Aurora Google Spanner A highly available MySQL and The first horizontally scalable, PostgreSQL-compatible strongly consistent, relational relational database service database service

Not scalable but HA Scalable and HA

All RDBMS features Missing RDBMS features

PostgreSQL & MySQL New SQL syntax

© 2020 - All Rights Reserved 12 Comparing PostgreSQL and Google Spanner

Feature PostgreSQL Google Spanner ✓ ❌ Query Layer Very well known New SQL flavor Feature set and ecosystem support SQL Feature Depth High Low (many advanced features) (basic features missing) critical for adoption

Fault Tolerance with HA ❌ ✓

Horizontal Scalability ❌ ✓

Distributed Txns ❌ ✓

Replication Async Sync, Read Replicas

© 2020 - All Rights Reserved 13 Spanner Design + PostgreSQL Compatibility

Feature PostgreSQL Google Spanner ✓ ❌ ✓ SQL Wire Protocol Very well known New SQL flavor Reuses PostgreSQL

RDBMS Feature Support High Low High (many advanced features) (basic features missing) (supports most Postgres features) Fault Tolerance with HA ❌ ✓ ✓

Horizontal Scalability ❌ ✓ ✓

Distributed Txns ❌ ✓ ✓

Global Txn Replication Async Sync, Read Replicas Sync, Read Replicas, Async

© 2020 - All Rights Reserved 14 Designed for Cloud Native Microservices

Google PostgreSQL Spanner YugabyteDB

✓ ✘ ✓ Yugabyte Query Layer SQL Ecosystem Massively New SQL flavor Reuse PostgreSQL adopted YCQL YSQL ✓ ✘ ✓ RDBMS Features Advanced Basic Advanced Complex cloud-native Complex and cloud-native DocDB Distributed Document Store Highly Available ✘ ✓ ✓ Distributed Sharding & Load Raft Consensus Transaction Manager Balancing Replication Horizontal Scale ✘ ✓ ✓ & MVCC Document Storage Layer Distributed Txns ✘ ✓ ✓ Custom RocksDB Storage Engine Data Replication Async Sync Sync + Async

© 2020 - All Rights Reserved 15 All Nodes Are Identical

Microservices Can connect to ANY node platform Add/remove nodes anytime

YugabyteDB YugabyteDB YugabyteDB Query Layer Query Layer Query Layer … … DocDB DocDB DocDB Storage Layer Storage Layer Storage Layer

YugabyteDB Node YugabyteDB Node YugabyteDB Node

© 2020 - All Rights Reserved 16 YugabyteDB Deployed as StatefulSets

Admin App Clients yb-masters Clients yb-tservers Headless Service Headless Service

yb-master-0 pod yb-master-1 pod yb-master-2 pod yb-master StatefulSet yugabytedb yugabytedb

yb-tserver-0tablet 1’ pod yb-tserver-1tablet 1’ pod yb-tserver-2tablet 1’ pod yb-tserver-3tablet 1’ pod yb-tserver … StatefulSet yugabytedb yugabytedb yugabytedb yugabytedb

node1 node2 node3 node4

Local/Remote Local/Remote Local/Remote Local/Remote Persistent Volume Persistent Volume Persistent Volume Persistent Volume

© 2020 - All Rights Reserved 17 Under the Hood – 3 Node Cluster

node1 node2 node3 YB-Master Manage shard metadata & yb-master1 yb-master2 coordinate cluster-wide ops yb-master3

yb-tserver1 yb-tserver2

tablet1-leader tablet1-follower YB-TServer tablet 1’ tablet 1’ Stores/serves data tablet2-follower Global Transaction Manager tablet2-leader in/from tablets (shards) Tracks ACID txns across multi-row ops, incl. clock skew mgmt. tablet3-follower tablet3-follower yb-tserver3 … …

tablet1-follower

tablet2-followertablet 1’

tablet3-leader DocDB Storage Engine … Raft Consensus Replication Purpose-built for ever-growing data, extended from RocksDB Highly resilient, used for both data replication & leader election

© 2020 - All Rights Reserved 18 Deployment Topologies

1. Single Region, Multi-Zone 2. Single Cloud, Multi-Region 3. Multi-Cloud, Multi-Region

Availability Zone 1 Region 1 Cloud 1

Availability Zone 2 Availability Zone 3 Region 2 Region 3 Cloud 2 Cloud 3

Consistent Across Zones Consistent Across Regions Consistent Across Clouds No WAN Latency But No Cross-Region WAN Latency with Cross-Cloud WAN Latency with Auto Region-Level Failover/Repair Auto Region-Level Failover/Repair Cloud-Level Failover/Repair

© 2020 - All Rights Reserved 19 Multi-Master Deployments w/ xCluster Replication

Master Cluster 1 in Region 1 Master Cluster 2 in Region 2

Availability Zone 1 Availability Zone 1

Bidirectional Async Replication

Availability Zone 2 Availability Zone 3 Availability Zone 2 Availability Zone 3

Consistent Across Zones Consistent Across Zones No Cross-Region Latency for Both Writes & Reads No Cross-Region Latency for Both Writes & Reads App Connects to Master Cluster in Region 2 on Failure App Connects to Master Cluster in Region 1 on Failure

© 2020 - All Rights Reserved 20 Deploying Yugabyte on Multi-Region K8s

● Scalable and highly available data tier

● Business continuity

● Geo-partitioning and data compliance

© 2020 - All Rights Reserved 21 YugabyteDB on K8s Multi-Region Requirements

● Pod to pod communication over TCP ports using RPC calls across n K8s clusters

● Global DNS Resolution system ○ Across all the K8s clusters so that pods in one cluster can connect to pods in other clusters

● Ability to create load balancers in each region/DB

● RBAC: ClusterRole and ClusterRoleBinding

● Reference: Deploy YugabyteDB on multi cluster GKE https://docs.yugabyte.com/latest/deploy/kubernetes/multi-cluster/gke/helm-chart/

© 2020 - All Rights Reserved 22 YugabyteDB on K8s

Demo – Single YB Universe Deployed on 3 GKE Clusters

© 2020 - All Rights Reserved 23 YugabyteDB Universe on 3 GKE Clusters

Deployment: 3 GKE clusters Each with 3 x N1 Standard 8 nodes 3 pods in each cluster using 4 cores

Cores: 4 cores per pod

Memory: 7.5 GB per pod

Disk: ~ 500 GB total for universe

© 2020 - All Rights Reserved 24 Yugabyte Platform

Demo

© 2020 - All Rights Reserved 25 Ensuring High Performance

LOCAL STORAGE REMOTE STORAGE Since v1.10 Most used

Lower latency, Higher throughput Higher latency, Lower throughput

Recommended for workloads that do their own Recommended for workloads do not perform any replication replication on their own

Pre-provision outside of K8s Provision dynamically in K8s

Use SSDs for latency-sensitive apps Use alongside local storage for cost-efficient tiering

© 2020 - All Rights Reserved 26 Configuring Data Resilience

POD ANTI-AFFINITY MULTI-ZONE/REGIONAL/MULTI-REGION POD SCHEDULING

Pods of the same type should not be scheduled on the same node Multi-Zone – Tolerate zone failures for Keeps impact of node failures to K8s worker nodes absolute minimum Regional – Tolerate zone failures for both K8s worker and master nodes

Multi-Region / Multi-Cluster – Requires network discovery between multi cluster

© 2020 - All Rights Reserved 27 Automating Day 2 Operations

HANDLING FAILURES ROLLING UPGRADES BACKUP & RESTORE

Pod failure handled by K8s Supports two upgradeStrategies: Backups and restores are a automatically onDelete (default) and database level construct rollingUpgrade Node failure has to be handled YugabyteDB can perform manually by adding a new slave Pick rolling upgrade strategy for distributed snapshot and copy to a node to K8s cluster DBs that support zero downtime target for a backup upgrades such as YugabyteDB Local storage failure has to be Restore the backup into an existing handled manually by mounting New instance of the pod spawned cluster or a new cluster with a new local volume to K8s with same network id and storage different number of TServers

© 2020 - All Rights Reserved 28 Extending StatefulSets with Operators

CPU usage in the yb-tserver StatefulSet

Based on Custom Controllers that have direct CPU > 80% for 1min and access to lower level K8S API max_threshold not exceeded

Excellent fit for stateful apps requiring human operational knowledge to correctly scale, reconfigure and upgrade while simultaneously ensuring high performance and data resilience Scale pods Complementary to Helm for packaging https://github.com/yugabyte/yugabyte-platform-operator

© 2020 - All Rights Reserved 29 Yugabyte Platform

Demo

© 2020 - All Rights Reserved 30 Live Demo: Cluster Scale Up

Tablet Tablet Tablet Tablet Tablet Tablet Tablet Server-1 Server-2 Server-3 Server-1 Server-2 Server-3 Server-4

t1 t4 t1 t4 t1 t4 t1 t4 t1 t4 t1 t4

t3 t2 t3 t2 t3 t2 t3 t2 t3 t2 t3 t2

1 Before expansion, all 3 nodes taking traffic. 2 New node just added. No traffic to new node yet.

Tablet Tablet Tablet Tablet Tablet Tablet Tablet Tablet Server-1 Server-2 Server-3 Server-4 Server-1 Server-2 Server-3 Server-4

t1 t4 t1 t1 t4 t1 t1 t4 t4 t4 t4 t3 t1

t3 t2 t3 t2 t3 t2 t2 t3 t3 t2 t2 t2

3 New node received t2 from current Leader, which is Zero-downtime cluster expansion and much guaranteed to have consistent copy. A simple file transfer and 4 with just that one Tablet, it is ready to take traffic. Expansion faster because of Raft and strong consistency. is still in progress.

31 © 2020 - All Rights Reserved Confidential Live Demo: Terminate a YB Pod

➜ ~ kubectl delete pods yb-tserver-1 -n yb-dev-yb-webinar-us-west1-c

➜ ~ kubectl get pods -n yb-dev-yb-webinar-us-west1-c

© 2020 - All Rights Reserved 32 A Classic Enterprise App Scenario

© 2020 - All Rights Reserved 33 Yugastore – E-Commerce App : A Real-World Demo

Deployed on

https://github.com/yugabyte/yugastore-java

© 2020 - All Rights Reserved 34 Yugastore – Kronos Marketplace

© 2020 - All Rights Reserved 35 Classic Enterprise Microservices Architecture

PRODUCT . MICROSERVICE YCQL

Yugabyte Cluster

REST CHECKOUT UIUIU APP API Gateway MICROSERVICE YCQL

YSQL CART MICROSERVICE

© 2020 - All Rights Reserved 36 Istio Traffic Management for Microservices

UIUIU APP Istio API Edge Proxy Gateway

Istio Service Discovery Istio Edge Gateway Istio Route Configuration CART CHECKOUT PRODUCT using Envoy Proxy MICROSERVICE MICROSERVICE MICROSERVICE

Pilot Citadel Galley

Istio Control Plane

© 2020 - All Rights Reserved 37 A Platform Built for a New Way of Thinking

➔ Event + Microservice first design

➔ Team autonomy with platform efficiency

➔ 100% Cloud Native operating model on K8s

➔ Turnkey multi-cloud

➔ Full Spring Data support

© 2020 - All Rights Reserved 38 We’re a Fast Growing Project

Clusters Slack GitHub Stars ▲ 12x ▲ 12x ▲ 7x We  stars! Give us one:

github.com/YugaByte/yugabyte-db Growth in 1 Year

Join our community:

yugabyte.com/slack

© 2020 - All Rights Reserved 39 Thank You

Join us on Slack: yugabyte.com/slack

Star us on GitHub: github.com/yugabyte/yugabyte-db

© 2020 - All Rights Reserved 40