Slides Geo-Distributed Databases: Engineering Around the Physics Of
Total Page:16
File Type:pdf, Size:1020Kb
Geo-Distributed Databases: Engineering Around the Physics of Latency 1 © 2021 All Rights Reserved Who we are Taylor Mull Suda Srinivasan ● Senior Data Engineer ● VP of Solutions ● DataStax, Charter Comms ● ~15 years in tech - many hats ● Nutanix, Deloitte, Microsoft, bunch of startups © 2021 All Rights Reserved Cloud native relational database for cloud native applications SQL PostgreSQL Resilience and Compatibility High Availability Transactional distributed SQL database Horizontal Geographic built for resilience and scale. Scalability Distribution 100% open source. Runs in any cloud. ACID Security Transactions © 2021 All Rights Reserved 3 What is a geo-distributed database? Data centers Availability zones A single database that is spread across Regions two or more geographically distinct locations, and runs without experiencing performance delays in executing transactions But Physics! © 2021 All Rights Reserved The physics of wire latency Speed of light ~60 ms + 1-2 ms Transmission media Packet size ~150 ms Packet loss Signal strength Propagation delays ... © 2021 All Rights Reserved Latency in the I/O path Keep your data close to usage and compute close to data © 2021 All Rights Reserved Why deploy geo-distributed databases? Resilience Performance Compliance ● Datacenters, cloud AZs, even ● Customers and users are located ● Data residency laws require data regions can fail around the world about a nation's citizens or ● Applications and databases ● Moving data close to usage and residents to be collected, should be resilient and available compute close to data lowers processed, and/or stored inside through failures latency the country © 2021 All Rights Reserved Core concepts 0. YugabyteDB architecture 1. Synchronous replication within a YugabyteDB cluster 2. Follower reads 3. xCluster asynchronous replication - unidirectional and bidirectional 4. Read replicas 5. Geo-partitioning © 2021 All Rights Reserved Core concept 0: YugabyteDB architecture App Node Node Node ● Nodes across DCs, zones, and regions ● User tables sharded into tablets (group of rows) ● Tablets (per-table, across tables) evenly distributed across nodes ● Sharding and distribution are transparent to the user © 2021 All Rights Reserved Designing the perfect distributed SQL DB Aurora much more popular than Spanner Yugabyte Query Layer YCQL YSQL Amazon Aurora Google Spanner DocDB Distributed Document Store A highly available MySQL The first horizontally scalable, Distributed Sharding & Load Raft Consensus and PostgreSQL-compatible strongly consistent, relational Transaction Manager Balancing Replication & MVCC relational database service database service Document Storage Layer Not scalable but HA Scalable and HA Custom RocksDB Storage Engine All RDBMS features Missing RDBMS features PostgreSQL & MySQL New SQL syntax bit.ly/distributed-sql-deconstructed © 2021 All Rights Reserved Core concept 1: Synchronous replication by default ● Each tablet is replicated App ● YugabyteDB uses Raft consensus protocol for leader election and replication ● Writes are replicated to all the tablets peers; they need to be acknowledged by a majority of the peers before the write succeeds ● Reads and writes are served by the tablet Node Node Node leader (by default) ● Sync replication offers: ○ Consistency ○ Resilience ● Sync replication costs: ○ Latency RF-3 tablet 1’ © 2021 All Rights Reserved Geo-distribution with sync replication © 2021 All Rights Reserved Enabling business outcomes: Top 5 global retailer An American multinational retail corporation that operates a chain of hypermarkets, department stores, and grocery stores in countries around the world WHY YUGABYTE SOLUTION AND BENEFITS ● Linear scale with product growth ● SOR for product catalog of +100 million items with ● Open source billions of mappings, serving over 100K qps ● Cloud-agnostic, geo-distributed ● Enhanced product agility ● Multi-row ACID transactions ● Handled Black Friday and Cyber Monday peaks ● Alternate key lookups ● Service remained resilient and available through TX ● Better performance and resiliency than Azure cloud outage CosmosDB, Azure Cloud SQL, and other databases ● $10M in lost revenue recovered © 2021 All Rights Reserved Multi-region deployment for resilience: Top 5 retailer Deployment: 27 Azure nodes across 3 regions - US-East, US-West Seattle, and US-South Texas US-East Cores: 16 Memory: 128 GB Disk: 2 x 1024 GB premium P40 disks per node US-West- US-South-Texas OS: CentOS 7.8 Seattle Preferred leaders in US-South Central region Service remained resilient and available through the Texas cloud power outage © 2021 All Rights Reserved Core concept 2: Follower reads trade off freshness for latency Leader Follower 1 Follower 2 15 15 15 ● Follower reads can return stale data ● Followers located near the client can Write req serve data with low latency 20 15 15 received ● Follower read configuration is at the app level Write ● Follower reads offer: completed 20 20 15 ○ Low latency ● Follower reads cost: Read req Data accuracy (freshness) received 20 20 15 ○ Tablet fully replicated 20 20 15 tablet 1’ © 2021 All Rights Reserved Core concept 3: xCluster asynchronous replication Master Cluster 1 in Region 1 Master Cluster 2 in Region 2 Availability Zone 1 Availability Zone 1 Unidirectional or Bidirectional Async Replication Availability Zone 2 Availability Zone 3 Availability Zone 2 Availability Zone 3 Consistent Across Zones Consistent Across Zones No Cross-Region Latency for Both Writes & Reads No Cross-Region Latency for Both Writes & Reads © 2021 All Rights Reserved 16 Enabling business outcomes: Kroger Largest supermarket chain in the US with over 2,750 supermarkets and multi-departments stores. Rapidly growing digital channel, especially during the COVID-19 crisis. WHY YUGABYTE SOLUTION AND BENEFITS ● Distributed ACID transactions, scalability ● YugabyteDB is the SOR for the shopping list service ● Geo-distributed deployment for resilience ● 42 states, 9m shoppers ● Multi-API support - YSQL, YCQL ● Multi-region deployment with sync replication for ● Automatic data sharding resilience with single digit latency ● Open source ● xCluster bidirectional replication ● Designed to be multi-cloud on GCP and Azure ”We have been leveraging YugabyteDB as the distributed SQL database running natively inside Kubernetes to power the business-critical apps that require scale and high availability.” - Mahesh Thyagarajan, VP Engineering © 2021 All Rights Reserved Core concept 4: Read replicas AZ 1 ● Read replicas offer low latency reads AZ 2 AZ 3 ● Read replicas can’t be used for resilience/ failover AZ 1 Read Replica AZ 2 © 2021 All Rights Reserved 18 © 2021 All Rights Reserved 19 Admiral architecture Deployed across 5 countries, 3 continents ● Synchronous cluster across US West, US Central, and US East ● Each region has a master process for HA ● Read replica clusters in Asia and Europe © 2021 All Rights Reserved Core concept 5: Row-level geo-partitioning id geo id geo 4 UK 1 US ● Pin rows of a table or indexes to specific geos ● Strong consistency id geo ● Low read and write 2 IND 3 IND latency © 2021 All Rights Reserved Flexible deployment options in a single database Consistency Read Latency Write Latency Used For Multi-zone cluster Strong Low within region 1-10 ms Low within region 1-10 ms Zone-level resilience Multi-region stretched Tunable (with High with strong consistency 40-100 ms, always Region-level resilience cluster follower reads) or Low with eventual strongly consistent consistency xCluster Async Eventual (timeline) Low within region 1-10 ms Low within region 1-10 ms Backup and DR single-direction xCluster Async Eventual (timeline) Low within region 1-10 ms Low within region 1-10 ms Backup and DR bidirectional Read replicas Strong in primary Low within primary cluster Low within region 1-10 ms Low latency reads; not a cluster; eventual in region 1-10 ms DR solution (not an read replica clusters independent failure domain) Geo-partitioning Strong Low within region 1-10 ms; Low within region 1-10 ms Compliance high across regions 40-100ms © 2021 All Rights Reserved Summary of core concepts 1. Data is synchronously replicated within a Yugabyte cluster by default. 2. Nodes can be placed in different zones, different regions (stretched), or different cloud. 3. Reads and writes are handled by the tablet leader. 4. Follower reads trade off data freshness for lower latency. 5. xCluster replication asynchronously replicates data across clusters for backup/DR. 6. Read replicas enable low latency reads from local clusters. 7. Geo-partitioning allows table rows and indexes to be pinned to specific geographies. These options allow you to prioritize different objectives - resilience, data freshness, latency, and compliance. Achieve desired resilience, latency, and compliance for your apps. © 2021 All Rights Reserved Thank You Join us on Slack: yugabyte.com/slack Star us on GitHub: github.com/yugabyte/yugabyte-db © 2021 All Rights Reserved 24.