Amazon Elasticache Deep Dive Powering Modern Applications with Low Latency and High Throughput

Amazon ElastiCache Deep Dive Powering modern applications with low latency and high throughput Michael Labib Sr. Manager, Non-Relational Databases © 2020, Amazon Web Services, Inc. or its Affiliates. Agenda • Introduction to Amazon ElastiCache • Redis Topologies & Features • ElastiCache Use Cases • Monitoring, Sizing & Best Practices © 2020, Amazon Web Services, Inc. or its Affiliates. Introduction to Amazon ElastiCache © 2020, Amazon Web Services, Inc. or its Affiliates. Purpose-built databases © 2020, Amazon Web Services, Inc. or its Affiliates. Purpose-built databases © 2020, Amazon Web Services, Inc. or its Affiliates. Modern real-time applications require Performance, Scale & Availability Users 1M+ Data volume Terabytes—petabytes Locality Global Performance Microsecond latency Request rate Millions per second Access Mobile, IoT, devices Scale Up-out-in E-Commerce Media Social Online Shared economy Economics Pay-as-you-go streaming media gaming Developer access Open API © 2020, Amazon Web Services, Inc. or its Affiliates. Amazon ElastiCache – Fully Managed Service Redis & Extreme Secure Easily scales to Memcached compatible performance and reliable massive workloads Fully compatible with In-memory data store Network isolation, encryption Scale writes and open source Redis and cache for microsecond at rest/transit, HIPAA, PCI, reads with sharding and Memcached response times FedRAMP, multi AZ, and and replicas automatic failover © 2020, Amazon Web Services, Inc. or its Affiliates. What is Redis? Initially released in 2009, Redis provides: • Complex data structures: Strings, Lists, Sets, Sorted Sets, Hash Maps, HyperLogLog, Geospatial, and Streams • High-availability through replication • Scalability through online sharding • Persistence via snapshot / restore • Multi-key atomic operations A high-speed, in-memory, non-Relational data store. • LUA scripting Customers love that Redis is easy to use. • Open Source © 2020, Amazon Web Services, Inc. or its Affiliates. What is Memcached? Initially released in 2003, Memcached provides: • Simple, in-memory, LRU cache • Simple key-value (string-string) store • Supports strings, objects • Multi-threaded Single-Node Instance • Sharding via client-side library Sharded Instance • Easy to Scale • No persistence • Open source Clients ClientsClients ClientsClients ClientsClients Clients © 2020, Amazon Web Services, Inc. or its Affiliates. The need for speed… ElastiCache + RDS ElastiCache + Aurora ElastiCache + Redshift ElastiCache + DynamoDB ElastiCache + DocumentDB ElastiCache + …. © 2020, Amazon Web Services, Inc. or its Affiliates. Redis Topologies & Features © 2020, Amazon Web Services, Inc. or its Affiliates. ElastiCache Redis: Distributed In-Memory Data Store Redis Node Client Client List String Client Hash Bitmap Set Client Stream Geospatial Sorted Set Client HyperLogLog Client Client Client © 2020, Amazon Web Services, Inc. or its Affiliates. ElastiCache Redis: Distributed In-Memory Data Store A=1 A=1 Redis Node WriteClient Client Read Client A:1 A=1 List String A=1 <1ms Read Client Hash Bitmap Set Read Client A=1 Stream Geospatial Sorted Set A=1 Read Client HyperLogLog Read Client A=1 A=1 Read Client Read Client © 2020, Amazon Web Services, Inc. or its Affiliates. Redis Cluster Mode – Enabled vs. Disabled Feature Redis Cluster (enabled) Redis Cluster (disabled) Recovery Time 10-20 sec (non-DNS) ~30+ sec (DNS) Writes affected on failed shard. Writes affected on entire data set. Failover Impact Reads available Reads available Up to 250* nodes (90 = 15 shards + 5 1 primary Node Scale replicas soft limit) 0-5 replicas (max. 6 nodes) 0–5 replicas per shard Primary Endpoint Storage 170 TB (635 GB x 250) 635 GB Slot 0 Max Connections 16.25 million (65,000 x 250) 390,000 (65,000 x 6) P Slot 1 … R Online Scaling Shards and read replicas Read replicas only Slot 16383 Migration Path Backup/Restore Snapshot Online Migration Tool • Throughput limited by 1 primary, 5 Configuration • Achieve greater throughput through replicas Endpoint Scalability and horizontal scaling • Horizontal Scale for Reads (Replicas) P Slot 0–5461 R Performance • Horizontal/Vertical scaling supported P Slot 5462–10922 R Supported • Vertical scaling for Replicas/Primary Slot 10923– also supported P R 16383 Cluster Resizing (zero-downtime) • Horizontal Scaling to add/remove Vertical Scaling Scaling Operation shards • Writes/Reads continue during scale up • Read Scalability to add/remove operation replicas © 2020, Amazon Web Services, Inc. or its Affiliates. Redis Cluster-mode disabled (Scaled Vertically) VPC Availability zone 1 Availability zone 2 Elastic Load Balancing Public subnet Public subnet Connect to Primary for Read/Writes and Replica’s for Reads Auto Scaling group Amazon EC2 Amazon EC2 Private subnet Private subnet Keyspace All keys remain on same single node Redis node Redis node Primary Endpoint Replica Endpoints © 2020, Amazon Web Services, Inc. or its Affiliates. ElastiCache for Redis Multi-AZ Auto-Failover § Chooses replica with lowest replication lag Automatic Failover to a read replica in case of § DNS endpoint is same primary node failure Multi-AZ ElastiCache Automates writes snapshots for persistence Use Primary Endpoint Primary ElastiCache for Redis reads Replica Replica Use Read Replicas ElastiCache ElastiCache or Reader Endpoint for Redis for Redis Availability Zone A Availability Zone B © 2020, Amazon Web Services, Inc. or its Affiliates. ElastiCache with Redis Multi-AZ Region Auto Scaling ElastiCache Cluster Read Replica Primary Availability Zone A Availability Zone B © 2020, Amazon Web Services, Inc. or its Affiliates. ElastiCache with Redis Multi-AZ Region Auto Scaling ElastiCache Cluster Read Replica Primary Availability Zone A Availability Zone B © 2020, Amazon Web Services, Inc. or its Affiliates. ElastiCache with Redis Multi-AZ Region Auto Scaling ElastiCache Cluster Read Replica Primary Availability Zone A Availability Zone B © 2020, Amazon Web Services, Inc. or its Affiliates. Redis Cluster-mode enabled (Scaled Horizontally) VPC Availability zone 1 Availability zone 2 Elastic Load Balancing Public subnet Public subnet Clients use hash value for ( CW Metric: CurrItems ) a key CRC16(key) mod 16384 Auto Scaling group Distribution Equal | Custom Amazon EC2 Amazon EC2 Cluster Map Private subnet Partitioned by Shard Private subnet Shard 1 Slot 0 - … Keyspace Configuration Shard 2 Slot … to … Endpoint Shard 3 Slot … to … Zero Downtime Scaling Redis node Redis node Redis node Redis node Shard 4 Slot … to 16383 © 2020, Amazon Web Services, Inc. or its Affiliates. To p o l o g y - Redis Cluster Mode Enabled Example: S S S 3 shards 2 replicas per shard Multi-AZ Shards, primaries, and read replicas • Add shards to scale reads/writes, increase in-memory capacity Redis Cluster • Add replicas to scale reads, increase availability • Able to specify availability zones. Multi-AZ default P R R P R R • Able to customize slot distributions, equal distribution Slots 0–5454 Slots 5455–10909 Slots 0–5454 Slots 5455–10909 Slots 0–5454 Slots 5455–10909 default R R P Slots10910–16363 Slots10910–16363 Slots10910–16363 Availability Zone A Availability Zone B Availability Zone C © 2020, Amazon Web Services, Inc. or its Affiliates. Cluster mode-enabled Failover CW Metric: ReplicationLag x15 Shard x5 async replication Cache node Cache node ElastiCache for Redis Primary Replica © 2020, Amazon Web Services, Inc. or its Affiliates. Cluster mode-enabled Failover Failover Detection Shard async replication Cache node Cache node ElastiCache for Redis Primary Replica © 2020, Amazon Web Services, Inc. or its Affiliates. Cluster mode-enabled Failover SNS Event: ElastiCache:CacheNodeReplaceComplete SNS Event: ElastiCache:FailoverComplete Shard Test with Failover API async replication Cache node Cache node ElastiCache for Redis Primary Replica Automatic Failover (with no DNS propagation) © 2020, Amazon Web Services, Inc. or its Affiliates. Online Re-Sharding – Zero Downtime Shard 1 Shard 2 Shard 3 0-5461 5462--10922 10923-16383 Simple API aws elasticache modify-replication-group-shard-configuration --replication-group-id rep-group-id --apply-immediately --node-group-count 5 Scale In || Out © 2020, Amazon Web Services, Inc. or its Affiliates. Zero downtime - Online re-sharding - scale out Shard 1 Shard 2 Shard 3 0-2909, 5462-5783, 0-5461 5462--10922 10923-1638314199 5095-5461 6876-9830 Uniform slot distribution across shards reads/ writes Amazon EC2 Shard 4 No Application Interruption Shard 5 2910-5094, 5784-6875, 9831--10922 14200-16383 © 2020, Amazon Web Services, Inc. or its Affiliates. Zero downtime - Online re-sharding - scale in Shard 1 Shard 2 Shard 3 0-5461 5462--10922 10923-16383 Uniform slot distribution across shards reads/ writes Amazon EC2 Shard 4 No Application Interruption Shard 5 © 2020, Amazon Web Services, Inc. or its Affiliates. Global Datastore (Cross Region Replication) Example for a worldwide application • One-click setup for existing clusters • Write locally, read globally • Enable cross-region disaster Secondary recovery (Passive) Region Read Primary • Leverage extreme performance with (active) region Read/Write Redis’ sub-millisecond latency Secondary (Passive) Region • Secure encryption in transit for Read cross-region traffic • Use with AWS Management Console, or latest AWS SDK or CLI © 2020, Amazon Web Services, Inc. or its Affiliates. Optimized M5 and R5 instances & Enhanced I/O • Scale up to of • Dynamic network processing to enhance © 2020, Amazon Web Services, Inc. or its Affiliates. Self-managing

Amazon Elasticache Deep Dive Powering Modern Applications with Low Latency and High Throughput

Redis and Memcached

Modeling and Analyzing Latency in the Memcached System

Timeline 1994 July Company Incorporated 1995 July Amazon

Amazon Dynamodb

Release 2.5.5 Ask Solem Contributors

An End-To-End Application Performance Monitoring Tool

WEB2PY Enterprise Web Framework (2Nd Edition)

Performance at Scale with Amazon Elasticache

Amazon Documentdb Deep Dive

A Motion Is Requested to Authorize the Execution of a Contract for Amazon Business Procurement Services Through the U.S. Communities Government Purchasing Alliance

Database Software Market: Billy Fitzsimmons +1 312 364 5112

Amazon Mechanical Turk Developer Guide API Version 2017-01-17 Amazon Mechanical Turk Developer Guide