<<

Build data driven, high-performance, -scale applications

Julio Faerman @faermanj AWS Technical Evangelist

SUMMIT © 2019, , Inc. or its affiliates. All rights reserved. Characteristics of modern applications Internet-scale and transactional

Users: 1M+ Data volume: TB–PB–EB Locality: Global Performance: Milliseconds–microseconds Request Rate: Millions Access: Mobile, IoT, devices Scale: Up-out-in Economics: Pay-as-you-go

Ride hailing Media streaming Social media Dating Developer access: Instant API access

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Modern applications require internet-scale performance

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common data categories and use cases

Relational Key-value Document In-memory Graph Time-series Ledger

Referential High Store Query by key Quickly and Collect, store, Complete, integrity, ACID throughput, low- documents and with easily create and process immutable, and transactions, latency reads quickly access microsecond and navigate data sequenced verifiable history schema- and writes, querying on any latency relationships by time of all changes to on-write endless scale attribute between application data data

Lift and shift, ERP, Real-time bidding, Content Leaderboards, Fraud detection, IoT applications, Systems CRM, finance shopping cart, management, real-time analytics, social networking, event tracking of record, supply social, product personalization, caching recommendation chain, health care, catalog, customer mobile engine registrations, preferences financial

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Purpose-built databases

Relational Key-value Document In-memory Graph Time-series Ledger

Amazon DynamoDB DocumentD ElastiCache Neptune Timestream Quantum RDS B

Aurora Community Commercial

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. with Amazon DynamoDB

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Purpose-built databases for internet-scale apps The world’s largest e-commerce business, Amazon.com, runs on nonrelational databases because of their scale, performance, and maintenance benefits.

— Werner Vogels CTO, Amazon

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Lyft >1M rides/day, 8x traffic in peak hours

CHALLENGE Needed a solution that scales and manage up to 8x more riders during peak times.

SOLUTION DynamoDB stores GPS coordinates of all rides. With AWS, Lyft saves on infrastructure costs and enables massive growth of ridesharing platform. There are now 23M people who use Lyft worldwide.

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon DynamoDB Fast and flexible key value database service for any scale

Comprehensive Global database for Performance at scale Serverless security global users and apps

Consistent, single-digit No hardware provisioning, Encrypts all data by default Build global applications with fast millisecond response times at any software patching, or upgrades; and fully integrates with access to local data by easily scale; build applications with scales up or down AWS Identity and Access replicating tables across multiple virtually unlimited throughput automatically; continuously Management for robust AWS Regions backs up your data security

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DynamoDB transactions

Single API Call

Simplify your code by executing multiple, all-or-nothing actions within and across tables with a single API call

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DynamoDB: Capacity managed for you

Govern max consumption

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DynamoDB Accelerator (DAX)

Features • Fully managed, highly available: Handles all software management, fault tolerant, replication across multi-AZs within a Region • DynamoDB API compatible: Seamlessly caches DynamoDB API calls, no application rewrites required • Write-through: DAX handles caching for writes • Flexible: Configure DAX for one table or many • Scalable: Scales-out to any workload with up to 10 read replicas • Manageability: Fully integrated AWS service: Amazon CloudWatch, Tagging for DynamoDB, AWS Console • Security: Amazon VPC, AWS IAM, AWS CloudTrail, AWS Organizations DynamoDB Advancements over the last 21 months

February 2017 April 2017 April 2017 June 2017

Time To VPC DynamoDB Auto Live (TTL) endpoints Accelerator (DAX) scaling

November 2017 November 2017 November 2017 March 2018

On-demand Point-in-time Global tables Encryption at rest backup recovery

June 2018 August 2018 November 2018 November 2018

Adaptive 99.999% SLA Transactions On-demand SLA capacity ACID

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

SQL vs NoSQL

SQL NoSQL

Optimized for storage Optimized for compute

Normalized/relational Denormalized/hierarchical

Ad hoc queries Instantiated views

Scale vertically Scale horizontally

Good for OLAP Built for OLTP at scale

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. with Amazon Aurora

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Old world commercial relational databases

Very Punitive You’ve Proprietary Lock-in expensive licensing got mail

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Moving to open database engines

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Aurora

MySQL and PostgreSQL compatible relational database built for the cloud Performance and availability of commercial-grade databases at 1/10th the cost

Performance Availability and scalability and durability Highly secure Fully managed

5x throughput of standard MySQL Fault-tolerant, self-healing Network isolation, Managed by RDS: and 3x of standard PostgreSQL; storage; six copies of data encryption at rest/transit no hardware provisioning, scale-out up to across three AZs; continuous software patching, setup, 15 read replicas backup to S3 configuration, or backups

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aurora: Scale-out, Distributed architecture

Master Replica Replica

SQL SQL SQL  Push Log applicator to Storage Transactions Transactions Transactions  “The log is the database” Master Replica Replica Replica Caching Caching Caching

 4/6 Write Quorum & Local tracking

Shared storage volume  Write performance

 Read scale out

 AZ + 1 failure tolerance AZ1 AZ2 AZ3  Instant database redo recovery

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Read scale out MYSQL READ SCALING AMAZON AURORA READ SCALING

MySQL Master SINGLE-THREADED MySQL Replica Aurora Master PAGE CACHE Aurora Replica BINLOG APPLY UPDATE 70% Write 70% Write 70% Write 100% New Reads 30% Read 30% New Reads 30% Read

Data Volume Data Volume Shared Multi-AZ Storage

Logical using complete changes Physical using delta changes Same write workload NO writes on replica storage Independent Shared storage

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aurora Multi-Master Architecture Orange Master Replica Blue Master Cluster Services SQL SQL • Membership • Replication Replica  No Pessimistic Locking • Heartbeat • Metadata TransactionsMaster Transactions

Caching Caching  No Global Ordering Decoupled T1 T2  No Global Commit-Coordination 1 2 ? 3  Optimistic Conflict Resolution Decoupled Shared Storage Volume  Decoupled System AZ1 1 1 2 2 3 3 AZ2 1 1 2 2 3 3  Microservices Architecture AZ3 1 1 2 2 3 3 Decoupled Driving down query latency – Parallel Query

DATABASE NODE  Parallel, Distributed processing AGGREGATE PUSH DOWN RESULTS PREDICATES

 Push-down processing closer to data

 Reduces buffer pool pollution

STORAGE NODES “AZ+1” failure tolerance

AZ 1 AZ 2 AZ 3 2/3 read Why? 2/3 write  In a large fleet, always some failures  AZ failures have ”shared fate” Quorum break on AZ failure

AZ 1 AZ 2 AZ 3 3/6 read How? 4/6 write  6 copies, 2 copies per AZ Quorum  2/3 quorum will not work survives AZ failure

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Database backtrack

t4 Invisible Invisible

t2 t3

Rewind to t3 t0 t1

Rewind to t1

t0 t1 t2 t3 t4

Backtrack brings the database to a point in time without requiring restore from backups • Backtracking from an unintentional DML or DDL operation • Backtrack is not destructive. You can backtrack multiple times to find the right point in time Global replication Faster disaster recovery and enhanced data locality

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Performance Insights

CPU bottleneck Dashboard showing database load Max vCPU . Easy – e.g. drag and drop . Powerful – drill down using zoom in

Identifies source of bottlenecks . Sort by top SQL . Slice by host, user, wait events SQL w/ high CPU Adjustable time frame . Hour, day, week , month . Up to 2 years of data; 7 days free

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aurora Serverless

Responds to your application load automatically

Scale capacity up and down in < 10 seconds

New instance has warm buffer pool

Multi-tenant proxy is highly available

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How does it work . . .

Region

Availability zone 1 App

Multi-tenant NLB / database proxy layer

Monitoring service Warm-pool of Aurora instances

Shared distributed storage volume

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Web Service Data API

Access your database from Lambda applications

SQL statements packaged as HTTP requests Web Service Data API

Connection pooling managed behind proxy

Aurora Serverless

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. with Amazon ElastiCache

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Internet-scale Apps Need Low Latency and High Concurrency

Users 1M+

Data volume TB-PB-EB

Locality Global

Performance Milliseconds -Microseconds Request Rate Millions

Access Mobile, IoT, Devices Gaming Financial Social Ride leaderboards trading media hailing Scale Up-Out-In

Economics Pay as you go

Dating Media Session Developer access Instant API access streaming stores SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Amazon ElastiCache Fully-managed, Redis or Memcached compatible, low-latency, in-memory data store

Extreme Fully Easily Performance Managed Scalable In-memory data store and AWS manages all Read scaling with cache for sub-millisecond hardware and software replicas. Write and memory scaling response times setup, configuration, with sharding. monitoring Non disruptive scaling

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. What’s New: Redis & Memcached  Redis 5.0 • Redis Streams • SortedSets now have LIST capabilities (POP and BLOCK) ) • HyperLogLogs has an optimized algorithm • Speed Improvements (Jemalloc additions, etc.) • Active Defragmentation • Added In-line HELP command for redis-cli • Native TLS Integration Redis (ElastiCache) • More at https://aws.amazon.com/redis/Whats_New_Redis5

 Memcached 1.5.10

• Automated Slab rebalancing • LRU crawler to background-reclaim memory • Faster hash table lookups with murmur3 algorithm

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. New: Amazon ElastiCache 250 node support

When 9.5 TiB is not enough!

Example 1:

• Assume 125 shards made of 1 Primary + 1 Replica = 250 nodes • Assume R5.24xlarge ( 635.61 GiB ) • Cluster memory 635.61 GiB X 125 = ~80 TiB = ~88 TB

Example 2:

• Assume 250 shards made of 1 Primary + 0 Replica = 250 nodes • Assume R5.24xlarge ( 635.61 GiB ) • Cluster memory 635.61 GiB X 250 = ~159 TiB = ~170 TB

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Redis Overview

Fast In-memory <1ms latency for most commands key-value store

Open source Powerful ~200 commands, Lua scripting, Geospatial, Pub/Sub Easy to learn Various data structures Strings, lists, hashes, sets, Highly available sorted sets, bitmaps, streams, Replication and HyperLogLogs Atomic operations Backup/Restore Supports transactions Enables snapshotting

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cluster sizing best practices • In-Memory Storage • Recommended: Memory needed + 25% reserved memory (for Redis) + some room for growth (optional 10%) • Optimize using eviction policies and TTLs • Scale up or out when before reaching max-memory using CloudWatch alarms • Use memory optimized nodes for cost effectiveness (R5 support ) • Performance • Benchmark operations using Redis Benchmark tool • For more READIOPS—Add replicas • For more WRITEIOPS—Add shards (scale out) • For more network IO—Use network optimized instances and scale out • Use pipelining for bulk reads/writes • Consider Big(O) time complexity for data structure commands • Cluster Isolation (apps sharing key space)—Choose a strategy that works for your workload • Identify what kind of isolation is needed based on the workload and environment • Isolation: No Isolation $ | Isolation by Purpose $$ | Full Isolation $$$

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. GE’s Predix Platform Powered by ElastiCache Redis

GE is the world’s largest digital industrial company. We Using ElastiCache Redis with Open Service use [ElastiCache Redis] to make it super easy Broker, Predix Platform from GE Digital allows “ and simple for developers to use Amazon services. developers to easily create Redis clusters with Amazon ElastiCache team implemented the Redis AUTH feature in four regions in two months standard, pre-configured parameters, sizing and enabling application level security.” network security. – Amulya Sharma Senior Staff Software Engineer Developers build container-based stateless GE Digital applications on AWS and ElastiCache is used to Container ElastiCache Runtime VPC Server VPC manage session state for these applications. Broker

App The architecture makes is easy and simple for API developers to build applications. Control Plane

Data Plane EC2

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Expedia’s Real-time Analytics with ElastiCache

With ElastiCache Redis as caching layer, the write Expedia is a leader in the $1 trillion travel throughput on DynamoDB has been set to 3500, industry, with an extensive portfolio that “ down from 35000, reducing the cost by 6x.” includes some of the world’s most trusted – Kuldeep Chowhan travel brands. Engineering Manager, Expedia

Expedia’s real-time analytics application collects data for its “test & learn” Reference data on-premises Redshift experiments on Expedia sites. The analytics Historical queries on up to 2 years of data application processes ~200 million ElastiCache Kenesis EC2 S3 messages daily. (Redis) Firehose Ingest Join/ Staging near real- multiple data compare time data streams events Real-time streams of lodging Aurora mark data Operational queries of real-time data

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. with Amazon Timestream

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. A sequence of data points recorded over a time interval Amazon Timestream (Preview) Fast, scalable, and fully managed time series database

1,000x faster at 1/10th the Trillions of daily Analytics optimized Serverless cost of relational databases events for time series data

Collect fast moving time- Capable of processing Built-in analytics for No servers to manage; time- series data from multiple trillions of events daily; the interpolation, smoothing, and consuming tasks such as sources at the rate of adaptive query processing approximation to identify hardware provisioning, millions of inserts per engine maintains steady, trends, patterns, and software patching, setup, & second predictable performance anomalies configuration done for you

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our approach

Architect services ground-up for the cloud and for the explosion of data

Offer a portfolio of purpose-built services, optimized for your workloads

Help you innovate faster through managed services

Provide services that help you migrate existing apps and databases to the cloud

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. When to Use Which Services Situation Solution Existing application Use your existing engine on RDS • MySQL Amazon Aurora, RDS for MySQL • PostgreSQL Amazon Aurora, RDS for PostgreSQL • MariaDB Amazon Aurora, RDS for MariaDB • Oracle Use SCT to determine complexity Amazon Aurora, RDS for Oracle • SQL Server Use SCT to determine complexity Amazon Aurora, RDS for SQL Server New application • If you can avoid relational features DynamoDB • If you need relational features Amazon Aurora In-memory store/cache • Amazon ElastiCache Time series data • Amazon Timestream Track every application change, crypto verifiable. • Amazon Quantum Ledger Database (QLDB) Have a central trust authority Don’t have a trusted central authority • Amazon Managed Blockchain Data Warehouse & BI • Amazon Redshift, Amazon Redshift Spectrum, and Amazon QuickSight Adhoc analysis of data in S3 • Amazon Athena and Amazon QuickSight Apache Spark, Hadoop, HBase (needle in a • Amazon EMR haystack type queries) SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Log analytics, operational monitoring, & search • Amazon Elasticsearch Service and Amazon Kinesis Thank you!

Julio Faerman @faermanj

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.