
Introduction to Amazon DynamoDB - and some use case examples Pete Naylor Sr Database Specialist SA - AWS © 2020, Amazon Web Services, Inc. or its Affiliates. Agenda • DynamoDB’s place in the history of purpose-built databases • Key concepts • How do I use DynamoDB? - Basic data models and controls • Challenges with changing and imbalanced loads - And how DynamoDB deals with them • Integrated DynamoDB architecture and data flows © 2020, Amazon Web Services, Inc. or its Affiliates. Database evolution and DynamoDB © 2020, Amazon Web Services, Inc. or its Affiliates. DynamoDB history How did we get here? 2004 – Amazon.com operational database scaling woes https://www.cnet.com/news/amazon-com-hit-with-outages/ 2007 – Whitepaper describing internal distributed database solution http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf 2012 – Reimagined implementation made available via AWS as DynamoDB https://aws.amazon.com/about-aws/whats-new/2012/01/18/aws-announces-dynamodb/ 2018 – Amazon.com almost entirely migrated to Redshift, Aurora, DynamoDB https://twitter.com/ajassy/status/1060979175098437632 2019 – Continuing rapid innovation informed by unmatched experience http://amzn.to/2fAJXH9 © 2020, Amazon Web Services, Inc. or its Affiliates. DynamoDB history Motivations at Amazon (and elsewhere) 1. Reduce dependence on commercially licensed relational database engines 2. Minimize operational complexity and administrative overhead 3. Provide best possible customer experience across all data indexing needs “A one size fits all database doesn't fit anyone” - Werner Vogels © 2020, Amazon Web Services, Inc. or its Affiliates. Internet-scale applications Users 1M+ Data volume TB-PB-EB Locality Global Performance Milliseconds-microseconds Request rate Millions Access Mobile, IoT, devices Scale Incrementally based on traffic Economics Pay as you go Developer access Instant API access © 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates. By: Arturo Pardavila III https://www.flickr.com/photos/apardavila/albums/72157674624352641 Snap (Snapchat) Database writes peak seconds after Chicago Cubs win the World Series. © 2020, Amazon Web Services, Inc. or its Affiliates. A migration from mainframe • Retail business ran on mainframe, which became a bottleneck • Migrated financial transaction data to DynamoDB • Unbound scale for customers and app developers “We built a secure and resilient cloud infrastructure that could solve the scalability and reliability problems with a serverless architecture.” Srini Uppalapati Capital One © 2020, Amazon Web Services, Inc. or its Affiliates. SQL and DynamoDB side by side Traditional SQL DynamoDB Optimized for storage Optimized for compute Normalized/relational Denormalized/hierarchical Ad hoc queries Instantiated views for known patterns Scale vertically Scale horizontally Good for OLAP Built for OLTP at scale © 2020, Amazon Web Services, Inc. or its Affiliates. Core Strengths of DynamoDB Sounds cool, but when should I use it? Considerations: Priorities • SQL / NoSQL • Relational / Non-Relational ü Security • Operational / Analytical ü Durability ü Scalability DynamoDB solves for: • Horizontal scaling ü Availability • Decoupled compute/storage ü Low Latency • Asynchronous roll-ups/aggregations ü Easy to Use • Availability/Durability/Latency ü Easy to Manage • Well-known access patterns • Schema flexibility © 2020, Amazon Web Services, Inc. or its Affiliates. Key concepts © 2020, Amazon Web Services, Inc. or its Affiliates. Scaling databases Traditional SQL NoSQL DB DB DB DB DB DB P1 P2 P3 Pn Scale up Scale out to many shards © 2020, Amazon Web Services, Inc. or its Affiliates. Scaling NoSQL databases Most NoSQL databases DynamoDB DB DB DB DB DB DB DB DB DB P1DB P1DB P1DB P1DB P1DB P1 DB DB DB DB P1DB P1DB P1DB P1DB P1 P1 DB DB DB P1DB P1DB P1DB P1DB P1DB P1 DB DB P1DB P1DB P1DB P1DB P1DB P1DB DB DB P25 P26 P27 P28 P29 P30 Servers and clusters DynamoDB: partitions Basic premise: There is a way to shard data that’s horizontally scalable. © 2020, Amazon Web Services, Inc. or its Affiliates. Incremental scaling with DynamoDB Workload: data volume, reads, writes DynamoDB resources: storage, read, and write capacity © 2020, Amazon Web Services, Inc. or its Affiliates. You work with tables… Table 1 Table 2 Table 3 DynamoDB does the rest under the hood… Server 1 Server N 1K WU or 3K RU T1.p1 T1.pn up to 10 GB © 2020, Amazon Web Services, Inc. or its Affiliates. Terminology Primary key – unique item identifier Items DynamoDB table Sort key conditions: all items Partition Sort key ==, <, >, >=, <= Attributes “begins with” key (optional) “between” - 1:1 relationships - 1:many relationships sorted results - distribute traffic - collect related items counts - collection identity - efficient filtering top/bottom N - sorting © 2020, Amazon Web Services, Inc. or its Affiliates. Global secondary index (GSI) Online indexing Alternate partition (+sort) key for alternate materialized view Capacity provisioned separately from table A1 A2 A3 A4 A5 Table (partition) Up to 20 GSIs per table A2 A1 (part.) (table key) KEYS_ONLY A5 A4 A1 A3 INCLUDE A3 GSIs (part.) (sort) (table key) (projected) A4 A5 A1 A2 A3 (part.) (sort) (table key) (projected) (projected) ALL © 2020, Amazon Web Services, Inc. or its Affiliates. Sharding/partitioning Denormalized Record Partition key DynamoDB table Hash.MIN = 0 Orders 00 Partition A OrderId: 1 CountryCode: 1 Hash(1) = 7B ASIN: [B00X4WHP5E] Keyspace 55 Partition B OrderId: 2 CountryCode : 1 Hash(2) = 48 ASIN: [B00OQVZDJM] AA OrderId: 3 Partition C CountryCode : 1 Hash(3) = CD ASIN: [B00U3FPN4U] FF Hash.MAX = FF Related data is stored together for efficient access © 2020, Amazon Web Services, Inc. or its Affiliates. A view “from a different angle” Three Replicas OrderId: 1 CustomerId: 1 Across 3 Availability ASIN: [B00X4WHP5E] Zones in the Region - one replica is elected to serve as “leader”. Hash(1) = 7B Availability Zone A Availability Zone B Availability Zone C Partition A Partition B Partition C Partition A Partition B Partition C PartitionPartition A A Partition B Partition C Host 1 Host 2 Host 3 Host 4 Host 5 Host 6 Host 7 Host 8 Host 9 CustomerOrdersTable © 2020, Amazon Web Services, Inc. or its Affiliates. Modeling data for DynamoDB © 2020, Amazon Web Services, Inc. or its Affiliates. Partition key • A good sharding (partitioning) scheme affords even distribution of both data and workload as they grow • Key concept: partition key as the dimension of scalability • Distribute traffic and data across partitions – horizontal scaling • Ideal scaling conditions: • The partition key is from a high cardinality set (that grows) • Requests are evenly spread over the key space • Requests are evenly spread over time © 2020, Amazon Web Services, Inc. or its Affiliates. Denormalization Traditional database DynamoDB CART CART CART_ITEM CartId: 1, CartId: 1, ItemId: 2, Qty: 1, Version: v3, Version: v3 CartId: 1 Items: [ {ID: 2, Qty: 1}, {ID: 5, Qty: 2}] Normalized schema: multiple items – one table per entity type Denormalized: one item with flexible schema © 2020, Amazon Web Services, Inc. or its Affiliates. Ensuring data consistency on updates Example: add/remove shopping cart items 1. get cart => vread = Version 2. update cart: Use ConditionExpression IF Version = vread in DynamoDB add/remove cart items ++Version ELSE go back to Step 1. Optimistic concurrency control © 2020, Amazon Web Services, Inc. or its Affiliates. Updating cart: data consistency using OCC 1. Get the cart: GetItem 2. Update the cart: conditional PutItem (or UpdateItem) { "TableName": “Cart”, { "TableName": “Cart”, "Key": {“CartId”: {“N”: “2”}} "Item": { } “CartID”: {“N”: “2”}, “Version”: {“N”:”4”}, “CartItems”: {…} }, CART "ConditionExpression": “Version = :ver”, "ExpressionAttributeValues": {":ver": {“N”: “3”}} CartID:2, Version: 3, } CartItems: [ ü Use conditions to implement optimistic {ID: 2, Qty: 1}, concurrency control ensuring data consistency {ID: 5, Qty: 2}] ü Single-item operations are ACID ü GetItem call can be eventually consistent © 2020, Amazon Web Services, Inc. or its Affiliates. Yes, you can have strongly consistent read-after-write and concurrency control with DynamoDB © 2020, Amazon Web Services, Inc. or its Affiliates. One-to-one relationships or key-values • Use a table or GSI with a partition key • Use GetItem or BatchGetItem API Example: Given a user or email, get attributes Users table Partition key Attributes UserId = bob Email = [email protected], JoinDate = 2011-11-15 UserId = fred Email = [email protected], JoinDate = 2011-12-01 Users-Email-GSI Partition key Attributes Email = [email protected] UserId = bob, JoinDate = 2011-11-15 Email = [email protected] UserId = fred, JoinDate = 2011-12-01 © 2020, Amazon Web Services, Inc. or its Affiliates. One-to-many relationships or parent-children • Use a table or GSI with a partition and sort key • Use the Query API to get multiple items Example: Given a device, find all readings between epoch X, Y Device-measurements Part. key Sort key Attributes DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90 DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90 © 2020, Amazon Web Services, Inc. or its Affiliates. Many-to-many relationships • Use a table and GSI with the partition and sort key elements switched • Use the Query API Example: Given a user, find all games. Or given a game, find all users. User-Games-Table Game-Users-GSI Part. key Sort key Part. key Sort key UserId = bob GameId = Game1
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages41 Page
-
File Size-