Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
©2016, Amazon Web Services, Inc. or its affiliates. All rights reserved
Amazon DynamoDB
Sean Shriver
NoSQL Solutions Architect
Amazon Web Services
September 2016
• Tables, API, data types, indexes
• Scaling
• Data modeling
• Scenarios and best practices
• DynamoDB Streams
• Reference architecture
Amazon DynamoDB
• Managed NoSQL database service
• Supports both document and key-value data models
• Highly scalable
• Consistent, single-digit millisecond latency at any
• Highly available—3x replication
• Simple and powerful API
Tables, Partitioning
Key-value access pattern
Determines data distribution Optional
Model 1:N relationships
Enables rich query capabilities
All items for a partition key
==, <, >, >=, <=
“begins with”
sorted results
top/bottom N values
paged responses
• CreateTable
• UpdateTable
• DeleteTable
• DescribeTable
• ListTables
• GetItem
• Query
• Scan
• BatchGetItem
• PutItem
• UpdateItem
• DeleteItem
• BatchWriteItem
• ListStreams
• DescribeStream
• GetShardIterator
• GetRecords
Table and item API
Stream API
Data types
• String (S)
• Number (N)
• Binary (B)
• String Set (SS)
• Number Set (NS)
• Binary Set (BS)
• Boolean (BOOL)
• Null (NULL)
• List (L)
• Map (M)
Used for storing nested JSON documents
00 55 A954 AA FF
Partition table
• Partition key uniquely identifies an item
• Partition key is used for building an unordered hash index
• Table can be partitioned for scale
00 FF
Id = 1
Name = Jim
Hash (1) = 7B
Id = 2
Name = Andy
Dept = Engg
Hash (2) = 48
Id = 3
Name = Kim
Dept = Ops
Hash (3) = CD
Key Space
Partitions are three-way replicated
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Replica 1
Replica 2
Replica 3
Partition 1 Partition 2 Partition N
Partition-sort key table
• Partition key and sort key together uniquely identify an Item
• Within unordered partition key-space, data is sorted by the sort key
• No limit on the number of items (∞) per partition key
– Except if you have local secondary indexes
00:0 FF:∞
Hash (2) = 48
Customer# = 2
Order# = 10
Item = Pen
Customer# = 2
Order# = 11
Item = Shoes
Customer# = 1
Order# = 10
Item = Toy
Customer# = 1
Order# = 11
Item = Boots
Hash (1) = 7B
Customer# = 3
Order# = 10
Item = Book
Customer# = 3
Order# = 11
Item = Paper
Hash (3) = CD
55 A9:∞54:∞ AA
Partition 1 Partition 2 Partition 3
Global secondary index (GSI)
• Alternate partition (+sort) key
• Index is across all table partition keys
(table key)
(table key)
(projected) ALL
(table key) KEYS_ONLY
provisioned separately
for GSIs
Online Indexing
A2 A3 A4 A5
Local secondary index (LSI)
• Alternate sort key attribute
• Index is local to a partition key
(table key)
A3 A4 A5
(table key)
(table key)
10 GB max per partition
key, i.e. LSIs limit the #
of sort keys!
• Throughput
– Provision any amount of throughput to a table
• Size
– Add any number of items to a table
• Max item size is 400 KB
• LSIs limit the number of items due to 10 GB limit
• Scaling is achieved through partitioning
• Provisioned at the table level
– Write capacity units (WCUs) are measured in 1 KB per second
– Read capacity units (RCUs) are measured in 4 KB per second
• RCUs measure strictly consistent reads
• Eventually consistent reads cost 1/2 of consistent reads
• Read and write throughput limits are
Getting the most out of DynamoDB throughput
“To get the most out of
DynamoDB throughput, create
tables where the partition key
has a large number of distinct
values, and values are
requested fairly uniformly, as
randomly as possible.”
—DynamoDB Developer Guide
1. Key Choice: High key
2. Uniform Access: access is
evenly spread over the
3. Time: requests arrive
evenly spaced in time
Example: Key Choice or Uniform Access
Example: Time
How does DynamoDB handle bursts?
• DynamoDB saves 300 seconds of unused
capacity per partition
Bursting is best effort!
Burst capacity is built-in
Provisioned Consumed
“Save up” unused capacity
Consume saved up capacity
Burst: 300 seconds
(1200 × 300 = 360k CU)
Burst capacity may not be sufficient
Provisioned Consumed Attempted
Throttled requests
Don’t completely depend on burst capacity… provision sufficient throughput
Burst: 300 seconds
(1200 × 300 = 360k CU)
What causes throttling?
• If sustained throughput goes beyond
provisioned throughput per partition
• From the example before:
– Table created with 5000 RCUs, 500 WCUs
– RCUs per partition = 1666.67
– WCUs per partition = 166.67
– If sustained throughput > (1666 RCUs or 166 WCUs) per key or
partition, DynamoDB may throttle requests
• Solution: Increase provisioned throughput
What causes throttling?
• Non-uniform workloads
– Hot keys/hot partitions
– Very large bursts
• Dilution of throughout across partitions caused
by mixing hot data with cold data
– Use a table per time period for storing time series data so WCUs
and RCUs are applied to the hot data set
Data Modeling
Store data based on how you will access it!
1:1 relationships or key-values
• Use a table or GSI with a partition key
• Use GetItem or BatchGetItem API
Example: Given a user or email, get attributes
Users Table
Partition key Attributes
UserId = bob Email = bob@gmail.com, JoinDate = 2011-11-15
UserId = fred Email = fred@yahoo.com, JoinDate = 2011-12-01
Partition key Attributes
Email = bob@gmail.com UserId = bob, JoinDate = 2011-11-15
Email = fred@yahoo.com UserId = fred, JoinDate = 2011-12-01
1:N relationships or parent-children
• Use a table or GSI with partition and sort key
• Use Query API
Example: Given a device, find all readings
between epoch X, Y
Part. Key Sort key Attributes
DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90
DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
N:M relationships
• Use a table and GSI with partition and sort key
elements switched
• Use Query API
Example: Given a user, find all games. Or given a
game, find all users.
Part. Key Sort key
UserId = bob GameId = Game1
UserId = fred GameId = Game2
UserId = bob GameId = Game3
Part. Key Sort key
GameId = Game1 UserId = bob
GameId = Game2 UserId = fred
GameId = Game3 UserId = bob
Documents (JSON)
• Data types (M, L, BOOL, NULL)
introduced to support JSON
• Document SDKs
– Simple programming model
– Conversion to/from JSON
– Java, JavaScript, Ruby, .NET
• Cannot create an Index on
elements of a JSON object
stored in Map
– They need to be modeled as top-
level table attributes to be used in
LSIs and GSIs
• Set, Map, and List have no
element limit but depth is 32
Javascript DynamoDB
string S
number N
boolean BOOL
null NULL
array L
object M
Rich expressions
• Projection expression
– Query/Get/Scan: ProductReviews.FiveStar[0]
• Filter expression
– Query/Scan: #V > :num (#V is a place holder for keyword VIEWS)
• Conditional expression
– Put/Update/DeleteItem: attribute_not_exists (#pr.FiveStar)
• Update expression
– UpdateItem: set Replies = Replies + :num
Scenarios and Best Practices
Event Logging
Storing time series data
Time series tables
(Partition key)
(sort key)
Attribute1 …. Attribute N
(Partition key)
(sort key)
Attribute1 …. Attribute N
(Partition key)
(sort key)
Attribute1 …. Attribute N
(Partition key)
(sort key)
Attribute1 …. Attribute N
RCUs = 1000
WCUs = 100
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
RCUs = 10
WCUs = 1
Current table
Older tables
Don’t mix hot and cold data; archive cold data to Amazon S3
Use a table per time period
• Pre-create daily, weekly, monthly tables
• Provision required throughput for current table
• Writes go to the current table
• Turn off (or reduce) throughput for older tables
Dealing with time series data
Product Catalog
Popular items (read)
Partition 1
2000 RCUs
Partition K
2000 RCUs
Partition M
2000 RCUs
Partition 50
2000 RCU
Scaling bottlenecks
Product A Product B
ProductCatalog Table
100,000 𝑅𝐶𝑈
50 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠
≈ 𝟐𝟎𝟎𝟎 𝑅𝐶𝑈 𝑝𝑒𝑟 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛
SELECT Id, Description, ...
FROM ProductCatalog
Item Primary Key
Request Distribution Per Partition Key
DynamoDB Requests
Partition 1 Partition 2
ProductCatalog Table
SELECT Id, Description, ...
FROM ProductCatalog
Item Primary Key
Request Distribution Per Partition Key
DynamoDB Requests Cache Hits
Messaging App
Large items
Filters vs. indexes
M:N Modeling—inbox and outbox
Messages App
FROM Messages
WHERE Recipient='David'
FROM Messages
WHERE Sender ='David'
Recipient Date Sender Message
David 2014-10-02 Bob …
… 48 more messages for David …
David 2014-10-03 Alice …
Alice 2014-09-28 Bob …
Alice 2014-10-01 Carol …
Large and small attributes mixed
(Many more messages)
Messages Table
50 items × 256 KB each
Large message bodies
FROM Messages
WHERE Recipient='David'
Computing inbox query cost
Items evaluated by query
Average item size
Conversion ratio
Eventually consistent reads
Recipient Date Sender Subject MsgId
David 2014-10-02 Bob Hi!… afed
David 2014-10-03 Alice RE: The… 3kf8
Alice 2014-09-28 Bob FW: Ok… 9d2b
Alice 2014-10-01 Carol Hi!... ct7r
Separate the bulk data
Inbox-GSI Messages Table
MsgId Body
9d2b …
3kf8 …
ct7r …
afed …
1. Query Inbox-GSI: 1 RCU
2. BatchGetItem Messages: 1600 RCU
(50 separate items at 256 KB)
(50 sequential items at 128 bytes)
Uniformly distributes large item reads
Inbox GSI
Simplified writes
MsgId: 123,
Body: ...,
Recipient: Steve,
Sender: David,
Date: 2014-10-23,
Global secondary
Outbox GSI
FROM Messages
WHERE Sender ='David'
Messaging app
Global secondary
Global secondary
• Reduce one-to-many item sizes
• Configure secondary index projections
• Use GSIs to model M:N relationship
between sender and recipient
Distribute large items
Querying many large items at
Multiplayer Online Gaming
Query filters vs.
composite key indexes
GameId Date Host Opponent Status
d9bl3 2014-10-02 David Alice DONE
72f49 2014-09-30 Alice Bob PENDING
o2pnb 2014-10-08 Bob Carol IN_PROGRESS
b932s 2014-10-03 Carol Bob PENDING
ef9ca 2014-10-03 David Bob IN_PROGRESS
Games Table
Multiplayer online game data
Query for incoming game requests
• DynamoDB indexes provide partition and sort key
• What about queries for two equalities and a sort?
WHERE Opponent='Bob‘
Secondary Index
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
Approach 1: Query filter
Secondary Index
Approach 1: Query filter
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
WHERE Opponent='Bob'
(filtered out)
Needle in a haystack
• Send back less data “on the wire”
• Simplify application code
• Simple SQL-like expressions
– AND, OR, NOT, ()
Use query filter
Your index isn’t entirely selective
Approach 2: Composite key
Secondary Index
Approach 2: Composite key
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Secondary Index
Approach 2: Composite key
WHERE Opponent='Bob'
Needle in a sorted haystack
Sparse indexes
User Game Score Date Award
1 Bob G1 1300 2012-12-23
2 Bob G1 1450 2012-12-23
3 Jay G1 1600 2012-12-24
4 Mary G1 2000 2012-10-24 Champ
5 Ryan G2 123 2012-03-10
6 Jones G2 345 2012-03-20
Id User Score
Champ 4 Mary 2000
Scan sparse partition GSIs
• Concatenate attributes to form useful
secondary index keys
• Take advantage of sparse indexes
Replace filter with indexes
You want to optimize a query as
much as possible
Status + Date
Real-Time Voting
Write-heavy items
Requirements for voting
• Allow each person to vote only once
• No changing votes
• Real-time aggregation
• Voter analytics, demographics
Real-time voting architecture
RawVotes Table
Voting App
Partition 1
1000 WCUs
Partition K
1000 WCUs
Partition M
1000 WCUs
Partition N
1000 WCUs
Votes Table
Candidate A Candidate B
Scaling bottlenecks
Provision 200,000 WCUs
Write sharding
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_7 Candidate B_8
Candidate A_6 Candidate A_8
Candidate A_5
Votes Table
Write sharding
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_7 Candidate B_8
UpdateItem: “CandidateA_” + rand(0, 10)
ADD 1 to Votes
Candidate A_6 Candidate A_8
Candidate A_5
Votes Table
Votes Table
Shard aggregation
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_5
Candidate A_6 Candidate A_8
Candidate A_7 Candidate B_8
Candidate A
Total: 2.5M
1. Sum
2. Store Voter
• Trade off read cost for write scalability
• Consider throughput per partition key and per
Shard write-heavy partition keys
Your write workload is not
horizontally scalable
Correctness in voting
UserId Candidate Date
Alice A 2013-10-02
Bob B 2013-10-02
Eve B 2013-10-02
Chuck A 2013-10-02
RawVotes Table
Segment Votes
A_1 23
B_2 12
B_1 14
A_2 25
AggregateVotes Table
1. Record vote and de-dupe; retry 2. Increment candidate counter
Correctness in aggregation?
UserId Candidate Date
Alice A 2013-10-02
Bob B 2013-10-02
Eve B 2013-10-02
Chuck A 2013-10-02
RawVotes Table
Segment Votes
A_1 23
B_2 12
B_1 14
A_2 25
AggregateVotes Table
DynamoDB Streams
• Stream of updates to
a table
• Asynchronous
• Exactly once
• Strictly ordered
– Per item
• Highly durable
• Scale with table
• 24-hour lifetime
• Sub-second latency
DynamoDB Streams
View Type Destination
Old image—before update Name = John, Destination = Mars
New image—after update Name = John, Destination = Pluto
Old and new images Name = John, Destination = Mars
Name = John, Destination = Pluto
Keys only Name = John
View types
UpdateItem (Name = John, Destination = Pluto)
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Shard 1
Shard 2
Shard 3
Shard 4
Amazon Kinesis Client
Library Application
Client Application
DynamoDB Streams and
Amazon Kinesis Client Library
DynamoDB Streams
Open Source Cross-
Region Replication Library
Asia Pacific (Sydney) EU (Ireland) Replica
US East (N. Virginia)
Cross-region replication
DynamoDB Streams and AWS Lambda
Real-time voting architecture (improved)
Redshift Amazon EMR
Amazon Kinesis–
Enabled App
Voters RawVotes TableVoting App RawVotes
Real-time voting architecture
Redshift Amazon EMR
Amazon Kinesis-
Enabled App
Voters RawVotes TableVoting App RawVotes
Real-time voting architecture
Redshift Amazon EMR
Amazon Kinesis-
Enabled app
Voters RawVotes TableVoting App RawVotes
Real-time voting architecture
Redshift Amazon EMR
Amazon Kinesis–
Enabled App
Voters RawVotes TableVoting app RawVotes
Real-time voting architecture
Redshift Amazon EMR
Amazon Kinesis–
Enabled App
Voters RawVotes TableVoting app RawVotes
Analytics with
DynamoDB Streams
• Collect and de-dupe data in DynamoDB
• Aggregate data in-memory and flush
Performing real-time aggregation
and analytics
Reference Architecture
Deep Dive on Amazon DynamoDB

More Related Content

Deep Dive on Amazon DynamoDB

  • 1. ©2016, Amazon Web Services, Inc. or its affiliates. All rights reserved Amazon DynamoDB Sean Shriver NoSQL Solutions Architect Amazon Web Services September 2016
  • 2. Agenda • Tables, API, data types, indexes • Scaling • Data modeling • Scenarios and best practices • DynamoDB Streams • Reference architecture
  • 3. Amazon DynamoDB • Managed NoSQL database service • Supports both document and key-value data models • Highly scalable • Consistent, single-digit millisecond latency at any scale • Highly available—3x replication • Simple and powerful API
  • 5. Table Table Items Attributes Partition Key Sort Key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for a partition key ==, <, >, >=, <= “begins with” “between” sorted results counts top/bottom N values paged responses
  • 6. • CreateTable • UpdateTable • DeleteTable • DescribeTable • ListTables • GetItem • Query • Scan • BatchGetItem • PutItem • UpdateItem • DeleteItem • BatchWriteItem • ListStreams • DescribeStream • GetShardIterator • GetRecords Table and item API Stream API DynamoDB
  • 7. Data types • String (S) • Number (N) • Binary (B) • String Set (SS) • Number Set (NS) • Binary Set (BS) • Boolean (BOOL) • Null (NULL) • List (L) • Map (M) Used for storing nested JSON documents
  • 8. 00 55 A954 AA FF Partition table • Partition key uniquely identifies an item • Partition key is used for building an unordered hash index • Table can be partitioned for scale 00 FF Id = 1 Name = Jim Hash (1) = 7B Id = 2 Name = Andy Dept = Engg Hash (2) = 48 Id = 3 Name = Kim Dept = Ops Hash (3) = CD Key Space
  • 9. Partitions are three-way replicated Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Replica 1 Replica 2 Replica 3 Partition 1 Partition 2 Partition N
  • 10. Partition-sort key table • Partition key and sort key together uniquely identify an Item • Within unordered partition key-space, data is sorted by the sort key • No limit on the number of items (∞) per partition key – Except if you have local secondary indexes 00:0 FF:∞ Hash (2) = 48 Customer# = 2 Order# = 10 Item = Pen Customer# = 2 Order# = 11 Item = Shoes Customer# = 1 Order# = 10 Item = Toy Customer# = 1 Order# = 11 Item = Boots Hash (1) = 7B Customer# = 3 Order# = 10 Item = Book Customer# = 3 Order# = 11 Item = Paper Hash (3) = CD 55 A9:∞54:∞ AA Partition 1 Partition 2 Partition 3
  • 12. Global secondary index (GSI) • Alternate partition (+sort) key • Index is across all table partition keys GSIs A5 (part.) A4 (sort) A1 (table key) A3 (projected) Table INCLUDE A3 A4 (part.) A5 (sort) A1 (table key) A2 (projected) A3 (projected) ALL A2 (part.) A1 (table key) KEYS_ONLY RCUs/WCUs provisioned separately for GSIs Online Indexing A1 (partition) A2 A3 A4 A5
  • 13. Local secondary index (LSI) • Alternate sort key attribute • Index is local to a partition key A1 (partition) A3 (sort) A2 (table key) A1 (partition) A2 (sort) A3 A4 A5 LSIs A1 (partition) A4 (sort) A2 (table key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (partition) A5 (sort) A2 (table key) A3 (projected) A4 (projected) ALL 10 GB max per partition key, i.e. LSIs limit the # of sort keys!
  • 15. Scaling • Throughput – Provision any amount of throughput to a table • Size – Add any number of items to a table • Max item size is 400 KB • LSIs limit the number of items due to 10 GB limit • Scaling is achieved through partitioning
  • 16. Throughput • Provisioned at the table level – Write capacity units (WCUs) are measured in 1 KB per second – Read capacity units (RCUs) are measured in 4 KB per second • RCUs measure strictly consistent reads • Eventually consistent reads cost 1/2 of consistent reads • Read and write throughput limits are independent WCURCU
  • 17. Getting the most out of DynamoDB throughput “To get the most out of DynamoDB throughput, create tables where the partition key has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible.” —DynamoDB Developer Guide 1. Key Choice: High key cardinality 2. Uniform Access: access is evenly spread over the key-space 3. Time: requests arrive evenly spaced in time
  • 18. Example: Key Choice or Uniform Access Partition Time Heat
  • 20. How does DynamoDB handle bursts? • DynamoDB saves 300 seconds of unused capacity per partition Bursting is best effort!
  • 21. Burst capacity is built-in 0 400 800 1200 1600 CapacityUnits Time Provisioned Consumed “Save up” unused capacity Consume saved up capacity Burst: 300 seconds (1200 × 300 = 360k CU)
  • 22. Burst capacity may not be sufficient 0 400 800 1200 1600 CapacityUnits Time Provisioned Consumed Attempted Throttled requests Don’t completely depend on burst capacity… provision sufficient throughput Burst: 300 seconds (1200 × 300 = 360k CU)
  • 23. What causes throttling? • If sustained throughput goes beyond provisioned throughput per partition • From the example before: – Table created with 5000 RCUs, 500 WCUs – RCUs per partition = 1666.67 – WCUs per partition = 166.67 – If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition, DynamoDB may throttle requests • Solution: Increase provisioned throughput
  • 24. What causes throttling? • Non-uniform workloads – Hot keys/hot partitions – Very large bursts • Dilution of throughout across partitions caused by mixing hot data with cold data – Use a table per time period for storing time series data so WCUs and RCUs are applied to the hot data set
  • 25. Data Modeling Store data based on how you will access it!
  • 26. 1:1 relationships or key-values • Use a table or GSI with a partition key • Use GetItem or BatchGetItem API Example: Given a user or email, get attributes Users Table Partition key Attributes UserId = bob Email = bob@gmail.com, JoinDate = 2011-11-15 UserId = fred Email = fred@yahoo.com, JoinDate = 2011-12-01 Users-Email-GSI Partition key Attributes Email = bob@gmail.com UserId = bob, JoinDate = 2011-11-15 Email = fred@yahoo.com UserId = fred, JoinDate = 2011-12-01
  • 27. 1:N relationships or parent-children • Use a table or GSI with partition and sort key • Use Query API Example: Given a device, find all readings between epoch X, Y Device-measurements Part. Key Sort key Attributes DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90 DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
  • 28. N:M relationships • Use a table and GSI with partition and sort key elements switched • Use Query API Example: Given a user, find all games. Or given a game, find all users. User-Games-Table Part. Key Sort key UserId = bob GameId = Game1 UserId = fred GameId = Game2 UserId = bob GameId = Game3 Game-Users-GSI Part. Key Sort key GameId = Game1 UserId = bob GameId = Game2 UserId = fred GameId = Game3 UserId = bob
  • 29. Documents (JSON) • Data types (M, L, BOOL, NULL) introduced to support JSON • Document SDKs – Simple programming model – Conversion to/from JSON – Java, JavaScript, Ruby, .NET • Cannot create an Index on elements of a JSON object stored in Map – They need to be modeled as top- level table attributes to be used in LSIs and GSIs • Set, Map, and List have no element limit but depth is 32 levels Javascript DynamoDB string S number N boolean BOOL null NULL array L object M
  • 30. Rich expressions • Projection expression – Query/Get/Scan: ProductReviews.FiveStar[0] • Filter expression – Query/Scan: #V > :num (#V is a place holder for keyword VIEWS) • Conditional expression – Put/Update/DeleteItem: attribute_not_exists (#pr.FiveStar) • Update expression – UpdateItem: set Replies = Replies + :num
  • 31. Scenarios and Best Practices
  • 33. Time series tables Events_table_2015_April Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N Events_table_2015_March Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N Events_table_2015_Feburary Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N Events_table_2015_January Event_id (Partition key) Timestamp (sort key) Attribute1 …. Attribute N RCUs = 1000 WCUs = 100 RCUs = 10000 WCUs = 10000 RCUs = 100 WCUs = 1 RCUs = 10 WCUs = 1 Current table Older tables HotdataColddata Don’t mix hot and cold data; archive cold data to Amazon S3
  • 34. Use a table per time period • Pre-create daily, weekly, monthly tables • Provision required throughput for current table • Writes go to the current table • Turn off (or reduce) throughput for older tables Dealing with time series data
  • 36. Partition 1 2000 RCUs Partition K 2000 RCUs Partition M 2000 RCUs Partition 50 2000 RCU Scaling bottlenecks Product A Product B Shoppers ProductCatalog Table 100,000 𝑅𝐶𝑈 50 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≈ 𝟐𝟎𝟎𝟎 𝑅𝐶𝑈 𝑝𝑒𝑟 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 SELECT Id, Description, ... FROM ProductCatalog WHERE Id="POPULAR_PRODUCT"
  • 37. RequestsPerSecond Item Primary Key Request Distribution Per Partition Key DynamoDB Requests
  • 38. Partition 1 Partition 2 ProductCatalog Table User DynamoDB User SELECT Id, Description, ... FROM ProductCatalog WHERE Id="POPULAR_PRODUCT"
  • 39. RequestsPerSecond Item Primary Key Request Distribution Per Partition Key DynamoDB Requests Cache Hits
  • 40. Messaging App Large items Filters vs. indexes M:N Modeling—inbox and outbox
  • 41. Messages Table Messages App David SELECT * FROM Messages WHERE Recipient='David' LIMIT 50 ORDER BY Date DESC Inbox SELECT * FROM Messages WHERE Sender ='David' LIMIT 50 ORDER BY Date DESC Outbox
  • 42. Recipient Date Sender Message David 2014-10-02 Bob … … 48 more messages for David … David 2014-10-03 Alice … Alice 2014-09-28 Bob … Alice 2014-10-01 Carol … Large and small attributes mixed (Many more messages) David Messages Table 50 items × 256 KB each Large message bodies Attachments SELECT * FROM Messages WHERE Recipient='David' LIMIT 50 ORDER BY Date DESC Inbox
  • 43. Computing inbox query cost Items evaluated by query Average item size Conversion ratio Eventually consistent reads
  • 44. Recipient Date Sender Subject MsgId David 2014-10-02 Bob Hi!… afed David 2014-10-03 Alice RE: The… 3kf8 Alice 2014-09-28 Bob FW: Ok… 9d2b Alice 2014-10-01 Carol Hi!... ct7r Separate the bulk data Inbox-GSI Messages Table MsgId Body 9d2b … 3kf8 … ct7r … afed … David 1. Query Inbox-GSI: 1 RCU 2. BatchGetItem Messages: 1600 RCU (50 separate items at 256 KB) (50 sequential items at 128 bytes) Uniformly distributes large item reads
  • 46. Simplified writes David PutItem { MsgId: 123, Body: ..., Recipient: Steve, Sender: David, Date: 2014-10-23, ... } Inbox Global secondary index Messages Table
  • 47. Outbox GSI SELECT * FROM Messages WHERE Sender ='David' LIMIT 50 ORDER BY Date DESC
  • 49. • Reduce one-to-many item sizes • Configure secondary index projections • Use GSIs to model M:N relationship between sender and recipient Distribute large items Querying many large items at once InboxMessagesOutbox
  • 50. Multiplayer Online Gaming Query filters vs. composite key indexes
  • 51. GameId Date Host Opponent Status d9bl3 2014-10-02 David Alice DONE 72f49 2014-09-30 Alice Bob PENDING o2pnb 2014-10-08 Bob Carol IN_PROGRESS b932s 2014-10-03 Carol Bob PENDING ef9ca 2014-10-03 David Bob IN_PROGRESS Games Table Multiplayer online game data
  • 52. Query for incoming game requests • DynamoDB indexes provide partition and sort key • What about queries for two equalities and a sort? SELECT * FROM Game WHERE Opponent='Bob‘ AND Status=‘PENDING' ORDER BY Date DESC (partition) (sort) (???)
  • 53. Secondary Index Opponent Date GameId Status Host Alice 2014-10-02 d9bl3 DONE David Carol 2014-10-08 o2pnb IN_PROGRESS Bob Bob 2014-09-30 72f49 PENDING Alice Bob 2014-10-03 b932s PENDING Carol Bob 2014-10-03 ef9ca IN_PROGRESS David Approach 1: Query filter Bob
  • 54. Secondary Index Approach 1: Query filter Bob Opponent Date GameId Status Host Alice 2014-10-02 d9bl3 DONE David Carol 2014-10-08 o2pnb IN_PROGRESS Bob Bob 2014-09-30 72f49 PENDING Alice Bob 2014-10-03 b932s PENDING Carol Bob 2014-10-03 ef9ca IN_PROGRESS David SELECT * FROM Game WHERE Opponent='Bob' ORDER BY Date DESC FILTER ON Status='PENDING' (filtered out)
  • 55. Needle in a haystack Bob
  • 56. • Send back less data “on the wire” • Simplify application code • Simple SQL-like expressions – AND, OR, NOT, () Use query filter Your index isn’t entirely selective
  • 57. Approach 2: Composite key StatusDate DONE_2014-10-02 IN_PROGRESS_2014-10-08 IN_PROGRESS_2014-10-03 PENDING_2014-09-30 PENDING_2014-10-03 Status DONE IN_PROGRESS IN_PROGRESS PENDING PENDING Date 2014-10-02 2014-10-08 2014-10-03 2014-10-03 2014-09-30
  • 58. Secondary Index Approach 2: Composite key Opponent StatusDate GameId Host Alice DONE_2014-10-02 d9bl3 David Carol IN_PROGRESS_2014-10-08 o2pnb Bob Bob IN_PROGRESS_2014-10-03 ef9ca David Bob PENDING_2014-09-30 72f49 Alice Bob PENDING_2014-10-03 b932s Carol
  • 59. Opponent StatusDate GameId Host Alice DONE_2014-10-02 d9bl3 David Carol IN_PROGRESS_2014-10-08 o2pnb Bob Bob IN_PROGRESS_2014-10-03 ef9ca David Bob PENDING_2014-09-30 72f49 Alice Bob PENDING_2014-10-03 b932s Carol Secondary Index Approach 2: Composite key Bob SELECT * FROM Game WHERE Opponent='Bob' AND StatusDate BEGINS_WITH 'PENDING'
  • 60. Needle in a sorted haystack Bob
  • 61. Sparse indexes Id (Part.) User Game Score Date Award 1 Bob G1 1300 2012-12-23 2 Bob G1 1450 2012-12-23 3 Jay G1 1600 2012-12-24 4 Mary G1 2000 2012-10-24 Champ 5 Ryan G2 123 2012-03-10 6 Jones G2 345 2012-03-20 Game-scores-table Award (Part.) Id User Score Champ 4 Mary 2000 Award-GSI Scan sparse partition GSIs
  • 62. • Concatenate attributes to form useful secondary index keys • Take advantage of sparse indexes Replace filter with indexes You want to optimize a query as much as possible Status + Date
  • 64. Requirements for voting • Allow each person to vote only once • No changing votes • Real-time aggregation • Voter analytics, demographics
  • 66. Partition 1 1000 WCUs Partition K 1000 WCUs Partition M 1000 WCUs Partition N 1000 WCUs Votes Table Candidate A Candidate B Scaling bottlenecks Voters Provision 200,000 WCUs
  • 67. Write sharding Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 Candidate A_6 Candidate A_8 Candidate A_5 Voter Votes Table
  • 68. Write sharding Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 UpdateItem: “CandidateA_” + rand(0, 10) ADD 1 to Votes Candidate A_6 Candidate A_8 Candidate A_5 Voter Votes Table
  • 69. Votes Table Shard aggregation Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_5 Candidate A_6 Candidate A_8 Candidate A_7 Candidate B_8 Periodic Process Candidate A Total: 2.5M 1. Sum 2. Store Voter
  • 70. • Trade off read cost for write scalability • Consider throughput per partition key and per partition Shard write-heavy partition keys Your write workload is not horizontally scalable
  • 71. Correctness in voting UserId Candidate Date Alice A 2013-10-02 Bob B 2013-10-02 Eve B 2013-10-02 Chuck A 2013-10-02 RawVotes Table Segment Votes A_1 23 B_2 12 B_1 14 A_2 25 AggregateVotes Table Voter 1. Record vote and de-dupe; retry 2. Increment candidate counter
  • 72. Correctness in aggregation? UserId Candidate Date Alice A 2013-10-02 Bob B 2013-10-02 Eve B 2013-10-02 Chuck A 2013-10-02 RawVotes Table Segment Votes A_1 23 B_2 12 B_1 14 A_2 25 AggregateVotes Table Voter
  • 74. • Stream of updates to a table • Asynchronous • Exactly once • Strictly ordered – Per item • Highly durable • Scale with table • 24-hour lifetime • Sub-second latency DynamoDB Streams
  • 75. View Type Destination Old image—before update Name = John, Destination = Mars New image—after update Name = John, Destination = Pluto Old and new images Name = John, Destination = Mars Name = John, Destination = Pluto Keys only Name = John View types UpdateItem (Name = John, Destination = Pluto)
  • 76. Stream Table Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Table Shard 1 Shard 2 Shard 3 Shard 4 KCL Worker KCL Worker KCL Worker KCL Worker Amazon Kinesis Client Library Application DynamoDB Client Application Updates DynamoDB Streams and Amazon Kinesis Client Library
  • 77. DynamoDB Streams Open Source Cross- Region Replication Library Asia Pacific (Sydney) EU (Ireland) Replica US East (N. Virginia) Cross-region replication
  • 78. DynamoDB Streams and AWS Lambda
  • 79. Real-time voting architecture (improved) AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  • 80. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis- Enabled App Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  • 81. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis- Enabled app Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  • 82. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting app RawVotes DynamoDB Stream
  • 83. Real-time voting architecture AggregateVotes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting app RawVotes DynamoDB Stream
  • 84. Analytics with DynamoDB Streams • Collect and de-dupe data in DynamoDB • Aggregate data in-memory and flush periodically Performing real-time aggregation and analytics

Editor's Notes

  1. consistent, single-digit millisecond latency at any scale Fast, Consistent Performance [automatic partitioning, SSD technology] Highly Scalable [store as much data, limits are there for safety] Fully Managed [choose key schema, tablename, provisioned capacity] Event Driven Programming [Streams, Lambda/KCL] Fine-grained Access Control [item-level, IAM credentials] Flexible [doc, kv]
  2. Online Indexing
  3. Think of this as a parallel table asynchronously populated by DynamoDB Eventually consistent 1 Table update = 0, 1 or 2 GSI updates
  4. Think of this as a parallel table asynchronously populated by DynamoDB Eventually consistent 1 Table update = 0, 1 or 2 GSI updates
  5. There is a limit in place to avoid run-away apps. But you can request a limit increase. http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html For every distinct hash key value, the total sizes of all table and index items cannot exceed 10 GB This request would have caused the ReadCapacityUnits limit to be exceeded for the account in us-west-2. Current ReadCapacityUnits reserved by the account: 237. Limit: 2000. Requested: 5000. Refer to the Amazon DynamoDB Developer Guide for current limits and how to request higher limits. (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: LimitExceededException; Request ID: 0U3G4FLA19BQMELKS4G7RSTH83VV4KQNSO5AEMVJF66Q9ASUAAJG)
  6. Picture from: http://www.amazon.com/Black-Aluminum-Control-Amplifier-Wheel/dp/B005HU1ZHA Scan and Query Cumulative size of processed items – ceiling (Ʃ(item sizes)/4KB) Batch GetItem ceiling [(Ʃ(item1 size)/4KB) + …ceiling (Ʃ(itemN size)/4KB)] Consumed throughput is measured per operation Provisioned throughput is divided between all partitions
  7. Uneven access across key space
  8. Per partition throughput Provisioned 300 seconds of unused CU RCU 1,666.67 500,001 WCU 166.67 50,000 This is used when a partition runs out of provisioned throughput due to bursts Best effort delivery of burst capacity
  9. Insufficient provisioning of RCU or WCU If sustained throughput goes beyond provisioned throughput per partition
  10. Key/values access patters (Map, Dictionary…)
  11. http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.SpecifyingConditions.html ProjectionExpression: Choose what attributes are returned FilterExpression: Remove items from the response ConditionalExpression: Do the op if the condition matches UpdateExpression: list append, add/substract something attribute_not_exists (#pr.FiveStar) Tokens that begin with the : character are expression attribute values, which are placeholders for the actual value at runtime. "ConditionExpression": "ForumName <> :f and Subject <> :s", "ExpressionAttributeValues": { ":f": {"S": "Amazon DynamoDB"}, ":s": {"S": "How do I update multiple items?"}
  12. Touch on elasticity
  13. Warn: You still consume IO even if the item doesn’t exist
  14. Often some items in your table are accessed more frequently than others. For example, this graph illustrates how many requests per second were made for each item in your table. For collections like a product catalog, some items are substantially more popular than others. The same probably goes for tweets sent from a celebrity, and that sort of thing. Displaying those “hot items” on everyone’s twitter feed is problematic since they cause an uneven request distribution. (ask audience) Can anyone think of some things that we can do to “cool off” those hot items?
  15. One thing we can do is cache those reads in the application. Since a tweet doesn’t change once you post it, you can cache it in memory or in something like Amazon ElastiCache. This is a technique used in high throughput applications with a traditional database as well.
  16. 50 messages to read from Large items not necessarily expensive on their own, but the cost adds up Query comes from same partition
  17. Put the metadata into a GSI. 128 bytes versus 256KB
  18. Last section was 5:25
  19. This is your base table
  20. filter will drop items, but you still pay for reads
  21. you have to access a lot of items to find yours
  23. Previous section is 5:05
  24. Touch on elasticity
  25. 850w, 450r = 850KB/s wr, 1800 KB/s read
  26. throttle, error
  27. { "Records":[ { "EventName":"INSERT", "EventVersion":"1.0", "EventSource":"aws:dynamodb", "Dynamodb":{ "NewImage":{ "Message":{ "S":"New item!" }, "Id":{ "N":"101" } }, "SizeBytes":26, "StreamViewType":"NEW_AND_OLD_IMAGES", "SequenceNumber":"111", "Keys":{ "Id":{ "N":"101" } } }, "EventID":"1", "eventSourceARN":"arn:aws:dynamodb:us-east-1:acct-id:table/ExampleTableWithStream/stream/stream-id/", "AwsRegion":"us-east-1" }, { "EventName":"MODIFY", "EventVersion":"1.0", "EventSource":"aws:dynamodb", "Dynamodb":{ "NewImage":{ "Message":{ "S":"This item has changed" }, "Id":{ "N":"101" } }, "SizeBytes":59, "StreamViewType":"NEW_AND_OLD_IMAGES", "SequenceNumber":"222", "OldImage":{ "Message":{ "S":"New item!" }, "Id":{ "N":"101" } }, "Keys":{ "Id":{ "N":"101" } } }, "EventID":"2", "eventSourceARN":"arn:aws:dynamodb:us-east-1:acct-id:table/ExampleTableWithStream/stream/stream-id/", "AwsRegion":"us-east-1" }, { "EventName":"REMOVE", "EventVersion":"1.0", "EventSource":"aws:dynamodb", "Dynamodb":{ "SizeBytes":38, "StreamViewType":"NEW_AND_OLD_IMAGES", "SequenceNumber":"333", "OldImage":{ "Message":{ "S":"This item has changed" }, "Id":{ "N":"101" } }, "Keys":{ "Id":{ "N":"101" } } }, "EventID":"3", "eventSourceARN":"arn:aws:dynamodb:eu-west-1:acct-id:table/ExampleTableWithStream/stream/stream-id/", "AwsRegion":"eu-west-1" } ] }
  28. Call out heat management benefit
  29. 4:04
  30. Kinesis Firehose: Elasticsearch, S3, Redshift FIXME