Explore Amazon DynamoDB capabilities and benefits in detail and learn how to get the most out of your DynamoDB database. We go over best practices for schema design with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, DynamoDB Streams, and more. We also provide lessons learned from operating DynamoDB at scale, including provisioning DynamoDB for IoT.
2. Plan
Dating Website Serverless IoT
DAX and GSIs TTL, Streams, and DAX
Getting Started
Developer Resources
Amazon DynamoDB
Tables, Indexes, Partitioning
New Features
TTL, VPCe, DAX
3. Highly available Consistent, single digit
millisecond latency
at any scale
Fully managed Secure Integrates with AWS Lambda,
Amazon Redshift, and more.
Amazon DynamoDB
8. Data types, table creation options, provisioned capacity
Data Types Provisioned capacity
Type DynamoDB Type
String String
Integer, Float Number
Timestamp Number or String
Blob Binary
Boolean Bool
Null Null
List List
Set
Set of String,
Number, or Binary
Map Map
PartitionKey, Type:
SortKey, Type:
Provisioned Reads:
Provisioned Writes:
LSI Schema GSI Schema
AttributeName [S,N,B]
AttributeName [S,N,B]
1+
1+
Provisioned Reads: 1+
Provisioned Writes: 1+
TableName
OptionalRequired
Read Capacity Unit (RCU)
1 RCU returns 4KB of data for strongly
consistent reads, or double the data
at the same cost for eventually
consistent reads
Capacity is per second, rounded up to
the next whole number
Write Capacity Unit (WCU)
1 WCU writes 1KB of data, and each
item consumes 1 WCU minimum
CreateTable
String,
Number,
Binary ONLY
Per Second
Unique to
Account and
Region
10. CustomerOrdersTable
00
55
AA
FF
Partition A
33.33 % Keyspace
33.33 % Provisioned Capacity
Partition B
33.33 % Keyspace
33.33 % Provisioned Capacity
Partition C
33.33 % Keyspace
33.33 % Provisioned Capacity
Hash.MIN = 0
Hash.MAX = FF
Keyspace
Time
Partition A
33.33 % Keyspace
33.33 % Provisioned Capacity
Partition B
33.33 % Keyspace
33.33 % Provisioned Capacity
Partition D
Partition E
16.66 %
16.66 %
16.66 %
16.66 %
Partition split due to partition size
00
55
AA
FF
Partition A
33.33 % Keyspace
33.33 % Provisioned Capacity
Partition B
33.33 % Keyspace
33.33 % Provisioned Capacity
Partition C
33.33 % Keyspace
33.33 % Provisioned Capacity
Time
Partition A
Partition C
16.66 %
16.66 %
16.66 %
16.66 %
Partition splits due to capacity increase
16.66 %
16.66 %
16.66 %
16.66 %
16.66 %
16.66 %
16.66 %
16.66 %
Partition B
Partition D
Partition E
Partition F
The desired size of a partition
is 10GB* and when a partition
surpasses this it can split
*=subject to change
Split for partition size
The desired capacity of a
partition is expressed as:
3w + 1r < 3000 *
Where w = WCU & r = RCU
*=subject to change
Split for provisioned capacity
11. Partitioning
Partition A
1000 RCUs
100 WCUs
Partition C
1000 RCUs
100 WCUs
Host A Host C
Availability Zone A
Partition A
1000 RCUs
100 WCUs
Partition C
1000 RCUs
100 WCUs
Host E Host G
Availability Zone B
Partition A
1000 RCUs
100 WCUs
Partition C
1000 RCUs
100 WCUs
Host H Host J
Availability Zone C
CustomerOrdersTable
54:∞00:0 54:∞00:0 54:∞00:0FF:∞AA:0 FF:∞AA:0 FF:∞AA:0
Data is replicated to three
Availability Zones by design
3-way replication
OrderId: 1
CustomerId: 1
ASIN: [B00X4WHP5E]
Hash(1) = 7B
Partition B
1000 RCUs
100 WCUs
Host B Host F Host I
Partition B
1000 RCUs
100 WCUs
Partition B
1000 RCUs
100 WCUs
A9:∞55:0 A9:∞55:0 A9:∞55:0
12. DynamoDB Streams
Partition A
Partition B
Partition C
Ordered stream of item changes
Exactly once, strictly ordered by key
Highly durable, scalable
24 hour retention
Sub-second latency
Compatible with Kinesis Client Library
DynamoDB Streams Shard
1
Shards have a lineage
and automatically close
after time or when the
associated DynamoDB
partition splits
Shard
2
Shard
3Updates
KCL
Worker
Amazon Kinesis
Client Library
Application
KCL
Worker
KCL
Worker
GetRecords
Amazon DynamoDB
Table
DynamoDB Streams
Stream
13. TTL job
Time-To-Live (TTL)
Amazon DynamoDB
Table
CustomerActiveOrder
OrderId: 1
CustomerId: 1
MyTTL: 1492641900
DynamoDB Streams
Stream
Amazon Kinesis
Amazon Redshift
An epoch timestamp marking
when an item can be deleted
by a background process,
without consuming any
provisioned capacity
Time-To-Live
Removes data that is no longer relevant
14. Time-To-Live (TTL)
TTL items identifiable
in DynamoDB Streams
Configuration protected by AWS
Identity and Access Management
(IAM), auditable with AWS CloudTrail
Eventual deletion,
free to use
16. DynamoDB in the VPC
Availability Zone #1 Availability Zone #2
Private Subnet Private Subnet
VPC endpoint
web
app
server
security group security group
DAX
o Microseconds latency in-
memory cache
o Millions of requests per
second
o Fully managed, highly
available
o Role based access control
o No IGW or VPC endpoint
required
DAX o DynamoDB-in-the-VPC
o IAM resource policy
restricted
VPC Endpoints
AWS Lambda
web
app
server
security group security group
DAX
17. DynamoDB Accelerator (DAX)
FQDN Endpoint for
discovery
Supports AWS Java SDK on launch,
with more AWS SDKs to come
Cluster based, Multi-AZ Separate Query and
Item cache
18. DynamoDB table key choice
To get the most out of DynamoDB throughput, create tables where the partition key has a large
number of distinct values, and values are requested fairly uniformly, as randomly as possible.
Amazon DynamoDB Developer Guide
19. Elements of even access
Partitions
Time
Heat
1. Key choice: high key cardinality
2. Uniform access: access is evenly spread over the key-space
Time
3. Requests arrive evenly spaced in time
Time
Even access: All three at once
20. Use burst sparingly
DynamoDB “saves” 300 seconds of unused capacity per partition
0
400
800
1200
1600
CapacityUnits
Time
Provisioned Consumed
“Save up” unused capacity
Consume saved up capacity
Burst: 300 seconds
(1200 × 300 = 360k CU)
0
400
800
1200
1600
CapacityUnits
Time
Provisioned Consumed Attempted
Throttled requests
Don’t completely depend on burst
capacity… provision sufficient
throughput
21. What causes throttling?
If sustained throughput goes beyond provisioned throughput on a partition
A throttle comes from a partition In Amazon CloudWatch, if consumed capacity
is well under provisioned and throttling
occurs, it must be “partition throttling”
Disable retries, writes your own retry
code, and log all throttled or returned keys
• Fire TV Stick
• Echo Dot – Black
• Amazon Fire TV
• Amazon Echo – Black
• Fire HD 8
• Echo Dot – White
• Kindle Paperwhite
• Fire Tablet with Alexa
• Fire HD 8 Tablet with A…
• Fire HD 8 Tablet with A…
Top Items
22. Elastic is the new normal
Write Capacity Units
Read Capacity Units
ConsumedCapacityUnits
Time
>200% increase from baseline
>300% increase from baseline
25. Online dating website running on AWS
Users have people they like, and
conversely people who like them
Hourly batch job matches users
Data stored in Likes and Matches tables
Dating Website
DESIGN PATTERNS:
DynamoDB Accelerator and GSIs
26. Schema Design
Likes
user_id_self
(Partition key)
user_id_other
(sort key)
MyTTL
(TTL attribute)
… Attribute N
Table Keys
GSI Other
user_id_other
(Partition key)
user_id_self
(sort key)
Matches
event_id
(Partition key)
timestamp
(sort key)
UserIdLeft
(GSI left)
UserIdRight
(GSI right)
Attribute N
GSI Left
UserIdLeft
(Partition key)
event_id
(Table key)
timestamp
(Table Key)
UserIdRight
Requirements:
1. Get all people I like
2. Get all people that like me
3. Expire likes after 90 days
LIKES|
GSI Right
UserIdRight
(Partition key)
event_id
(Table key)
timestamp
(Table Key)
UserIdLeft
Requirements:
1. Get my matches
MATCHES|
27. Matchmaking
LIKESRequirements:
1. Get all new likes every hour
2. For each like, get the other
user’s likes
3. Store matches in matches table
Partition 1
Partition …
Partition N
Availability Zone
Public Subnet
match
making
server
security group
Auto Scaling group
28. Matchmaking
LIKESRequirements:
1. Get all new likes every hour
2. For each like, get the other
user’s likes
3. Store matches in matches table
Partition 1
Partition …
Partition N
Availability Zone
Public Subnet
match
making
server
security group
Auto Scaling group
THROTTLE!
29. Matchmaking
Requirements:
1. Get all new likes every hour
2. For each like, get the other
user’s likes
3. Store matches in matches table
1. Key choice: High key cardinality
2. Uniform access: access is evenly spread over the key-space
3. Time: requests arrive evenly spaced in time
Even Access:
30. Matchmaking
LIKESRequirements:
1. Get all new likes every hour
2. For each like, get the other
user’s likes
3. Store matches in matches table
Partition 1
Partition …
Partition N
Availability Zone
Private Subnet
match
making
server
security group
AutoScaling group
0. Write like to like table, then
query by user id to warm cache,
then queue for batch processing
security group
DAX
31. Takeaways:
Keep DAX warm by querying after writing
Use GSIs for many to many relationships
Dating Website
DESIGN PATTERNS:
DynamoDB Accelerator and GSIs
32. Serverless IoT
DESIGN PATTERNS:
TTL, DynamoDB Streams, and DAX
Amazon DynamoDB
Single DynamoDB table for storing sensor data
Tiered storage to remove archive old events to S3
Data stored in data table
33. Schema Design
Data
DeviceId
(Partition key)
EventEpoch
(sort key)
MyTTL
(TTL attribute)
… Attribute N
Requirements:
1. Get all events for a device
2. Archive old events after 90 daysDATA|
UserDevices
UserId
(Partition key)
DeviceId
(sort key)
Attribute 1 … Attribute N
Requirements:
1. Get all devices for a user
USERDEVICES|
References
34. Serverless IoT
DATA
DeviceId: 1
EventEpoch: 1492641900
MyTTL: 1492736400 Expiry
AWS Lambda
Amazon S3
Bucket
Amazon DynamoDB Amazon DynamoDB Streams
Single DynamoDB table for storing sensor data
Tiered storage to remove archive old events to S3
Data stored in data table
USERDEVICES
35. Serverless IoT
DATA
Partition A Partition B Partition DPartition C
Throttling
Noisy sensor produces data at a
rate several times greater than
others
36. Data
00
3F
BF
FF
Partition A
25.0 % Keyspace
25.0 % Provisioned Capacity
Partition B
25.0 % Keyspace
25.0 % Provisioned Capacity
Partition D
25.0 % Keyspace
25.0 % Provisioned Capacity
Hash.MIN = 0
Hash.MAX = FF
Keyspace
Partition C
25.0 % Keyspace
25.0 % Provisioned Capacity
7F
37. Serverless IoT: Naïve Sharding
Requirements:
1. Single DynamoDB table for storing
sensor data
2. Tiered storage to remove archive
old events to S3
3. Data stored in data table
0. Capable of dynamically
sharding to overcome throttling
Data
DeviceId_ShardId
(Partition key)
EventEpoch
(sort key)
MyTTL
(TTL attribute)
… Attribute N
Requirements:
1. Get all events for a device
2. Archive old events after 90 daysDATA|
Shard
DeviceId
(Partition key)
ShardCount
Requirements:
1. Get shard count for given device
2. Always grow the count of shardsSHARD|
Range: 0..1,000
A sharding scheme where the
number of shards is not
predefined, and will grow over
time but never contract. Contrast
with a fixed shard count
Naïve Sharding
38. DATA
DeviceId_ShardId: 1_3
EventEpoch: 1492641900
MyTTL: 1492736400 Expiry
SHARD
DeviceId: 1
ShardCount: 10
Serverless IoT: Naïve Sharding
Request path:
1. Read ShardCount from Shard table
2. Write to a random shard
3. If throttled, review shard count
1.
2.
39. Serverless IoT
DATA
Partition A Partition B Partition DPartition C
Pick a random shard to write data to
SHARD
DeviceId: 1
ShardCount: 10
DeviceId_ShardId: 1_Rand(0,10)
EventEpoch: 1492641900
MyTTL: 1492736400
1. 2.
?
40. DATA
DeviceId: 1
EventEpoch: 1492641900
MyTTL: 1492736400
AWS Lambda
Amazon S3
Bucket
Amazon DynamoDB
Amazon DynamoDB Streams
Single DynamoDB table for storing sensor data
Tiered storage to remove archive old events to S3
Data stored in data table
Capable of dynamically sharding to overcome throttling
USERDEVICES
SHARD
DeviceId: 1
ShardCount: 10
DAX
+ Amazon Kinesis
Firehose
41. Takeaways:
Use naïve write sharding to dynamically expand shards
Use DAX for hot reads, especially from Lambda
Use TTL to create tiered storage
Serverless IoT
DESIGN PATTERNS:
TTL, DynamoDB Streams, and DAX