In this presentation you'll learn about the decisions that went into designing and building Amazon DynamoDB, and how it allows you to stay focused on your application while enjoying single digit latencies at any scale. We'll dive deep on how to model data, maintain maximum throughput, and drive analytics against your data, while profiling real world use cases, tips and tricks from customers running on Amazon DynamoDB today.
Phil Fitzsimons, Solution Architect, AWS
Rob Greig, CTO, Royal Opera House
48. date = 2012-05-16-
id = 100 09-00-10 total = 25.00
date = 2012-05-15-
id = 101 15-00-11 total = 35.00
date = 2012-05-16-
id = 101 12-00-10 total = 100.00
49. Table
date = 2012-05-16-
id = 100 09-00-10 total = 25.00
date = 2012-05-15-
id = 101 15-00-11 total = 35.00
date = 2012-05-16-
id = 101 12-00-10 total = 100.00
50. date = 2012-05-16-
id = 100 09-00-10 total = 25.00
Item
date = 2012-05-15-
id = 101 15-00-11 total = 35.00
date = 2012-05-16-
id = 101 12-00-10 total = 100.00
51. date = 2012-05-16-
id = 100 09-00-10 total = 25.00
Attribute
date = 2012-05-15-
id = 101 15-00-11 total = 35.00
date = 2012-05-16-
id = 101 12-00-10 total = 100.00
52. Where is the schema?
Tables do not require a formal schema.
Items are an arbitrarily sized hash.
53. Indexing.
Items are indexed by primary and secondary keys.
Primary keys can be composite.
Secondary keys index on other attributes.
54. ID Date Total
id = 100 date = 2012-05-16-09-00-10 total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
id = 101 date = 2012-05-16-12-00-10 total = 100.00
id = 102 date = 2012-03-20-18-23-10 total = 20.00
id = 102 date = 2012-03-20-18-23-10 total = 120.00
55. Hash key
ID Date Total
id = 100 date = 2012-05-16-09-00-10 total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
id = 101 date = 2012-05-16-12-00-10 total = 100.00
id = 102 date = 2012-03-20-18-23-10 total = 20.00
id = 102 date = 2012-03-20-18-23-10 total = 120.00
56. Hash key Range key
ID Date Total
Composite primary key
id = 100 date = 2012-05-16-09-00-10 total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
id = 101 date = 2012-05-16-12-00-10 total = 100.00
id = 102 date = 2012-03-20-18-23-10 total = 20.00
id = 102 date = 2012-03-20-18-23-10 total = 120.00
57. Hash key Range key Secondary range key
ID Date Total
id = 100 date = 2012-05-16-09-00-10 total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
id = 101 date = 2012-05-16-12-00-10 total = 100.00
id = 102 date = 2012-03-20-18-23-10 total = 20.00
id = 102 date = 2012-03-20-18-23-10 total = 120.00
63. One API call, multiple items
BatchGet returns multiple items by key.
BatchWrite performs up to 25 put or delete operations.
Throughput is measured by IO, not API calls.
65. Query vs Scan
Query for Composite Key queries.
Scan for full table scans, exports.
Both support pages and limits.
Maximum response is 1Mb in size.
66. Query patterns
Retrieve all items by hash key.
Range key conditions:
==, <, >, >=, <=, begins with, between.
Counts. Top and bottom n values.
Paged responses.
82. Uniform workload.
Data stored across multiple partitions.
Data is primarily distributed by primary key.
Provisioned throughput is divided evenly across partitions.
83. To achieve and maintain full
provisioned throughput, spread
workload evenly across hash keys.
85. BEST PRACTICE 1:
Distinct values for hash keys.
Hash key elements should have a
high number of distinct values.
86. Lots of users with unique user_id.
Workload well distributed across hash key.
user_id = first_name = last_name =
mza Matt Wood
user_id = first_name = last_name =
jeffbarr Jeff Barr
user_id = first_name = last_name =
werner Werner Vogels
user_id = first_name = last_name =
simone Simone Brunozzi
... ... ...
87. BEST PRACTICE 2:
Avoid limited hash key values.
Hash key elements should have a
high number of distinct values.
88. Small number of status codes.
Unevenly, non-uniform workload.
status = date =
200 2012-04-01-00-00-01
status = date =
404 2012-04-01-00-00-01
status date =
404 2012-04-01-00-00-01
status = date =
404 2012-04-01-00-00-01
89. BEST PRACTICE 3:
Model for even distribution.
Access by hash key value should be evenly
distributed across the dataset.
90. Large number of devices.
Small number which are much more popular than others.
Workload unevenly distributed.
mobile_id = access_date =
100 2012-04-01-00-00-01
mobile_id = access_date =
100 2012-04-01-00-00-02
mobile_id = access_date =
100 2012-04-01-00-00-03
mobile_id = access_date =
100 2012-04-01-00-00-04
... ...