Mais conteúdo relacionado Semelhante a AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Dive into Global Secondary Indexes (20) Mais de Amazon Web Services (20) AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Dive into Global Secondary Indexes 1. Optimize Your Database for the Cloud
with DynamoDB
A Deep Dive into
Global Secondary Indexes (GSI)
David Pearson
Siva Raghupathy
1
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
2. automated operations
=
predictable performance
database service
durable low latency
cost effective
2
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
3. Durable Low Latency
WRITES
Continuously replicated to 3 AZ’s
Quorum acknowledgment
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No trade-off in latency
3
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
4. Recent Announcements
Secondary Indexes (Local and Global)
DynamoDB Local
• Disconnected development with full API support
• No network
• No usage costs
• No SLA
Fine-Grained Access Control
• Direct-to-DynamoDB access for mobile devices
Geospatial and Transaction Libraries
4
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
5. DynamoDB Concepts
table
5
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
6. DynamoDB Concepts
table
items
6
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
8. DynamoDB Concepts
hash
hash keys
mandatory for all items in a table
key-value access pattern
8
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
9. DynamoDB Concepts
partition 1 .. N
hash keys
mandatory for all items in a table
key-value access pattern
determines data distribution
9
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
10. DynamoDB Concepts
hash
range
range keys
model 1:N relationships
enable rich query capabilities
composite primary key
all items for a hash key
==, <, >, >=, <=
“begins with”
“between”
sorted results
counts
top / bottom N values
paged responses
10
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
11. DynamoDB Concepts
local secondary indexes (LSI)
alternate range key + same hash key
index and table data is co-located (same partition)
11
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
12. LSI Attribute Projections
Table
LSIs
A1
A2
(hash) (range)
A3
A1
A3
A2
(hash) (range) (table key)
A4
A5
KEYS_ONLY
A1
A4
A2
A3
(hash) (range) (table key) (projected)
INCLUDE A3
A1
A4
A2
A3
A5
(hash) (range) (table key) (projected) (projected)
ALL
12
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
13. DynamoDB Concepts
global secondary
indexes (GSI)
any attribute indexed as
new hash and/or range key
13
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
14. Local Secondary Index
1 Key = hash key and a range key
Global Secondary Index
Key = hash or hash-and-range
2
Hash same attribute as that of the table. Range key
can be any scalar table attribute
The index hash key and range key (if present) can be
any scalar table attributes
3
For each hash key, the total size of all indexed items
must be 10 GB or less
No size restrictions for global secondary indexes
4
Query over a single partition, as specified by the hash
Query over the entire table, across all partitions
key value in the query
5 Eventual consistency or strong consistency
Eventual consistency only
6
Read and write capacity units consumed from the
table.
Every global secondary index has its own provisioned
read and write capacity units
7
Query will automatically fetch non-projected attributes Query can only request projected attributes. It will not
from the table
fetch any attributes from the table
14
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
15. GSI Attribute Projections
Table
A1
(hash)
A2
A2
A1
(hash) (table key)
GSIs
A3
A4
A5
KEYS_ONLY
A5
A3
A1
(hash) (range) (table key)
KEYS_ONLY
A5
A4
A1
A3
(hash) (range) (table key) (projected)
INCLUDE A3
A4
A5
A1
A2
A3
(hash) (range) (table key) (projected) (projected)
ALL
15
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
16. GSI Query Pattern
Query covered by GSI
• Query GSI & get the attributes
Query not covered by GSI
• Query GSI get the table key(s)
• BatchGetItem/GetItem from table
• 2 or more round trips to DynamoDB
Tip: If you need very low latency then project all required attributes into GSI
16
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
17. How do GSI updates work
Client
Table
Primary
Primary
Primary
table Global
Primary
table
table
Secondary
table
2. Asynchronous
update (in progress)
Index
17
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
18. 1 Table update = 0, 1 or 2 GSI updates
Table Operation
No of GSI index
updates
• Item not in Index before or after update
0
• Update introduces a new indexed-attribute
• Update deletes the indexed-attribute
1
• Updated changes the value of an indexed attribute from 2
A to B
18
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
19. GSI EXAMPLES
19
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
20. Example1: Multi-tenant application for file
storing and sharing
Access Patterns
1.
2.
3.
4.
5.
6.
Users should be able to query all the files they own
Search by File Name
Search by File Type
Search by Date Range
Keep track of Shared Files
Search by descending order or File Size
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
21. DynamoDB Data Model
Users
• Hash key = UserId (S)
• Attributes = User Name (S), Email (S), Address (SS), etc.
User_Files
• Hash key = UserId (S) – This is also the tenant id
• Range key = FileId (S)
• Attributes = Name (S), Type (S), Size (N), Date (S), SharedFlag
(S), S3key (S)
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
22. Global Secondary Indexes
Table Name
Index Name
Attribute to
Index
Projected Attribute
User_Files
NameIndex
Name
KEYS
User_Files
TypeIndex
Type
KEYS + Name
User_Files
DateIndex
Date
KEYS + Name
User_Files
SharedFlagIndex
SharedFlag
KEYS + Name
User_Files
SizeIndex
Size
KEYS + Name
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
23. Access Pattern 1
Find all files owned by a user
• Query (UserId = 2)
UserId
(Hash)
FileId
(Range)
Name Date
Type
1
1
File1
2013-04-23
JPG
1
2
File2
2013-03-10
PDF
2
3
File3
2013-03-10
PNG
2
4
File4
2013-03-10
3
5
File5
2013-04-10
SharedFlag
Size
S3key
1000
bucket1
Y
100
bucket2
Y
2000
bucket3
DOC
3000
bucket4
TXT
400
bucket5
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
24. Access Pattern 2
NameIndex
Search by file name
Name
(range)
FileId
1
File1
1
1
File2
2
2
File3
3
2
File4
4
3
• Query (IndexName =
NameIndex, UserId =
1, Name = File1)
UserId
(hash)
File5
5
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
25. Access Pattern 3
TypeIndex
Search for file name
by file Type
FileId
Name
1
JPG
1
File1
1
PDF
2
File2
2
DOC
4
File4
2
PNG
3
File3
3
• Query (IndexName =
TypeIndex, UserId = 2,
Type = DOC)
UserId Type
(hash) (range)
TXT
5
File5
Projection
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
26. Access Pattern 4
Search for file name by
date range
• Query (IndexName =
DateIndex, UserId = 1,
Date between 2013-0301 and 2013-03-29)
DateIndex
UserId Date
(hash) (range)
FileId
Name
1
2013-03-10
2
File2
1
2013-04-23
1
File1
2
2013-03-10
3
File3
2
2013-03-10
4
File4
3
2013-04-10
5
File5
Projection
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
27. Access Pattern 5
SharedFlagIndex
Search for names of
Shared files
• Query (IndexName =
SharedFlagIndex,
UserId = 1,
SharedFlag = Y)
UserId SharedFlag
(hash) (range)
FileId
Name
1
Y
2
File2
2
Y
3
File3
Projection
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
28. Access Pattern 6
Query for file names by
descending order of file
size
• Query (IndexName =
SizeIndex, UserId = 1,
ScanIndexForward =
false)
SizeIndex
UserId
(hash)
Size
(range)
FileId
Name
1
100
1
File1
3
400
2
File2
1
1000
3
File3
2
2000
4
File4
2
3000
5
File5
Projection
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
29. Example2: Find top score for game G1
Game-scores-table
Id
(hash key)
User
Game
Score
Date
1
Bob
G1
1300
2012-12-23 18:00:00
2
3
Bob
Jay
G1
G1
1450
1600
2012-12-23 19:00:00
2012-12-24 20:00:00
4
5
6
Mary
Ryan
Jones
G1
G2
G2
2000
123
345
2012-10-24 17:00:00
2012-03-10 15:00:00
2012-03-20 15:00:00
29
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
32. Query: Find top score for game G1
32
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
33. DATA MODELING WITH GSI
33
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
34. Modeling 1:1 relationships
Use a table with a Hash key or a GSI with a hash key
Example:
Users Table
• Users
Hash key = UserID
• Users-email-GSI
Hash key = Email
Hash key
UserId = bob
UserId = fred
Attributes
Email = bob@gmail.com, JoinDate = 2011-11-15
Email = fred@yahoo.com, JoinDate = 2011-1201, Sex = M
Users-email-GSI
Hash key
Email =
bob@gmail.com
Email =
fred@yahoo.com
Attributes
UserId = bob, JoinDate = 2011-11-15
UserId = fred, JoinDate = 2011-12-01,
Sex = M
34
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
35. Modeling 1:N relationships
Use a table with Hash and Range key or GSI ()
Example:
• One (1) User can play many (N) Games
Hash Key
UserId =
bob
UserId =
fred
UserId =
bob
User-Games-Table
Attributes
GameId = Game1,
HighScore = 10500, ScoreDate = 2011-10-20
GameId = Game2
HIghScore = 12000, ScoreDate = 2012-01-10
GameId = Game3
HighScore = 20000, ScoreDate = 2012-02-12
User-Games-GSI
Hash Key
Range
Attributes
key
UserId = bob GameId HighScore = 10500,
= Game1 ScoreDate = 2011-10-20
UserId =
GameId HIghScore = 12000,
fred
= Game2 ScoreDate = 2012-01-10
UserId = bob GameId HighScore = 20000,
= Game3 ScoreDate = 2012-02-12
35
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
36. Modeling N:M relationships
Use GSI
• Example: 1 user plays multiple games
and 1 game has multiple users
User-Games-Table
Hash Key
Range key
UserId = bob
GameId = Game1
UserId = fred
GameId = Game2
UserId = bob
GameId = Game3
Game-Users-GSI
Hash Key
Range key
GameId = Game1 UserId = bob
GameId = Game2 UserId = fred
GameId = Game3 UserId = bob
36
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
37. Best Practices
Choose a GSI Hash Key with high cardinality
Employee-Table
Id (hash)
Name
Sex
Address
Cardinality of Sex = 2 (M/F)
SexDOB-GSI
Sex (Hash)
DOB
DOB
Id
Name
Address
Solution: Generate aliases for M/F by suffixing a known range
of integers (say 1 to 100) and Query for each value M_1 to M_100
37
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
38. Best Practices
Take advantage of Sparse Indexes
Game-scores-table
Id
(hash)
User Game Score Date
1
Bob
G1
1300 2012-12-23
2
3
Bob
Jay
G1
G1
1450 2012-12-23
1600 2012-12-24
4
5
6
Mary G1
Ryan G2
Jones G2
2000 2012-10-24
123 2012-03-10
345 2012-03-20
Award-GSI
Award
Award
(hash)
Id
User Score
Champ
4
Mary 2000
Champ
38
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
39. Best Practices
Query GSI for quick item lookups
• Less read capacity units consumed
Mail Box-Table
ID (hash key)
Timestamp (range key)
Mail Box-lookup-GSI
Attribute1
Attribute2
Attribute3
….
LargeAttachment
ID (hash key)
Timestamp (range key)
Attribute1
Attribute2
Attribute3
39
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
40. Best Practices
Provision enough throughput for GSI
• one update to the table may result in two writes to an index
If GSIs do not have enough write capacity, table writes
will eventually be throttled down to what the "slowest"
index can consume
40
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
41. Debugging Throughput Issues
ProvisionedThroughputExceededException (HTTP
status code 400)
• "The level of configured provisioned throughput for one or more
global secondary indexes of the table was exceeded. Consider
increasing your provisioning level for the under-provisioned
global secondary indexes with the UpdateTable API"
41
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
42. Debugging Throughput Issues
GSI CloudWatch Metrics
•
•
•
•
ProvisionedReadCapacityUnits Vs ConsumedReadCapacityUnits
ProvisionedWriteCapacityUnits Vs ConsumedWriteCapacityUnits
ReadThrottleEvents
WriteThrottleEvents
42
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
43. Questions
43
© 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.