SlideShare uma empresa Scribd logo
1 de 29
©	2017,	Amazon	Web	Services,	Inc.	or	its	Affiliates.	All	rights	reserved
Pop-up Loft
Amazon DynamoDB Design Workshop
Edin Zulich,
NoSQL Solutions Architect, AWS
Data Modeling
Hierarchical data structures as items
• Use composite sort key to define a hierarchy
• Highly selective result sets with sort queries
• Index anything, scales to any size
Primary	Key
Attributes
ProductID type
Items
1 bookID
title author genre publisher datePublished ISBN
Some	Book John Smith Science	Fiction Ballantine Oct-70 0-345-02046-4
2 albumID
title artist genre label studio released producer
Some Album Some Band Progressive	Rock Harvest Abbey	Road 3/1/73 Somebody
2 albumID:trackID
title length music vocals
Track 1 1:30 Mason Instrumental
2 albumID:trackID
title length music vocals
Track	2 2:43 Mason Mason
2 albumID:trackID
title length music vocals
Track 3 3:30 Smith Johnson
3 movieID
title genre writer producer
Some	Movie Scifi	Comedy Joe Smith 20th	Century	Fox
3 movieID:actorID
name character image
Some	Actor Joe img2.jpg
3 movieID:actorID
name character image
Some Actress Rita img3.jpg
3 movieID:actorID
name character image
Some Actor Frito img1.jpg
… or as documents (JSON)
• JSON data types (M, L, BOOL, NULL)
• Document SDKs available
• 400 KB maximum item size (limits hierarchical data structure)
Primary	Key
Attributes
ProductID
Items
1
id title author genre publisher datePublished ISBN
bookID Some	Book Some Guy Science	Fiction Ballantine Oct-70 0-345-02046-4
2
id title artist genre Attributes
albumID Some	Album Some Band Progressive	Rock
{	label:"Harvest",	studio:	"Abbey	Road",	published:	"3/1/73",	producer:	"Pink	
Floyd",	tracks:	[{title:	"Speak	to	Me",	length:	"1:30",	music:	"Mason",	vocals:	
"Instrumental"},{title:	”Breathe",	length:	”2:43",	music:	”Waters,	Gilmour,	
Wright",	vocals:	”Gilmour"},{title:	”On	the	Run",	length:	“3:30",	music:	”Gilmour,	
Waters",	vocals:	"Instrumental"}]}
3
id title genre writer Attributes
movieID Some	Movie Scifi	Comedy Joe Smith
{	producer:	"20th	Century	Fox",	actors:	[{	name:	"Luke	Wilson",	dob:	"9/21/71",	
character:	"Joe	Bowers",	image:	"img2.jpg"},{	name:	"Maya	Rudolph",	dob:	
"7/27/72",	character:	"Rita",	image:	"img1.jpg"},{	name:	"Dax	Shepard",	dob:	
"1/2/75",	character:	"Frito	Pendejo",	image:	"img3.jpg"}]
1:1 relationships or key-values
• Use a table or GSI with a partition key
• Use GetItem or BatchGetItem API
Example: Given a user or email, get attributes
Users	Table
Partition key Attributes
UserId	=	bob Email	=	bob@gmail.com,	JoinDate	=	2011-11-15
UserId	=	fred Email	=	fred@yahoo.com,	JoinDate	=	2011-12-01
Users-Email-GSI
Partition	key Attributes
Email	=	bob@gmail.com UserId	=	bob,	JoinDate	=	2011-11-15
Email	=	fred@yahoo.com UserId	=	fred,	JoinDate	=	2011-12-01
1:N relationships or parent-children
• Use a table or GSI with partition and sort key
• Use Query API
Example:
Given a device, find all readings between epoch X, Y
Device-measurements
Part.	Key Sort	key Attributes
DeviceId	=	1 epoch	=	5513A97C Temperature	=	30,	pressure	=	90
DeviceId	=	1 epoch	=	5513A9DB Temperature	=	30,	pressure	=	90
N:M relationships
• Use a table and GSI with partition and sort key elements
switched
• Use Query API
Example:
Given a user, find all games. Or given a game, find all
users.
User-Games-Table
Part.	Key Sort	key
UserId	=	bob GameId	=	Game1
UserId	=	fred GameId	=	Game2
UserId	=	bob GameId	=	Game3
Game-Users-GSI
Part.	Key Sort	key
GameId	=	Game1 UserId	=	bob
GameId	=	Game2 UserId	=	fred
GameId	=	Game3 UserId	=	bob
Data Modeling Exercise
• A shopping cart use case
• Attributes: cartId, dateTime, set of SKU’s, etc.
• Mostly under 1KB in size
• Up to 100 million carts in the table – 100GB total
• Accessed by cartId (key-value access pattern)
• 10K writes, and 10K reads per second
• TTL to expire “old” carts
• Also:
• Notify when there is a price change for a SKU in a cart
Cost estimation
• The table: cartId, dateTime, set of SKU’s, etc.
• Mostly under 1KB in size
• Up to 100 million carts in the table – 100GB total
• Accessed by cartId (key-value access pattern)
• 10K writes, and 10K reads per second
• Cost estimate
• Enter the storage, item size, reads, and writes
• Get an estimate of your monthly bill: ~$6300
• The price notification job
• E.g. 100 million items, scan/update over 24 hrs
• ~ 1200 RCU and WCU for 24 hrs: ~ $22
Use case: Time Series Data
sensor_data
DeviceId: 123
Timestamp: 1492641900
Temp: 172
AWS Lambda
Kinesis Stream
• Sensor data (temp) from 1000’s of
sensors
• Need to store for fast access
• Access by sensor ID + time range
• Store for 90 days
DynamoDB
Solution DeviceId: 123
Timestamp: 1492641900
Temp: 172
MyTTL: 1492736400
AWS Lambda
Kinesis Stream
üGroup multiple data points into
a single item
üSave on writes
üUse TTL to manage data lifetime
sensor_data
DynamoDB
Time Series Data Example Architecture
Sensors
AWS
Lambda
Amazon
S3 hosted
site
Amazon
DynamoDB
Amazon
Kinesis
Amazon
CloudWatch
alarm
Amazon
Cognito
Javascript SDK
request
Amazon
SNS
Python (boto) JavaScriptJava
Time Series Data
ü High volume ingest: Kinesis + DynamoDB
ü Fast access: DynamoDB
ü At a reasonable cost: <$10K/month for:
• 2.5TB of stored data points per month
• Ingest of 100K data points per second
All using a serverless architecture with managed services on AWS
Cost estimate
• Data rate: 100000 data points per second
• Data storage: 1 month’s worth = ~2.5TB
Kinesis: 100000 records -> 100 shards -> ~$5K per month
DynamoDB: 100000 WCU’s -> $50K per month
• Binned writes: 1000 WCU’s -> $500 per month (100x less…)
• Diff. over 1 year: $600K vs. $6K
We can save a lot by storing multiple data points per item
Separating hot and cold data
Storing time series data
Time Series Tables (Static TTL Timestamps)
Events_table_2015_April
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_March
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_Feburary
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
Events_table_2015_January
Event_id
(Partition)
Timestamp
(Sort)
Attribute1 …. Attribute N
RCUs = 1000
WCUs = 1
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
RCUs = 10
WCUs = 1
Current table
Older tables
HotdataColddata
Don’t mix hot and cold data; archive cold data to Amazon S3
Use a table per time period
• Precreate daily, weekly, monthly tables
• Provision required throughput for current table
• Writes go to the current table
• Reduce write capacity for older tables
Dealing with static time series data
(timestamp never changes)
Using TTL to age out cold data
RCUs = 10000
WCUs = 10000
HotdataColddata
Use DynamoDB TTL and Streams to archive
Events_table_2016_April
Event_id
(Partition key)
Timestamp
(sort key)
myTTL
1489188093
…. Attribute N
Current table
S3
AWS Lambda
Sharding Writes
A Design Pattern for write-heavy
items
Example: Real-time voting
Votes Table
Voters
RawVotes Table
Voting App
Partition 1
1000 WCUs
Partition K
1000 WCUs
Partition M
1000 WCUs
Partition N
1000 WCUs
Votes Table
Candidate A Candidate B
Real-time voting: scaling bottlenecks
Voters
Provision 200,000 WCUs
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_7 Candidate B_8
UpdateItem: “CandidateA_” + rand(0, 10)
ADD 1 to Votes
Candidate A_6
Candidate A_8
Candidate A_5
Voter
Votes Table
Write-sharding
Example: a voting application
Votes Table
Shard aggregation
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_5
Candidate A_6 Candidate A_8
Candidate A_7 Candidate B_8
Periodic
process
Candidate A
Total: 2.5M
1. Sum
2. Store Voter
Tables
UserId Candidate Date
Alice A 2016-10-02
Bob B 2016-10-02
Eve B 2016-10-02
Chuck A 2016-10-02
RawVotes Table
Segment Votes
A_1 23
B_2 12
B_1 14
A_2 25
Votes Table
1. Record vote and de-dupe 2. Increment candidate counter
• Trade off read cost for write scalability
• Consider throughput per partition key
Shard write-heavy partition keys
Your write workload is not horizontally
scalable
Voting app, v2: Aggregation with Streams
Votes Table
Amazon
Redshift Amazon EMR
Your
Amazon Kinesis–
Enabled App
Voters RawVotes TableVoting App RawVotes
DynamoDB
Stream
Voting app, v2: Tables
UserId Candidate Date
Alice A 2016-10-02
Bob B 2016-10-02
Eve B 2016-10-02
Chuck A 2016-10-02
RawVotes Table
Candidate Votes
A 23
B 12
C 14
D 25
Votes Table
1. Record vote and de-dupe 2. Increment candidate counter
Analytics with DynamoDB Streams
• Collect and de-dupe data in DynamoDB
• Aggregate data in-memory and flush periodically
Performing real-time aggregation and
analytics
©	2017,	Amazon	Web	Services,	Inc.	or	its	Affiliates.	All	rights	reserved
Pop-up Loft
aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS
Thank you!

Mais conteúdo relacionado

Mais procurados

게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 GamingAmazon Web Services Korea
 
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayGetting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayAmazon Web Services Korea
 
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)Amazon Web Services
 
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Amazon Web Services
 
DynamoDB - What's new - DAT304 - re:Invent 2017
DynamoDB - What's new - DAT304 - re:Invent 2017DynamoDB - What's new - DAT304 - re:Invent 2017
DynamoDB - What's new - DAT304 - re:Invent 2017Amazon Web Services
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDBAmazon Web Services
 
Introduction to aws dynamo db
Introduction to aws dynamo dbIntroduction to aws dynamo db
Introduction to aws dynamo dbOmid Vahdaty
 
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012Amazon Web Services
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014Amazon Web Services
 

Mais procurados (20)

게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayGetting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
 
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Deep Dive - DynamoDB
Deep Dive - DynamoDBDeep Dive - DynamoDB
Deep Dive - DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
DynamoDB - What's new - DAT304 - re:Invent 2017
DynamoDB - What's new - DAT304 - re:Invent 2017DynamoDB - What's new - DAT304 - re:Invent 2017
DynamoDB - What's new - DAT304 - re:Invent 2017
 
AWS Webcast - Dynamo DB
AWS Webcast - Dynamo DBAWS Webcast - Dynamo DB
AWS Webcast - Dynamo DB
 
Introducing DynamoDB
Introducing DynamoDBIntroducing DynamoDB
Introducing DynamoDB
 
DynamoDB Deep Dive
DynamoDB Deep DiveDynamoDB Deep Dive
DynamoDB Deep Dive
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDB
 
Introduction to aws dynamo db
Introduction to aws dynamo dbIntroduction to aws dynamo db
Introduction to aws dynamo db
 
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
 

Semelhante a DynamoDB Design Workshop

SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven ProgrammingAmazon Web Services
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAmazon Web Services
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep DiveAmazon Web Services
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBAmazon Web Services
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Amazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthSRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthAmazon Web Services
 
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Todd Whitehead
 

Semelhante a DynamoDB Design Workshop (20)

Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
 
AWS Data Collection & Storage
AWS Data Collection & StorageAWS Data Collection & Storage
AWS Data Collection & Storage
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
Data Collection and Storage
Data Collection and StorageData Collection and Storage
Data Collection and Storage
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthSRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal Health
 
Amazon DynamoDB and DAX
Amazon DynamoDB and DAXAmazon DynamoDB and DAX
Amazon DynamoDB and DAX
 
Amazon DynamoDB and DAX
Amazon DynamoDB and DAXAmazon DynamoDB and DAX
Amazon DynamoDB and DAX
 
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

DynamoDB Design Workshop

  • 3. Hierarchical data structures as items • Use composite sort key to define a hierarchy • Highly selective result sets with sort queries • Index anything, scales to any size Primary Key Attributes ProductID type Items 1 bookID title author genre publisher datePublished ISBN Some Book John Smith Science Fiction Ballantine Oct-70 0-345-02046-4 2 albumID title artist genre label studio released producer Some Album Some Band Progressive Rock Harvest Abbey Road 3/1/73 Somebody 2 albumID:trackID title length music vocals Track 1 1:30 Mason Instrumental 2 albumID:trackID title length music vocals Track 2 2:43 Mason Mason 2 albumID:trackID title length music vocals Track 3 3:30 Smith Johnson 3 movieID title genre writer producer Some Movie Scifi Comedy Joe Smith 20th Century Fox 3 movieID:actorID name character image Some Actor Joe img2.jpg 3 movieID:actorID name character image Some Actress Rita img3.jpg 3 movieID:actorID name character image Some Actor Frito img1.jpg
  • 4. … or as documents (JSON) • JSON data types (M, L, BOOL, NULL) • Document SDKs available • 400 KB maximum item size (limits hierarchical data structure) Primary Key Attributes ProductID Items 1 id title author genre publisher datePublished ISBN bookID Some Book Some Guy Science Fiction Ballantine Oct-70 0-345-02046-4 2 id title artist genre Attributes albumID Some Album Some Band Progressive Rock { label:"Harvest", studio: "Abbey Road", published: "3/1/73", producer: "Pink Floyd", tracks: [{title: "Speak to Me", length: "1:30", music: "Mason", vocals: "Instrumental"},{title: ”Breathe", length: ”2:43", music: ”Waters, Gilmour, Wright", vocals: ”Gilmour"},{title: ”On the Run", length: “3:30", music: ”Gilmour, Waters", vocals: "Instrumental"}]} 3 id title genre writer Attributes movieID Some Movie Scifi Comedy Joe Smith { producer: "20th Century Fox", actors: [{ name: "Luke Wilson", dob: "9/21/71", character: "Joe Bowers", image: "img2.jpg"},{ name: "Maya Rudolph", dob: "7/27/72", character: "Rita", image: "img1.jpg"},{ name: "Dax Shepard", dob: "1/2/75", character: "Frito Pendejo", image: "img3.jpg"}]
  • 5. 1:1 relationships or key-values • Use a table or GSI with a partition key • Use GetItem or BatchGetItem API Example: Given a user or email, get attributes Users Table Partition key Attributes UserId = bob Email = bob@gmail.com, JoinDate = 2011-11-15 UserId = fred Email = fred@yahoo.com, JoinDate = 2011-12-01 Users-Email-GSI Partition key Attributes Email = bob@gmail.com UserId = bob, JoinDate = 2011-11-15 Email = fred@yahoo.com UserId = fred, JoinDate = 2011-12-01
  • 6. 1:N relationships or parent-children • Use a table or GSI with partition and sort key • Use Query API Example: Given a device, find all readings between epoch X, Y Device-measurements Part. Key Sort key Attributes DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90 DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90
  • 7. N:M relationships • Use a table and GSI with partition and sort key elements switched • Use Query API Example: Given a user, find all games. Or given a game, find all users. User-Games-Table Part. Key Sort key UserId = bob GameId = Game1 UserId = fred GameId = Game2 UserId = bob GameId = Game3 Game-Users-GSI Part. Key Sort key GameId = Game1 UserId = bob GameId = Game2 UserId = fred GameId = Game3 UserId = bob
  • 8. Data Modeling Exercise • A shopping cart use case • Attributes: cartId, dateTime, set of SKU’s, etc. • Mostly under 1KB in size • Up to 100 million carts in the table – 100GB total • Accessed by cartId (key-value access pattern) • 10K writes, and 10K reads per second • TTL to expire “old” carts • Also: • Notify when there is a price change for a SKU in a cart
  • 9. Cost estimation • The table: cartId, dateTime, set of SKU’s, etc. • Mostly under 1KB in size • Up to 100 million carts in the table – 100GB total • Accessed by cartId (key-value access pattern) • 10K writes, and 10K reads per second • Cost estimate • Enter the storage, item size, reads, and writes • Get an estimate of your monthly bill: ~$6300 • The price notification job • E.g. 100 million items, scan/update over 24 hrs • ~ 1200 RCU and WCU for 24 hrs: ~ $22
  • 10. Use case: Time Series Data sensor_data DeviceId: 123 Timestamp: 1492641900 Temp: 172 AWS Lambda Kinesis Stream • Sensor data (temp) from 1000’s of sensors • Need to store for fast access • Access by sensor ID + time range • Store for 90 days DynamoDB
  • 11. Solution DeviceId: 123 Timestamp: 1492641900 Temp: 172 MyTTL: 1492736400 AWS Lambda Kinesis Stream üGroup multiple data points into a single item üSave on writes üUse TTL to manage data lifetime sensor_data DynamoDB
  • 12. Time Series Data Example Architecture Sensors AWS Lambda Amazon S3 hosted site Amazon DynamoDB Amazon Kinesis Amazon CloudWatch alarm Amazon Cognito Javascript SDK request Amazon SNS Python (boto) JavaScriptJava
  • 13. Time Series Data ü High volume ingest: Kinesis + DynamoDB ü Fast access: DynamoDB ü At a reasonable cost: <$10K/month for: • 2.5TB of stored data points per month • Ingest of 100K data points per second All using a serverless architecture with managed services on AWS
  • 14. Cost estimate • Data rate: 100000 data points per second • Data storage: 1 month’s worth = ~2.5TB Kinesis: 100000 records -> 100 shards -> ~$5K per month DynamoDB: 100000 WCU’s -> $50K per month • Binned writes: 1000 WCU’s -> $500 per month (100x less…) • Diff. over 1 year: $600K vs. $6K We can save a lot by storing multiple data points per item
  • 15. Separating hot and cold data Storing time series data
  • 16. Time Series Tables (Static TTL Timestamps) Events_table_2015_April Event_id (Partition) Timestamp (Sort) Attribute1 …. Attribute N Events_table_2015_March Event_id (Partition) Timestamp (Sort) Attribute1 …. Attribute N Events_table_2015_Feburary Event_id (Partition) Timestamp (Sort) Attribute1 …. Attribute N Events_table_2015_January Event_id (Partition) Timestamp (Sort) Attribute1 …. Attribute N RCUs = 1000 WCUs = 1 RCUs = 10000 WCUs = 10000 RCUs = 100 WCUs = 1 RCUs = 10 WCUs = 1 Current table Older tables HotdataColddata Don’t mix hot and cold data; archive cold data to Amazon S3
  • 17. Use a table per time period • Precreate daily, weekly, monthly tables • Provision required throughput for current table • Writes go to the current table • Reduce write capacity for older tables Dealing with static time series data (timestamp never changes)
  • 18. Using TTL to age out cold data RCUs = 10000 WCUs = 10000 HotdataColddata Use DynamoDB TTL and Streams to archive Events_table_2016_April Event_id (Partition key) Timestamp (sort key) myTTL 1489188093 …. Attribute N Current table S3 AWS Lambda
  • 19. Sharding Writes A Design Pattern for write-heavy items
  • 20. Example: Real-time voting Votes Table Voters RawVotes Table Voting App
  • 21. Partition 1 1000 WCUs Partition K 1000 WCUs Partition M 1000 WCUs Partition N 1000 WCUs Votes Table Candidate A Candidate B Real-time voting: scaling bottlenecks Voters Provision 200,000 WCUs
  • 22. Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 UpdateItem: “CandidateA_” + rand(0, 10) ADD 1 to Votes Candidate A_6 Candidate A_8 Candidate A_5 Voter Votes Table Write-sharding Example: a voting application
  • 23. Votes Table Shard aggregation Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_5 Candidate A_6 Candidate A_8 Candidate A_7 Candidate B_8 Periodic process Candidate A Total: 2.5M 1. Sum 2. Store Voter
  • 24. Tables UserId Candidate Date Alice A 2016-10-02 Bob B 2016-10-02 Eve B 2016-10-02 Chuck A 2016-10-02 RawVotes Table Segment Votes A_1 23 B_2 12 B_1 14 A_2 25 Votes Table 1. Record vote and de-dupe 2. Increment candidate counter
  • 25. • Trade off read cost for write scalability • Consider throughput per partition key Shard write-heavy partition keys Your write workload is not horizontally scalable
  • 26. Voting app, v2: Aggregation with Streams Votes Table Amazon Redshift Amazon EMR Your Amazon Kinesis– Enabled App Voters RawVotes TableVoting App RawVotes DynamoDB Stream
  • 27. Voting app, v2: Tables UserId Candidate Date Alice A 2016-10-02 Bob B 2016-10-02 Eve B 2016-10-02 Chuck A 2016-10-02 RawVotes Table Candidate Votes A 23 B 12 C 14 D 25 Votes Table 1. Record vote and de-dupe 2. Increment candidate counter
  • 28. Analytics with DynamoDB Streams • Collect and de-dupe data in DynamoDB • Aggregate data in-memory and flush periodically Performing real-time aggregation and analytics