5. File Block Object
AWS Storage options for digital media
Amazon
EFS
Amazon
EBS
Amazon EC2
Instance
storage
Amazon
S3
Amazon
Glacier
6. A Concept - the Content Lake
Inspired from Data Lake (Coined by James Dixon in 2010)
A single store of all of digital content that you create and
acquire in any form or factor
•Don’t assume any resolutions/formats (for now or future)
•It is up to the consumer (application consuming the content) to use the
appropriate infrastructure for processing
7. Amazon S3 : the Content Lake
• Durable, cost-effective and fast
• Highly scalable front-end
– Multi-part uploads (parallel writes)
– Range-gets (parallel reads)
• No need for capacity planning or
provisioning
• Use Amazon S3 with on-premises
storage in a hybrid model
• Secure
9. Hydrating the Content Lake
Amazon S3
Amazon S3
(multi-part Upload)
Direct Connect
N x 1G | 10G
Massively Scalable Front-end
10. Introducing AWS Import/Export Snowball
Scale and Speed
• Up to 50TB Capacity per device
• 10Gbps and 1Gbps connectivity
• Parallel data transfer enables PBs transferred in a week
Secure
• Tamper-resistant enclosure
• 256-bit encryption with KMS
• Secure data erasure
Simple
• Manage entire process through AWS Console
• Lightweight data transfer client
• Notifications
11. What is Snowball? Petabyte scale data transport
E-ink shipping
label
Ruggedized
case
“8.5G Impact”
All data encrypted
end-to-end
50 TB
10G network
Rain & dust
resistant
Tamper-resistant
case & electronics
12. Can I drop it?
• No (please don’t)
• Snowball is its own box
• Has had many drop tests already
• Can handle 8.5G impacts
• Designed for shipping
14. What does it cost?
• $200 / job plus shipping
• Includes 10 days to fill the device at your site
• $15/day after the tenth day on site
• Standard Amazon S3 charges apply
• $0.03/GB to transfer data out
• $0.00/GB to transfer data in
15. How fast is that truck full of drives?
• Less than 1 day to transfer 250TB via 5x10G connections with 5
Snowballs, less than 1 week including shipping
• Number of days to transfer 250TB via the Internet at typical
utilizations
Internet Connection Speed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 95 190 316 632
50% 47 95 158 316
75% 32 63 105 211
16. What does it cost?
Example 1:
• 250TB loaded on to 5 Snowballs
• 8 days at your site
• 5 * $200 = $1,000 plus shipping
Example 2:
• 30TB exported on to 1 Snowball
• 8 days at your site
• $200 + 30TB * $0.03/GB = $1,121.60 plus shipping
17. Edge Locations
Availability Zone
Region
Dallas (2)
St.Louis
Miami
JacksonvilleLos Angeles (2)
Seattle
Ashburn (3)
Newark
New York (3)
Dublin
London (2)
Amsterdam (2)
Stockholm
Frankfurt (2)Paris (2)
Singapore(2)
Hong Kong (2)
Tokyo (2)
Sao Paulo
South Bend
San Jose
Palo Alto
Hayward
Osaka
Milan
Sydney
Madrid
Seoul
Mumbai
Chennai
Regional Lakes …
18. Source
(Virginia)
Destination
(Oregon)
• Only replicates new PUTs. Once S3
is configured, all new uploads into a
source bucket will be replicated
• Entire bucket or prefix based
• 1:1 replication between any 2
regions
Use cases
Compliance - store data hundreds of miles apart
Lower latency - distribute data to remote customers/partners)
S3 cross-region replication
Automated, fast, and reliable asynchronous replication of data across AWS regions
19. Amazon S3
Amazon S3 (range-gets)
Direct Connect
N x 1G | 10G
Massively Scalable S3 Front-end
EBS
Instance
Store
c
Massively Scalable
Compute on AWS Cloud
On-Prem Apps
Consuming the Content Lake
20. Object life cycle from hot to cold
S3 Standard
• Primary data
• 11 9’s of durability
• 2.75c – 3c per
GB/month, $338 -
369 per TB/year
S3 – Infrequent Access
• Active Archives
• Mezzanine files
• 11 9’s of durability
• 1.25c per GB/month,
$154 per TB/year
• 1c per GB for retrievals
Glacier
• Deep/offline archives
• WORM-compliant
data
• 11 9’s of durability
• 0.7c per GB/month,
$86 per TB/year
Data tiering using Life Cycle Policies
Actual customer quote: $0.0125 ?! OMG I will
take all your storage!!!
21. 1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
S3 capacity pricing—pay only for what you use!
AWS Cloud
Storage
22. Securing your data on S3
• AWS alignment with the latest MPAA cloud based
application guidelines for content security –
August 2015
• VPC private endpoint for Amazon S3 – enables a
true private workflow capability
• Encryption & key management capabilities
• Amazon Glacier Vault for high-value
media/originals
23. Preserve, retrieve, and restore every version
of every object stored in your bucket
S3 automatically adds new versions and
preserves deleted objects with delete markers
Easily control the number of versions kept by
using lifecycle expiration policies
Easy to turn on in the AWS Management
Console
Key = photo.gif
ID = 121212
Key = photo.gif
ID = 111111
Versioning Enabled
PUT
Key = photo.gif
S3 versioning
24. Amazon S3 event notifications
Delivers notifications to Amazon SNS, Amazon SQS, or AWS
Lambda when events occur in Amazon S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {
…
}
Support for notification when
objects are created via Put,
Post, Copy, or Multipart
Upload.
Support for notification when
objects are deleted, as well
as with filtering on prefixes
and suffixes for all types of
notifications.
25. Reference Architecture – Content Processing Pipeline
(Using Lambda)
S3 multi-part API
S3 as backend storage for Content Files acesable to
other processing tasks
Amazon Elastic
Transcoder
S3 Notification
Trigger a Lambda
Function to Start a
transcoding job
Ingest
S3 Notification
Lambda function to
generate a signed
URL to share the
file
Update CMS or
Metadata
26. Elastic File System - Rendering in the Cloud
• Designed to support petabyte
scale file systems
• Throughput scales linearly with
storage
• Same latency spec across each AZ
• Thousands of concurrent NFS
connections
• Works great for large I/O sizes
• Pay for only what you use not what
you provision
• Managed with multi-copy durability
27. Media Workloads (redefined)
EBS
Instance
Store
Amazon EBS/EFS/EC2 Instance Store
Process
Partner/Affiliate/
Service Provider
User Delivery/ConsumptionVFX/Production
On-Prem Apps
Archive
Amazon Glacier (Life Cycle Policies)
c
c
Direct Connect
Content Access Transfer
Disposable Infrastructure
Auto-scaling
Workload specific
Amazon S3
EFS
28. Q&A
Learn more at: http://aws.amazon.com/s3/
http://aws.amazon.com/glacier/
http://aws.amazon.com/importexport/
henryz@amazon.com