SlideShare a Scribd company logo
1 of 62
Download to read offline
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Henry Zhang, Senior Product Manager, Amazon Glacier
October 2015
Amazon Glacier Deep Dive
STG312
Audio archives – SoundCloud
• World’s leading social sound platform
• Audio files transcoded and stored in multiple formats
• Stores PBs of data
• Transcoded files served from Amazon S3
• Originals moved to Amazon Glacier for long-term retention
Video archives – Sony Media Cloud (Ci)
Amazon
Glacier
Tape replacement – King County
• Most populous county in Washington State
• Replace tape solution for backup from 17 agencies
• Meet compliance requirement
• Saved $1MM in first year, no more tape refresh or
management churn
Archive:
Data retained for the long term,
for compliance or potential
future reference
Data archiving needs are growing everywhere
• Media assets, 4K, 8K
• Health care / Life sciences
• Financial services
• Regulated industries
• Oil and gas / Geospatial
• Digital preservation
• Long-term backups
• Logs
Traditional archiving approaches
• Tape silos / Tape libraries
• Tape drives (LTO-X / DLT / etc.)
• Virtual tape libraries (VTLs)
• Tape out / Vaulting
• Specialized software & personnel
How can Amazon Glacier help with your archival?
Metered usage:
Pay as you go
No capital investment
No commitment
No risky capacity planning
Avoid risks of physical
media handling
Control your
geographic locality for
performance and
compliance
Amazon Glacier is a low-cost storage service for
archival data with long-term retention requirements.
$0.007/GB per month 3-5 hour data retrieval
Financial records
Medical PACs images
High Res Media Assets
How can Amazon Glacier help with your archival?
Extremely low-cost archive storage service, starting at $0.007 GB/mo
Allows you to retrieve data within 3-5 hours
99.999999999% of durability (7 orders of magnitude higher than 2 copies of tape)
No data migration, no hardware/infrastructure investments
Infinite scale and pay for what you use
Access to on-demand compute resource on AWS
Getting started – key concepts
• Account – Access AWS services, view billing/usage, manage security
• Vaults – Container for archives, up to 1000 vaults per account
• Archives – Files and records, write-once, 40TB max, unlimited archives
• Inventory – Cold index of archive properties refreshed every 24 hours
Amazon Glacier – 3 ways to Access
•Direct Glacier API/SDK
•S3 lifecycle integration
•Third party tools and gateways
Amazon Glacier concepts: Uploading data
Create vault (films)1
Configure access policies2
ArchiveApp user policy
Effect:Allow
Resource:
arn:aws:glacier:<accountId>:vaults/Films
Action: glacier:UploadArchive
3 Upload archives
UploadArchive(data) ->
Archive ID
Amazon Glacier concepts: Retrieving data
Initiate Job
ArchiveId: AE99F…
Vault: Films -> Job ID
1
3-5 hours for job completion2
3 Job completion notification
4 Download output
Amazon Glacier – Amazon S3 lifecycle archival
• Seamlessly move data from Amazon S3 to Amazon Glacier
• Automated lifecycle rules
• Transition based on object age or predefined date
Amazon Glacier – Backup software integration
• CommVault – Native Integration
with Amazon S3 & Amazon Glacier
• Deduplication & encryption
• Single console management
Amazon S3 Amazon Glacier
Amazon Glacier – Third-party tools and gateways
•Consumer grade: less than $50
• Example: Cloudberry, FastGlacier, Arq (Haystack Software)
•Small / medium business: $500 - $1,000
• Example: Synology, Veeam, QNap
•Enterprise grade gateway (price varies)
• Example: NetApp AltaVault
Best practices – Prepare your data
Use Archive descriptions
• Use Archive description field for
metadata.
• If local index is corrupted or
destroyed, use archive description
to reconstruct critical mappings.
• For example, create index entry,
add primary key to archive
description on upload.
Small objects and object size overhead
• Every archive has 32KB of associated overhead
and some operations are charged per request
• For archive size of 3.2MB ~1% cost overheads
• For 1KB archive, 97% of cost would go to
overhead
• Solution is aggregation – recommend minimum
size on the order of at least MBs
Archive aggregation
Checksum 2
Checksum 1
File 2
Checksum 3
. . .
Local index
File 1 offset
File 1
File 2 offset
File 3 offset
Index/directory
…
Checksum & metadata
Checksum & metadata
Checksum & metadata
Archive
Best practices – Optimize upload
Best practices: Multipart uploads
Improve throughput, reliability, and get idempotency with multipart uploads
1. InitiateMultipartUpload(partSize) → uploadId
2. UploadPart(uploadId, data)
3. CompleteMultipartUpload(uploadId) → archiveId
Archive
Parallel Uploads
Parts
Best practices: Data ingestion options
AWS Direct
Connect
Dedicated bandwidth between
your site and AWS
Internet
Transfer data in a secure SSL tunnel
over the public Internet
AWS Import/Export
Snowball
Physical transfer of media into
and out of AWS
Best practices – Cost management
Amazon Glacier – Data retrieval policies
• Provides transparency and cost control for data retrievals
• Governs all retrieval activities for an account in a region
• Synchronously accept/reject each retrieval request
• Accounts for inflight retrieval operations
Amazon Glacier – Data retrieval policies
Amazon Glacier – Data retrieval policies
Amazon Glacier – Data retrieval policies
Amazon Glacier – Data retrieval policies
Cost allocation with vault tags
Best practices – Security and compliance
Amazon Glacier – Audit logging with AWS CloudTrail
• Enable AWS CloudTrail in
console
• Control plane events –
Vault activities
• Data plane events –
Archive activities
Vault access policies
• Manage access to a Vault in a single location – single IAM policy
– Grant/revoke access to internal business units/teams
– “Marketing_Vault” has a distinct access policy than “DevOps_Vault”
• Easily manage cross-account access for your business partner
– Simply add a section for your business partner in the same policy
Amazon Glacier Vault Lock allows you to easily
set compliance controls on individual vaults and
enforce them via a lockable policy.
Time-based retention
MFA Authentication
Controls govern all
records in a Vault
Immutable policy
Two-step locking
Compliance Storage with Vault Lock
Vault Lock for compliance storage
• Non-overwrite, non-erasable records
• Time-based retention with “ArchiveAgeInDays” control
• Policy lockdown (strong governance)
• Legal hold with vault-level tags
• Configure optional designated third-party access and grant
temporary access
Example control: 1 year record retention
Example control: 1 year record retention
Vault Lock: Two-step locking
Legal hold with vault-level tags
Example control: Legal hold
Vault lock best practices
Vault access policy
• Can be updated/deleted
Vault lock policy
• Lockable/Immutable policy
• Cannot be updated/deleted
after lockdown
Use vault access policy to:
• Designate third-party access
• Grant temporary read
permissions when necessary
Use vault lock policy to:
• Deploy regulatory controls such
as records retention
• Enforce data access through
multi-factor authentication only
Compliance/Governance Flexibility
Using vault lock policy with vault access policy
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Vault Lock in the Glacier Console
Amazon Glacier received a third-party assessment
from Cohasset Associates on how Amazon Glacier
with Vault Lock can be used to meet the
requirements of SEC 17a-4(f) and CFTC 1.31(b)-(c).
Thank you!
Remember to complete
your evaluations!

More Related Content

What's hot

SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
Amazon Web Services
 

What's hot (20)

AWS September Webinar Series - Meet Regulatory Storage Requirements with Amaz...
AWS September Webinar Series - Meet Regulatory Storage Requirements with Amaz...AWS September Webinar Series - Meet Regulatory Storage Requirements with Amaz...
AWS September Webinar Series - Meet Regulatory Storage Requirements with Amaz...
 
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
 
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
 
How to backup, restore and archive your data on AWS
How to backup, restore and archive your data on AWSHow to backup, restore and archive your data on AWS
How to backup, restore and archive your data on AWS
 
Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS
 
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryGetting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
 
AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)
AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)
AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)
 
Intro to AWS: Storage Services
Intro to AWS: Storage ServicesIntro to AWS: Storage Services
Intro to AWS: Storage Services
 
Automating Backup & Archiving with AWS and CommVault
Automating Backup & Archiving with AWS and CommVaultAutomating Backup & Archiving with AWS and CommVault
Automating Backup & Archiving with AWS and CommVault
 
Amazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage Overview
 
Disaster Recovery Options on AWS Loft
Disaster Recovery Options on AWS LoftDisaster Recovery Options on AWS Loft
Disaster Recovery Options on AWS Loft
 
S3 and Glacier
S3 and GlacierS3 and Glacier
S3 and Glacier
 
Disaster Recovery Options with AWS
Disaster Recovery Options with AWSDisaster Recovery Options with AWS
Disaster Recovery Options with AWS
 
Backup and Archiving in the AWS Cloud
Backup and Archiving in the AWS CloudBackup and Archiving in the AWS Cloud
Backup and Archiving in the AWS Cloud
 
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierCost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
 
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
Building Bulletproof Infrastructure on AWS
Building Bulletproof Infrastructure on AWSBuilding Bulletproof Infrastructure on AWS
Building Bulletproof Infrastructure on AWS
 

Viewers also liked

Dynamo db pros and cons
Dynamo db  pros and consDynamo db  pros and cons
Dynamo db pros and cons
Saniya Khalsa
 

Viewers also liked (20)

(STG202) AWS Import/Export Snowball: Large-Scale Data Ingest into AWS
(STG202) AWS Import/Export Snowball: Large-Scale Data Ingest into AWS(STG202) AWS Import/Export Snowball: Large-Scale Data Ingest into AWS
(STG202) AWS Import/Export Snowball: Large-Scale Data Ingest into AWS
 
Introduction to DevOps and the AWS Code Services
Introduction to DevOps and the AWS Code ServicesIntroduction to DevOps and the AWS Code Services
Introduction to DevOps and the AWS Code Services
 
Announcing AWS Snowball Edge and AWS Snowmobile - December 2016 Monthly Webin...
Announcing AWS Snowball Edge and AWS Snowmobile - December 2016 Monthly Webin...Announcing AWS Snowball Edge and AWS Snowmobile - December 2016 Monthly Webin...
Announcing AWS Snowball Edge and AWS Snowmobile - December 2016 Monthly Webin...
 
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
 
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
(STG402) Amazon EBS Deep Dive
(STG402) Amazon EBS Deep Dive(STG402) Amazon EBS Deep Dive
(STG402) Amazon EBS Deep Dive
 
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
 
Hands-on Labs: Getting Started with AWS - March 2017 AWS Online Tech Talks
Hands-on Labs: Getting Started with AWS  - March 2017 AWS Online Tech TalksHands-on Labs: Getting Started with AWS  - March 2017 AWS Online Tech Talks
Hands-on Labs: Getting Started with AWS - March 2017 AWS Online Tech Talks
 
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
 
AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)
AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)
AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)
 
Dynamo db pros and cons
Dynamo db  pros and consDynamo db  pros and cons
Dynamo db pros and cons
 
AWS Data Transfer Services - AWS Gateway, AWS Snowball, AWS Snowball Edge, an...
AWS Data Transfer Services - AWS Gateway, AWS Snowball, AWS Snowball Edge, an...AWS Data Transfer Services - AWS Gateway, AWS Snowball, AWS Snowball Edge, an...
AWS Data Transfer Services - AWS Gateway, AWS Snowball, AWS Snowball Edge, an...
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCache
 
Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...
Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...
Accelerating Application Performance with Amazon ElastiCache (DAT207) | AWS r...
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 
Best Practices for Managing Security Operations in AWS - March 2017 AWS Onlin...
Best Practices for Managing Security Operations in AWS - March 2017 AWS Onlin...Best Practices for Managing Security Operations in AWS - March 2017 AWS Onlin...
Best Practices for Managing Security Operations in AWS - March 2017 AWS Onlin...
 
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 

Similar to (STG312) Amazon Glacier Deep Dive: Cold Data Storage in AWS

Similar to (STG312) Amazon Glacier Deep Dive: Cold Data Storage in AWS (20)

Data Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveData Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and Archive
 
Data Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveData Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and Archive
 
Data Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveData Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and Archive
 
Deep Dive on Archiving and Compliance
Deep Dive on Archiving and ComplianceDeep Dive on Archiving and Compliance
Deep Dive on Archiving and Compliance
 
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 
Deep Dive on Amazon Glacier Covering New Retrieval Features - December 2016 M...
Deep Dive on Amazon Glacier Covering New Retrieval Features - December 2016 M...Deep Dive on Amazon Glacier Covering New Retrieval Features - December 2016 M...
Deep Dive on Amazon Glacier Covering New Retrieval Features - December 2016 M...
 
Backup to the Cloud
Backup to the CloudBackup to the Cloud
Backup to the Cloud
 
Deep Dive on Amazon S3 - AWS Online Tech Talks
Deep Dive on Amazon S3 - AWS Online Tech TalksDeep Dive on Amazon S3 - AWS Online Tech Talks
Deep Dive on Amazon S3 - AWS Online Tech Talks
 
Deep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
Deep Dive on Amazon S3 - March 2017 AWS Online Tech TalksDeep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
Deep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
 
Introduction to Storage on AWS - AWS Summit Cape Town 2017
Introduction to Storage on AWS - AWS Summit Cape Town 2017Introduction to Storage on AWS - AWS Summit Cape Town 2017
Introduction to Storage on AWS - AWS Summit Cape Town 2017
 
Backup and archiving in the aws cloud
Backup and archiving in the aws cloudBackup and archiving in the aws cloud
Backup and archiving in the aws cloud
 
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
 
Storage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon GlacierStorage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon Glacier
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud.
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Backup and Archiving in the AWS Cloud
Backup and Archiving in the AWS CloudBackup and Archiving in the AWS Cloud
Backup and Archiving in the AWS Cloud
 
AWS Webcast - Backup and Archiving in the AWS Cloud
AWS Webcast - Backup and Archiving in the AWS CloudAWS Webcast - Backup and Archiving in the AWS Cloud
AWS Webcast - Backup and Archiving in the AWS Cloud
 
Deep Dive on Amazon S3
Deep Dive on Amazon S3Deep Dive on Amazon S3
Deep Dive on Amazon S3
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

(STG312) Amazon Glacier Deep Dive: Cold Data Storage in AWS

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Henry Zhang, Senior Product Manager, Amazon Glacier October 2015 Amazon Glacier Deep Dive STG312
  • 2. Audio archives – SoundCloud • World’s leading social sound platform • Audio files transcoded and stored in multiple formats • Stores PBs of data • Transcoded files served from Amazon S3 • Originals moved to Amazon Glacier for long-term retention
  • 3. Video archives – Sony Media Cloud (Ci) Amazon Glacier
  • 4. Tape replacement – King County • Most populous county in Washington State • Replace tape solution for backup from 17 agencies • Meet compliance requirement • Saved $1MM in first year, no more tape refresh or management churn
  • 5. Archive: Data retained for the long term, for compliance or potential future reference Data archiving needs are growing everywhere • Media assets, 4K, 8K • Health care / Life sciences • Financial services • Regulated industries • Oil and gas / Geospatial • Digital preservation • Long-term backups • Logs
  • 6. Traditional archiving approaches • Tape silos / Tape libraries • Tape drives (LTO-X / DLT / etc.) • Virtual tape libraries (VTLs) • Tape out / Vaulting • Specialized software & personnel
  • 7. How can Amazon Glacier help with your archival? Metered usage: Pay as you go No capital investment No commitment No risky capacity planning Avoid risks of physical media handling Control your geographic locality for performance and compliance
  • 8. Amazon Glacier is a low-cost storage service for archival data with long-term retention requirements. $0.007/GB per month 3-5 hour data retrieval Financial records Medical PACs images High Res Media Assets
  • 9. How can Amazon Glacier help with your archival? Extremely low-cost archive storage service, starting at $0.007 GB/mo Allows you to retrieve data within 3-5 hours 99.999999999% of durability (7 orders of magnitude higher than 2 copies of tape) No data migration, no hardware/infrastructure investments Infinite scale and pay for what you use Access to on-demand compute resource on AWS
  • 10. Getting started – key concepts • Account – Access AWS services, view billing/usage, manage security • Vaults – Container for archives, up to 1000 vaults per account • Archives – Files and records, write-once, 40TB max, unlimited archives • Inventory – Cold index of archive properties refreshed every 24 hours
  • 11. Amazon Glacier – 3 ways to Access •Direct Glacier API/SDK •S3 lifecycle integration •Third party tools and gateways
  • 12. Amazon Glacier concepts: Uploading data Create vault (films)1 Configure access policies2 ArchiveApp user policy Effect:Allow Resource: arn:aws:glacier:<accountId>:vaults/Films Action: glacier:UploadArchive 3 Upload archives UploadArchive(data) -> Archive ID
  • 13. Amazon Glacier concepts: Retrieving data Initiate Job ArchiveId: AE99F… Vault: Films -> Job ID 1 3-5 hours for job completion2 3 Job completion notification 4 Download output
  • 14. Amazon Glacier – Amazon S3 lifecycle archival • Seamlessly move data from Amazon S3 to Amazon Glacier • Automated lifecycle rules • Transition based on object age or predefined date
  • 15. Amazon Glacier – Backup software integration • CommVault – Native Integration with Amazon S3 & Amazon Glacier • Deduplication & encryption • Single console management Amazon S3 Amazon Glacier
  • 16. Amazon Glacier – Third-party tools and gateways •Consumer grade: less than $50 • Example: Cloudberry, FastGlacier, Arq (Haystack Software) •Small / medium business: $500 - $1,000 • Example: Synology, Veeam, QNap •Enterprise grade gateway (price varies) • Example: NetApp AltaVault
  • 17. Best practices – Prepare your data
  • 18. Use Archive descriptions • Use Archive description field for metadata. • If local index is corrupted or destroyed, use archive description to reconstruct critical mappings. • For example, create index entry, add primary key to archive description on upload.
  • 19. Small objects and object size overhead • Every archive has 32KB of associated overhead and some operations are charged per request • For archive size of 3.2MB ~1% cost overheads • For 1KB archive, 97% of cost would go to overhead • Solution is aggregation – recommend minimum size on the order of at least MBs
  • 20. Archive aggregation Checksum 2 Checksum 1 File 2 Checksum 3 . . . Local index File 1 offset File 1 File 2 offset File 3 offset Index/directory … Checksum & metadata Checksum & metadata Checksum & metadata Archive
  • 21. Best practices – Optimize upload
  • 22. Best practices: Multipart uploads Improve throughput, reliability, and get idempotency with multipart uploads 1. InitiateMultipartUpload(partSize) → uploadId 2. UploadPart(uploadId, data) 3. CompleteMultipartUpload(uploadId) → archiveId Archive Parallel Uploads Parts
  • 23. Best practices: Data ingestion options AWS Direct Connect Dedicated bandwidth between your site and AWS Internet Transfer data in a secure SSL tunnel over the public Internet AWS Import/Export Snowball Physical transfer of media into and out of AWS
  • 24. Best practices – Cost management
  • 25. Amazon Glacier – Data retrieval policies • Provides transparency and cost control for data retrievals • Governs all retrieval activities for an account in a region • Synchronously accept/reject each retrieval request • Accounts for inflight retrieval operations
  • 26. Amazon Glacier – Data retrieval policies
  • 27. Amazon Glacier – Data retrieval policies
  • 28. Amazon Glacier – Data retrieval policies
  • 29. Amazon Glacier – Data retrieval policies
  • 30. Cost allocation with vault tags
  • 31. Best practices – Security and compliance
  • 32. Amazon Glacier – Audit logging with AWS CloudTrail • Enable AWS CloudTrail in console • Control plane events – Vault activities • Data plane events – Archive activities
  • 33. Vault access policies • Manage access to a Vault in a single location – single IAM policy – Grant/revoke access to internal business units/teams – “Marketing_Vault” has a distinct access policy than “DevOps_Vault” • Easily manage cross-account access for your business partner – Simply add a section for your business partner in the same policy
  • 34. Amazon Glacier Vault Lock allows you to easily set compliance controls on individual vaults and enforce them via a lockable policy. Time-based retention MFA Authentication Controls govern all records in a Vault Immutable policy Two-step locking Compliance Storage with Vault Lock
  • 35. Vault Lock for compliance storage • Non-overwrite, non-erasable records • Time-based retention with “ArchiveAgeInDays” control • Policy lockdown (strong governance) • Legal hold with vault-level tags • Configure optional designated third-party access and grant temporary access
  • 36. Example control: 1 year record retention
  • 37. Example control: 1 year record retention
  • 39. Legal hold with vault-level tags
  • 41. Vault lock best practices
  • 42. Vault access policy • Can be updated/deleted Vault lock policy • Lockable/Immutable policy • Cannot be updated/deleted after lockdown Use vault access policy to: • Designate third-party access • Grant temporary read permissions when necessary Use vault lock policy to: • Deploy regulatory controls such as records retention • Enforce data access through multi-factor authentication only Compliance/Governance Flexibility Using vault lock policy with vault access policy
  • 43. Vault Lock in the Glacier Console
  • 44. Vault Lock in the Glacier Console
  • 45. Vault Lock in the Glacier Console
  • 46. Vault Lock in the Glacier Console
  • 47. Vault Lock in the Glacier Console
  • 48. Vault Lock in the Glacier Console
  • 49. Vault Lock in the Glacier Console
  • 50. Vault Lock in the Glacier Console
  • 51. Vault Lock in the Glacier Console
  • 52. Vault Lock in the Glacier Console
  • 53. Vault Lock in the Glacier Console
  • 54. Vault Lock in the Glacier Console
  • 55. Vault Lock in the Glacier Console
  • 56. Vault Lock in the Glacier Console
  • 57. Vault Lock in the Glacier Console
  • 58. Vault Lock in the Glacier Console
  • 59. Vault Lock in the Glacier Console
  • 60. Amazon Glacier received a third-party assessment from Cohasset Associates on how Amazon Glacier with Vault Lock can be used to meet the requirements of SEC 17a-4(f) and CFTC 1.31(b)-(c).