This session explores some of the key features of Amazon Glacier, including security, durability, and configuration for storing compliance and regulatory data. It covers best practices for managing your cold data, including ingest, retrieval, and security controls. Other topics include: how to optimize storage, upload, and retrieval costs; how to identify the most applicable workloads; and recommended optimizations based on a few sample use cases from a number of industry verticals.
2. Audio archives – SoundCloud
• World’s leading social sound platform
• Audio files transcoded and stored in multiple formats
• Stores PBs of data
• Transcoded files served from Amazon S3
• Originals moved to Amazon Glacier for long-term retention
4. Tape replacement – King County
• Most populous county in Washington State
• Replace tape solution for backup from 17 agencies
• Meet compliance requirement
• Saved $1MM in first year, no more tape refresh or
management churn
5. Archive:
Data retained for the long term,
for compliance or potential
future reference
Data archiving needs are growing everywhere
• Media assets, 4K, 8K
• Health care / Life sciences
• Financial services
• Regulated industries
• Oil and gas / Geospatial
• Digital preservation
• Long-term backups
• Logs
7. How can Amazon Glacier help with your archival?
Metered usage:
Pay as you go
No capital investment
No commitment
No risky capacity planning
Avoid risks of physical
media handling
Control your
geographic locality for
performance and
compliance
8. Amazon Glacier is a low-cost storage service for
archival data with long-term retention requirements.
$0.007/GB per month 3-5 hour data retrieval
Financial records
Medical PACs images
High Res Media Assets
9. How can Amazon Glacier help with your archival?
Extremely low-cost archive storage service, starting at $0.007 GB/mo
Allows you to retrieve data within 3-5 hours
99.999999999% of durability (7 orders of magnitude higher than 2 copies of tape)
No data migration, no hardware/infrastructure investments
Infinite scale and pay for what you use
Access to on-demand compute resource on AWS
10. Getting started – key concepts
• Account – Access AWS services, view billing/usage, manage security
• Vaults – Container for archives, up to 1000 vaults per account
• Archives – Files and records, write-once, 40TB max, unlimited archives
• Inventory – Cold index of archive properties refreshed every 24 hours
11. Amazon Glacier – 3 ways to Access
•Direct Glacier API/SDK
•S3 lifecycle integration
•Third party tools and gateways
12. Amazon Glacier concepts: Uploading data
Create vault (films)1
Configure access policies2
ArchiveApp user policy
Effect:Allow
Resource:
arn:aws:glacier:<accountId>:vaults/Films
Action: glacier:UploadArchive
3 Upload archives
UploadArchive(data) ->
Archive ID
13. Amazon Glacier concepts: Retrieving data
Initiate Job
ArchiveId: AE99F…
Vault: Films -> Job ID
1
3-5 hours for job completion2
3 Job completion notification
4 Download output
14. Amazon Glacier – Amazon S3 lifecycle archival
• Seamlessly move data from Amazon S3 to Amazon Glacier
• Automated lifecycle rules
• Transition based on object age or predefined date
18. Use Archive descriptions
• Use Archive description field for
metadata.
• If local index is corrupted or
destroyed, use archive description
to reconstruct critical mappings.
• For example, create index entry,
add primary key to archive
description on upload.
19. Small objects and object size overhead
• Every archive has 32KB of associated overhead
and some operations are charged per request
• For archive size of 3.2MB ~1% cost overheads
• For 1KB archive, 97% of cost would go to
overhead
• Solution is aggregation – recommend minimum
size on the order of at least MBs
22. Best practices: Multipart uploads
Improve throughput, reliability, and get idempotency with multipart uploads
1. InitiateMultipartUpload(partSize) → uploadId
2. UploadPart(uploadId, data)
3. CompleteMultipartUpload(uploadId) → archiveId
Archive
Parallel Uploads
Parts
23. Best practices: Data ingestion options
AWS Direct
Connect
Dedicated bandwidth between
your site and AWS
Internet
Transfer data in a secure SSL tunnel
over the public Internet
AWS Import/Export
Snowball
Physical transfer of media into
and out of AWS
25. Amazon Glacier – Data retrieval policies
• Provides transparency and cost control for data retrievals
• Governs all retrieval activities for an account in a region
• Synchronously accept/reject each retrieval request
• Accounts for inflight retrieval operations
32. Amazon Glacier – Audit logging with AWS CloudTrail
• Enable AWS CloudTrail in
console
• Control plane events –
Vault activities
• Data plane events –
Archive activities
33. Vault access policies
• Manage access to a Vault in a single location – single IAM policy
– Grant/revoke access to internal business units/teams
– “Marketing_Vault” has a distinct access policy than “DevOps_Vault”
• Easily manage cross-account access for your business partner
– Simply add a section for your business partner in the same policy
34. Amazon Glacier Vault Lock allows you to easily
set compliance controls on individual vaults and
enforce them via a lockable policy.
Time-based retention
MFA Authentication
Controls govern all
records in a Vault
Immutable policy
Two-step locking
Compliance Storage with Vault Lock
35. Vault Lock for compliance storage
• Non-overwrite, non-erasable records
• Time-based retention with “ArchiveAgeInDays” control
• Policy lockdown (strong governance)
• Legal hold with vault-level tags
• Configure optional designated third-party access and grant
temporary access
42. Vault access policy
• Can be updated/deleted
Vault lock policy
• Lockable/Immutable policy
• Cannot be updated/deleted
after lockdown
Use vault access policy to:
• Designate third-party access
• Grant temporary read
permissions when necessary
Use vault lock policy to:
• Deploy regulatory controls such
as records retention
• Enforce data access through
multi-factor authentication only
Compliance/Governance Flexibility
Using vault lock policy with vault access policy
60. Amazon Glacier received a third-party assessment
from Cohasset Associates on how Amazon Glacier
with Vault Lock can be used to meet the
requirements of SEC 17a-4(f) and CFTC 1.31(b)-(c).