SlideShare uma empresa Scribd logo
1 de 32
Glacier and S3
Dave Thompson
AWS Meetup Michigan, Jan 2014
Who the @#%^ is Dave
Thompson?
• DevOps/SRE/Systems guy from MI by way of San
Francisco
• Current Employer: MuleSoft Inc
• Past Employers: Netflix, Domino’s Pizza, U of M
• Also contributing to the madness at RBN
… and what is he talking
about?
• Today, we’ll talk about a case study using Glacier
with S3, and the various surprises that I
encountered on the way.
Act 1: A New Project
Our Story So Far
• Client’s datacenter is going dark in a few months.
• Their app is data heavy… a little less than 1 BN
small files.
Our Story So Far (cont.)
• Client has migrated app servers to EC2
• Data has been uploaded to S3
Everything Goes According
to Plan!
• Files are uploaded to S3
• App updated to use S3 data
Act 2: The Public Cloud Strikes Back
Things take a
dark turn…
S3 is too latent for the app.
Enter RBN!
The proposal: migrate the data from S3 to a cloud storage
solution (Zadara), and archive the files to Glacier.
Everything Goes According
to Plan (Again)!
• Files are copied to Zadara share
• S3 lifecycle configured to archive objects to Glacier
The Zadara share becomes
corrupted after the data is migrated.
Except…
Amazon Glacier: a Primer
• Glacier is an archival solution provided by AWS.
• It’s closely integrated with S3.
• Use cases for Glacier and S3 are different,
though…
S3 vs Glacier
• Unlike an S3 GET, a Glacier RETRIEVAL takes ~4
hours
• UPLOAD and RETRIEVAL API requests are 10x
more expensive on Glacier than comparable S3
requests
• Bandwidth charges for RETRIEVAL requests apply,
even inside us-east-1
S3 vs Glacier (cont.)
• This means that Glacier is optimized for
compressed archives (i.e. tarball data)
• S3 is about equally suited for smaller or larger files
• Automatically archiving S3 objects to Glacier can
thus lead to great sadness.
What a Twist!
~100MM files had already been
automatically archived to Glacier.
Act 3: Return of the Data
The New Plan
• Restore files from Glacier back to S3
• Migrate data from S3 to Zadara share
• Archive files back to Glacier in tar.gz chunks
• Create DynamoDB index from file name to Glacier
archive for future restore
but wait…
How much was this restore going to cost?
Task 0: Calculating Cost
• Glacier pricing model is… interesting
• Costs are fixed per UPLOAD and RETRIEVAL
request
• Cost for bandwidth based on the peak outbound
bandwidth consumed in a monthly billing period2
• Monthly bandwidth equal to 5% of your total Glacier
usage is permitted free of charge
The Equation(Oh, boy. Okay, let’s do
this.)
• Let X equal the number of RETRIEVE API calls made.
• Let Y equal the amount to restore in GB.
• Let Z equal the total amount of data archived in GB.
• Let T equal the time to restore the data in hours.
• Then the cost can be expressed as:
(0.05 * (X / 1000)) + (((Y / T) - (Z * .05 / 30) * .01 * 720)
Task 1: Restore from Glacier
• Two m2.large instances running a Python daemon
• Multiple iterations, from single threaded to multi-
threaded to multiprocessing with threading
After iterating several times to get the speed we needed, I
started the process for the ‘last time’ on a Sunday evening.
ETA: ~5 days
This Page Intentionally
Left Blank
Protip:
Glacier is not optimized for RPS
Task 1: Restore from Glacier
(cont.)
Glacier team was not amused.
Task 1: Restore from Glacier
(cont.)
Restore continued at the ‘suggested’ rate, and thereafter
completed successfully a couple of weeks later.
Task 1 complete!
Task 2: Migrate and Archive
Data
Now we just needed to migrate the data from S3 to Zadara
(again), create tarballs of the files, archive them to Glacier, and
create a DynamoDB index so you can look up individual files.
Easy!
Task 2: Migrate and Archive
Data (cont.)
Back to iPython and Boto. Recent experience with Python
threading and multiprocessing was to prove helpful.
This Page Intentionally
Left Blank
Great Success!
And the whole thing only took about 10x as long
as the client initially estimated!
Lessons Learned
• Glacier is optimized for large, compressed files and
lower request rates.
• Be very careful about the S3 -> Glacier lifecycle
option.
• If you DoS an Amazon service, you get special
attention!
Questions have you?

Mais conteúdo relacionado

Mais procurados

AWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon GlacierAWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon GlacierAmazon Web Services
 
AWS 201 Webinar: Introduction to Amazon Glacier
AWS 201 Webinar: Introduction to Amazon GlacierAWS 201 Webinar: Introduction to Amazon Glacier
AWS 201 Webinar: Introduction to Amazon GlacierAmazon Web Services
 
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...Amazon Web Services
 
Amazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon Web Services
 
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryGetting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryAmazon Web Services
 
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...Amazon Web Services
 
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...Amazon Web Services
 
(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive
(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive
(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & ArchiveAmazon Web Services
 
Backup to the Cloud
Backup to the CloudBackup to the Cloud
Backup to the Cloud2nd Watch
 
Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...
Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...
Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...Amazon Web Services
 
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012Amazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...Amazon Web Services
 
Data Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & SnowmobileData Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & SnowmobileAmazon Web Services
 
(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014
(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014
(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014Amazon Web Services
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWSCaserta
 
Understanding AWS Storage Options (STG101) | AWS re:Invent 2013
Understanding AWS Storage Options (STG101) | AWS re:Invent 2013Understanding AWS Storage Options (STG101) | AWS re:Invent 2013
Understanding AWS Storage Options (STG101) | AWS re:Invent 2013Amazon Web Services
 

Mais procurados (20)

AWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon GlacierAWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
 
AWS 201 Webinar: Introduction to Amazon Glacier
AWS 201 Webinar: Introduction to Amazon GlacierAWS 201 Webinar: Introduction to Amazon Glacier
AWS 201 Webinar: Introduction to Amazon Glacier
 
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
An Overview of AWS Services for Data Storage and Migration - SRV205 - Atlanta...
 
Amazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage Overview
 
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryGetting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
 
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
 
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
 
AWS Storage Gateway
AWS Storage GatewayAWS Storage Gateway
AWS Storage Gateway
 
(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive
(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive
(STG311) AWS Storage Gateway: Secure, Cost-Effective Backup & Archive
 
Backup to the Cloud
Backup to the CloudBackup to the Cloud
Backup to the Cloud
 
Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...
Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...
Best Practices for Architecting Cloud Backup and Recovery Solutions - AWS Mar...
 
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
SRG302 Archiving in the Cloud using Amazon Glacier - AWS re: Invent 2012
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
 
Data Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & SnowmobileData Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & Snowmobile
 
(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014
(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014
(SOV203) Understanding AWS Storage Options | AWS re:Invent 2014
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWS
 
Understanding AWS Storage Options (STG101) | AWS re:Invent 2013
Understanding AWS Storage Options (STG101) | AWS re:Invent 2013Understanding AWS Storage Options (STG101) | AWS re:Invent 2013
Understanding AWS Storage Options (STG101) | AWS re:Invent 2013
 
AWS Storage Options
AWS Storage OptionsAWS Storage Options
AWS Storage Options
 
EC2 and S3 Level 100
EC2 and S3 Level 100EC2 and S3 Level 100
EC2 and S3 Level 100
 

Semelhante a S3 and Glacier

Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayChoosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayAmazon Web Services Korea
 
Store stream data on Data Lake
Store stream data on Data LakeStore stream data on Data Lake
Store stream data on Data LakeMarcos Rebelo
 
Migrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the CloudMigrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the CloudAmazon Web Services
 
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012Amazon Web Services
 
AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour Amazon Web Services
 
Bluecat Iceberg Journey by Cory Darby
Bluecat Iceberg Journey by Cory DarbyBluecat Iceberg Journey by Cory Darby
Bluecat Iceberg Journey by Cory DarbyBrian Olsen
 
Intro to Joyent's Manta Object Storage Service
Intro to Joyent's Manta Object Storage ServiceIntro to Joyent's Manta Object Storage Service
Intro to Joyent's Manta Object Storage ServiceRod Boothby
 
DevOps throughout time
DevOps throughout timeDevOps throughout time
DevOps throughout timeHany Fahim
 
(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big DataAmazon Web Services
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsIgor Sfiligoi
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAmazon Web Services
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAmazon Web Services
 
AWS Customer Highlight - Craftsy
AWS Customer Highlight - CraftsyAWS Customer Highlight - Craftsy
AWS Customer Highlight - CraftsyAmazon Web Services
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...Codemotion
 
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
 Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ... Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...AWS Chicago
 
[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...
[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...
[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...Amazon Web Services
 
Case Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataCase Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataSchubert Zhang
 
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...Amazon Web Services
 
Spark Meetup at Uber
Spark Meetup at UberSpark Meetup at Uber
Spark Meetup at UberDatabricks
 

Semelhante a S3 and Glacier (20)

Ingest and storage options
Ingest and storage optionsIngest and storage options
Ingest and storage options
 
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB DayChoosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
Choosing the Right Database Service (김상필, 유타카 호시노) - AWS DB Day
 
Store stream data on Data Lake
Store stream data on Data LakeStore stream data on Data Lake
Store stream data on Data Lake
 
Migrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the CloudMigrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the Cloud
 
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
 
AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour
 
Bluecat Iceberg Journey by Cory Darby
Bluecat Iceberg Journey by Cory DarbyBluecat Iceberg Journey by Cory Darby
Bluecat Iceberg Journey by Cory Darby
 
Intro to Joyent's Manta Object Storage Service
Intro to Joyent's Manta Object Storage ServiceIntro to Joyent's Manta Object Storage Service
Intro to Joyent's Manta Object Storage Service
 
DevOps throughout time
DevOps throughout timeDevOps throughout time
DevOps throughout time
 
(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
AWS Customer Highlight - Craftsy
AWS Customer Highlight - CraftsyAWS Customer Highlight - Craftsy
AWS Customer Highlight - Craftsy
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
 
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
 Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ... Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
 
[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...
[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...
[AWS LA Media & Entertainment Event 2015]: Digital Media Ingest & Storage Opt...
 
Case Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataCase Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of Data
 
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
Optimizing Data Management Using AWS Storage and Data Migration Products | AW...
 
Spark Meetup at Uber
Spark Meetup at UberSpark Meetup at Uber
Spark Meetup at Uber
 

Último

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 

Último (20)

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 

S3 and Glacier

  • 1. Glacier and S3 Dave Thompson AWS Meetup Michigan, Jan 2014
  • 2. Who the @#%^ is Dave Thompson? • DevOps/SRE/Systems guy from MI by way of San Francisco • Current Employer: MuleSoft Inc • Past Employers: Netflix, Domino’s Pizza, U of M • Also contributing to the madness at RBN
  • 3. … and what is he talking about? • Today, we’ll talk about a case study using Glacier with S3, and the various surprises that I encountered on the way.
  • 4. Act 1: A New Project
  • 5. Our Story So Far • Client’s datacenter is going dark in a few months. • Their app is data heavy… a little less than 1 BN small files.
  • 6. Our Story So Far (cont.) • Client has migrated app servers to EC2 • Data has been uploaded to S3
  • 7. Everything Goes According to Plan! • Files are uploaded to S3 • App updated to use S3 data
  • 8. Act 2: The Public Cloud Strikes Back
  • 9. Things take a dark turn… S3 is too latent for the app.
  • 10. Enter RBN! The proposal: migrate the data from S3 to a cloud storage solution (Zadara), and archive the files to Glacier.
  • 11. Everything Goes According to Plan (Again)! • Files are copied to Zadara share • S3 lifecycle configured to archive objects to Glacier
  • 12. The Zadara share becomes corrupted after the data is migrated. Except…
  • 13. Amazon Glacier: a Primer • Glacier is an archival solution provided by AWS. • It’s closely integrated with S3. • Use cases for Glacier and S3 are different, though…
  • 14. S3 vs Glacier • Unlike an S3 GET, a Glacier RETRIEVAL takes ~4 hours • UPLOAD and RETRIEVAL API requests are 10x more expensive on Glacier than comparable S3 requests • Bandwidth charges for RETRIEVAL requests apply, even inside us-east-1
  • 15. S3 vs Glacier (cont.) • This means that Glacier is optimized for compressed archives (i.e. tarball data) • S3 is about equally suited for smaller or larger files • Automatically archiving S3 objects to Glacier can thus lead to great sadness.
  • 16. What a Twist! ~100MM files had already been automatically archived to Glacier.
  • 17. Act 3: Return of the Data
  • 18. The New Plan • Restore files from Glacier back to S3 • Migrate data from S3 to Zadara share • Archive files back to Glacier in tar.gz chunks • Create DynamoDB index from file name to Glacier archive for future restore
  • 19. but wait… How much was this restore going to cost?
  • 20. Task 0: Calculating Cost • Glacier pricing model is… interesting • Costs are fixed per UPLOAD and RETRIEVAL request • Cost for bandwidth based on the peak outbound bandwidth consumed in a monthly billing period2 • Monthly bandwidth equal to 5% of your total Glacier usage is permitted free of charge
  • 21. The Equation(Oh, boy. Okay, let’s do this.) • Let X equal the number of RETRIEVE API calls made. • Let Y equal the amount to restore in GB. • Let Z equal the total amount of data archived in GB. • Let T equal the time to restore the data in hours. • Then the cost can be expressed as: (0.05 * (X / 1000)) + (((Y / T) - (Z * .05 / 30) * .01 * 720)
  • 22. Task 1: Restore from Glacier • Two m2.large instances running a Python daemon • Multiple iterations, from single threaded to multi- threaded to multiprocessing with threading After iterating several times to get the speed we needed, I started the process for the ‘last time’ on a Sunday evening. ETA: ~5 days
  • 24. Protip: Glacier is not optimized for RPS
  • 25. Task 1: Restore from Glacier (cont.) Glacier team was not amused.
  • 26. Task 1: Restore from Glacier (cont.) Restore continued at the ‘suggested’ rate, and thereafter completed successfully a couple of weeks later. Task 1 complete!
  • 27. Task 2: Migrate and Archive Data Now we just needed to migrate the data from S3 to Zadara (again), create tarballs of the files, archive them to Glacier, and create a DynamoDB index so you can look up individual files. Easy!
  • 28. Task 2: Migrate and Archive Data (cont.) Back to iPython and Boto. Recent experience with Python threading and multiprocessing was to prove helpful.
  • 30. Great Success! And the whole thing only took about 10x as long as the client initially estimated!
  • 31. Lessons Learned • Glacier is optimized for large, compressed files and lower request rates. • Be very careful about the S3 -> Glacier lifecycle option. • If you DoS an Amazon service, you get special attention!