Optimizing for Cost in the AWS Cloud - 5 Ways to Further Save - AWS Summit 2012 - NYC - Jinesh Varia

Optimizing for Cost in the Cloud

Jinesh Varia
@jinman
Technology Evangelist

Multiple dimensions of optimizations

Cost
Performance
Response time
Time to market
High-availability
Scalability
Security
Manageability
…….

When you turn off your cloud resources,
you actually stop paying for them

Continuous optimization in your architecture results
in recurring savings in your next month’s bill

Elasticity is one of the fundamental
properties of the cloud that drives many of its
economic benefits

Optimizing for Cost…

#1 Use only what you need (use Auto Scaling Service, modify–db)

Turn off what you don’t need (automatically)

Daily CPU Load
14
12
10
8
Load

6 25% Savings
4
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Hour

Optimize by the time of day

www.MyWebSite.com
(dynamic data)
Amazon Route 53
media.MyWebSite.com
(DNS)
(static data)
Elastic Load
Balancer

Amazon
Auto Scaling group : Web Tier CloudFront

Amazon EC2

Auto Scaling group : App Tier

Amazon RDS Amazon S3
Amazon
Availability Zone #1 RDS

Availability Zone #2

Web Servers 50% Savings

1 5 9 13 17 21 25 29 33 37 41 45 49
Week

Optimize during a year

Auto scaling : Types of Scaling
Scaling by Schedule
• Use Scheduled Actions in Auto Scaling Service
• Date
• Time
• Min and Max of Auto Scaling Group Size
• You can create up to 125 actions, scheduled up to 31 days into the
future, for each of your auto scaling groups. This gives you the ability
to scale up to four times a day for a month.
Scaling by Policy
• Scaling up Policy - Double the group size
• Scaling down Policy - Decrement by 1

Auto scaling Best Practices

Use Auto Scaling Tags
Use Auto scaling Alarms and Email Notifications
Scale up and down symmetrically
Scale up quickly and scaling down slowly
Auto Scaling across Availability Zones
Leverage Suspend and Resume Processes

Example:

Scale up by 10%
if CPU utilization is greater than 60%
for 5 minutes,

Scale down by 10%
if CPU utilization is less than 30%
for 20 minutes.

RDS DB Servers 75% Savings

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Days of the Month

Optimize during a month

End of the month processing
Expand the cluster at the end of the month
• Expand/Shrink feature in Amazon Elastic MapReduce
Vertically Scale up at the end of the month
• Modify-DB-Instance (in Amazon RDS) (or a New RDS DB Instance )
• CloudFormation Script (in Amazon EC2)

Tip: Use “Reminder scripts”

 Disassociate your unused EIPs
 Delete unassociated EBS volumes
 Delete older EBS snapshots
 Leverage S3 Object Expiration

Basic recommendations on Instance Type

Choose the EC2 instance type that best matches the resources
required by the application
• Start with memory requirements and architecture type (32bit or 64-
bit)
• Then choose the closest number of virtual cores required
Scaling across AZs
• Smaller sizes give more granularity for deploying to multiple AZs

AWS Support – Trusted Advisor –
Your personal cloud assistant

Tip – Instance Optimizer

Free Memory
Free CPU PUT 2 weeks
Free HDD
At 1-min
intervals Alarm
Amazon CloudWatch

Instance

Custom Metrics

“You could save a bunch of money by switching
to a small instance, Click on CloudFormation Script to
Save”



#2 Invest time in Reserved Pricing analysis (EC2, RDS)

Your Best Option: Reserved + On-Demand

Save more when you reserve

On-demand Reserved
Instances Instances Heavy
Utilization RI
• Pay as you go • One time low
upfront fee + 1-year and 3- Medium
Pay as you go year terms Utilization RI
• Starts from • $23 for 1 year
term and Light
$0.02/Hour Utilization RI
$0.01/Hour

$14,000
m2.xlarge running Linux in US-East Region
$12,000
over 3 Year period
Break-even
$10,000 point
$8,000
Cost

Heavy Utilization
$6,000 Medium Utilization
$4,000
Light Utilization
On-Demand
$2,000

$-

Utilization

Utilization Sweet Spot Feature Savings over On-Demand
<10% On-Demand No Upfront Commitment
10% - 40% Light Utilization RI Ideal for Disaster Recovery Up to 56% (3-Year)
40% - 75% Medium Utilization RI Standard Reserved Capacity Up to 66% (3-Year)
>75% Heavy Utilization RI Lowest Total Cost Up to 71% (3-Year)
Ideal for Baseline Servers

Recommendations

Steady State Usage Pattern
• For 100% utilization
• 3-Year Heavy RI (for maximum savings over on-demand)
Spiky Predictable Usage Pattern
• Baseline
• 3-Year Heavy RI (for maximum savings over on-demand)
• 1-Year Light RI (for lowest upfront commitment) + savings over on-demand
• Peak: On-Demand
Uncertain and unpredictable Usage Pattern
• Start out small with On-Demand Instances (risk-free and commitment-
free)
• Switch to some combination of Reserved and On-Demand, if application is
successful
• If not successful, you walk away having spent a fraction of what you would
pay to buy your own technology infrastructure

Example: Simple 3-Tier Web Application

Description Option 1 Option 2 Option 3 Option 4
2 Web servers 2 On-Demand 2 On-Demand 1 On-Demand and 1 On-Demand and
1 Reserved Medium 1 Reserved Light
Utilization Utilization
2 App servers 2 On-Demand 2 On-Demand 1 On-Demand and 1 On-Demand and
1 Reserved Medium 1 Reserved Light
Utilization Utilization
2 Database servers 2 On-Demand 2 Reserved 2 Reserved Medium 2 Reserved Heavy
Medium Utilization Utilization
Utilization

Example: Simple 3-Tier Web Application

Savings Option 1 Option 2 Option 3 Option 4
Calculator Calculator Calculator Calculator
Monthly Cost $702.72 $374.78 $256.20 $238.63
One-Time Cost 1 Year Term - $1280.00 $1600.00 $1698.00
3 Year Term - $2000.00 $2500.00 $2612..60
Total Cost 1 Year Term (x12) $8432.64 $5777.36 $4674.40 $4561.56
3 Year Term (x36) $25297.92 $15492.08 $11723.20 $11203.28

Savings 1 Year Term n/a 32% 44% 45%
(Over Option 1)
3 Year Term n/a 39% 54% 54%




#3 Architect for Spot Instances (bidding strategies)

Optimize by using Spot Instances

On-demand Reserved Spot
Instances Instances Instances
• Pay as you go • One time low • Requested Bid
upfront fee + Price and Pay
Pay as you go as you go
• Starts from • $23 for 1 year • $0.005/Hour
$0.02/Hour term and as of today at
$0.01/Hour 9 AM

1-year and 3-
year terms

Heavy Medium Light Utilization
Utilization RI Utilization RI RI

What are Spot Instances?

Sold at Sold at
50%
Unused 54%
Unused
Discount! Discount!

Sold at Sold at
56%
Unused 59%
Unused
Discount! Discount!

Sold at Sold at
66%
Unused 63%
Unused
Discount! Discount!

Availability Zone Availability Zone

Region

What is the tradeoff?

Unused Unused

Unused
Reclaimed Unused

Unused
Reclaimed Unused

Availability Zone Availability Zone

Region

Spot Use cases
Use Case Types of Applications
Batch Processing Generic background processing (scale out computing)

Hadoop Hadoop/MapReduce processing type jobs (e.g. Search,
Big Data, etc.)

Scientific Computing Scientific trials/simulations/analysis in chemistry,
physics, and biology
Video and Image Transform videos into specific formats
Processing/Rendering
Testing Provide testing of software, web sites, etc

Web/Data Crawling Analyzing data and processing it
Financial Hedgefund analytics, energy trading, etc
HPC Utilize HPC servers to do embarrassingly parallel jobs

Cheap Compute Backend servers for Facebook games

Save more money by using Spot Instances

Reserved Hourly Price > Spot Price < On-Demand Price

Spot: Example Customers

57%

50%
63%

50%
56%

50%
66%

50%

Typical Spot Bidding Strategies

Bid Distribution (for last 3 months)
20% 1. Bid near the
18%
Reserved
Hourly Price
Percentage of the Distribution

16%

14%
2. Bid above the
12%
Spot Price
10% History
8%

6%
3. Bid near On-
4%
Demand Price
2%
4. Bid above the
0%
On-Demand
Price
Bid Price as Percentage of the On-Demand Price

1. Bid Near the Reserved Hourly Price

$$$$$$$$$$$$$$$$$$ $$$ $ $ $ $

66% Savings over
On-Demand

2. Bid above the Spot Price History

50% Savings over
On-Demand

3. Bid near the On-Demand Price

50% Savings over
On-Demand

4. Bid above the On-Demand Price

57% Savings over
On-Demand

Amazon EMR (Hadoop): Run Task Nodes on Spot

Amazon S3
Upload large
datasets or log Amazon S3
Data files directly
Input
Source Data
Outpu
tData

Task
Amazon Elastic Node
MapReduce Amazon SimpleDB

Mapper
Code/ Reducer Name Task
Service Metadata
Scripts HiveQL
Node Node
Pig Latin
Cascading Runs multiple
JobFlow Steps Core HiveQL
Node Pig Latin
Query
Core
Node
HDFS
BI Apps
Amazon Elastic MapReduce JDBC/ODB
C
Hadoop Cluster

Amazon EMR: Reducing Cost with Spot

Scenario #1
#1: Cost without Spot
Job Flow 4 instances *14 hrs * $0.45 = $25.20

Duration:
14 Hours #2: Cost with Spot
4 instances *7 hrs * $0.45 = $12.60 +
5 instances * 7 hrs * $0.225 = $7.875
Scenario #2 Total = $20.475
Job Flow

Time Savings: 50%
Duration:
Cost Savings: ~19%
7 Hours

Made for each other: MapReduce + Spot

Use Case: Web crawling/Search
using Hadoop type clusters. Use
Reserved Instances for their DB
workloads and Spot instances for
their indexing clusters. Launch
100’s of instances.
Bidding Strategy: Bid a little
above the On-Demand price to
prevent interruption.
Interruption Strategy: Restart
the cluster if interrupted

66% Savings over
On-Demand

Video Transcoding Application Example
Amazon S3 Amazon S3

Amazon
Elastic Compute Cloud
Input Output
Bucket Bucket
Amazon EC2

Amazon SQS Amazon SQS
Job Completed Reports
Job Website

Input Output
Website Queue Queue Amazon EC2
(Job
Manager)

On-demand + Spot

Amazon
Amazon SimpleDB
CloudWatch
Amazon SimpleDB

Amazon EC2
Intranet

Use of Amazon SQS in Spot Architectures

VisibilityTimeOut
Amazon EC2
Spot Instance

Optimizing Video Transcoding Workloads

Free Offering Premium Offering
• Optimize for reducing cost  Optimized for Faster response times
• Acceptable Delay Limits  No Delays

Implementation Implementation
• Set Persistent Requests  Invest in RIs
• Use on-demand Instances, if  Use on-demand for Elasticity
delay

Maximum Bid Price Maximum Bid Price
< On-demand Rate >= On-demand Rate
Get your set reduced price for Get Instant Capacity for higher price
your workload

Architecting for Spot Instances : Best Practices

Manage interruption
• Split up your work into small increments
• Checkpointing: Save your work frequently and periodically
Test Your Application
Track when Spot Instances Start and Stop
Spot Requests
• Use Persistent Requests for continuous tasks
• Choose maximum price for your requests





#4 Leverage Application Services (ELB, SNS, SQS, SWF, SES)

Optimize by converting ancillary instances into
services

Monitoring: CloudWatch
Notifications: SNS
Queuing: SQS
SendMail: SES
Load Balancing: ELB
Workflow: SWF
Search: CloudSearch

Elastic Load Balancing

Software LB on EC2 Elastic Load Balancing
Pros Pros
Application-tier load Elastic and Fault-tolerant
balancer
Auto scaling
Monitoring included

Cons
SPOF Cons
Elasticity has to be For Internet-facing traffic
implemented manually only
Not as cost-effective

$0.025
per hour
DNS Elastic Load
Web Servers
Balancer
Availability Zone

$0.08
per hour
(small instance)
EC2 instance
DNS + software LB Web Servers
Availability Zone

Application Services

Software on EC2 SNS, SQS, SES, SWF
Pros Pros
Custom features Pay as you go
Scalability
Cons Availability
Requires an instance High performance
SPOF
Limited to one AZ
DIY administration

Consumers
Producer SQS queue

$0.01 per
10,000 Requests
($0.000001 per Request)

$0.08
per hour
(small instance) Producer
EC2 instance Consumers
+ software queue





#4 Leverage Application Services (ELB, SNS, SQS, SWF, SES)

#5 Implement Caching (ElastiCache, CloudFront)

caching

Optimize for performance and cost
by page caching and edge-caching static content

When am I charged?
Paris

Client

Edge Location

Amazon Simple
Storage Service
(S3) Client
Singapore

Amazon Elastic
Compute Cloud
(EC2)
Edge Location

London

Edge Location

Client

When content is popular…
Paris

Client

Edge Location

Amazon Simple
Storage Service
(S3)
Client
Singapore

Amazon Elastic
Compute Cloud
(EC2)
Edge Location

London

Edge Location

Client

Architectural Recommendations

Use Amazon S3 + CloudFront as it will reduce the cost as well
as reduce latency for static data
• Depends on cache-hit ratio
For Video Streaming, use CloudFront as there is no need of a
separate streaming server running Adobe FMS
Use managed caching service (Amazon ElastiCache)

Number of ways to further save with AWS…




#4 Leverage Application Services (ELB SNS, SQS, SWF, SES)

#5 Implement Caching (ElastiCache, CloudFront)

Thank you!

jvaria@amazon.com
Twitter: @jinman

Web Application Usage Patterns

Steady State Spiky Predictable Uncertain unpredictable
Usage Pattern Usage Pattern Usage Pattern

(Example: Corporate Website) (Example: Marketing (Example: Social game or
Promotions Website) Mobile Website)

www.MyWebSite.com
(dynamic data)
Example: TCO of a Amazon Route 53
media.MyWebSite.com
(DNS)
3-tier Web Application Elastic Load
(static data)

Balancer

Amazon
Auto Scaling group : Web Tier CloudFront

Amazon EC2

Auto Scaling group : App Tier

Amazon RDS Amazon Amazon S3
Availability Zone #1 RDS

Availability Zone #2

Optimizing for Cost in the AWS Cloud - 5 Ways to Further Save - AWS Summit 2012 - NYC - Jinesh Varia

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Optimizing for Cost in the AWS Cloud - 5 Ways to Further Save - AWS Summit 2012 - NYC - Jinesh Varia

Semelhante a Optimizing for Cost in the AWS Cloud - 5 Ways to Further Save - AWS Summit 2012 - NYC - Jinesh Varia (20)

Mais de Amazon Web Services

Mais de Amazon Web Services (20)

Último

Último (20)

Optimizing for Cost in the AWS Cloud - 5 Ways to Further Save - AWS Summit 2012 - NYC - Jinesh Varia

Notas do Editor