This document discusses how to reduce spending on AWS through various techniques:
1. Paying for cloud resources only when they are used through the pay-as-you-go model avoids upfront costs and allows turning off unused capacity.
2. Using reserved instances when capacity needs are predictable provides significant discounts compared to on-demand pricing.
3. Architecting applications in a "cost aware" manner, such as leveraging caching, auto-scaling, managed services, and right-sizing instances can optimize costs.
4. Taking advantage of AWS's economies of scale through consolidated billing and free services helps lower overall spend. Planning workload usage of spot instances can achieve up to 85% savings.
15. Free steak
campaign
Facebook
page
Mars
exploration ops
Consumer
social app
Ticket pricing
optimization
SAP &
Sharepoint
Securities Trading
Data Archiving
Gene
sequencing
Marketing
web site
Interactive TV
apps
Financial
markets
analytics
R&D data
analysis
Consumer
social app
Big data
analytics
Web site &
media sharing
Disaster
recovery
Media
streaming
Web and
mobile apps
Streaming
webcasts
Facebook
app
Consumer
social app
Every Imaginable Use Case
16. Every Day…
AWS adds the equivalent server
capacity to power Amazon when it was
a global, $5.2B enterprise in 2003
2003
$5.2B retail business
7,800 employees
A whole lot of servers…
17. The AWS Price Reduction Philosophy
Ecosystem
Global
Footprint
New Features
New Services
Infrastructure
Innovation
More AWS
Usage
More
Infrastructure
Economies
of Scale
Lower
Infrastructure
Costs
Reduced
Prices
More
Customers
19. AWS Pricing Philosophy
• Pay as you go
– No minimum commitments or long-term contracts required
– Capex -> Opex
– Turn off when you don‟t need it
• Pay less per unit when you use more
– Tiered Pricing and Volume Discounts
• Pay even less when you reserve
– Reserved pricing
• Pay even less as AWS grows
– Efficiencies, optimizations and economies of scale result in passing the
savings back to you in the form of lower pricing
21. On-Demand
Pay for compute
capacity by the
hour with no long-
term commitments
For spiky
workloads,
or to define needs
Cost Optimization using different purchase models
Reserved
Make a low, one-
time payment and
receive a significant
discount on the
hourly charge
For committed
utilization
Spot
Bid for unused
capacity, charged at a
Spot Price which
fluctuates based on
supply and demand
For time-insensitive or
transient workloads
Free Tier
Get Started on
AWS with free
usage & no
commitment
For POCs and
getting started
25. Utilise the Free Tier
Time
Scale
Scenario
Small team with initial idea for Mobile app
3 months to get to launch
Unknown customer/problem/solution
No cash….
26. • Internal testing with your team
• 2 Tier Web - Database Servers
• Use t1.micro Instances
Dev / Test Environment
Time
Scale
Average Spend
$0
p/m
27. • Release to small group of ‘core testers’
• 2 Tier Web & Database Servers
• 2 x t1.micro Instances
Alpha Release
Time
Scale
Average Spend
$15
p/m
28. • First public release – limited audience
• Master / Slave DB setup
• m1.small Instances
• Auto-Scaling Instances (2 Instance minimum)
Beta Release / MVP
Time
Scale
Average Spend
$235
p/m
29. Getting to MVP for $250
Time
Scale
Total Spend to MVP
$250
$235$15$0
• 3 months dev/test/release
• Serving Beta customers
• Ready for full production
and scale
30. Reserved Instance Pricing
Make a low, one-time payment and receive a
significant discount on the hourly charge
For committed utilization
•Light Utilization RI
•Medium Utilization RI
•High Utilization RI
•1-year
•3-year
2 Terms3 Versions
31. Reserved Instance Pricing
Utilization RI option Savings over On-Demand
<10% On-Demand
10% - 40% Light Utilization RI Up to 56%
40% - 75% Medium Utilization RI Up to 66%
>75% Heavy Utilization RI Up to 71%
37. • Most traffic happens in the afternoons and evenings, so they reduce the number of
instances at night by 40%.
• At peak traffic $52 an hour is spent on EC2 and at night, during off peak, the spend is as
little as $15 an hour. Saving per hour = 71%
38. Save more money by using Spot Instances
Up to 85% savings over On Demand pricing
Spot market for under-
utilized capacity
Requested Bid Price and
Pay as you go
Spot Price < On-Demand Price
39. Use Case Types of Applications
Batch Processing Generic background processing (scale out computing)
Hadoop Hadoop/MapReduce processing type jobs (e.g. Search, Big Data, etc.)
Scientific Computing Scientific trials/simulations/analysis in chemistry, physics, and biology
Video and Image
Processing/Rendering
Transform videos into specific formats
Testing Provide testing of software, web sites, etc
Web/Data Crawling Analyzing data and processing it
Financial Hedgefund analytics, energy trading, etc
HPC Utilize HPC servers to do embarrassingly parallel jobs
Cheap Compute Backend servers for Facebook games
Use Cases for Spot Pricing
40.
41. Optimizing Video Transcoding Workloads
for a FREEMIUM model
Free Offering
Optimize for reducing cost
Acceptable Delay Limits
Implementation
– Leverage spot pricing
– Maximum Bid Price
– < On-demand Rate
– Use on-demand Instances, if delay
Get strongly reduced price for your
workload
Premium Offering
Optimized for Faster response
No Delays
Implementation
– Invest in Reserved Instances
– Use on-demand for Elasticity
Get Instant Capacity for higher price
43. “Give me 4 fault tolerant algorithms and I can pick
the best one almost with my eyes closed.
If you then ask me which one is best for the
business, in terms of dollar costs, I would be
clueless...”
Werner Vogels, CTO, Amazon
44. Cost optimization through „Cost Aware Architecting‟
…by leveraging:Reduce Cost of…
Compute
1. S3 & CloudFront for Caching & Offloading
Storage 5. Storing derivative objects in S3 „Reduced Redundancy‟
Database 6. Read Replicas and/or ElastiCache
Test & Dev 7. Rapid proto-typing & Lean Dev/Test
2. Auto-Scaling done Right
3. Leverage Managed Services
4. Sizing your Application for AWS
45. 1. S3 & CloudFront for Caching & Offloading
• Reduce your compute demand and costs
• Improve end-user experience
• Increase reliability and durability
Cost Aware Architecting to Reduce costs of EC2
46. 1. S3 & CloudFront for Caching & Offloading
Cost Aware Architecting to Reduce costs of EC2
47. 1. S3 & CloudFront for Caching & Offloading
Cost Aware Architecting to Reduce costs of EC2
48. 1. S3 & CloudFront for Caching & Offloading
Cost Aware Architecting to Reduce costs of EC2
49. 1. S3 & CloudFront for Caching & Offloading
Cost Aware Architecting to Reduce costs of EC2
50.
51. 2. Auto-Scaling done Right with Real Time reaction response
• Elastic Load Balancing and (event-driven) Auto Scaling
• Notification of pending news flash (with audible alarm)
• On-demand ramp up of capacity (6 mins.)
• Subscriber alert push delivered
• Mass response traffic handled (followed by ramp down)
Cost Aware Architecting to Reduce costs of EC2
53. 2. Auto-Scaling done Right with Real Time reaction response
Cost Aware Architecting to Reduce costs of EC2
Straits TimesBuuuk
54. 2. Auto-Scaling done Right with Real Time reaction response
Cost Aware Architecting to Reduce costs of EC2
55. 2. Auto-Scaling done Right with Real Time reaction response
Cost Aware Architecting to Reduce costs of EC2
56. 2. Auto-Scaling done Right with Real Time reaction response
Cost Aware Architecting to Reduce costs of EC2
57. 2. Auto-Scaling done Right with Real Time reaction response
Cost Aware Architecting to Reduce costs of EC2
58. 3. Leverage Managed Services
Cost Aware Architecting to Reduce costs of EC2
Rabbit MQ, MSMQ
Cron
Running a mail server
Running a NoSQL cluster
Running MySQL on EC2
Memcached
Encoding Server
VS
Simple Queuing Service
Simple Workflow Service
Simple Email Service
Dynamo DB
Relational Database Service
ElastiCache
Elastic Transcoder
59. 4. Sizing your Application for AWS
Cost Aware Architecting to Reduce costs of EC2
60. 256
128
64
32
16
8
4
2
1
1 2 4 8 16 32 64 128 256
High I/O 4XL 60.5 GB
35 EC2 Compute Units
16 virtual cores
2*1024 GB SSD-based local instance storage
EC2 Compute Units
Memory(GB)
Small 1.7 GB,
1 EC2 Compute Unit
1 virtual core
Micro 613 MB
Up to 2 ECUs (for
short bursts)
Large 7.5 GB
4 EC2 Compute Units
2 virtual cores
$0.32/0.46
Hi-Mem XL 17.1 GB
6.5 EC2 Compute Units
2 virtual cores
Hi-Mem 2XL 34.2 GB
13 EC2 Compute Units
4 virtual cores
Hi-Mem 4XL 68.4 GB
26 EC2 Compute Units
8 virtual cores
High-CPU Med 1.7 GB
5 EC2 Compute Units
2 virtual cores
High-CPU XL 7 GB
20 EC2 Compute Units
8 virtual cores
Medium 3.7 GB,
2 EC2 Compute Units
1 virtual core
M3 XL 15 GB
13 EC2 Compute Units
4 virtual cores
EBS storage only
M3 2XL 30 GB
26 EC2 Compute Units
8 virtual cores
EBS storage only
Extra Large 15 GB
8 EC2 Compute Units
4 virtual cores
10 GB
Inter-Instance
Network
Cluster GPU 4XL 22 GB
33.5 EC2 Compute Units,
2 x NVIDIA Tesla “Fermi”
M2050 GPUs
Cluster Compute 4XL 23 GB
33.5 EC2 Compute Units
Cluster Compute 8XL 60.5 GB
88 EC2 Compute Units
High Storage 8XL 117 GB
35 EC2 Compute Units,
24 * 2 TB ephemeral drives
10 GB Ethernet
Hi-Mem Cluster Compute 8XL 244 GB
88 EC2 Compute Units
16 virtual cores
240 GB SSD
61. 5. Storing derivative objects in S3 „Reduced Redundancy‟
• Original vs. derived assets : 33% savings
• Single reference and consistency
• Control, accurate logs and tracking
Cost Aware Architecting to Reduce costs of S3
Reduced Redundancy Storage
„RRS‟
62. 6. Read Replicas and/or ElastiCache („Database Smarts‟)
• Scale out and share work
• Optimal performance, minimize load
• Enhance reliability, ensure data safety
• Cost reduction
Cost Aware Architecting to Reduce costs of DB
63.
64.
65.
66. 7. Rapid proto-typing & Lean Dev/Test
• Inexpensive idea validation
• Seamless switch over and versioning
• Rapid dev / test agility
Cost Aware Architecting to Reduce costs of Test/Dev
67. Bringing this all Together
Enterprise software provider in APAC
Focused on SaaS for storage, security, collaboration, etc.
Backed by leading VC’s in the region
Strong growth – winning customers globally
Focused on profitability & reducing unit costs
Worked closely with the AWS team to optimize its architecture
69. New Customers
Amazon EC2
Amazon RDS
Amazon ELB
Amazon S3
Amazon EBS
For All Customers
Amazon SQS/SNS
Amazon DynamoDB
Amazon SES
Amazon SWF
And more…
AWS Elastic Beanstalk
AWS CloudFormation
AWS IAM
Auto Scaling
Consolidated Billing
No Charge for
Inbound Data Transfer
Data Transfer between
Instances within an
Availability Zone
Free Usage Tier
Did you know?
Free Services Data Transfer
Traditional IT capacity planning, by the very nature of the logistics of acquiring hardware, installation, configuration and networking, has to take a forward looking view. Complex estimates of the utilisation of resources are made in order to handle the peaks you anticipate. Shown here in red is the level of resources a business needs to install in order to handle the peak needs of a service. Demand on that service might vary by the time of day, week, month or year, or be driven by exceptional demand driven by promotions or seasonal events.
There are many patterns of usage that make capacity planning a complex science. From on and off usage patterns, where capacity is only needed at fixed times and not at others, fast growth where an online service becomes so successful that step changes in traditional capacity need to be added, variable peaks - where you just don't know what demand will be when and best guess applies, to predictable peaks such as during commute times as customers use mobile devices to access your service.
Each of these examples is typified by wasted IT resources. Where you planned correctly, the IT resources will be over provisioned so that services are not impacted and customers lost during high demand. In the worst cases, that capacity will not be enough, and customer dissatisfaction will result. Most businesses have a mix differing patterns at play, and much time and resource is dedicated to planning and management to ensure services are always available. And when a new online service is really successful, you often can't ship in new capacity fast enough. Some say that's a nice problem to have, but those that have lived through it will tell you otherwise!
Only happens in the cloud
Elasticity with AWS enables your provisioned capacity to follow demand. To scale up when needed and down when not. And as you only pay for what is used, the savings can be significant.
You control how and when your service scales, so you can closely match increasing load in small increments, scale up fast when needed, and cool off and reduce the resources being used at any time of day. Even the most variable and complex demand patterns can be matched with the right amount of capacity - all automatically handled by AWS.
Our strategy of pricing each service independently gives you tremendous flexibility to choose the services you need for each project and to pay only for what you usePay as you goPay less per unit when you use morePay even less when you reserve Pay even less as AWS grows
Personnel costs include the cost of the sizable IT infrastructure teams that are needed to handle the “heavy lifting” – managing heterogeneous hardware and the related supply chain, staying up-to-date on data center design, negotiating contracts, dealing with legacy software, operating data centers, moving facilities, scaling and managing physical growth, etc. These are all the things that an enterprise needs to do well if it wants to achieve low infrastructure costs in the areas discussed above. For example: Hardware procurement teams are needed, who have to spend a lot of time evaluating hardware, negotiating, holding hardware vendor meetings, managing delivery and installation, etc. It’s expensive to have a staff with sufficient knowledge to do this well.Data center design and build teams are needed to create and maintain reliable and cost-effective facilities.Operations staff is needed 24/7/365 in each facility to manage MySQL Databases. This staff is responsible for installing, patching, upgrades, migration, backups, snapshots and recovery of databases, ensuring availability, trouble shooting and performance enhancements.Networking teams are needed for running a highly available network. Expertise is needed to design, debug, scale, and operate the network and deal with the external relationships necessary to have cost-effective internet transit.Security personnel are needed at all phases of the design, build, and operations process.
While the number and types of services offered by AWS has increased dramatically, our philosophy on pricing has not changed: at the end of each month, you pay only for what you use, and you can start or stop using a product at any time. No long-term contracts are requiredPay as you go. No required minimum commitments, no longterm contracts. This flexibility minimizes the need for detailed resource planning. Pay per use. Pay only for what you use. With AWS, there’s no need to pay up-front for excess capacity or get penalized for under-planning. For compute resources, you pay on an hourlybasis from the time you launch a resource until the time you terminate it. For data storage and transfer, you pay on a per gigabyte basis. We charge based on the underlying infrastructure and services you consume. Pay less by using more. For storage and data transfer, pricing is tiered. The more you use, the less you pay per gigabyte. Pay even less when you reserve. For certain products, you can invest in reserved capacity. In that case, you pay a one-time low upfront fee, and your on-demand rate is reduced by 28% to 58%. Custom pricing. What if none of our pricing models work for your project? Custom pricing is available for high volume projects with unique requirements. For assistance, contact us to speak with a sales representative.
We have a variety of purchase options that allow you to match your workload to the right model, and we’re happy to help you optimize your bill by working with you to choose the right mix of several of these.
■One of the fastest growing sites in history. Cites AWS for making it possible to handle 18 million visitors in March, a 50% increase from the previous month, with very little IT infrastructure. ■12 employees as of last December. Using the cloud a site can grow dramatically while maintaining a very small team. Looks like 31 employees as of now.
Many companies still have their static content, e.g. pics, at their serverThis means for every user, this requires computational effort as pictures are being called from the serverA better way is to offload all your static content and put it in S3, and have it delivered from our CDN, called CloudfrontThis reduces the number of API calls to your server, and hence, you can lower the number of servers you run, thereby lowering costsCloudFront is fully automated: you don’t have to configured detailed configs, your users just get the content from the nearest of our 35 global Cloudfront POPS – lower latency, means a an improved end-user experienceDurability goes up with S3: 11 9’s. But also: consistency as all your data is in one location, not across all your servers. DRY
Many companies still have their static content, e.g. pics, at their serverThis means for every user, this requires computational effort as pictures are being called from the serverA better way is to offload all your static content and put it in S3, and have it delivered from our CDN, called CloudfrontThis reduces the number of API calls to your server, and hence, you can lower the number of servers you run, thereby lowering costsCloudFront is fully automated: you don’t have to configured detailed configs, your users just get the content from the nearest of our 35 global Cloudfront POPS – lower latency, means a an improved end-user experienceDurability goes up with S3: 11 9’s. But also: consistency as all your data is in one location, not across all your servers. DRY
Many companies still have their static content, e.g. pics, at their serverThis means for every user, this requires computational effort as pictures are being called from the serverA better way is to offload all your static content and put it in S3, and have it delivered from our CDN, called CloudfrontThis reduces the number of API calls to your server, and hence, you can lower the number of servers you run, thereby lowering costsCloudFront is fully automated: you don’t have to configured detailed configs, your users just get the content from the nearest of our 35 global Cloudfront POPS – lower latency, means a an improved end-user experienceDurability goes up with S3: 11 9’s. But also: consistency as all your data is in one location, not across all your servers. DRY
Many companies still have their static content, e.g. pics, at their serverThis means for every user, this requires computational effort as pictures are being called from the serverA better way is to offload all your static content and put it in S3, and have it delivered from our CDN, called CloudfrontThis reduces the number of API calls to your server, and hence, you can lower the number of servers you run, thereby lowering costsCloudFront is fully automated: you don’t have to configured detailed configs, your users just get the content from the nearest of our 35 global Cloudfront POPS – lower latency, means a an improved end-user experienceDurability goes up with S3: 11 9’s. But also: consistency as all your data is in one location, not across all your servers. DRY
Many companies still have their static content, e.g. pics, at their serverThis means for every user, this requires computational effort as pictures are being called from the serverA better way is to offload all your static content and put it in S3, and have it delivered from our CDN, called CloudfrontThis reduces the number of API calls to your server, and hence, you can lower the number of servers you run, thereby lowering costsCloudFront is fully automated: you don’t have to configured detailed configs, your users just get the content from the nearest of our 35 global Cloudfront POPS – lower latency, means a an improved end-user experienceDurability goes up with S3: 11 9’s. But also: consistency as all your data is in one location, not across all your servers. DRY
Perx = mobile loyalty program. iPhone app for loyalty in restaurants, bars etc. Location Based = tells you when you walk around where to get a deal.Logo’s for all the rastaurants is static content. When they changed to S3 + Cloudfront, user experience went up, users loved it. Easier to manage as they only had to manage changes, new ones etc ONCEThen, we started offering CloudFront for Dynamic Content. For Perx, that works for all the different offers that restaurants put out there on a weekly basis. “What is the best deal today at Subway or Starbucks?”. These can now be cached at the edge as well. Offload dynamic calls to your server, thereby again lowering the load on your servers, and your costs!
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
Buuuk is the mobile app partner of Singapore Press Holding (SPH), the publisher of Straits Times and much moreThey distribute their mobile app to millions and millions usersThey have Breaking News Alerts, which obviously drive an immediate surge in users and hence in the number of required capacityIn real time, they react to this. When SPH tells the system that there is a Breaking News announcement. When the Buuuk system receives this, they automatically issue a command to increase the number of EC2 instances and delay the news by 5 minutesAfter 5 minutes, the system is fully scaled up and the announcement goes out. People receive it and what do they do? They visit the news site with millions at the same time. The surge in users can then be easily dealt with = customer satisfaction From experience, they know this peak is normally only less than hour, so within an hour, they scale down again so that they only pay for 1 hour of excess capacity
For many content, media files, etc. derivatives are being created. For examples, thumbnails, versions for iOS, Android, etc.These files can be re-generated from the originalOr: you do have the source files elsewhere.In that case: you can consider S3 Reduced Redundancy. Not 11 9’s, but 4 9’s. Still 99.99% durability, which is 400 times more than a normal harddrive. BUT – 33% cheaper than the standard S3Improved consistency and logging / tracking as you exactly know where the content is, who calls it, how often, etc.
If you can add a number of read replicas, you can offload a number of tasks such as reportingYou do not need to peak your Master and reduce the size of your entire database fleet. You can further offload certain activities to other services such as DynamoDB or ElastiCacheYou can even make your EC2 instances smart, so they know the read replicas are there and when to ping those
A first option = read replicas to deal with API calls to the databaseAll reads go there, not to the Master so you can avoid the Master from needing to grow and growWhen you are really smart, you can even auto-scale the Read Replicas to only have them when usage increases
Additional offload is ElasticCaheSometimes, 90% of calls can be offloaded to ElasticCache as the calls are the same CloudWatch can actually tell you the CPU utilization of your RDS so if its low, you can reduce the size
Copy-paste entire infrastructure & try out bothLeverage CloudFormation to describe a stack and create templates to automate process of spinning up new stacksYou can copy-paste, change a few things in the copied environment to test whether it works betterThis allows for rapid dev & test and is often used for optimization of performance & conversion metrics: A/B TESTINGUsed by Obama in campaign, but also extensively in gaming