In this session, we will go over cloud economics and understanding the total cost of ownership (TCO) when building in the cloud and how you are trading upfront capital expenditure (CapEx) for operational expenditure (OpEx). We will also look at how the TCO changes over time as you start modernising your applications to make full use of the cloud's capabilities. Lastly, we will cover the different purchasing options to help you understand how you can reduce costs even further by identifying consistent, base workloads.
4. Invention requires two things: the
ability to try a lot of experiments,
and not having to live with
the collateral damage of
failed experiments.“
Andy Jassy
CEO, Amazon Web Services
5. You
You
Considerations when running a data centre
Spend time innovating and building new applications, not managing infrastructure
AWS
Self-managed Fully managed
Solving problems
Building applications
Focussing on product
Hardware lifecycles
Backup and recovery
Capacity planning
Industry compliance
Licensing
Networking
Cooling
Power
Physical Security
6. Challenge
They experienced service admin challenges with their original
provider and wanted to scale business to the next level.
Solution
They moved from self-managed MySQL to Amazon Aurora
MySQL. They use Aurora as the primary transactional database,
Amazon DynamoDB for personalized search, and Amazon
ElastiCache as in-memory store for sub-millisecond site rendering.
Result
Initially, the appeal of AWS was the ease of managing and
customizing the stack. It was great to be able to ramp up more
servers without having to contact anyone and without having
minimum usage commitments. AWS is the easy answer for any
Internet business that wants to scale to the next level.
—Nathan Blecharczyk, Cofounder and CTO of Airbnb
“
”
MOVE TO MANAGED →
Amazon
Aurora
Amazon
ElastiCache
Amazon
DynamoDB
8. Amazon EC2 instance characteristics
M5d.xlarge
Instance family
Instance
generation
Instance size
Instance type
CPU
Memory
Storage
Network performance
Additional
capabilities
9.
10. Amazon EC2 general-purpose instances
M5
instances
Balance of compute, memory, and network
resources
4:1 memory-to-vCPU ratio
C5
instances
High performance at a low price per
vCPU ratio
2:1 memory-to-vCPU ratio
R5
instances
Accelerate performance for workloads
that process large datasets in memory
8:1 memory-to-vCPU ratio
11. Broadest and deepest platform choice
Workloads Capabilities Options
(AWS, Intel, AMD)
(up to 4.0 GHz)
(up to 24 TiB)
(HDD and NVMe)
(up to 100 Gbps)
(GPUs and FPGA)
(Nano to 32xlarge)
+ + =
270+instance types
17. To optimize Amazon EC2, combine purchase options
Use for known,
steady-state workloads
Scale using Spot for fault-tolerant,
flexible, stateless workloads
On-Demand, for new or
stateful spiky workloads
18. Amazon EC2 Cost Optimisation non-prod
100.0
71.4
35.7
29.8
0
20
40
60
80
100
24 x 7 24 x 5 12 x 5 10 x 5
% Running Time
Up to 70%
savings for non-
production
workloads
20. A serverless business strategy
Serverless-First is the decision to opt for serverless technologies
in your application as a first choice.
No server management
Flexible, automated scaling
Pay for value
Automated high availability
What do we mean when we say serverless?
21. Computing evolution – A paradigm shiftLEVELOFABSTRACTION
FOCUS ON BUSINESS LOGIC
PHYSICAL MACHINES
Requires “guess” planning
Lives for years on-premises
Heavy investments (capex)
Low innovation factor
Deploy in months
22. Computing evolution – A paradigm shiftLEVELOFABSTRACTION
FOCUS ON BUSINESS LOGIC
VIRTUAL MACHINES
Hardware independence
Faster provisioning speed (minutes/hours)
Trade capex for opex
More scale
Elastic resources
Faster speed and agility
Reduced maintenance
23. Computing evolution – A paradigm shiftLEVELOFABSTRACTION
FOCUS ON BUSINESS LOGIC
CONTAINERIZATION
Platform independence
Consistent runtime environment
Higher resource utilization
Easier and faster deployments
Isolation and sandboxing
Start speed (deploy in seconds)
24. Computing evolution – A paradigm shift
AWS Lambda
AWS Fargate
LEVELOFABSTRACTION
FOCUS ON BUSINESS LOGIC
Continuous scaling
Fault tolerance built-in
Pay for value
Zero maintenance
SERVERLESS
25. Anatomy of an AWS Lambda function
Handler() function
Function to be executed
upon invocation
Event object
Data sent during Lambda
function invocation
Context object
Methods available to
interact with runtime
information (request ID,
log group, more)
import json
def lambda_handler(event, context):
# TODO implement
return {
'statusCode': 200,
'body': json.dumps('Hello World!')
}
26. Anatomy of a serverless application
/orders
/forums
/search
/lists
/user
/...
Amazon API
Gateway
AWS Secrets
Manager /
Parameter Store
Amazon
DynamoDB
Import sdk
Import http-lib
Import ham-sandwich
Pre-handler-secret-getter()
Pre-handler-db-connect()
Function myhandler(event, context) {
<Event handling logic> {
result = SubfunctionA()
}else {
result = SubfunctionB()
return result;
}
Function Pre-handler-secret-getter() {
}
Function Pre-handler-db-connect(){
}
Function subFunctionA(thing){
## logic here
}
Function subFunctionA(thing){
## logic here
}
Dependencies, configuration
information, common helper functions
Common helper functions
Business logic sub-functions
Your handler
Import sdk
Import http-lib
Import ham-sandwich
Pre-handler-secret-getter()
Pre-handler-db-connect()
Function myhandler(event, context) {
<Event handling logic> {
result = SubfunctionA()
}else {
result = SubfunctionB()
return result;
}
Function Pre-handler-secret-getter() {
}
Function Pre-handler-db-connect(){
}
Function subFunctionA(thing){
## logic here
}
Function subFunctionA(thing){
## logic here
}
Dependencies, configuration
information, common helper functions
Common helper functions
Business logic sub-functions
Your handler
Import sdk
Import http-lib
Import ham-sandwich
Pre-handler-secret-getter()
Pre-handler-db-connect()
Function myhandler(event, context) {
<Event handling logic> {
result = SubfunctionA()
}else {
result = SubfunctionB()
return result;
}
Function Pre-handler-secret-getter() {
}
Function Pre-handler-db-connect(){
}
Function subFunctionA(thing){
## logic here
}
Function subFunctionA(thing){
## logic here
}
Dependencies, configuration
information, common helper functions
Common helper functions
Business logic sub-functions
Your handler
Import sdk
Import http-lib
Import ham-sandwich
Pre-handler-secret-getter()
Pre-handler-db-connect()
Function myhandler(event, context) {
<Event handling logic> {
result = SubfunctionA()
}else {
result = SubfunctionB()
return result;
}
Function Pre-handler-secret-getter() {
}
Function Pre-handler-db-connect(){
}
Function subFunctionA(thing){
## logic here
}
Function subFunctionA(thing){
## logic here
}
Dependencies, configuration
information, common helper functions
Common helper functions
Business logic sub-functions
Your handler
Import sdk
Import http-lib
Import ham-sandwich
Pre-handler-secret-getter()
Pre-handler-db-connect()
Function myhandler(event, context) {
<Event handling logic> {
result = SubfunctionA()
}else {
result = SubfunctionB()
return result;
}
Function Pre-handler-secret-getter() {
}
Function Pre-handler-db-connect(){
}
Function subFunctionA(thing){
## logic here
}
Function subFunctionA(thing){
## logic here
}
Dependencies, configuration
information, common helper functions
Common helper functions
Business logic sub-functions
Your handler
Import sdk
Import http-lib
Import ham-sandwich
Pre-handler-secret-getter()
Pre-handler-db-connect()
Function myhandler(event, context) {
<Event handling logic> {
result = SubfunctionA()
}else {
result = SubfunctionB()
return result;
}
Function Pre-handler-secret-getter() {
}
Function Pre-handler-db-connect(){
}
Function subFunctionA(thing){
## logic here
}
Function subFunctionA(thing){
## logic here
}
Dependencies, configuration
information, common helper functions
Common helper functions
Business logic sub-functions
Your handler
28. AWS operational responsibility models
On premises Cloud
Less More
Compute Virtual machine
Amazon EC2 AWS Elastic Beanstalk AWS LambdaAWS Fargate
Databases MySQL
MySQL on EC2 Amazon RDS MySQL Aurora Aurora Serverless Amazon QLDB/Amazon DynamoDB
Storage Storage
Amazon S3
Messaging
Enterprise
service bus
(ESB) Amazon MQ Amazon Kinesis Amazon Event Bridge / SNS / SQS
Analytics
Hadoop Hadoop on EC2 Amazon EMR Amazon ES Amazon Athena
29. You
You
Fully managed services on AWS
Spend time innovating and building new applications, not managing infrastructure
AWS
Self-managed Fully managed
Schema design
Query construction
Query optimization
Automatic failover
Backup and recovery
Isolation and security
Industry compliance
Push-button scaling
Automated patching
Advanced monitoring
Routine maintenance
Built-in best practices
30. Relational
Referential
integrity, ACID
transactions,
schema-
on-write
Lift and shift, ERP,
CRM, finance
Aurora, RDS
Key-value
High
throughput,
low-latency
reads
and writes,
endless scale
Real-time
bidding, shopping
cart, social,
product catalog,
customer
preferences
DynamoDB
Document
Store
documents and
quickly access
querying on
any attribute
Content
management,
personalization,
mobile
DocumentDB
In-memory
Query by key
with
microsecond
latency
Leaderboards,
real-time
analytics, caching
ElastiCache
Graph
Quickly and
easily create
and navigate
relationships
between
data
Fraud detection,
social
networking,
recommendation
engine
Neptune
Time-series
Collect, store,
and process
data
sequenced
by time
IoT applications,
event tracking
Timestream
Ledger
Complete,
immutable, and
verifiable history
of all changes to
application data
Systems
of record, supply
chain, health care,
registrations,
financial
QLDB
AWS
Service(s)
Common Use
Cases
Data Model and Store
32. Architecture evolution
Monolithic application
Does everything
Shared release pipeline
Rigid scaling
High impact of change
Hard to adopt new technologies
Microservices
Does one thing
Independent deployments
Independent scaling
Small impact of change
Choice of technology
When the impact of change is small, release velocity can increase
33. Staff productivity: AWS benchmarking insights
Source: n = 1036 AWS customers. AWS Cloud Economics Benchmarking, 2019.
Cloud improves efficiency… With larger gains for re-architected applications
57.9%
Increase in # of VMs
managed per admin
153.5%
Increase in # of TB
managed per admin
over time
67.7%
Increase in # of TBs
managed per admin
147.7%
Increase in # of VMs
managed per admin
over time
All Customers All Customers Apps that were re-architected,
re-factored, or re-platformed
Apps that were re-architected,
re-factored, or re-platformed
On-premises In cloud On-premises In cloud On-premises In cloud On-premises In cloud
CIO’s tell us that cost is not the key factor in their decision to move to the cloud. Agility and innovation are usually at the center of their motivation. However, being at least cost neutral is table stakes before considering any migration to the cloud.
We typically see at least 20% savings with just a lift-and-shift. Over the first months, customers continue to optimize their EC2 instances for additional 10-20% savings. By adopting higher level services, customers further optimize - many seeing 60% or more savings.
[CLICK]
With AWS services, you don’t need to worry about administration tasks such as server provisioning, patching, setup, configuration, backups, or recovery. AWS continuously monitors your clusters to keep your workloads up and running with self-healing storage and automated scaling, so that you can focus on higher value application development. You focus on high value application development tasks such as schema design, query construction & optimization leaving AWS to take care of operational tasks on your behalf.
You never have to over or under provision infrastructure to accommodate application growth, intermittent spikes, and performance requirements and incur fixed capital costs which include software licensing and support, hardware refresh, and resources to maintain hardware. AWS does it all for you so you can spend time innovating and building new applications, not managing infrastructure.
Here’s an example on a customer who’s all-in on AWS. Airbnb moved away for self managing databases to fully managed AWS databases such as Aurora, DynamoDB, and ElastiCache.
https://aws.amazon.com/solutions/case-studies/airbnb/
Image source: free stock image from Pexels.com (no license fee)
Every servers has 4 hey computing resources– CPU, Memory, Storage, Network capabilities
Some workloads are more CPU intensive, and more memory intensive,
So we created different SKUs or familes – that’s the first letter to the right
As we added new technology to our instances, we realized we wanted to expose these innovations – so we introduced generations – what CPU capabilities and cjupsets, network capabilities
Last one is size – pretty simple tshirt – still have the same ratio, chipset, and but each size has twice the CPU, memory and storage of the previous size – enabling to scale up your workloads
What does all of this mean?
More choices enables better performance for specific workloads
Faster processors from Intel, processor choice with Graviton (ARM) and AMD, instances for accelerated computing with our partner Nvidia –
Network offerings up to 100GBps performance
Elastic Graphics or Elastic Inference and of course Elastic Block Store for greater performance and storage flexibility.
We will have nearly 300 instances by the end of the year to support virtually every workload and business need.
Basic AS example – update slide
----
Autoscaling is for reliability and performance
AWS have been providing autoscaling for a decade
Cost saving is a bonus – the cherry on top – after keeping customers happy
So, when should you use Spot, On-Demand or RIs?
Picking just one option is the wrong solution.
Have personally seen customers I have worked with reduce cost from their initial AWS estimates by 75% by using these techniques – and that was without Spot!
Experts in Cost optimization analysis – Taking this cost saving and applying business logic.
Non production can make up to 90% of the capacity of some workloads and commonly over 50%
It doesn’t need to scale dynamically in response to demand.
10x5 is a common development pattern
Anything running below 75% of the time can be considered for better cost optimization than RIs
https://aws.amazon.com/premiumsupport/knowledge-center/stop-start-instance-scheduler/
A serverless strategy enables you to achieve the maximum value from the cloud. And by serverless, we’re talking about applications that are built with technologies that eliminate the need to manage servers, scale nearly infinitely automatically, use a pay for what you value pricing model, and are automatically highly available.
This strategy, often referred to as a serverless operational model enables you to dramatically reduce the time spent on things that aren’t core to your business (like managing servers) and focus on the products that drive your business.
Many customers have made a decision to achieve this value through a “serverless-first” strategy – or a decision to use serverless services as a preference, unless the use case or workload demands they choose otherwise. Customers like iRobot, Alma Media, and Fender are all serverless first.
One of the big reasons for the growth of this approach the new role of applications themselves - over the past 20 years, the role that the application plays has dramatically changed from a supporting utility to the business itself. With this in mind – A serverless first approach enables customers achieve the maximum benefit from the cloud – faster time to market and maximal agility.
We’ve seen a paradigm shift on compute. Not too long ago, and in some places even now, there are “server huggers”. These are the people who love to maintain their own data centers. They’re not able to leverage all the innovation that’s happening in the cloud. Making a new service means building a new team to build, operate and maintain that service. You need to renew the hardware regularly to keep up with the pace and then of course you’ve an upfront capital expenditure before you can do any innovation.
Then there is a good set of customers that are moving to the cloud and leveraging Virtual Machines. There are five reasons customers move to the cloud.
Agility – spin up/down quickly
Breadth of functionality – constantly growing number of services
Cost – you pay for what you use
Global deployment – make your application available across the world by deploying to regions/AZs
Elasticity – no need to plan ahead of time
Of those, while cost is typically the discussion starter but we believe agility and innovation are at the top. You can setup infrastructure for an experiment fearlessly and terminate it if it does not work out. This increases the amount of innovation that happens in your company.
Third wave is of containers which gives you a high degree of mobility. You’re running something on-prem and want to keep those investments yet be able to run in production in the cloud. Containers allows to package your applications and move them between different computing environments. It starts a lot more rapidly than the VMs and you get this fancy toolchain. But you’re still managing servers, updating the operating system and patching it.
And then there is Serverless, where you don’t have to worry about any of that. Customers tell us containers are great but managing the infrastructure is undifferentiated heavy lifting for them. Too much of their time is spent corralling infrastructure and not enough time is spent on the code, the application, and the application architecture – which is really where they want to focus. This is exactly what is offered by Serverless. You build code and let somebody else worry about the infrastructure.
You have heard us talk about shared responsibility in the past. Simply stated, a shared responsibility model implies there are parts of the system that AWS is responsible for and there are parts of the systems that you as a customer must take responsibility for. In many cases we provide tools and there is a rich ecosystem of open source and commercial products that makes it easier for all of you to own your side of the responsibility box.
There is no one hard line on where this line is drawn between the two parts of shared responsibility. One one side of the spectrum you can leverage the power and flexibility of EC2 to run your own database. You can use something like RDS to simplify the management of the database or use RDS Aurora to completely offload the database storage infrastructure to an AWS managed backend. Alternatively you can move to a fully managed database like DynamoDB where you create tables, put data in those tables, and query the data. There is no infrastructure to manage. QoS is part of the the database service with sufficient knobs and dials to give you the right level of control.
The question we keep asking ourselves is: how can we draw this line in a way that allows our customers to innovate on business problems, but makes the underlying infrastructure less visible. Any time you spend corralling infrastructure is undifferentiated heavy lifting. Over the last 12 years as more and more people start in the cloud or move major applications to the cloud, this definition of "undifferentiated havy lifting has evolved"
With AWS services, you don’t need to worry about administration tasks such as server provisioning, patching, setup, configuration, backups, or recovery. AWS continuously monitors your clusters to keep your workloads up and running with self-healing storage and automated scaling, so that you can focus on higher value application development. You focus on high value application development tasks such as schema design, query construction & optimization leaving AWS to take care of operational tasks on your behalf.
You never have to over or under provision infrastructure to accommodate application growth, intermittent spikes, and performance requirements and incur fixed capital costs which include software licensing and support, hardware refresh, and resources to maintain hardware. AWS does it all for you so you can spend time innovating and building new applications, not managing infrastructure.
Decisions about data models and stores last longer than expected, the extra time you put in to this pays off. First think of your data model, how the entities relate to each other and how this data will be read and used in the future.
Then consider the access patterns. The read/write ratio, durability, security, cost and availability requirements. With this information this table will help you to make this important choice.
If you are building a monolith or familiar with relation databases then RDS or Aurora is suitable for a large number of applications. It can scale further by using Elastic Cache.
CLICK Microservices SHOULD have an independent data store for each service. The narrower characteristics and probably scaling requirements make key-value or document stores like dynamodb or documentDB a good fit.
CLICK For more specialized use cases, look at Neptune for graph databases, Timestream for time series data or QLDB for an immutable ledger.
Monoliths
Single monolithic app | Must deploy entire app | One database for entire app | Organized around technology layers | State in each runtime instance | One technology stack for entire app | In-process calls locally, SOAP externally
When you're working with a monolithic app, you have many developers all pushing changes through a shared release pipeline, which causes frictions at many points of the lifecycle.
Upfront during development, engineers need to coordinate their changes to make sure they're not making changes that will break someone else's code.
If you want to upgrade a shared library to take advantage of a new feature, you need to convince everyone else to upgrade at the same time – good luck with that!
And if you want to quickly push an important fix for your feature, you still need to merge it in with everyone else's in process changes. This leads to "merge Fridays", or worse yet "merge weeks", where all the developers have to compile their changes and resolve any conflicts for the next release.
Even after development, you also face overhead when you're pushing the changes through the delivery pipeline
You need to re-build the entire app, run all of the test suites to make sure there are no regressions, and re-deploy the entire app
To give you an idea of this overhead, Amazon had a central team whose sole job it was to deploy this monolithic app into production
Even if you're just making a one-line change in a tiny piece of code you own, you still need to go through this heavyweight process and wait to catch the next train leaving the station
For a fast growth company trying to innovate and compete, this overhead and sluggishness iss unacceptable
Microservices are minimal function services that are deployed separately but can interact together to achieve a broader use case.
Many smaller minimal function microservices | Can deploy each independently | Each has its own datastore | Organized around business capabilities | State is externalized | Choice of technology for each microservice | REST calls over HTTP, messaging
When monoliths become too big to scale efficiently we make a couple of big changes - One was architectural, and the other was organizational
The teams were decoupled and they had the tools necessary to efficiently release on their own
Teams independently architect, develop, deploy and maintain each microservice
Ownership is key - every team service has an owner. Owners architect, owners implement, owners support in production, owners can fix things, and owners care
Each microservice often has its own datastore, and each microservice is fully decentralized – no ESBs, no single database, no top-down anything
Any given instance of a microservice is stateless – state, config and data pushed externally
Microservices support polyglot – each microservice team is free to pick the best technology
DevOps principles – automated setup and developers owning production support
Use of containers, which allow for simple app packaging and fast startup time
Use of cloud for elasticity, platform and software services