For people who start to create a cloud service, it’s really important to know how to create a scalable cloud service to fit the growth of the future workloads. In this session, we will introduce how to design a scalable cloud service including AWS services introduction and best practices.
4. A scalable architecture
• Can support growth in users, traffic, data size
• Without practical limits
• Without a drop in performance
• Seamlessly - just by adding more resources
• Efficiently - in terms of cost per user
12. We need a bigger server
• Add larger & faster storage (EBS)
• Use the right instance type
• Easy to change instance sizes
• Not our long term strategy
• Will hit an endpoint eventually
• No fault tolerance
13. Separating web and DB
• More capacity
• Scale each tier individually
• Tailor instance for each tier
– Instance type
– Storage
• Security
– Security groups
– DB in a private VPC subnet
14. But how do I choose what
DB technology I need?
SQL? NoSQL?
15. Why start with a Relational DB?
• SQL is versatile & feature-rich
• Lots of existing code, tools, knowledge
• Clear patterns to scalability*
• Reality: eventually you will have a polyglot data layer
– There will be workloads where NoSQL is a better fit
– Use the right tool for each workload
* for read-heavy apps
16. Key Insight: Relational Databases are Complex
• Our experience running Amazon.com taught us that
relational databases can be a pain to manage and
operate with high availability
• Poorly managed relational databases are a leading
cause of lost sleep and downtime in the IT world!
• Especially for startups with small teams
19. Offload static content
• Amazon S3: highly available hosting that scales
– Static files (JavaScript, CSS, images)
– User uploads
• S3 URLs – serve directly from S3
• Let the web server focus on dynamic content
20. Amazon CloudFront
• Worldwide network of edge locations
• Cache on the edge
– Reduce latency
– Reduce load on origin servers
– Static and dynamic content
– Even few seconds caching of popular content can have huge impact
• Connection optimizations
– Optimize transfer route
– Reuse connections
– Benefits even non cachable content
CloudFront
22. Database caching
• Faster response from RAM
• Reduce load on database
Application server
1. If data in cache,
return result
2. If not in cache,
read from DB
RDS database
Amazon ElastiCache
3. And store
in cache
25. High Availability
Availability Zone a
RDS DB
instance
Web
server
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Amazon CloudFront
ElastiCache
node 1
26. High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
Web
server
Web
server
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Amazon CloudFront
ElastiCache
node 1
27. High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
S3 bucket for
static assets
Amazon CloudFront
ElastiCache
node 1
28. Elastic Load Balancing
• Managed Load Balancing Service
• Fault tolerant
• Health Checks
• Distributes traffic across AZs
• Elastic – automatically scales its capacity
29. High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
S3 bucket for
static assets
ElastiCache
node 1
Amazon CloudFront
30. High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
RDS DB
standby
S3 bucket for
static assets
ElastiCache
node 1
Amazon CloudFront
31. Data layer HA
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
RDS DB
standby
32. Data layer HA
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
RDS DB
standby
ElastiCache
node 2
33. User sessions
• Problem: Often stored on local disk
(not shared)
• Quickfix: ELB Session stickiness
• Solution: DynamoDB
Elastic Load
Balancing
Web
server
Web
server
Logged in Logged out
34. Amazon DynamoDB
• Managed document and key-value store
• Simple to launch and scale
• To millions of IOPS
• Both reads and writes
• Consistent, fast performance
• Durable: perfect for storage of session data
https://github.com/aws/aws-dynamodb-session-tomcat
http://docs.aws.amazon.com/aws-sdk-php/guide/latest/feature-dynamodb-session-handler.html
36. Replace guesswork with elastic IT
Startups pre-AWS
Demand
Unhappy
Customers
Waste $$$
Traditional
Capacity
Capacity
Demand
AWS Cloud
37. Scaling the web tier
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
RDS DB
standby
ElastiCache
node 2
38. Scaling the web tier
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
RDS DB
standby
ElastiCache
node 2
Web
server
Web
server
39. Scaling the web tier
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
server
Web
server
RDS DB
standby
ElastiCache
node 2
Web
server
Web
server
40. Automatic resizing of compute
clusters based on demand
Feature Details
Control Define minimum and maximum instance pool
sizes and when scaling and cool down occurs.
Integrated to Amazon
CloudWatch
Use metrics gathered by CloudWatch to drive
scaling.
Instance types Run Auto Scaling for on-demand and Spot
Instances. Compatible with VPC.
aws autoscaling create-auto-scaling-group
--auto-scaling-group-name MyGroup
--launch-configuration-name MyConfig
--min-size 4
--max-size 200
--availability-zones us-west-2c, us-west-2b
Auto Scaling Trigger auto-scaling policy
Amazon
CloudWatch
41.
42. ”
“
Sanlih E-Television Uses AWS to Support
Online Strategy
Sanlih E-Television is a nationwide cable TV
network delivering some of the most popular TV
channels in Taiwan.
I estimate that we’ve saved
30% by selecting AWS over
other cloud service
providers.
Andy Wang
Chief Information Officer, Sanlih E-Television
”
“ • Wanted to take advantage of online and streaming
platforms to build on leading position in the market
• Had to ensure IT infrastructure could handle demand
and deliver content
• Began running streaming service, website and mobile
apps on AWS
• Successfully integrated internet and mobile into
channel mix
• Saved time and money due to stability of AWS
platform and competitive pricing of services
43. ”
“
Netflix Delivers Billions of Hours of Content per Month Using AWS.
Netflix is one of the world’s leading Internet television
network with over 57 million members in nearly 50
countries.
Our success with AWS can be
attributed to the scalability,
elasticity, and global availability of
AWS services.
Eva Tse
Director, Big Data Platform , Netflix
”
“ • Needed flexible IT infrastructure to experiment,
analyze, and grow its business worldwide.
• Using AWS to measure its users’ streaming
experiences through its analytics platform.
• Reports a reduction from weeks to seconds in testing
time for new features.
• Netflix operates a 10 PB data ‘warehouse’ on Amazon
S3 comprised of hundreds of millions of objects.
• Designed to deliver billions of hours of content
monthly using tens of thousands of instances across
three regions.
45. What does this mean in practice?
• Only store transient data on local disk
• Needs to persist beyond a single http request?
– Then store it elsewhere
User uploads
User Sessions
Amazon S3
AWS DynamoDB
Application Data
Amazon RDS
46. Having decomposed into
small, loosely coupled,
stateless building blocks
You can now Scale out with ease
Having done that…
47. Having decomposed into
small, loosely coupled,
stateless building blocks
We can also Scale back with ease
Having done that…
48. Take the shortcut
• While this architecture is simple you still need
to deal with:
– Configuration details
– Deploying code to multiple instances
– Maintaining multiple environments (Dev, Test, Prod)
– Maintain different versions of the application
• Solution: Use AWS Elastic Beanstalk
49. AWS Elastic Beanstalk (EB)
• Easily deploy, monitor, and scale three-tier web
applications and services.
• Infrastructure provisioned and managed by EB
• You maintain control.
• Preconfigured application containers
• Easily customizable.
• Support for these platforms:
51. Mobile
Push
Notifications
Mobile
Analytics
Cognito
Cognito
Sync
Analytics
Kinesis
Data
Pipeline
RedShift EMR
Your Applications
AWS Global Infrastructure
Network
VPC
Direct
Connect
Route 53
Storage
EBS S3 Glacier CloudFront
Database
DynamoDBRDS ElastiCache
Deployment & Management
Elastic
Beanstalk
OpsWorks
Cloud
Formation
Code
Deploy
Code
Pipeline
Code
Commit
Security & Administration
CloudWatch Config
Cloud
Trail
IAM Directory KMS
Application
SQS SWF
App
Stream
Elastic
Transcoder
SES
Cloud
Search
SNS
Enterprise Applications
WorkSpaces WorkMail WorkDocs
Compute
EC2 ELB
Auto
Scaling
LambdaECS
52. AWS building blocks
Inherently Scalable & Highly Available Scalable & Highly Available
Elastic Load Balancing
Amazon CloudFront
Amazon Route53
Amazon S3
Amazon SQS
Amazon SES
Amazon CloudSearch
AWS Lambda
…
Amazon DynamoDB
Amazon Redshift
Amazon RDS
Amazon Elasticache
…
Amazon EC2
Amazon VPC
Automated Configurable With the right architecture
53. Stay focused as you scale your team
AWS
Cloud-Based
Infrastructure
Your
Business
More Time to Focus on
Your Business
Configuring Your
Cloud Assets
70%
30%70%
On-Premise
Infrastructure
30%
Managing All of the
“Undifferentiated Heavy Lifting”
55. Amazon Route 53
DNS serviceNo limit
Availability Zone a
RDS DB
instance
ElastiCache
node 2
Availability Zone b
S3 bucket for
static assets
www.example.com
Elastic Load
Balancing
RDS DB
standby
ElastiCache
node 3
RDS read
replica
RDS read
replica
DynamoDB
RDS read
replica
ElastiCache
node 4
RDS read
replica
ElastiCache
node 1
CloudSearchLambdaSES SQS
56. A quick review
• Keep it simple and stateless
• Make use of managed self-scaling services
• Multi-AZ and AutoScale your EC2 infrastructure
• Use the right DB for each workload
• Cache data at multiple levels
• Simplify operations with deployment tools
So lets avoid this by building a scalable architecture.
A scalable architecture can grow without practical limits simply by adding more resources.
We also care about cost efficiency so this is something else our architecture should achieve.
Lets start from day 1. Maybe a couple of developers working on their idea.
You will need a server to host your app for testing and sharing with friends and family or some early enthusiasts.
You sign up for AWS, and with a few clicks you have a server.
You setup that single server - an ec2 instance to test your code and run a private beta.
You install your db and Web server of choice, you upload your code and you are good to go for now.
Soon after that you are ready to open access to your product for a public beta.
If things go well you will soon need a bigger server and that is easy on AWS.
You can add more and faster storage with EBS and you can stop the instance change the size of your instance and start it again with more RAM, CPU etc.
Of course that is not our long term strategy - you will eventually hit an end point. Plus having everything in a single very large server is not great in terms of fault tolerance or cost efficiency.
So as a first step let's go ahead and move the database to its own dedicated instance.
We have 2 servers so instantly a lot more capacity.
But we can also select a different instance type tailored to each workload.
Of course this is also better in terms of security – e.g. we can really lock down access to the db server.
And this is usually the point where someone will ask me, which database should I use? And there two main types of databases that are popular. Relational databases and nosql databases.
And my default answer is that you should start with a Relational database.
There will be exceptions and later on we will talk about those and how those technologies scale.
But Relational databases will work well for most apps. They offer more features and there are more developers that have experience writing apps for them.
So start with that and the reality will be that later you can always add NoSQL later for the right workloads.
But we know from experience that managing Relational databases is hard especially at scale.
Databases are a frequent cause of downtime in the IT world!
This is especially true for startups with limited resources
You won't have access to consultants to help you.
So instead of managing your database on your own on an ec2 instance you can instead use Amazon’s Relational Database service.
RDS
And RDS solves that problem for you. With a few clicks you can have a db server running mysql Oracle SQL server or Postgres .
And AWS handles all the provisioning, hardware replacement, it makes it easy to migrate to a larger server when you need that, it handles backups, security patches ecc so that you can build your application on top of a robust database implementation.
Now we could start scaling those 2 tiers straight away.
But let’s take a step back and implement some quick wins early on in the process.
Low effort changes that will give us a lot of room to breathe and cost efficiency as we grow.
First we want to store any static assets like css files and images on Amazon simple storage service s3.
S3 not only stores those files but it can also act as a highly scalable hosting service.
Instead of serving those assets through your Web server you offload this task to s3 URLs
This will reduce the load for your Web server that can now focus on generating dynamic content.
Secondly we want to use CloudFront, that is a Content Delivery Network.
It can reduce latency for users around the world by caching both static and dynamic content on the edge locations of the AWS global infrastructure.
In some cases even a few seconds of caching for very popular pages can result in a huge reduction of load for your Web server.
Even for non cacheable content CF will provide network optimizations.
So what we are doing here is using Cloudfront to serve the whole application
We can specify a different origin depending on specific file path patterns. In this example we fetch content from s3 or ec2.
Then we can apply caching on one more layer – between the application server and the DB server.
Any frequent queries to the db where the results do not change very often can have their results cached and served from an in memory cache.
This will provide a better experience and reduce the load on your database.
You can install something like Memcached or Redis on a set of EC2 instances but Similarly to what we described for the database you can use a managed service called Elasticache that allows you to run those engines without the operational overhead.
OK so you are done with the beta, have refined your product but you want to get more sincere feedback and iterate fast.
The best way to do this is to start offering paid membership of some sort. Paid customers will typically give you the best feedback. They are demanding and are the ones that already think your product is worth paying for. It is now very important that you introduce high availability to your architecture. A hardware failure should not impact your end users.
Here is the current architecture which has multiple single points of failure.
Eg if the Web server crashes your app won't work.
We add a second Web server from the same AMI but on a separate AZ. Each AWS region has multiple AZs that are physically distinct locations. This allows you to build extremely robust architectures that utilize multiple data centers.
Because we have multiple Web servers we need to distribute http requests with elastic load balancing.
And you don’t need multiple ELB instances because ELB is not a single server. It is itself a managed and fault tolerant service,.
ELB will also automatically scale its own capacity to process incoming requests depending on traffic.
For the database assuming we are using RDS you can enable the multi AZ feature that will launch a secondary node in a different AZ.
In an event of failure RDS will automatically fail over to that instance maintaining the hostname so that you don't need to manually modify your app config.
Similarly for the cache we expand our cluster in 2 AZs.
In the case of memcache each of those nodes stores a portion of the keys so the impact of failure is reduced, only part of our cache will become cold.
in the case of redis we can easily configure elasticache to setup master slave replication and automatic failover.
A problem we have to face when moving from one to 2 servers is how do we manage user sessions. Typically most runtime environments store those on the local file system which is not shared. A user that signs in on one server will be logged out on a subsequent http request that might be serviced by server 2.
A quickfix here's to use elb feature called session stickiness. This will send a particular user to the same backend server every time. We will see later on why this is not our long term solution and why it is better to move this to DynamoDB.
And dynamodb is a managed nosql data store on AWS that stores your data durably in multiple AZs. It also has consistently fast performance so it is ideal for the storage of session data.
In fact for php and tomcat environments there are drop in replacement session handlers that you can use to achieve that.
Going further on our journey let's assume your startup has seen some good traction and is ready to invest on marketing campaigns which could help it go viral.
In traditional hosting environments that is a nice but difficult problem to have. You need to guess how many servers to buy or rent. And you might order too many. Or too few. In AWS you can go to the console and add more web servers required.
You can add for example 2 more web servers
And attach them to Elastic Load Balancing
Elastic Load Balancing itself will scale automatically.
But this is not something you want to do manually.
Even during the same day you have variance in your capacity requirements so you want to automatically adjust the number of servers in your fleet to be as close as possible to your actual needs.
Autoscaling is a service that allows you to do that.
You configure a minimum and a maximum number of servers and you set a rule that defines when you want to add servers or when to remove servers.
E.g. when cpu utilization is high for more than 5 minutes.
STORY BACKGROUND
Sanlih E-Television is a leading cable TV company in Taiwan with about 25 percent of the national viewing audience. The network operates six channels: 24-hour news, drama, lifestyle and pop, international, finance, and music television.
Amazon EC2 to run website, Amazon RDS and Dynamo DB for database service, Amazon Kinesis for real-time application monitoring and clickstream analytics.
AWS used to support its Internet platforms strategy including TV, online news apps, e-commerce, and OTT content.
SOLUTION & BENEFITS
AWS services (EC2) for online campaigns related to its programs, including popular dramas, and for sending out news flashes to mobile devices
Adopting Amazon Elasticsearch Service and Amazon Elastic MapReduce (Amazon EMR) for deeper insights into customer engagement through the company’s multiple online channels.
Saved 30 % over other cloud service providers, 50% over on-premesis solutions
CONTENT TAGS
Main use case: Website/Web App
Additional use case(s): Big Data
Keywords (seperated by commas): broadcast, TV, cable TV network, online platform, e-commerce, TV channels, multiplatform, new media, mobile, streaming services
All AWS Services used by the customer: Amazon EC2, Amazon RDS, Amazon DynamoDB, Amazon Elasticsearch Service, Amazon Kinesis, Amazon Elastic MapReduce
Benefits Realized: Options are: Flexibility, Lower Cost, Lower Time To Market, User Experience
STORY BACKGROUND
Netflix is one of the world’s leading Internet television network with over 57 million members in nearly 50 countries.
The company is using AWS to measure and understand its users’ streaming experiences through its analytics platform. Also using AWS to deliver billions of hours of content per month to users worldwide.
By using AWS, Netflix can reduce its testing times from weeks to seconds and store more than 10 PB of information––hundreds of millions of objects––on Amazon S3.
SOLUTION
[Main use case]. Big Data
[Additional use cases]. Analytics and Business Intelligence (BI); Content Delivery; Database and Data Warehouse; Development and Test
[Keywords separated by commas]. EMR, Analytics, S3, Data Warehouse, Testing, User experience, Hadoop, DevOps
[List all AWS Services used by the customer]. Using Amazon EC2, Amazon EMR, Amazon S3, DynamoDB
BENEFITS
Reduced testing time from weeks to seconds by launching instances instead of procuring servers.
Netflix operates a 10 PB data ‘warehouse’ on Amazon S3 comprised of hundreds of millions of objects.
Designed to deliver billions of hours of content monthly using tens of thousands of instances across three regions.
Moving organization to a DevOps model to promote fast ways to test and experiment new features.
[Benefits Realized]. Availability, Better Performance, Lower Time To Market, Scalability/Elasticity, Speed, User Experience
This sounds very easy and it is as long as you have a stateless architecture on your Web servers.
What does this mean?
Anything that needs to persist beyond the life of a single http request should be stored in shared storage – not on the web server itself.
E.g. in our example we have already done the hard work.
We store user uploads on S3, and user sessions on dynamoDB and everything else perhaps on an RDS database
With that we can simply add more servers when we need them
They will immediately affect new and existing users – we are not using session stickiness.
but more importantly we can terminate any of them at any time - none of them stores any important data that I have not saved elsewhere.
And the architecture I described is simple but you still need to learn about aws autoscalling, deploy your app to multiple servers, maintain different environments for development testing production, and multiple versions of your app, maybe you also want to do ab testing.
With Elastic Beanstalk you just provide your code as a zip file and this service will configure elb, launch servers in autoscalling and deploy your code. It is a free service, you only pay for the resources it launches for you, it supports multiple runtimes, and is very customizable.
You can move a lot faster and hide some complexity by using an automated service like elastic beanstalk.
Another characteristic of scalable architectures is that of loose coupling. You can use SQS – Amazon’s Queing service - to achieve that.
If you have tasks that can be performed asynchronously you can place those in SQS instead of having your users wait for them to be performed. You can use SQS as a buffer that protects your backend systems from sudden spikes. Because the backend system can process the queue in its own pace – so you don’t need to scale up aggressively.
You also move latency out of highly responsive request paths. And can hide any performance or availability issues from your end users.
A few days ago the AWS Lambda service became available and this even allows you to offload the processing of asynchronous tasks to a managed execution layer so that you don’t even need to have ec2 instances to run this code.
And now that we have loads of users it is important we increase our pace and add new features.
Many times when you add functionality you might need to introduce new components to your setup. Perhaps you want to implement advanced search features. Or you want to send push notifications or implement video transcoding.
In those cases your first question should be whether there is an aws service that already achieves that and is already designed to scale instead of figuring out how to implement it on your own on ec2.
We have seen how services like EC2 give you the freedom to architect in myriads of ways or your app needs to be built in a certain way to take advantage of their elasticity
And it is important to realize that the higher level services – you can think of them as building blocks – are already implemented to scale so that you don't have to architect from scratch.
In fact some of those do this automatically for you.
These services are available with a few clicks. And as long as you can use such services you can keep the size of your team small and still achieve great outcomes for your customers.
Even later if you have lots of revenue and you can hire engineers it is always better if they focus on the things that differentiate you and not on how to manage a search cluster.
If we follow the same concept we can keep on scaling with no practical limits.
As a summary the main points from today’s session are the following:
You want to keep things as simple as possible and create a stateless web architecture.
Distribute your resource in multiple AZ and use AutoScaling for your EC2 infrastructure.
But do try to use managed services on AWS as much as possible and select the right db for the right job.
Caching will help you be more efficient and automated deployment tools can help you be operationally efficient.
In terms of next steps there is a lot of documentation online but also I would highly recommend you sign up for AWS Business Support as it can be an extension of your team.