6. Cloud Computing Benefits
No Up-Front Low Cost Pay Only for
Capital Expense What You Use
Self-Service Easily Scale Improve Agility &
Infrastructure Up and Down Time-to-Market
Deploy
7. Cloud Computing Fault-Tolerance Benefits
No Up-Front HA Low Cost Pay for DR Only
Capital Expense Backups When You Use it
Self-Service Easily Deliver Fault- Improve Agility &
DR Infrastructure Tolerant Applications Time-to-Recovery
Deploy
8. AWS Cloud allows Overcast Redundancy
Have the shadow
duplicate of your
infrastructure ready to go
when you need it…
…but only pay for what
you actually use
9. Old Barriers to HA
are now Surmountable
Cost
Complexity
Expertise
10. AWS Building Blocks: Two Strategies
Inherently fault- Services that are fault-tolerant
tolerant services with the right architecture
S3 Amazon EC2
SimpleDb
VPC
DynamoDB
Cloudfront EBS
SWF, SQS, SNS, SES RDS
Route53
Elastic Load Balancer
Elastic Beanstalk
ElastiCache
Elastic MapReduce
IAM
11. Resources
Deployment
The Stack: Management
Configuration
Networking
Facilities
Geographies
12. EC2 Instances
Amazon Machine Images
The Stack: CW Alarms - AutoScaling
Cloudformation - Beanstalk
Route53 – ElasticIP – ELB
Availability Zones
Regions
13. Regional Diversity
Use Regions for:
Latency
• Customers
• Data Vendors
• Staff
Compliance
Disaster Recovery
… and Fault Tolerance!
33. New! Storage Gateway
Your Datacenter
Amazon Elastic
Compute Cloud
(EC2)
AWS Storage
Gateway
VM SSL
Clients
Internet
On-premises Host or
Direct AWS Storage Amazon Simple
Connect Gateway Service Storage Service (S3)
Application
Servers Amazon Elastic
Block Storage
(EBS)
Direct Attached or Storage Area Network Disks
34. Test! Use a Chaos Monkey!
Prudent
Conservative
Professional
…and all the cool kids are doing it
http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
Cloud computing is a better way to run your business. The cloud helps companies of all sizesbecome moreagile. Instead of running your applications yourself you can run them on the cloud where IT infrastructure is offered as a service like a utility. With the cloud, your company saves money: there are no up-front capital expenses as you don’t have to buy hardware for your projects. The massive scale and fast pace of innovation of the cloud drive the costs down for you. In the cloud, you pay only for what you use just like electricity.The cloud can also help your company save time and improve agility – it’s faster to get started: you can build new environments in minutes as you don’t need to wait for new servers to arrive. The elastic nature of the cloud makes it easy to scale up and down as needed. At the end of the day you have more resources left for innovation which allows you to focus on projects that can really impact your businesses like building and deploying more applications. “With the high growth nature of our business, we were looking for a cloud solution to enable us to scale fast. Think twice before buying your next server. Cloud computing is the way forward.” - Sami Lababidi, CTO, Playfish
AWS is useful for low-end traditional DR to high-end HA, but…AWS encourages a rethinking of traditional DR / HA practicesEverything in the cloud is “off-site” and (potentially) “multi-site”Using multiple sites (multiple AZs) comes largely for freeUsing multiple geographically-distributed sites (multiple Regions) is significantly cheaper and easierTends to move the default design point away from “cold” Disaster Recovery toward “hot” High AvailabilityMakes it easier to stack multiple mechanismse.g., Basic HA within one Region, DR site in second Region
Each item a
Each item a
Fault Separation Amazon EC2 provides customers the flexibility to place instances within multiple geographic regions as well as across multiple Availability Zones. Each Availability Zone is designed with fault separation. This means that Availability Zones are physically separated within a typical metropolitan region, on different flood plains, in seismically stable areas. In addition to discrete uninterruptable power source (UPS) and onsite backup generation facilities, they are each fed via different grids from independent utilities to further reduce single points of failure. They are all redundantly connected to multiple tier-1 transit providers. It should be noted that although traffic flowing across the private networks between Availability Zones in a single region is on AWS-controlled infrastructure, all communications between regions is across public Internet infrastructure, so appropriate encryption methods should be used to protect sensitive data. Data are not replicated between regions unless proactively done so by the customer.
Distinct physical locationsLow-latency network connections between AzsIndependent power, cooling, network, securityAlways partition app stacks across 2 or more AzsElastic Load Balance across instances in multiple AzsDon’t confuse AZ’s with Regions!
Note, the question is not “do you need to automate your deployment” or “should I use automation when I’m using the cloud?” the answer to that is YES!The question is; if you’re using fully standard PHP or Java stacks, why manage it? Beanstalk does that great, with zero lock-in. If what you need is more complex, perhaps cloudformation (note, you can do BOTH!)
Three-Tier Web App has been “fork-lifted” to the cloudEverything in a single Availability ZoneLoad balanced at the Web tier and App tier using software load balancersMaster and Standby databaseElastic IP on front end load balancer onlyS3 used as DB backup instead of tapeHow can you use AWS features to make this app more highly available?
Three-Tier Web App has been “fork-lifted” to the cloudEverything in a single Availability ZoneLoad balanced at the Web tier and App tier using software load balancersMaster and Standby databaseElastic IP on front end load balancer onlyS3 used as DB backup instead of tapeHow can you use AWS features to make this app more highly available?
Class exercise: Use AWS features to make this web app more Highly AvailableUse two Availability Zones for failoverEnable CloudWatch for monitoring and alarmsUse Auto Scaling at Web and App tiers (across two zones)Use regular EBS Snapshots, save configured EC2 instances as AMIsReplace front-end load balancer with ELBUse load balancer on EC2 between Web and App tierReplace when ELB offers internal load-balancingUse Elastic IP addresses for Load Balancer and Data BasePush all static content to S3 and/or CloudFront. Less popular content should be served from S3 directly.Use Route53 to control public DNS entries to dynamic and static content, and to get Zone Apex support for ELBPush logs to S3Put DB replica in second zone for failoverConsider using RDS with Multi-AZ deployment
Avoid single points of failureAssume everything fails, and design backwardsGoal: Applications should continue to function even if the underlying physical hardware fails or is removed or replaced.Design your recovery processTrade off business needs vs. cost of high-availability
Multiple DNS TargetsLoad Balanced across Availability ZonesAuto-scaled web-cache servers with health checksAuto-scaled web-servers with health checksComprehensive config, data, and AMI backupMonitoring, alarming and logging
Mid-tier Load Balancing or QueueingSpans Availability ZonesAuto-scaled App Servers with health checksComprehensive config, data, and AMI backupMonitoring, alarming and logging
DB-Tier Load Balancing or QueueingAuto-scaled Database cache servers with health checksRedundant Relational Database systems Mirrored, log-shipped, async or sync replicatedDesigned to scale horizontally (sharding)Durable NoSQL or KV-store Data SystemsNo SPOF designSupports automatic re-balancing, replication, and fault-recoveryMonitoring, alarming and logging
DB-Tier Load Balancing or QueueingAuto-scaled Database cache servers with health checksRedundant Relational Database systems Mirrored, log-shipped, async or sync replicatedDesigned to scale horizontally (sharding)Durable NoSQL or KV-store Data SystemsNo SPOF designSupports automatic re-balancing, replication, and fault-recoveryMonitoring, alarming and logging
Multi-AZ DeploymentsSynchronous replication across AZsAutomatic fail-over to standby replicaAutomated BackupsEnables point-in-time recovery of the DB instanceRetention period configurableSnapshotsUser initiated full backup of DBNew DB can be created from snapshots