Maintaining physical Disaster Recovery (DR) datacenters grows more cost-prohibitive each year. By moving your DR data center to the AWS cloud, you enable faster disaster recovery and greater resiliency without the cost of a second physical datacenter.
In this webinar, we covered:
-Architecting “pilot light” to “hot standby” DR environments
-Multi AWS Availability Zone DR strategies
-What cloud DR offers that on-premises options can’t
-Lessons Learned from DR implementations on AWS
-Demo: Building a hot standby DR environment
Learn more at https://www.softnas.com/aws
5. Terminology
Business Continuity
Business Continuity ensures an
organization's critical business functions
continue to operate or recover quickly
despite serious incidents.
Disaster Recovery
Disaster Recovery (DR) enables the
recovery or continuation of vital technology
infrastructure and systems following a
natural or human-induced disaster.
Recovery Point Objective Recovery Time Objective
RTO is a targeted duration of which a
business process must be restored after a
disaster or disruption.
RPO is the maximum targeted period in
which data might be lost from an IT
service due to a major incident.
6. Keep your Primary Datacenter, but Shift DR to AWS
Primary Datacenter DR Datacenter
Traditional DR
Replication
Main Datacenter
AWS DR
Replication
Amazon
S3
Import/
Export
Amazon
EC2
Amazon
Route 53
SoftNAS
Cloud
Users
Users
+ Additional
services
7. DR Datacenter vs.AWS
On-Premises
• High cost to build disaster recovery
sites or datacenters (CapEx)
• High cost of storage, backup,
archival and retrieval tools, and
processes (OpEx)
• Difficult planning, procurement and
deployment
• Challenging to verify DR plans
• Single level of DR across the
organization
AWS
• Low cost upfront investment
(CapEx)
• On-demand costs (OpEx)
• Consistent experience across AWS
environments
• Recovery automation
• Separate levels of DR per
application or business unit
9. DR Datacenter
Routers
Firewalls
Network
Application Licenses
Operating Systems
Hypervisor
Servers
SAN fabric
Primary Storage
Backup
Archive
AWS
Routers
Firewalls
Network
Application Licenses
Operating Systems
Hypervisor
Servers
SAN fabric
Snapshot Storage
Backup
Archive
DR Infrastructure Management
WhatYouManage
withaDRDatacenter
WhatYou Manage
with AWS DR
10. DR Services Mapping
Your Datacenter
Route 53
ELB/Appliance
EC2/Auto scaling
DB failover nodes
AD failover nodes
Availability zones
Multi-regionDisaster Recovery
DataCenters
DNS
Load Balancers
Web/App Servers
Database Servers
AD/Authentication
13. DR Architectures
Backup &
Restore
Pilot Light
Hot Standby
Multi-Site
Backup of on-premises
data to AWS to use in a
DR event
Replicate data and
minimal running services
intoAWS, ready to take
over and flare up
Replicate data and
services into AWS
ready to take over
Replicated and load
balanced
environments that are
both actively
taking production traffic
RPO
COST
24 hours
RTO
24 hours
$
RPO RTO
COST
12 hours 4 hours
$$
RPO
COST
1-4hours
RTO
15 min
$$$
RPO RTO
COST
<15 min 0-5min
$$$$
Business continuity
begins
Un-interruptedBusiness
continuity
14. Backup & Restore Pilot Light Hot Standby Multi-Site
S3SoftNAS
Cloud
Glacier EBS
Volumes
Route 53 Direct
Connect
VPN
NetworkingStorage
Multiple Direct
Connects
Compute
Auto
Scaling
ELBEC2
Deployment/
Management
CloudFormation IAM
Added through the levels of DR
VPC
16. Backup & Restore – How itWorks
Advantages
• Simple to get started
• Cost effective (mostly backup storage)
Preparation Phase
• Start SoftNAS Cloud 30 day free trial
• Install and configure SoftNAS Cloud
• Describe procedure to restore from backup
on AWS
• Know which AMI to use, build your
own as needed
• Know how to switch to new system
• Know how to configure the
deployment
In Case of Disaster
• Retrieve backups from S3
• Bring up required infrastructure
• EC2 instances with prepared AMIs,
Load Balancing, etc.
• Restore system from backup
• Switch over to the new system
• Adjust DNS records to point to AWS
Objectives
• RTO: as long as it takes to bring up
infrastructure and restore system from
backups
• RPO: time since last backup
17. Pilot Light Architecture
Data Replication
ELB
On-premises
Active
Production
Route 53
Corporate data center
1 TB Data
Volume
Web
Servers
AWS region
Web
Servers
AWS
Active
Production
Direct Connect
App
Servers
DB
Server
App
Servers
DB
Server 1TB
Data
Volume
EC2 (m3.xlarge)
$205/Month
EBS (GP2)
$100/Month
EC2 (t2.medium)
$0/Month
ELB (100GB Data)
$0/Month
EC2 (t2.small)
$0/Month
ELB (100GB Data)
$0/Month
CloudFormation
18. Pilot Light – How itWorks
Advantages
• Very cost effective (fewer 24/7 resources)
Preparation Phase
• Enable replication of all critical data to AWS
• Prepare all required resources for
automatic start
• AMIs, Network Settings, Load
Balancing, etc.
• Reserved Instances
In Case of Disaster
• Automatically bring up resources around
the replicated core data set
• Scale the system as needed to handle
current production traffic
• Switch over to the new system
• Adjust DNS records to point to AWS
Objectives
• RTO: around 4hours
• RPO: around 12 hours
19. Hot Standby Architecture
ELB
On-premises
Active
Production
Route 53
Corporate data center
1 TB Data
Volume
Web
Servers
AWS region
Web
Servers
AWS
Active
Production
App
Servers
DB
Server
App
Servers
DB
Server 1TB
Data
Volume
EC2 (m3.xlarge)
$205/Month
EBS (GP2)
$100/Month
EC2 (t2.medium)
$41/Month
ELB (100GB Data)
$19/Month
EC2 (t2.small)
$22/Month
ELB (100GB Data)
$19/Month
R53 (1M Query)
$4/Month
CloudFormation
Data Replication
Direct Connect
20. Hot Standby – How itWorks
Advantages
• Handles production workloads well
Preparation Phase
• Enable replication of all critical data to
AWS
• Prepare all required resources for
automatic start
• AMIs, Network Settings, Load
Balancing, etc.
• Reserved Instances
In Case of Disaster
• Automatically bring up resources around
the replicated core data set
• Scale the system as needed to handle
current production traffic
• Switch over to the new system
• Adjust DNS records to point to AWS
Objectives
• RTO: around 15 minutes
• RPO: around 1-4 hours
21. Multi-site Architecture
Data Replication
ELB
On-premises
Active
Production
Route 53
Corporate data center
1 TB Data
Volume
Web
Servers
AWS region
Web
Servers
AWS
Active
Production
Direct Connect
App
Servers
DB
Server
App
Servers
DB
Server 1TB
Data
Volume
EC2 (m3.xlarge)
$205/Month
EBS (GP2)
$100/Month
EC2 (t2.medium)
$82/Month
ELB (100GB Data)
$19/Month
EC2 (t2.small)
$44/Month
ELB (100GB Data)
$19/Month
R53 (1M Query)
$4/Month
CloudFormation
22. Multi-site – How itWorks
– Advantages
• At any moment can take all production load
– Preparation
• Fully scaling in/out with production load
– In Case of Disaster
• Immediately fail over all production load
• Adjust DNS records to point to AWS
– Objectives
• RTO: minutes
• RPO: minutes
23. Customer DR Example
Customer has a combination ofTier 1,Tier 2, andTier 3 business applications.They did the following:
Tier 1 Apps
RPO & RTO <15 minutes
Multi-site DR
• Critical core elements of system
already configured
• EC2 instances running for critical
services
• Pre-configuredAMIs forTier-2 apps
that can be quickly provisioned upon
failure
• Cloud infrastructure load-balanced
and configured for automatic failover
• Initial data synchronization using in-
house backup software or FTP
• Incremental data replicated /
synchronized using cloud NAS
Tier 2 Apps
RPO & RTO <4 hours
Pilot Light DR
• EC2 instances for all services
running at all times
• In-house and cloud infrastructure
load-balanced and configured for
auto-failover
• Initial data synchronization using
in-house backup software or FTP
• Incremental data replicated /
synchronized using cloud NAS
• All data replicated into S3 bucket
• Initial data synchronization using in
house backup software or FTP
• Pre-configured AMIs forTier 1 and
Tier 2 apps quickly provisioned upon
failure
• Incremental data replicated /
synchronized using cloud NAS
• EC2 instances spun-up from objects
within S3 buckets
Tier 3 Apps
RPO & RTO <8 hours
Backup & Restore
Fast Performance: Fast disk-based storage and retrieval of files.
Compliance: Fast retrieval of files allows you to avoid fines for missing compliance deadlines.
Elasticity: Add any amount of data, quickly. Easily expire and delete without handling media.
Secure: Secure and durable cloud disaster recovery platform with industry-recognized certifications and audits.
Partners: AWS solution provider and system integration partners to help with your deployment.
Businesses are using the AWS cloud to enable faster disaster recovery of their critical IT systems without incurring the infrastructure expense of a second physical site. The AWS cloud supports many popular disaster recovery (DR) architectures from “pilot light” environments that may be suitable for small customer workload data center failures to “hot standby” environments that enable rapid failover at scale.