VMworld 2013
Mauricio Barra, VMware
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
Thomas McQuillan, UnitedHealth Group
Techarex networks introduces disaster recovery as a service (draas) in united...Techarex Networks
Semelhante a VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Lessons from a Fortune 500 Health Care Company Implementation (20)
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Lessons from a Fortune 500 Health Care Company Implementation
1. vCenter Site Recovery Manager – Solution Overview
and Lessons from a Fortune 500 Health Care
Company Implementation
Mauricio Barra, VMware
Thomas McQuillan, UnitedHealth Group
BCO5733
#BCO5733
2. 22
Agenda
Context of BC/DR
vCenter Site Recovery Manager 5.5
Licensing, Pricing and Packaging
UnitedHealth Group: An SRM implementation at scale
from a Fortune 500 Company
4. 44
Uptime and Protection of Data are Critical for Business
Revenue
Continuously available services
ensure revenue streams
Productivity Enables the workforce to work
at full capacity
Compliance
Guarantees responsiveness
to auditing entities (SOX, ISO)
Reputation Protects relationships with
customers and partners
5. 55
Improving BC/DR Is at the Top of IT Initiatives
Source: Forrester “Server Virtualization Predictions For 2013”, March 2013
Source: Forrester “BC/DR Remain Priorities For 2012 But Take A Backseat To Cost-Saving And Efficiency Initiatives”, October 2011
Among top 5 technology
priorities in 2012
• 40% report High Priority
• 20% report Critical
Priority
#1 driver for
virtualization:
• 57% report it’s “very
important” to adopt
x86 virtualization
6. 66
Legacy Disaster Recovery Solutions Are Not Adequate
Expensive
Complex
Recovery Plans
?
?
?
?
?
?
?
?
Unreliable
Failovers
Apps
Hosts
Storage
Network
Software
Hosts
Storage
Facilities
>$10K per app
Failure to meet business requirements
• Long RTOs – days to weeks
• Too much time and resources consumed
7. 77
Planned Downtime Unplanned Downtime
VMware Enables IT Business Continuity at All Levels
• vMotion
• Storage vMotion
• Fault Tolerance
• High Availability
• App HA
• Site Recovery Manager
• DR to the Cloud with SRM
• vSphere Replication
• vSphere Data Protection Advanced
• vSphere APIs for Data Protection (VADP)
Site Application Availability
Local Application Availability
Data Protection
9. 99
Key Components of a Disaster Recovery Solution with SRM
vCenter Server
Site
Recovery
Manager
Protected Site Recovery Site
Storage
vCenter Server
Site
Recovery
Manager
vSpherevSphere
Storage
Disaster Recovery: Ensuring recovery or continuation of operations at an alternate
site in the case of an outage at the primary site
10. 1010
Replication Options
vSphere Replication
Array-Based Replication (3rd party)
Key Components of a Disaster Recovery Solution with SRM
vCenter Server
Site
Recovery
Manager
Protected Site
vSphere
Storage
Site Recovery Manager
vSphere
Disaster Recovery: Ensuring recovery or continuation of operations at an alternate
site in the case of an outage at the primary site
11. 1111
Copy Individual Virtual Machines with vSphere Replication
Only true Hypervisor-based replication
for vSphere
Asynchronous RPOs (15 min to 24 hrs)
Managed directly from vCenter Server
Included with vSphere Ess+ and above
Reduce replication software costs
Reduce storage costs using
heterogeneous arrays
Simpler VM-level replication
SRM integration enables automated DR
vSphere
vSphere
Replication
Site A (Primary)
vSphere
Site B (Recovery)
Overview
Benefits
12. 1212
What’s New with vSphere Replication
Multiple vSphere Replication
appliances per vCenter Server
Choose to revert to a previous
‘known good point’ after failover
Enables new topologies with up to
10 vSphere Replication appliances
Multiple point-in-time
recovery
Benefit
Reduce storage costs replicating
to and from Virtual SAN
Support for Virtual SAN
(public beta)
Support for Storage vMotion
and Storage DRS
Leverage this vSphere functionality
on VMs being replicated
New Feature
New
New optimizations gain up to 5x
speed improvement in replication
Dramatic speed improvement
13. 1313
Site Recovery Manager Delivers Simple and Reliable DR
DR orchestration solution that
automates testing and execution
of centralized recovery plans
Leverages vSphere Replication
or broad range of array-based
replication solutions
Up to 50% lower TCO for DR
Setup recovery plans in minutes,
not weeks
Initiate orchestration with one click
Test as frequently as needed
vCenter Site Recovery Manager
Benefits
VMware vSphere
VMware
vCenter Server
Site Recovery
Manager
VMware
vCenter Server
Site Recovery
Manager
VMware vSphere
Site A (Primary) Site B (Recovery)
Servers ServersArray-based
replication
vSphere
Replication
14. 1414
What’s New with Site Recovery Manager
Choose to revert to a previous
‘known good point’ after failover
Multiple point-in-time recovery
with vSphere Replication
Benefit
Reduce storage costs using Virtual
SAN with vSphere Replication
Support for Virtual SAN
(public beta)
Support for Storage vMotion
and Storage DRS
Leverage this vSphere functionality
on VMs being replicated
New Feature
New
15. 1515
SRM Transforms Management of Recovery and Migration Plans
Weeks or months to set up
recovery plans
Unstructured and error-prone
Quickly falls out of sync with apps
and infrastructure changes
Simple set up in minutes
Defined workflows eliminate errors
Simple to keep in sync with changes
…to Simple Recovery PlansFrom Complex Runbooks…
16. 1616
Frequent Testing Reduces Recovery Risk
During the testing gap, organizations can’t be sure that they
can recover the current IT environment
A failover scenario may take days or weeks to complete,
leaving the business at extreme risk
Lack of confidence
in DR process
Time
DR Test DR Test
TESTING GAP
Recovery
Risk
Traditional Disaster Recovery
17. 1717
Frequent Testing Reduces Recovery Risk
SRM provides assurance that DR objectives will be met.
Time
DR Test DR Test
TESTING GAP
Recovery
Risk
Traditional Disaster Recovery
Recovery
Risk
DR Test DR Test
Time
Site Recovery Manager
Frequent
DR Testing
18. 1818
SRM Automates Every Workflow of DR Orchestration
Replication
Main site
Recovery
site
Non-disruptive Testing Automated Failover
Automated Failback Planned Migrations
• Automated testing in
an isolated network
• Test as frequently
as needed for
predictable RTOs
• Automatically re-protect
VMs from Site B to Site A
• Reverse original
recovery plan
• 1-click initiation
• Automated execution
of user-defined
recovery plan
• Graceful shutdown
of production VMs
• ‘Data sync’ ensures zero
data loss
SRM
19. 1919
SRM’s Automation Reduces The Cost of Disaster Recovery
DR Costs per VM per Year
Source: The Total Economic Impact of VMware vCenter Site Recovery Manager, Forrester, May 2013
$1,757
$800 $800
$288 $288
$477
$477
$-
$500
$1,000
$1,500
$2,000
$2,500
Manual DR SRM only SRM + vSphere
Replication
DR management and testing SRM Software Replication
50% lower DR costs
(not factoring cost of
downtime)
• Over $1,100 savings per
year for each protected VM
• Avg. cost of downtime is
$145,000 per hour
• Planned migrations add 5%
cost savings
$2,234
$1,564
-30%
$1,087
-21%
Downtime
20. 2020
Public Cloud
Shared Recovery Site
DR2C Delivers SRM Benefits without Secondary Datacenter
Main site
Cost-efficient DR services:
Subscription-based
Shared resources lower cost
Providers offer variety of pricing, packaging,
service levels and deployment options
DR to the Cloud with SRM
Partner Ecosystem
vSphere
vCenter
Server
SRM
vSphere
Replication
22. 2222
SRM Available a-la-Carte or with vCloud Suite Enterprise
Packaging Licensing What is included with each license?
A-la-carte Per VM
•SRM only
•Two editions – Standard or Enterprise
•Entitlement to protect a certain number
of licensed virtual machines
vCloud Suite
Enterprise
Per CPU
•SRM, vSphere Ent+ and all the
components of vCloud Suite Enterprise
•Entitlement to protect an unlimited number
of virtual machines on licensed processors
23. 2323
SRM a-la-Carte Available in Two Editions
Standard Enterprise
Licensing and Pricing
Per protected virtual machine (license only) $195 $495
Scalability Limits
• Maximum protected VMs 75 VMs
(1)
Unlimited(2)
Features
• Centralized recovery plans ● ●
• Non-disruptive testing ● ●
• Automated DR failover ● ●
• Automated failback ● ●
• Planned migration ● ●
• Array-based replication support ● ●
• vSphere Replication support ● ●
• Multiple point-in-time recovery with VR ● ●
• Storage vMotion / Storage DRS support ● ●
• Virtual SAN (public beta) support ● ●
New in SRM 5.5
1. Maximum of 75 VMs per site and per SRM instance
2. Subject to the product’s technical scalability limits
24. 2424
CloudManagementCloudInfrastructure
SRM Included in vCloud Suite Enterprise
Price (per CPU, license only)
vSphere Enterprise Plus
• Virtualized infrastructure with policy-based automation
Disaster Recovery Automation
• Automated disaster recovery planning, testing, and execution
Cloud Automation
• Application and data services – Application provisioning, changes and data
• Governance – Approvals, reclamation, cost profile and transparency
• Extensibility – Infrastructure integrations, workflows and customizations
• Infrastructure provisioning and management
SRM Enterprise
$4,995 $7,495 $11,495
Networking and Security
• Scalable networking and virtualization-aware security
vCloud Net & Sec vCloud Net & SecvCloud Net & Sec
vSphere
Enterprise Plus
vSphere
Enterprise Plus
vSphere
Enterprise Plus
Operations Management
• Application Monitoring – OS, middleware, databases
• OS-level change, configuration and regulatory compliance management
• Extensibility – Adapters for 3rd party OS and application monitoring tools
• Extensibility – Adapters for 3rd party Infrastructure monitoring tools
• vSphere hardening, change and configuration management
• Application Awareness – Discovery dependency mapping
• Chargeback – Cost metering and reporting
• Operations Dashboard – Health Monitoring and Performance Analytics
• Capacity Management – Planning and Optimization
vCOPS Advanced vCOPS Enterprise
vCAC Ent
Updated Q3 2013
vCOPS Standard
vCAC AdvvCAC Std
Virtualized Datacenters
• Virtualized datacenters and public cloud extensibility
vCD, vCCvCD, vCCvCD, vCC
EnterpriseAdvancedStandard
31. vCenter
HB
Site A Site B
vCenter
HB
2. Daily maintenance scripting captures
VM configuration.
1. Standard Data Protection.
3. VM’s are backed up and 2nd copy
processes are utilized to move copies of
backups to 2nd site.
Recovery Methodology:
HOT = Active Infrastructure being
repurposed for DR Recovery.
Not: Active/Standby
RTO = 48 – 72 Hrs
Backup Protected (HOT)
Backup
Proxy
RPO = <48 Hrs
Backup
Proxy
32. vCenter
HB
vCenter
HB
4. A DR Event is experienced.
5. DR Failover is invoked.
Backup Protected (HOT)
Site A Site B
Backup
Proxy
2. Daily maintenance scripting captures
VM configuration.
1. Standard Data Protection.
3. VM’s are backed up and 2nd copy
processes are utilized to move copies of
backups to 2nd site.
Recovery Methodology:
HOT = Active Infrastructure being
repurposed for DR Recovery.
Not: Active/Standby
RTO = 48 – 72 Hrs
Backup
Proxy
RPO = <48 Hrs
Bad News
33. vCenter
HB
Site A Site B
vCenter
HB
6. Non-Essential VM’s are Shut Down and
Deleted.
Backup Protected (HOT)
Backup
Proxy
4. A DR Event is experienced.
5. DR Failover is invoked.
2. Daily maintenance scripting captures
VM configuration.
1. Standard Data Protection.
3. VM’s are backed up and 2nd copy
processes are utilized to move copies of
backups to 2nd site.
Recovery Methodology:
HOT = Active Infrastructure being
repurposed for DR Recovery.
Not: Active/Standby
RTO = 48 – 72 Hrs
RPO = <48 Hrs
Backup
Proxy
34. vCenter
HB
Site A Site B
vCenter
HB
8. Backup Proxies reconfig.
Backup Protected (HOT)
7. Core Infrastructure Reconfig.
Backup
Proxy
6. Non-Essential VM’s are Shut Down and
Deleted.
4. A DR Event is experienced.
5. DR Failover is invoked.
2. Daily maintenance scripting captures
VM configuration.
1. Standard Data Protection.
3. VM’s are backed up and 2nd copy
processes are utilized to move copies of
backups to 2nd site.
Recovery Methodology:
HOT = Active Infrastructure being
repurposed for DR Recovery.
Not: Active/Standby
RTO = 48 – 72 Hrs
RPO = <48 Hrs
Backup
Proxy
35. vCenter
HB
Site A Site B
vCenter
HB
9. VM’s restored via Backup Proxies.
Backup Protected (HOT)
Backup
Proxy
Backup
Proxy
8. Backup Proxies reconfig.
7. Core Infrastructure Reconfig.
6. Non-Essential VM’s are Shut Down and
Deleted.
4. A DR Event is experienced.
5. DR Failover is invoked.
2. Daily maintenance scripting captures
VM configuration.
1. Standard Data Protection.
3. VM’s are backed up and 2nd copy
processes are utilized to move copies of
backups to 2nd site.
Recovery Methodology:
HOT = Active Infrastructure being
repurposed for DR Recovery.
Not: Active/Standby
RTO = 48 – 72 Hrs
RPO = <48 Hrs
36. vCenter
HB
Site A Site B
vCenter
HB
10. VM NIC configuration script run to
restore VM NICs to restored VM’s.
11. VM’s started.
12. Application Verification.
Backup Protected (HOT)
Backup
Proxy
9. VM’s restored via Backup Proxies.
8. Backup Proxies reconfig.
7. Core Infrastructure Reconfig.
6. Non-Essential VM’s are Shut Down and
Deleted.
4. A DR Event is experienced.
5. DR Failover is invoked.
2. Daily maintenance scripting captures
VM configuration.
1. Standard Data Protection.
3. VM’s are backed up and 2nd copy
processes are utilized to move copies of
backups to 2nd site.
Recovery Methodology:
HOT = Active Infrastructure being
repurposed for DR Recovery.
Not: Active/Standby
RTO = 48 – 72 Hrs
RPO = <48 Hrs
Backup
Proxy
41. vCenter
HB
Site A Site B
vCenter
HB
1. SRM Protected VM’s based on RTO Band.
2. Replication of VM’s based on RPO Decision Flow.
Recovery Methodology:
Site Recovery Manager
vSphere and/or Storage Rep.
Storage Rep RPO <30 Min
VMware
SRM
VMware
SRM
VMware SRM Protected
2nd Copy
Storage
Primary
Storage
vSphere Rep RPO = <24 Hr
RTO Band 1 < 8 hrs
RTO Band 2 = 8 - 24 hrs
RTO Band 3 = 24 - 48 hrs
RTO Band 4 = 48 - 72 hrs
42. vCenter
HB
Site A Site B
vCenter
HB
VMware
SRM
VMware SRM Protected
3. A DR Event is experienced.
4. DR Failover is invoked.
VMware
SRM
2nd Copy
Storage
Bad News
1. SRM Protected VM’s based on RTO Band.
2. Replication of VM’s based on RPO Decision Flow.
Recovery Methodology:
Site Recovery Manager
vSphere and/or Storage Rep.
Storage Rep RPO <30 Min
vSphere Rep RPO = <24 Hr
RTO Band 1 < 8 hrs
RTO Band 2 = 8 - 24 hrs
RTO Band 3 = 24 - 48 hrs
RTO Band 4 = 48 - 72 hrs
43. vCenter
HB
Site A Site B
vCenter
HB
VMware SRM Protected
5. Non-Essential VM’s are Shut Down (Not Deleted).
2nd Copy
Storage
VMware
SRM
3. A DR Event is experienced.
4. DR Failover is invoked.
1. SRM Protected VM’s based on RTO Band.
2. Replication of VM’s based on RPO Decision Flow.
Recovery Methodology:
Site Recovery Manager
vSphere and/or Storage Rep.
Storage Rep RPO <30 Min
vSphere Rep RPO = <24 Hr
RTO Band 1 < 8 hrs
RTO Band 2 = 8 - 24 hrs
RTO Band 3 = 24 - 48 hrs
RTO Band 4 = 48 - 72 hrs
VMware
SRM
44. vCenter
HB
Site A Site B
vCenter
HB
VMware SRM Protected
6. Core Infrastructure Reconfigured.
2nd Copy
Storage
VMware
SRM
5. Non-Essential VM’s are Shut Down (Not Deleted).
3. A DR Event is experienced.
4. DR Failover is invoked.
1. SRM Protected VM’s based on RTO Band.
2. Replication of VM’s based on RPO Decision Flow.
Recovery Methodology:
Site Recovery Manager
vSphere and/or Storage Rep.
Storage Rep RPO <30 Min
vSphere Rep RPO = <24 Hr
RTO Band 1 < 8 hrs
RTO Band 2 = 8 - 24 hrs
RTO Band 3 = 24 - 48 hrs
RTO Band 4 = 48 - 72 hrs
VMware
SRM
45. vCenter
HB
Site A Site B
vCenter
HB
VMware SRM Protected
7. SRM recovery plans executed by RTO Band.
VMware
SRM
2nd Copy
Storage
6. Core Infrastructure Reconfigured.
5. Non-Essential VM’s are Shut Down (Not Deleted).
3. A DR Event is experienced.
4. DR Failover is invoked.
1. SRM Protected VM’s based on RTO Band.
2. Replication of VM’s based on RPO Decision Flow.
Recovery Methodology:
Site Recovery Manager
vSphere and/or Storage Rep.
Storage Rep RPO <30 Min
vSphere Rep RPO = <24 Hr
RTO Band 1 < 8 hrs
RTO Band 2 = 8 - 24 hrs
RTO Band 3 = 24 - 48 hrs
RTO Band 4 = 48 - 72 hrs
VMware
SRM
46. vCenter
HB
Site A Site B
vCenter
HB
VMware
SRM
VMware SRM Protected
8. Application Verification.
VMware
SRM
2nd Copy
Storage
7. SRM recovery plans executed by RTO Band.
6. Core Infrastructure Reconfigured.
5. Non-Essential VM’s are Shut Down (Not Deleted).
3. A DR Event is experienced.
4. DR Failover is invoked.
1. SRM Protected VM’s based on RTO Band.
2. Replication of VM’s based on RPO Decision Flow.
Recovery Methodology:
Site Recovery Manager
vSphere and/or Storage Rep.
Storage Rep RPO <30 Min
vSphere Rep RPO = <24 Hr
RTO Band 1 < 8 hrs
RTO Band 2 = 8 - 24 hrs
RTO Band 3 = 24 - 48 hrs
RTO Band 4 = 48 - 72 hrs
50. 50
Other VMware Activities Related to This Session
HOL:
HOL-SDC-1305
Business Continuity and Disaster Recovery In Action
Group Discussions:
BCO1004-GD
vCenter Heartbeat with Harry Smith
53. vCenter Site Recovery Manager – Solution Overview
and Lessons from a Fortune 500 Health Care
Company Implementation
Mauricio Barra, VMware
Thomas McQuillan, UnitedHealth Group
BCO5733
#BCO5733