SlideShare a Scribd company logo
1 of 12
Virtual Infrastructure Disaster Recovery
Veeam Backup & Replication
Disaster Recovery
Agenda
• Replication Topology
• Replication Infrastructure Overview
• Requirements
• Replica Storage Calculation
• WAN Accelerator Storage Calculation
• Consideration
• Storage Performance
• Replication Time
• Network
• Infrastructure
Proxy Server
Site B
(Physical)
Proxy Server
Site A
(Physical)
Veeam ONE
WAN
Accelerator
Site B
VMFS
Datastore
WAN
Accelerator
Site A
Veeam ONE
(Optional)
WAN
Accelerator
Site A
Proxy Server
Site A
(Physical)
Proxy Server
Site B
(Physical)
WAN
Accelerator
Site B
VMFS
Datastore
Disaster Recovery
Replication Topology
Veeam Proxy
WAN Accelerator
Veeam ONE
Veeam BR
Virtual Appliance Proxy
Replication Datastore
WAN
Storage
Connections
(Direct SAN)
External
Network
Internal
Network
Replication
Internal
Network
Disaster Recovery
Backup Infrastructure Overview
Role: Veeam Backup & Replication Server
Role Type: Management Server
OS Type: Windows Server x64 – 2008 R2 And Above
Machine Type: Virtual
Per site: 1 Server Per Site
CPU: 4 Cores
Memory: Minimum 4GB – 500MB Per Each Concurrent Job
Disk: Depended On SQL DB And Meta Data Size – Minimum 80GB
Database: Local SQL Server
Network: 1 x vNIC – 1Gb
Role: Veeam ONE Server
Role Type: Monitoring & Reporting Server
OS Type: Windows Server x64 – 2008 R2 And Above
Machine Type: Virtual
Per site: 1 Server Per Site
CPU: 4 Cores
Memory: Depended On Virtual Environment Size – Minimum 8GB
Disk: Depended On SQL DB Size – Minimum 80GB
Database: Local SQL Server
Network: 1 x vNIC – 1Gb
Role: Veeam WAN Accelerator
Role Type: Cache Server
OS Type: Windows Server x64 – 2008 R2 And Above
Machine Type: Physical
Per site: Recommended Two Per Site
CPU: 8 Cores
Memory: Minimum 12 GB
Disk: Calculated On Next Slides – Local / Remote
Database: No DB
Internal Network: 2 x NIC – 10 Gbps
External Network: 1Mbps And Faster WAN Connectivity
Role: Veeam Proxy
Role Type: Transport Machine
OS Type: Windows x64 – Windows 7 And Above
Machine Type: Physical And Virtual
Per site: Recommended Two Machines Per Site
CPU: 1 Core Per Each Concurrent Task
Memory: 200MB Per Each Concurrent Task
Disk: Physical Connectivity To SAN Fabrics*
Database: No DB
Network: 2 x NIC – 10 Gbps
*Physical connectivity will be used by physical proxy, virtual appliance will use hot-add feature.
Disaster Recovery
Requirements – Replica Storage Calculation
Thick VMDK
100GB
Replica With Thick Disk
Memory Size
8GB
Change Rate
(Incremental)
*
Retention
1GB * 4
Thick Replica
Size
112 GB
+
50% Working
Space
Thick VMDK
100GB
Average
Utilization
80GB
Replica With Thin Disk
Memory Size
8GB
Change Rate
(Incremental)
*
Retention
1GB * 4
Thin Replica
Size
92 GB
+
50% Working
Space
• Change rate will be calculated during a period of time and we can retrieve it from Veeam ONE Reporter.
• Working space should be guaranteed, because of future replication process and snapshot creation.
Disaster Recovery
Requirements – WAN Accelerator Storage Calculation
20GB
Free Space
Per Each 1TB
VMDK
Source WAN Accelerator
10% Guard
22GB
Total
100GB
Minimum
Recommendation
10GB
Free Space
Per Each OS
Type
Target WAN Accelerator
10% Guard
33GB
Total
100GB
Minimum
Recommendation
• Global cache is stored only on the target WAN accelerator. We do not have to provide space for global cache on the source WAN accelerator.
20GB
Free Space
Per Each 1TB
VMDK
Disaster Recovery
Consideration – Storage Performance
Four Active Snapshot Per Datastore
• Preventing impact on production storage performance and affect virtual machines performance. Can be
increased according to storage device load.
Dedicated Storage
• Having dedicated storage space on a storage device or dedicated LUNs to preventing impact on recovery site
storage performance and affect virtual machines performance.
• Bottleneck is always source during data processing, so use faster disks will help to reduce impact on production
virtual machines.
Direct SAN Access
• Offload on first session replication by accessing to storage device directly via HBA and SAN fabric.
• Data changes will be transferred via other available transport mode ( NBD or Hot-Add).
• Can be used for Thick disk restoration only, otherwise disk should be converted.
Virtual Appliance Proxy
• All LUNs must be accessible via ESXi storage connections.
• Adding more SCSI Controller, increasing concurrent replication tasks.
Disaster Recovery
Consideration – Replication Time
Average Rate
Of Change
Data Size
20GB
WAN
Bandwidth
10 Mbps
Bandwidth
Usage
90%
Replication
Time
300 Minutes
• Bandwidth calculation formula is: ((Avg. Of Change Rate) / (90% Of Bandwidth MBps)) / 60 Seconds
• Replication time will be increased based on data processing time and storage performance
WAN Replication Time
Data Change Processing Time
Need To Measurement At Real Situation
Disaster Recovery
Consideration – Network
Internal Network Bandwidth
• Network transport mode (NBD) has good performance on 10Gbps and faster links.
• 2 x 10Gbps links should be available for each Veeam proxy.
Isolated Internal Network
• Dedicated network switch for backup and replication to replication traffic isolation.
• Dedicated virtual switch on ESXi servers to replication traffic isolation.
Network Accessibility
• All replication infrastructure components should be accessible on internal and external network, from
production site or recovery site.
• All required ports should be opened on firewall and allow in ACLs, otherwise backup and replication jobs will be
failed.
External Network Bandwidth
• Continuously processing data and replicating data between production site and recovery site needs fast WAN
link with low lose packet. Minimum 1Mbps bandwidth is needed.
Disaster Recovery
Consideration – Infrastructure
Veeam Proxy
• Veeam proxies can be shared between Veeam backup servers, so each backup server needs one source proxy in
source side and one target proxy on target side.
Veeam WAN Accelerator
• Veeam WAN Accelerators can be shared between Veeam backup servers, each backup servers needs one WAN
source and WAN target at least.
Domain And Credential
• All Veeam replication components should be joined or disjoined to one domain. If vCenters are joined to
different domains, local credentials are preferred for connection.
NBD Limitation
• Regarding to limitation on vSphere API to establishing 7 concurrent network connections to each ESXi server,
Veeam proxies can’t processing more than 7 jobs by default.
Disaster Recovery
Replication / Failover / Failback Process
Communicate
Between Data
Movers In
Both Side
Copy VM Data
In Source Side
By Veeam
Proxy
Filtering
Overlapping,
Zero Data And
Swap Data
Blocks
Check
Metadata To
Detect Block
Changed
Copying Data
Blocks,
Compressed
And Move To
Target Side
Decompress
Replica Data
And Write To
Target
Datastore
Continuously Replication
Starting To
Write
Changes To
Delta
Snapshot
Do Network
Re-Mapping
And Run Re-IP
Rule
Power On
Replica VM –
Change Status
To Normal
Roll Back VM
Replica To
Required
Restore Point
Failover Process
Check
Difference Again
And Send
Changes To
Original VM
Again
Power Off
Replica VM
Transport
Changed Data
To Original VM
Check
Difference
Between Replica
And Original
VMs
Create Failback
Snapshot From
Original VM
Power Off
Original VM
Failback Process
Disaster Recovery
Next Step
Before Production Implementation
Simulating replication at real situation with some production servers
Storage space measurement
Network bandwidth measurement
Checking limitation
Detecting problems on real situation

More Related Content

What's hot

Load Balancing
Load BalancingLoad Balancing
Load Balancing
nashniv
 
VMware Performance Troubleshooting
VMware Performance TroubleshootingVMware Performance Troubleshooting
VMware Performance Troubleshooting
glbsolutions
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)
Yury Kaliaha
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
Rick Branson
 

What's hot (20)

PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
Load Balancing
Load BalancingLoad Balancing
Load Balancing
 
VMworld 2013: VMware Virtual SAN Technical Best Practices
VMworld 2013: VMware Virtual SAN Technical Best Practices VMworld 2013: VMware Virtual SAN Technical Best Practices
VMworld 2013: VMware Virtual SAN Technical Best Practices
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And Failover
 
VMware Performance Troubleshooting
VMware Performance TroubleshootingVMware Performance Troubleshooting
VMware Performance Troubleshooting
 
PostgreSQL9.3 Switchover/Switchback
PostgreSQL9.3 Switchover/SwitchbackPostgreSQL9.3 Switchover/Switchback
PostgreSQL9.3 Switchover/Switchback
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery Planning
 
Rit 2011 ats
Rit 2011 atsRit 2011 ats
Rit 2011 ats
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
Ceph Performance Profiling and Reporting
Ceph Performance Profiling and ReportingCeph Performance Profiling and Reporting
Ceph Performance Profiling and Reporting
 
Exchange Server 2013 High Availability - Site Resilience
Exchange Server 2013 High Availability - Site ResilienceExchange Server 2013 High Availability - Site Resilience
Exchange Server 2013 High Availability - Site Resilience
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
 
Load Balancing from the Cloud - Layer 7 Aware Solution
Load Balancing from the Cloud - Layer 7 Aware SolutionLoad Balancing from the Cloud - Layer 7 Aware Solution
Load Balancing from the Cloud - Layer 7 Aware Solution
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
 
Server load balancer ppt
Server load balancer pptServer load balancer ppt
Server load balancer ppt
 

Similar to Virtual Infrastructure Disaster Recovery

VMware Backups That Work—Lessons Learned From VADP Performance Benchmark Testing
VMware Backups That Work—Lessons Learned From VADP Performance Benchmark TestingVMware Backups That Work—Lessons Learned From VADP Performance Benchmark Testing
VMware Backups That Work—Lessons Learned From VADP Performance Benchmark Testing
Symantec
 
Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02
Suresh Kumar
 

Similar to Virtual Infrastructure Disaster Recovery (20)

Server side caching Vs other alternatives
Server side caching Vs other alternativesServer side caching Vs other alternatives
Server side caching Vs other alternatives
 
Veeam Backup & Replication Tips and Tricks
Veeam Backup & Replication Tips and TricksVeeam Backup & Replication Tips and Tricks
Veeam Backup & Replication Tips and Tricks
 
VMware Backups That Work—Lessons Learned From VADP Performance Benchmark Testing
VMware Backups That Work—Lessons Learned From VADP Performance Benchmark TestingVMware Backups That Work—Lessons Learned From VADP Performance Benchmark Testing
VMware Backups That Work—Lessons Learned From VADP Performance Benchmark Testing
 
Presentazione VMware @ VMUGIT UserCon 2015
Presentazione VMware @ VMUGIT UserCon 2015Presentazione VMware @ VMUGIT UserCon 2015
Presentazione VMware @ VMUGIT UserCon 2015
 
Virtualization & Network Connectivity
Virtualization & Network Connectivity Virtualization & Network Connectivity
Virtualization & Network Connectivity
 
VMware - Virtual SAN - IT Changes Everything
VMware - Virtual SAN - IT Changes EverythingVMware - Virtual SAN - IT Changes Everything
VMware - Virtual SAN - IT Changes Everything
 
VMworld Europe 2014: Virtual SAN Best Practices and Use Cases
VMworld Europe 2014: Virtual SAN Best Practices and Use CasesVMworld Europe 2014: Virtual SAN Best Practices and Use Cases
VMworld Europe 2014: Virtual SAN Best Practices and Use Cases
 
VMware virtual SAN 6 overview
VMware virtual SAN 6 overviewVMware virtual SAN 6 overview
VMware virtual SAN 6 overview
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
 
Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02
 
Building a Stretched Cluster using Virtual SAN 6.1
Building a Stretched Cluster using Virtual SAN 6.1Building a Stretched Cluster using Virtual SAN 6.1
Building a Stretched Cluster using Virtual SAN 6.1
 
SCU 2015 - Hyper-V Replica
SCU 2015 - Hyper-V ReplicaSCU 2015 - Hyper-V Replica
SCU 2015 - Hyper-V Replica
 
Microsoft Server Virtualization and Private Cloud
Microsoft Server Virtualization and Private CloudMicrosoft Server Virtualization and Private Cloud
Microsoft Server Virtualization and Private Cloud
 
Virtual SAN 6.2, hyper-converged infrastructure software
Virtual SAN 6.2, hyper-converged infrastructure softwareVirtual SAN 6.2, hyper-converged infrastructure software
Virtual SAN 6.2, hyper-converged infrastructure software
 
(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive(CMP402) Amazon EC2 Instances Deep Dive
(CMP402) Amazon EC2 Instances Deep Dive
 
TechNet Live spor 1 sesjon 6 - more vdi
TechNet Live spor 1   sesjon 6 - more vdiTechNet Live spor 1   sesjon 6 - more vdi
TechNet Live spor 1 sesjon 6 - more vdi
 
Server Virtualization using Hyper-V
Server Virtualization using Hyper-VServer Virtualization using Hyper-V
Server Virtualization using Hyper-V
 
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce ServerInitial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
 
How to Design a Scalable Private Cloud
How to Design a Scalable Private CloudHow to Design a Scalable Private Cloud
How to Design a Scalable Private Cloud
 

More from Davoud Teimouri

More from Davoud Teimouri (8)

Directory Service - Edited.pptx
Directory Service - Edited.pptxDirectory Service - Edited.pptx
Directory Service - Edited.pptx
 
Security Concerns.pptx
Security Concerns.pptxSecurity Concerns.pptx
Security Concerns.pptx
 
Capacity Planning - Parameters Relation
Capacity Planning - Parameters RelationCapacity Planning - Parameters Relation
Capacity Planning - Parameters Relation
 
ITSM Change Management
ITSM Change ManagementITSM Change Management
ITSM Change Management
 
مشکلات فرایندی سازمان ها
مشکلات فرایندی سازمان هامشکلات فرایندی سازمان ها
مشکلات فرایندی سازمان ها
 
Why Do We Need to Third-Party Security Solution?
Why Do We Need to Third-Party Security Solution?Why Do We Need to Third-Party Security Solution?
Why Do We Need to Third-Party Security Solution?
 
NTP Server - How it works?
NTP Server - How it works?NTP Server - How it works?
NTP Server - How it works?
 
Microsoft Hyper-V
Microsoft Hyper-VMicrosoft Hyper-V
Microsoft Hyper-V
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Virtual Infrastructure Disaster Recovery

  • 1. Virtual Infrastructure Disaster Recovery Veeam Backup & Replication
  • 2. Disaster Recovery Agenda • Replication Topology • Replication Infrastructure Overview • Requirements • Replica Storage Calculation • WAN Accelerator Storage Calculation • Consideration • Storage Performance • Replication Time • Network • Infrastructure
  • 3. Proxy Server Site B (Physical) Proxy Server Site A (Physical) Veeam ONE WAN Accelerator Site B VMFS Datastore WAN Accelerator Site A Veeam ONE (Optional) WAN Accelerator Site A Proxy Server Site A (Physical) Proxy Server Site B (Physical) WAN Accelerator Site B VMFS Datastore Disaster Recovery Replication Topology Veeam Proxy WAN Accelerator Veeam ONE Veeam BR Virtual Appliance Proxy Replication Datastore WAN Storage Connections (Direct SAN) External Network Internal Network Replication Internal Network
  • 4. Disaster Recovery Backup Infrastructure Overview Role: Veeam Backup & Replication Server Role Type: Management Server OS Type: Windows Server x64 – 2008 R2 And Above Machine Type: Virtual Per site: 1 Server Per Site CPU: 4 Cores Memory: Minimum 4GB – 500MB Per Each Concurrent Job Disk: Depended On SQL DB And Meta Data Size – Minimum 80GB Database: Local SQL Server Network: 1 x vNIC – 1Gb Role: Veeam ONE Server Role Type: Monitoring & Reporting Server OS Type: Windows Server x64 – 2008 R2 And Above Machine Type: Virtual Per site: 1 Server Per Site CPU: 4 Cores Memory: Depended On Virtual Environment Size – Minimum 8GB Disk: Depended On SQL DB Size – Minimum 80GB Database: Local SQL Server Network: 1 x vNIC – 1Gb Role: Veeam WAN Accelerator Role Type: Cache Server OS Type: Windows Server x64 – 2008 R2 And Above Machine Type: Physical Per site: Recommended Two Per Site CPU: 8 Cores Memory: Minimum 12 GB Disk: Calculated On Next Slides – Local / Remote Database: No DB Internal Network: 2 x NIC – 10 Gbps External Network: 1Mbps And Faster WAN Connectivity Role: Veeam Proxy Role Type: Transport Machine OS Type: Windows x64 – Windows 7 And Above Machine Type: Physical And Virtual Per site: Recommended Two Machines Per Site CPU: 1 Core Per Each Concurrent Task Memory: 200MB Per Each Concurrent Task Disk: Physical Connectivity To SAN Fabrics* Database: No DB Network: 2 x NIC – 10 Gbps *Physical connectivity will be used by physical proxy, virtual appliance will use hot-add feature.
  • 5. Disaster Recovery Requirements – Replica Storage Calculation Thick VMDK 100GB Replica With Thick Disk Memory Size 8GB Change Rate (Incremental) * Retention 1GB * 4 Thick Replica Size 112 GB + 50% Working Space Thick VMDK 100GB Average Utilization 80GB Replica With Thin Disk Memory Size 8GB Change Rate (Incremental) * Retention 1GB * 4 Thin Replica Size 92 GB + 50% Working Space • Change rate will be calculated during a period of time and we can retrieve it from Veeam ONE Reporter. • Working space should be guaranteed, because of future replication process and snapshot creation.
  • 6. Disaster Recovery Requirements – WAN Accelerator Storage Calculation 20GB Free Space Per Each 1TB VMDK Source WAN Accelerator 10% Guard 22GB Total 100GB Minimum Recommendation 10GB Free Space Per Each OS Type Target WAN Accelerator 10% Guard 33GB Total 100GB Minimum Recommendation • Global cache is stored only on the target WAN accelerator. We do not have to provide space for global cache on the source WAN accelerator. 20GB Free Space Per Each 1TB VMDK
  • 7. Disaster Recovery Consideration – Storage Performance Four Active Snapshot Per Datastore • Preventing impact on production storage performance and affect virtual machines performance. Can be increased according to storage device load. Dedicated Storage • Having dedicated storage space on a storage device or dedicated LUNs to preventing impact on recovery site storage performance and affect virtual machines performance. • Bottleneck is always source during data processing, so use faster disks will help to reduce impact on production virtual machines. Direct SAN Access • Offload on first session replication by accessing to storage device directly via HBA and SAN fabric. • Data changes will be transferred via other available transport mode ( NBD or Hot-Add). • Can be used for Thick disk restoration only, otherwise disk should be converted. Virtual Appliance Proxy • All LUNs must be accessible via ESXi storage connections. • Adding more SCSI Controller, increasing concurrent replication tasks.
  • 8. Disaster Recovery Consideration – Replication Time Average Rate Of Change Data Size 20GB WAN Bandwidth 10 Mbps Bandwidth Usage 90% Replication Time 300 Minutes • Bandwidth calculation formula is: ((Avg. Of Change Rate) / (90% Of Bandwidth MBps)) / 60 Seconds • Replication time will be increased based on data processing time and storage performance WAN Replication Time Data Change Processing Time Need To Measurement At Real Situation
  • 9. Disaster Recovery Consideration – Network Internal Network Bandwidth • Network transport mode (NBD) has good performance on 10Gbps and faster links. • 2 x 10Gbps links should be available for each Veeam proxy. Isolated Internal Network • Dedicated network switch for backup and replication to replication traffic isolation. • Dedicated virtual switch on ESXi servers to replication traffic isolation. Network Accessibility • All replication infrastructure components should be accessible on internal and external network, from production site or recovery site. • All required ports should be opened on firewall and allow in ACLs, otherwise backup and replication jobs will be failed. External Network Bandwidth • Continuously processing data and replicating data between production site and recovery site needs fast WAN link with low lose packet. Minimum 1Mbps bandwidth is needed.
  • 10. Disaster Recovery Consideration – Infrastructure Veeam Proxy • Veeam proxies can be shared between Veeam backup servers, so each backup server needs one source proxy in source side and one target proxy on target side. Veeam WAN Accelerator • Veeam WAN Accelerators can be shared between Veeam backup servers, each backup servers needs one WAN source and WAN target at least. Domain And Credential • All Veeam replication components should be joined or disjoined to one domain. If vCenters are joined to different domains, local credentials are preferred for connection. NBD Limitation • Regarding to limitation on vSphere API to establishing 7 concurrent network connections to each ESXi server, Veeam proxies can’t processing more than 7 jobs by default.
  • 11. Disaster Recovery Replication / Failover / Failback Process Communicate Between Data Movers In Both Side Copy VM Data In Source Side By Veeam Proxy Filtering Overlapping, Zero Data And Swap Data Blocks Check Metadata To Detect Block Changed Copying Data Blocks, Compressed And Move To Target Side Decompress Replica Data And Write To Target Datastore Continuously Replication Starting To Write Changes To Delta Snapshot Do Network Re-Mapping And Run Re-IP Rule Power On Replica VM – Change Status To Normal Roll Back VM Replica To Required Restore Point Failover Process Check Difference Again And Send Changes To Original VM Again Power Off Replica VM Transport Changed Data To Original VM Check Difference Between Replica And Original VMs Create Failback Snapshot From Original VM Power Off Original VM Failback Process
  • 12. Disaster Recovery Next Step Before Production Implementation Simulating replication at real situation with some production servers Storage space measurement Network bandwidth measurement Checking limitation Detecting problems on real situation