VMworld 2013
Kannan Mani, VMware
Brad Pinkston, VMware
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
6. 6
SRM Provides Broad Choice of Replication Options
vSphere Replication
Simple, cost-efficient replication for Tier 2 applications and smaller sites
Storage-based Replication
High-performance replication for business-critical applications in larger sites
vCenter Server
Site
Recovery
Manager
vSphere
vCenter Server
Site
Recovery
Manager
vSphere
vSphere
Replication
Storage-based
replication
Site A (Primary) Site B (Recovery)
7. 7
vSphere Replication Complements Storage-Based Replication
Replication
Provider
Cost Management Performance
vSphere
Replication
VMware
• Low-end storage
supported
• No additional
replication
software
• VM’ granularity
• Managed directly
in vCenter
• 15 min RPOs
• Scales to 500 VMs
• File-level
consistency
• No automated
failback, FT, linked
clones, physical
RDM
Storage-based
Replication
• Higher-end
replicating
storage
• Additional
replication
software
• LUN – VM layout
• Storage team
coordination
• Synchronous
replication
• High data volumes
• Application
consistency
possible
8. 8
Oracle Data Guard
http://www.oracle.com/technetwork/database/features/availability/twp-dataguard-11gr2-1-131981.pdf
Oracle Data Guard provides the management, monitoring, and automation
software infrastructure to create and maintain one or more standby databases
to protect Oracle data from failures, disasters, errors, and data corruptions.
Data Guard is unique among Oracle replication solutions in supporting both
synchronous (zero data loss) and asynchronous (near-zero data loss)
configurations
Administrators can chose either manual or automatic failover of production to a
standby system if the primary system fails in order to maintain high availability
for mission critical applications
10. 10
Oracle Database(SAP) – Oracle Data Guard and SRM
vCenter Server
Site
Recovery
Manager
vSphere
vCenter Server
Site
Recovery
Manager
vSphere
vSphere
Replication
Site A (Primary) Site B (Recovery)
Primary
SAP DB
Standby
SAP DB
Oracle Data
Guard
Log Shipping
SAP CS SAP PAS SAP CS SAP PAS
11. 11
Steps Tested
1 Oracle Primary DB at Site A and
Standby DB at Site B with Data Guard
3 Site A Down - SAP Application and
Central services VM replicated to
Site B using vSphere replication
4 Failover Oracle Primary to Standby using
SRM Call out Script from SAP Application VM
at Site B
5 Connect/Resume SAP application to the Oracle Database in site B
2 SAP Application connected to Primary
DB at Site A
13. 13
SRM Callout Script – odgfail.sh (Example)
~ # cat odgfail.sh
#! /bin/sh
#######################################################################################
# file name : odgfail.sh
# location : /scripts
# called from : Application VM on Site B
#######################################################################################
echo "Job `basename $0`: started at `date`"
#
# Set up standard ORACLE environment variables
ORACLE_SID=stdby; export ORACLE_SID
ORACLE_BASE=/oracle; export ORACLE_BASE
ORACLE_HOME=/oracle/PRD/102_64; export ORACLE_HOME
PATH=/oracle/PRD/102_64/bin:.:/oracle/PRD:/usr/sap/PRD/SYS/exe/run:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin;export PATH
LD_LIBRARY_PATH=/usr/sap/PRD/SYS/exe/run:/oracle/client/10x_64/instantclient; export LD_LIBRARY_PATH
#
# Failover to Standby
$ORACLE_HOME/bin/sqlplus /nolog <<EOFarch1
connect / as sysdba
--shutdown Primary database(in case of RAC, shutdown all RAC instances)
--Initiate failover to Standby Database:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE;
--Convert the physical standby database to the production role:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
--Comment/Uncomment either of the 2 sets of commands below
--If the database was never opened read-only since the last time it was started,
--open new production database via:
ALTER DATABASE OPEN;
--If the physical standby database has been opened in read-only mode since the last time it was started,
--shutdown standby database and restart it
--SHUTDOWN IMMEDIATE
--STARTUP pfile=initSTDBY.ora
exit
EOFarch1
echo "Job `basename $0`: ended at `date`"
########################## end of script
~ #
17. 17
Solution Testing Findings
Integration of RecoverPoint with vCenter Site Recovery Manager
enables DR testing to be carried out in isolated environments on
the recovery site so that production can remain active and
replication can continue uninterrupted. SRM also documents the
recovery process
RecoverPoint enables replication of entire virtualized Oracle
environments between data centers for disaster recovery
The RecoverPoint splitter supports replication across
heterogeneous storage platforms
Integration of RecoverPoint with vCenter Site Recovery Manager
enables DR testing to be carried out in isolated environments on
the recovery site so that production can remain active and
replication can continue uninterrupted
http://www.emc.com/collateral/hardware/white-papers/h8207-dr-oracle-vmaxe-recoverpoint-srm-wp.pdfDownload
18. 18
EMC RA – Storage Replication Solution Overview
20. 20
Solution Testing Findings
Integration of RecoverPoint with vCenter Site Recovery Manager
enables DR testing to be carried out in isolated environments on
the recovery site so that production can remain active and
replication can continue uninterrupted. SRM also documents the
recovery process
RecoverPoint enables replication of entire virtualized Oracle
environments between data centers for disaster recovery
The RecoverPoint splitter supports replication across
heterogeneous storage platforms
Integration of RecoverPoint with vCenter Site Recovery Manager
enables DR testing to be carried out in isolated environments on
the recovery site so that production can remain active and
replication can continue uninterrupted
http://www.emc.com/collateral/hardware/white-papers/h8207-dr-oracle-vmaxe-recoverpoint-srm-wp.pdfDownload
22. 22
Oracle DB on VMware Technical Best Practices
Server selection
Storage selection
vSphere version
vSphere operations
Performance monitoring
Guest operating system
configuration
• Virtual storage presentation
• Workload and datastore fan-in
ratios
• vCPU allocation
• Memory
• Network
• Security
• Cloning
• Disaster recovery
23. 23
General Best Practices
• Create a computing environment optimized for vSphere
• Enable required settings for ESX host BIOS – for example VT,
Turbo Mode, hyper-threading
• Disable unnecessary foreground and background processes on
guest operating system
• Create golden images of optimized operating systems using
vSphere cloning technologies
• Upgrade to vSphere ESX 5 for 10–20 % performance boost
• Allow vSphere to choose the best virtual machine monitor based on the CPU
and guest operating system combination. Virtual machine setting must be
selected Automatic for the CPU/MMU Virtualization option.
• Use Oracle recommended installation guidelines for respective operating
system – same as physical
• To minimize time drift in virtual machines follow guidelines in KB articles
Timekeeping best practices for Linux guests http://kb.vmware.com/kb/1006427
Timekeeping best practices for Windows, including NTP http://kb.vmware.com/kb/1318
VMware vSphere 4.1
OS
24. 24
Virtual CPUs
Best Practices for vCPUs
• Do not over-allocate vCPUs – try to match the exact workload
• If the exact workload is unknown, start with fewer vCPUs initially and
increase later if necessary
• For larger production workloads, the total number of vCPUs assigned to all
virtual machines should be less than or equal to the total number of cores
on the ESX host
• Enable hyper-threading for Intel Core i7 processors
• For 5500 series processors, enabling hyper-threading is recommended
• If unsure of the workload, use hardware vendor recommended Oracle sizing
guidelines
• Avoid remote NUMA access by sizing the number of vCPUs to be no
greater than the number of cores on a NUMA node (processor socket)
25. 25
Virtual Memory Best Practices
• Do not overcommit memory until vCenter reports that steady state
usage is below the amount of physical memory on the server
• Do not disable the balloon driver (installed with VMware Tools)
• Set the memory reservation to SGA size plus OS. (Reservation and
configured memory might be the same.)
• Enable hardware-assisted virtualization in the ESX host BIOS and on the VM
• Set CPU/MMU virtualization option to Automatic
• vSphere will choose best Virtual Machine Monitor option base on CPU/Guest OS
• Use Large Memory Pages
• Consult Oracle Administration Guide for sizing of SGA
26. 26
Network Best Practices
• Separate infrastructure traffic from virtual machine traffic for
security and isolation
• Use NIC teaming for availability and load balancing
• Take advantage of Network I/O Control (NIOC) to converge network and
storage traffic onto 10GbE
• For “chatty” virtual machines on same host, connect to same vSwitch to avoid
NIC traffic
• Use VMXNET3 Paravirtualized network adapter drivers to increase
performance
• Reduces overhead versus vlance or E1000 emulation
• Must have VMware Tools to enable VMXNET3
• Use jumbo frames
• To configure, see iSCSI and Jumbo Frames configuration on ESX 3.x and ESX 4.x
http://kb.vmware.com/kb/1007654
• Separate RAC interconnect network to isolate it from other traffic
27. 27
Storage Virtualization Concepts
• Storage array – consists of physical disks that are presented as
logical disks (storage array volumes or LUNs) to the ESX host
• Storage array LUNs – formatted as VMware vSphere® VMFS volumes
• Virtual disks – presented to
the guest operating system,
and can be partitioned and
used in guest file systems
28. 28
Storage Best Practices
• Use vSphere VMFS for single instance Oracle database
deployments
• For IP-based storage (iSCSI and NFS), enable jumbo frames
• Create dedicated data stores to service database workloads
• Align VMFS properly – Use vCenter to create VMFS partitions, because it
automatically aligns the partitions
• Use Oracle automatic storage management
• Follow your storage vendor’s best practices documentation when laying out
the Oracle database
• Use Paravirtualized SCSI adapters for Oracle datafiles with demanding
workloads
http://www.vmware.com/files/pdf/partners/oracle/Oracle_Databases_on_VMware_-_Best_Practices_Guide.pdfDownload
30. 30
Performance
Rapid Provisioning
I/O is not an issue
Scale up and out
Newer hardware can increase performance
Streamline activation, deployment, and validation of servers
Avoid manual configuration errors
Server Consolidation
Fully utilize hardware
Maintain application isolation
Scale dynamically and right-size infrastructure
Workload Management
Business Continuity
High Availability
VMware vSphere® vMotion®, VMware vSphere High Availability (HA),
VMware vSphere® Fault Tolerance (FT), VMware vSphere Distributed
Resource Scheduler (DRS)
Without clustering or RAC
VMware vCenter Site Recovery Manager™
Hardware reduction at failover site
Comprehensive testing of DR solution
Benefits of Oracle Databases on VMware
Zero downtime maintenance
Migrate live databases
31. 31
Where Can I Learn More?
vCenter Site Recovery Manager
• Product Page – www.vmware.com/products/srm
• Overview, datasheet, webinars, docs, community links
Oracle Data Guard
• Overview –
http://www.oracle.com/technetwork/database/features/availability/dataguardov
erview-083155.html
Virtualizing Oracle with VMware
• External Solution Page – http://www.vmware.com/solutions/business-critical-
apps/oracle-virtualization/oracle-database.html
Blog
• http://blogs.vmware.com/apps/oracle/
33. 33
Disaster Recovery Solution with Oracle
Data Guard and Site Recovery Manager
VMware, Inc.
3401 Hillview Ave
Palo Alto, CA 94304
Tel: 1-877-486-9273 or 650-427-5000
Fax: 650-427-5001
34. 34
Other VMware Activities Related to This Session
HOL:
HOL-SDC-1305
Business Continuity and Disaster Recovery In Action
Group Discussions:
BCO1003-GD
Disaster Recovery and Replication with Ken Werneburg
39. 39
Failover
A failover is performed when the production database fails and one of the standby databases is transitioned to take over
the production role, allowing business operations to continue. Once the failover is complete and applications have
resumed, the administrative staff can turn its attention to resolving the problems with the failed system. Failover may or
may not result in data loss depending on the Data Guard protection mode in effect at the time of the failover. There are
two distinct types of failover: manual failover and fast-start failover
Steps after Primary database crashes :
Step No. Standby Site
1 Initiate failover to Standby Database:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE.
In rare circumstances DBA’s may wish to avoid waiting for the standby to complete applying redo in the current standby redo
log file before performing the failover and so may issue an ‘ALTER DATABASE ACTIVATE STANDBY DATABASE’ command
to perform an immediate failover, this will cause any un-applied redo in the standby redo log to be lost.
2 Convert the physical standby database to the production role:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY
3 If the database was never opened read-only since the last time it was started, open new production database via:
ALTER DATABASE OPEN
If the physical standby database has been opened in read-only mode since the last time it was started, shutdown standby
database and restart it
SHUTDOWN IMMEDIATE
STARTUP
40. 40
Switchover
Switchover is a planned role reversal between the production database and one of its standby databases to avoid
downtime during scheduled maintenance on the production system or to test readiness for future role transitions. A
switchover guarantees no data loss.
Steps :
Step No. Primary Site Standby Site
1 Get Status of Primary Database :
SELECT NAME, DB_UNIQUE_NAME, LOG_MODE,
OPEN_MODE, PROTECTION_MODE,
PROTECTION_LEVEL, DATABASE_ROLE,
SWITCHOVER_STATUS FROM V$DATABASE
Ensure both log_archive_dest_state_1 (Local Archiving) and
log_archive_dest_state_2 (Archiving to Standby) are enabled
Get Status of Standby Database :
SELECT NAME, DB_UNIQUE_NAME, LOG_MODE,
OPEN_MODE, PROTECTION_MODE,
PROTECTION_LEVEL, DATABASE_ROLE,
SWITCHOVER_STATUS FROM V$DATABASE
Ensure log_archive_dest_state_1 (Local Archiving) is
enabled and log_archive_dest_state_2 (Archiving to
Primary) is disabled. Ensure NO gaps in redo on the
standby database
2 Verify that it is possible to perform a switchover operation:
SELECT SWITCHOVER_STATUS FROM V$DATABASE
if output is ‘SESSIONS ACTIVE’ then disconnect all sessions
manually or when performing step 3 append the “with session
shutdown” clause
3 Convert the current primary database to the new physical
standby:
ALTER DATABASE COMMIT TO SWITCHOVER TO
PHYSICAL STANDBY WITH SESSIONS SHUTDOWN
41. 41
Switchover (cont’d)
Step No. Primary Site Standby Site
4 Shutdown the former primary and mount as a standby
database:
SHUTDOWN IMMEDIATE
STARTUP NOMOUNT PFILE= initPRD.ora
ALTER DATABASE MOUNT STANDBY DATABASE
Defer the remote archive destination on the old primary:
ALTER SYSTEM SET log_archive_dest_state_2=DEFER
Verify that the old physical standby can be converted to
the new primary:
SELECT SWITCHOVER_STATUS FROM V$DATABASE
5 Convert the old physical standby to the new primary:
ALTER DATABASE COMMIT TO SWITCHOVER TO
PRIMARY WITH SESSIONS SHUTDOWN
If the physical standby database has not been opened in
read-only mode since the last time it was started:
ALTER DATABASE OPEN
Shutdown and startup the new primary database:
SHUTDOWN IMMEDIATE
STARTUP PFILE= initSTDBY.ora
6 Start managed recover on the new standby database:
ALTER DATABASE RECOVER MANAGED STANDBY
DATABASE DISCONNECT FROM SESSION
Enable remote archiving on the new primary to the new
standby:
ALTER SYSTEM SET
log_archive_dest_state_2=ENABLE