More Related Content Similar to Sdc 2012-how-can-hypervisors-leverage-advanced-storage-features-v7.6(20-9-2012) (20) Sdc 2012-how-can-hypervisors-leverage-advanced-storage-features-v7.6(20-9-2012)1. How can Hypervisors leverage
Advanced Storage features?
Author: Shriram Pore, Solutions Architect, Calsoft Inc.
Presenter: Dr. Anupam Bhide, CEO, Founder Calsoft Inc.
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
2. Introduction
Useful for storage vendors who are considering
implementing hypervisor storage APIs like VAAI
Understand how hypervisors interact with storage
today
Limitations in that interaction today
Need for a standard that is both hypervisor and
storage agnostic.
Some Areas (that today’s hypervisor-specific
standards do not cover)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
2
3. Virtual Environment Challenges
Challenges
Virtual environments (Hypervisors) with NAS/SAN
arrays have high storage bandwidth usage
(IP/FC/etc.) as compared to DAS
SAN/NAS arrays have matured technologies such as
snapshot, clone, server copy, range locking, etc.
Hypervisors can leverage these matured technologies
and improve storage / network utilization.
However, hypervisors themselves have developed
sophisticated storage virtualization layers
Solution
The goal here is to –
Offload file/bulk-block operations to NAS/SAN arrays/servers to reduce storage bandwidth
and increase I/O performance, storage utilization, etc.
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
4. Hypervisors use of storage
Classification – how hypervisors use storage
Hypervisor using local disks (e.g. VMware VMFS on
local disk)
Hypervisor creating proprietary file system on SAN
(e.g. VMware VMFS on a SAN array)
Hypervisor using LUNs of SAN array in RDM mode
Hypervisor using NAS storage box over NFS/CIFS
Better integration can have big benefits in many
cases
E.g. Vendors have reported 99% savings in
network bandwidth for cloning operations and 10x
to 20x in efficiency
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
4
5. Need for standard to optimize hypervisor
interaction with storage
Why the need?
• Storage is the biggest bottleneck for
hypervisor performance
• Many common hypervisor operations
can be optimized by delegating
operations to storage boxes
Hypervisors ESX(i) Hyper-V Xen • Storage vendors cannot afford to
conform to multiple hypervisor
standards for this delegation
SNIA interface SNIA interface
• Standard needs to be both hypervisor
Interface layer
for SAN for NAS independent and storage independent
• Identify in-efficient file/ Block operations
Identify on VMs in virtual environments
• Define set of SAN/ NAS primitives
(standard APIs) based on above
Define identification including capability
Storage exchange.
SAN NAS Complia • Levels of compliance depend on interface
nce implementations for storage and
hypervisor
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
6. SAN based Architecture
Application Application
(VM) (Web Services)
Set of files (disk Virtual Disk Library
metadata)
User Kernel
Files and file Boundary
segments Hypervisor proprietary FS e.g. VMFS
Logical Block Layer (Block Devices, logical
Blocks volume)
Physical SCSI Device Access
Device
Vendor Specific SCSI extensions on
Blocks
SAN array
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
7. Representative
SAN Primitives and Use Cases
SAN Goal Use Cases
Primitives
Block • Avoid multiple WRITE • Block Zero feature speeds up deployment of
Zeroing • Use SCSI WRITE SAME thick provision eager zeroed virtual disks
• Full Copy feature speeds up storage vMotion
• Avoid multiple READ & WRITE and cloning of VMs
and network bandwidth to • Hardware-Assisted Locking feature is useful
Full Copy
copy complete VMDK for operations that require VM locking for
• use SCSI EXTENDED COPY power on/off etc. and cluster wide operations
Hardware like vMotion, storage vMotion etc.
• BLOCK/EXTENT LEVEL locking • Power On Storm (100s of VMs being powered
Assisted
• use SCSI ATOMIC TEST & SET up which are using same LUN) has huge
Locking
latency which is resolved by hardware
• Errors to indicate soft and assisted locking
hard out of space conditions
Thin • Pause VM in case of hard • Dead Space Reclamation enables reclamation
Provisioning errors of blocks from a thin-provisioned LUN on the
• use UNMAP for Space SAN based arrays
Reclamation
Offloaded Data Transfer to the arrays
(illustrations from VMware SAN VAAI and Hyper-V ODX)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
8. Types of VMware virtual disks
Thick provisioned
All backing storage immediately allocated
Not zeroed immediately
Thin provisioned
Backing storage not fully allocated
Thick provisioned with eager zeroing
Like thick provisioned, but also zeroed immediately
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
8
9. Efficiencies gained
Block zeroing
Storage vendor can lazily zero extent in background
Can implement proprietary mechanisms to mark extent as
zeroed out
In thin provisioned LUNs, storage vendor can unmap the
extent
Full copy
Doing the copy within the storage box is more efficient
Furthermore, writeable snapshot(i.e. clone) technology can
be used to do the full copy without actually copying blocks
Dedupe performance can be improved by recognizing that
two extents are just copies of each other
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
9
10. Efficiencies gained
Atomic Test and Set
Many VM operations such as powering on or off
VMs require getting or releasing VM-specific lock.
This requires either getting SCSI-2 or SCSI-3
persistent group reservations – both inefficient.
Storage vendors can optimize by implementing
atomic test and set.
UNMAP
Thin provisioned LUNs can re-use deleted space
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
10
11. Integration of Hypervisors
and SAN capabilities
Steps Involved
• Identify in-efficient bulk-block operations on VMs in virtual environments/hypervisors
Identify
• Define set of SAN primitives (unused existing SCSI commands to be overloaded or new SCSI
commands to be adopted) based on above identification including capability exchange (supported
Define primitives) - can be added to SNIA specifications for standardization
• SAN vendors to implement SCSI commands to deliver desired functionality, which will use its own
Impleme technologies to achieve maximum performance
nt
• Hypervisor to invoke SCSI commands to leverage SAN capabilities
Call
• SAN arrays/servers to adopt proven/efficient technologies for increased performance and reduced
usage of network bandwidth (features like Dynamic LUN Provisioning, Thin Provisioning, concurrent
Adopt provisioning, Space reclamation, Dynamic Snapshots, LUN migration, etc.)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
12. NAS based Architecture – Plugin Approach
Application (Web
Application (VM)
Services)
Virtual disk library Vendor Specific NFS
Set of files (disc Virtual Disk Library
plugin API or Custom RPC plugin
metadata)
User Kernel
Boundary Plugin Approach
• Plugin approach
Files and file NAS proprietary FS over NFS/CIFS allows vendors to
segments use their own
Block Layer (Block Devices, logical communication
Logical volume) mechanism
Blocks • Many vendors use
Device Blocks Physical SCSI Device Access unused NFS
commands
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
13. Representative
NAS Primitives and Use Cases
NAS Goal Use Cases
Primitives
• Monitor space utilization in • Thick provisioning is the normal standard in
sparse file and guarantee enterprise deployments of hypervisors
File Space adequate space for VMs • With NAS file systems, POSIX lseek is the only
Reservation • Allows future I/O ops of VM way for hypervisors to efficiently create large
NOT to fail due to space files – but no backing storage will be created
unavailability with lseek
• File Space Reservation and Extended Stats
• Clone VMs in faster and
features enable hypervisors to quickly reserve
File Cloning storage space efficient way
space on NAS server over NFS/CIFS to create
(Full and • Also can be used for faster &
thick provisioned virtual disk for a VM
Lazy) storage efficient snapshots &
restores (VM snapshot)
• VM cloning/deployment and storage vMotion
• Retrieve accurate space are examples of operations that are offloaded
utilization of VMs - data that to NAS server.
cannot be retrieved using
Extended
NFS calls • Lazy clones typically used for instant
Statistics
• Monitor space utilization in operations. E.g. in VMs Linked Clones and VDI
sparse file and guarantee • Full Clones – used for operations across data
2012 Storage Developer Conference. space for VMs
adequate © Calsoft Inc. All Rights Reserved. stores
14. Efficiencies gained
File Space Reservation
Thick provisioned semantics are hard to achieve in NAS due
to NFS protocol limitations
Cloning – Full and Lazy
Doing the cloning within the NAS box is more efficient
Furthermore, file level writeable snapshot(i.e. clone)
technology can be used to do the lazy copy without
actually copying blocks
Dedupe performance can be improved by recognizing that
two extents are just copies of each other
Extended Attributes
Understand exactly how much space a lazy clone uses
Understand how much space a VMDK file is actually using
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
14
15. Additional Observations
Delegate creation of VM snapshots to the storage
box. These snapshots need to be hypervisor-
consistent.
For NAS, it is inefficient for hypervisors to do file
snapshots
Flash-based storage boxes do not write in place
and can do snapshots much more efficiently.
VDI applications could use the writeable
snapshots(clone) already provided by most
storage vendors
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
15
16. Additional Observations
With storage box provided snapshots, backups can
be taken without involving the server
Primitive for disaster recovery that can use
replication or remote mirroring (similar to SRA/SRM)
UNMAP for even thick provisioning for flash based
SAN arrays – helps reduce the size of book-keeping
data structures.
Storage arrays can demultiplex IO streams from
hypervisor and provide VM level IO stream QoS - vVol
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
16
17. Microsoft Offloaded Data Transfer
-ODX
Available Windows Server 8 onwards
Works for both SAN storage(VHDs) and NAS using
SMB protocol
Provides Full copy semantics for both SAN and NAS
Protocol works as follows:
Send offload read request to source device
Return with a token
Send offload write request with token to
destination device
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
17
18. Author Biography
Shriram Pore - Solutions Architect, Calsoft Inc.
A veteran of storage industry
More than 11 years of experience in architecting and developing products
Key strength lies in quickly understanding product requirements and translating
them into architectural and engineering specs for implementation.
Architected, designed and implemented solutions for NAS-VMware integrated
Backup-Recovery and Clone of VM
Led FS and replication teams on the file-server and also led the CORE component
team from system management perspective (configuration, provision, and
manage various facilities of file-server).
Master Of Computer Science from India, Pune University
Bachelor of Computer Science from India, Pune University
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
19. Presenter Biography
Dr. Anupam Bhide – CEO, Co-Founder, Calsoft Inc.
Storage industry veteran
More than 21 years of industry experience
Senior Architect in the RDBMS development group at Oracle Corp, designed
some of the key features of Oracle8
Founder-member of the DB2/6000 Parallel Edition team at IBM Research
Center
Visiting Faculty at University of California – Berkeley
Ph.D. in Computer Science, University of California-Berkeley
BS in Computer Science: Indian Institute of Technology, Bombay and MS:
University of Wisconsin-Madison
2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
19
20. Thank You
Questions & Answers
Contact info
Dr. Anupam Bhide
CEO, Co-Founder, Calsoft Inc.
Email: anupam@calsoftinc.com
Phone: +1 (408) 834 7086
Twitter: @Calsoftinc
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
20
21. Integration of Hypervisors
and NAS capabilities
Steps Involved
• Identify in-efficient file-operations on VMs in virtual environments
Identify
• Define set of NAS primitives (standard APIs) based on above identification including capability exchange
Define (supported primitives) This may be added to SNIA specs for standardization
• Provide a framework to implement APIs as plugin on hypervisors
Provide
• NAS vendor to implement the plugin to use its technologies to achieve maximum performance
Implem
ent
• Hypervisor to call into the plugin via defined framework APIs to leverage NAS capabilities
Call
• Modify NFS/CIFS to accommodate NAS primitives desired, else provide a new interface which would be easily
Modify integrated into the plugin with no security compromise
• NAS server to adopt proven/efficient technologies for increased performance and reduced usage of network
Adopt bandwidth (e.g.: file-clone, server-copy, FS snapshots, Thin Versioning, de-dupe, etc.)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.