2. Heterogeneous Workloads
Virtualization Comes of Age
The maturation of virtualization platforms, highlighted by the
recent releases of VMware vSphere 5.1 and Windows Server 2012
Hyper-V, has sharpened the focus of data center management on
increasing virtual machine (VM) density and virtualizing Tier-1
application workloads to maximize their virtualization ROI.
The average number of VMs per
server in my environment:
This technology examines the ability of Cisco UCS and HP
BladeSystems to provide useable I/O bandwidth and I/O flexibility
that are needed to support the growth of VM density and
heterogeneous workloads.
VM Density Driving Need for Useable I/O Bandwidth
IT Brand Pulse surveys indicate that IT pros project VM density will
almost double from 13 VMs/server to 24 VMs/server within the
next 24 months—driving a need for more useable
I/O bandwidth.
Heterogeneous Workloads Driving Need for Flexibility
More VMs/server means more diversity in the applications the
server I/O system must support. This drives the need for flexibility
in how blade server I/O bandwidth and policies are provisioned to
different applications.
What I need most to increase
the density of VMs per physical
servers is more:
Converged Infrastructures Demand I/O Flexibility
The convergence of compute, storage and networking into a
shared resource pool is gaining acceptance and deployment in
enterprise data centers. By definition, this sharing of resources by
multiple applications and lines of business is also driving the need
to have a flexible, high throughput and low latency blade
enclosure I/O.
41.1%
41.1% of IT professionals surveyed said “server virtualization” was the
application most driving the adoption of 10GbE in their data center.
Document # APP2013011 v16 August, 2013
Page 2 of 12
3. Anatomy of Blade Server Chassis I/O
Application Performance Depends on a Healthy Network
Every blade server chassis has an entire network embedded inside to carry east-west traffic between servers,
and north-south traffic to top-of-rack, end-of-row, and core switches upstream. The I/O performance of
heterogeneous applications running on virtualized blade servers can differ significantly based on:
The ability of some blade server networks to scale without over-subscription
Providing native support for multiple network protocols
Delivering network traffic with low latencies by minimizing route hops
Ethernet and Fibre Channel uplinks
to LANs and SANs
Embedded
switches and/or
pass-through
modules
Ethernet and Fibre
Channel
downlinks to midplane and server
adapters
Mid-plane
Ethernet or Fibre
Channel
Mezzanine
Adapters
Ethernet LAN-onMotherboard
(LOM) adapters
Blade server chassis
16 blade servers and 2 switches in chassis
1 LOM adapter on each server and 1 mezzanine adapter on each server
LOM
LAN-on-Motherboard is an Ethernet adapter which includes the first
network ports configured on a server.
Document # APP2013011 v16 August, 2013
Page 3 of 12
4. Useable I/O Bandwidth
Oversubscription Reduces Available Bandwidth
Oversubscription occurs when the I/O capacity of the adapter ports connected to a switch port exceeds the
capacity of the switch port. The oversubscription ratio is the sum of the capacity of the adapter ports divided
by the capacity of the switch port.
Oversubscription also results from the imbalance between outgoing bandwidth and incoming bandwidth
available from a blade server chassis to top-of-rack, end-of-row and core switches. Severe oversubscription
reduces useable bandwidth, increases network latency, slows user response time, and makes it difficult to
deliver deterministic application performance.
Blade server designers maximize useable bandwidth and minimize oversubscription. Not all blade server
architectures are optimized for these considerations.
Multi-Tiered Networks Increase Applications Latency
As data traverses a network fabric, each network tier in the path adds delays, negatively impacting
applications performance. As a result, designers also architect for minimal tiering to reduce these latencies.
Cisco UCS Hierarchical Network Architecture
The Cisco UCS design supports 8 blade servers per chassis and defaults to a hierarchical, multi-tiered, “northsouth” networking architecture in Cisco’s preferred End Host Mode. This mode is enabled by creating a
logical system where server-to-server traffic for hosts in the same fabric, travels outside the chassis to the
Fabric Interconnect.
This mode requires two fabric interconnects and two fabric extenders — with each extender providing up to
160Gb/s of total bandwidth for on active-active configuration. A fully loaded blade with a VIC 1240 quad-port
10GbE adapter plus a quad-port mezzanine extender requires 80Gb/s for a total of 640Gb/s with an 8 blade
enclosure—resulting in up to 4:1 oversubscription and reduced available bandwidth.
Oversubscription is exacerbated when the servers are on different fabrics, requiring server-to-server traffic to
travel north of the Fabric interconnects to the end of row aggregation switch. This architecture prevents
network loops but also prevents server-to-server traffic flow between interconnects, instead forcing it to
travel all the way to the end of row switch—resulting in the fabric interconnects become a north-south
bottleneck and up to 10:1 oversubscription, significantly limiting useable I/O bandwidth.
Oversubscription
Document # APP2013011 v16 August, 2013
The ratio of the switching bandwidth to the aggregate
bandwidth of all ingress (incoming) ports. High ratios
can result in latency and reduce scalability.
Page 4 of 12
5. Applications Latency
Cisco UCS Network Architecture Latency
The scenarios described also illustrate the additional network tiers elements built into the UCS architecture
which can result in increasing total latency with a commensurate reduction in applications performance and
unpredictable applications behavior. A calculation of network latency is examined in greater detail later in
this report.
Single and Multi-fabric Networking with Cisco UCS
End of Row Switch
(Two)
Oversubscription
Fabric Interconnect
16 UCS blade servers in 2 chassis
Fabric Extender
Fabric 1
Fabric 2
North-south traffic with fabric interconnects and end of row Switches is required for single or
multi-fabric traffic flow
Latency
Document # APP2013011 v16 August, 2013
Latency is the time between the start and completion
of one action measured in microseconds (µs) .
Page 5 of 12
6. HP ProLiant Gen8 Blade Server
The Ultimate in Useable Bandwidth
HP ProLiant Gen8 blade servers offer the flexibility of network architectures optimized for server-to-server
“east-west” traffic or “north-south” traffic in/outbound from the blade enclosure. This is accomplished via
configuration options for the HP FlexFabric server network adapters and HP Virtual Connect FlexFabric
modules.
If the application places demands for server-to-server “east-west” traffic within the enclosure, an activestandby configuration is utilized with the upstream network switch connected to a single port on each HP
Virtual Connect FlexFabric module. However, if the application demands predominantly server-to-core
“north-south” traffic, an active-active configuration enables two uplink ports on the network switch,
increasing overall bandwidth and reducing oversubscription.
A c7000 enclosure with 16 blades, each with two 10GbE FlexFabric adapters and two mezzanine 10GbE
adapters requires 960Gb/sec total bandwidth. The enclosure’s available internal mid plane bandwidth of
7.2Tb/sec, about 7x the required bandwidth – results in zero oversubscription for east-west traffic flow.
For traditional north-south traffic, HP Virtual Connect FlexFabric modules with 16 downlink and 8 uplink ports
results in 2:1 oversubscription, similar or lower than Cisco’s, depending on configurations.
The 7.2 Tb/s useable bandwidth between device bays and interconnect bays allows server-to-server traffic to stay
within a single c7000 enclosure
Virtual Connect
Document # APP2013011 v16 August, 2013
HP’s “wire-once” interconnect solution for cloud and
virtualized data centers delivers flatter networks and reduced cable costs.
Page 6 of 12
7. HP ProLiant Gen8 Blade Server
Network Latency Comparison
The HP Virtual Connect FlexFabric architecture flexibly supports both single and separate fabric traffic pattern
implementations. However, configurations architected with separate fabrics address the enterprise
requirements for resiliency through redundancy.
The HP solution delivers lower latencies for both configurations, based on the calculations from
specifications provided by HP and Cisco.
Total Data Latencies For HP and Cisco Architectures—Lower is better
HP Virtual Connect Network Architecture delivers data with more than 50% lower latency
10GbE
Document # APP2013011 v16 August, 2013
High performance 10GbE ports are now available for
blade servers in LAN-on-Motherboard (LOM) and mezzanine adapter form factors.
Page 7 of 12
8. I/O Flexibility
Support for Only One Network Wire Reduces I/O Flexibility
Based on IT Brand Pulse surveys, 40% of IT organizations are not converging with FCoE. For the 40% of IT
professionals who have been too busy to look at FCoE, or who say they have no plans to converge their LANs
and SANs, parallel Ethernet and Fibre Channel infrastructure will be deployed.
The modular design of blade servers make them inherently flexible. But not all blade server platforms are
equal when it comes to hosting multiple heterogeneous virtualized workloads and delivering I/O flexibility.
While some blade server designs accommodate Ethernet and native Fibre Channel connectivity, the Cisco
UCS design only supports Ethernet connectivity.
Wanted: Parallel Ethernet & Fibre Channel Networks
in 2013, the prevalent data center network architecture remains a parallel network architecture, including a mix of specialized NIC, iSCSI, and
Fibre Channel host adapters, as well as Ethernet and Fibre Channel switched fabrics. Cisco UCS blade servers support only Ethernet
connectivity. Adoption of FCoE technology is required to access installed Fibre Channel resources.
40%
40% of IT Professionals are not converging their networking infrastructure
to Fibre Channel over Ethernet.
Document # APP2013011 v16 August, 2013
Page 8 of 12
9. HP ProLiant Gen8 Blade Server
The Ultimate in I/O Flexibility
HP ProLiant Gen8 blade servers are designed for I/O flexibility with a choice of FlexFabric converged
networking or parallel Ethernet and Fibre Channel networks. The ProLiant Gen8 blade servers are also fully
compliant with Windows Server 2012 Virtual Fibre Channel—an innovation that will play an important role in
the virtualization of Tier-1 workloads with Microsoft Hyper-V.
HP BladeSystem c7000 enclosure with ProLiant Gen8 blade servers
HP Virtual Connect
FlexFabric
10Gb/24-port
module supports
connectivity to native
Fibre Channel 3PAR
storage at a lower
cost than using Fibre
Channel switches
Native Fibre Channel
server adapter
Over 12 million ports
shipped on this stack
Complete enterprise
OS support including
Solaris
Flexibility
HP Virtual Connect
FlexFabric
10Gb/24-port
module supports
LAN, NAS, iSCSI and
FCoE connectivity
HP 659818-B21 Mezzanine
FC Adapter
HP FlexFabric 10Gb 2-port
554FLB Adapter
Ethernet LAN on
Motherboard
(LOM)
Supports LAN,
NAS, iSCSI and
FCoE
connectivity
HP FlexFabric
Ready
Choice of I/O Convergence or Divergence meets the needs of
heterogeneous workloads on a single blade platform.
Document # APP2013011 v16 August, 2013
Page 9 of 12
10. Fibre Channel without Switches
Direct Attached Storage Reduces CAPEX and OPEX
Complementing the HP ProLiant blade server’s flexibility of Fibre Channel (FC) connectivity capability, HP’s
Flat SAN technology enables enterprise class direct attached storage to HP 3PAR Storage Systems without
requiring a Storage Area Network (SAN) fabric. This scalable solution provides connectivity for up to 192 FC
ports and 192 Petabytes of storage capacity. Benefits of this architecture include:
Reduced CAPEX on SAN fabric switches and associated software licenses
Reduced ongoing management and support OPEX of multiple touch points
Latency reduction of up to 55% by removing the fabric switching layer
Up to 2.5x faster Fibre Channel storage provisioning
These benefits accrue while maintaining the flexibility to simultaneously include fabric-attached storage for
traditional SAN connectivity from the same Virtual Connect module.
Simultaneous Direct Attached & SAN-Attached Fibre Channel Storage
FC Switch
HP StorageWorks EVA Storage
HP 3PAR Storage
HP ProLiant Gen8 Blade Servers in a C7000 Enclosure
Flexibility
Choice of Direct-Attached or Fabric-Attached or Simultaneous
Direct and Fabric-Attached Fibre Channel Storage Connectivity
Document # APP2013011 v16 August, 2013
Page 10 of 12
11. Advantage — HP ProLiant Gen8
Conclusions
For IT organizations who want to scale-up the density of their heterogeneous VM workloads, HP ProLiant
Gen8 blade servers offer more useable bandwidth and superior I/O flexibility.
Feature
HP BL460C blade server
with 554FLB or 554M
Cisco B200 M3 blade
server with VIC1240
Servers per Chassis
16
8
Servers per Zero Over subscription Traffic Domain 16
0
Over subscription
None to 2:1 over subscrip- 4:1 to 10:1 over subscription
tion
Single System Latencies
1.5 to 3.0 µsec
3.2 to 7.4 µsec
Support for native Fibre Channel & 10GbE
Yes
No
Fibre Channel over Ethernet (FCoE)
Yes
Yes
iSCSI
Yes
No – software only
TCP offload engine (TOE)
Yes
Yes
Hardware offload
Proven
Emulex Fibre Channel technology is proven with a 15 year history of
deployment of 12 million ports in mission-critical environments.
Document # APP2013011 v16 August, 2013
Page 11 of 12
12. Resources
Related Links
To learn more about the companies, technologies and products mentioned in this report, visit the following
web pages:
HP FlexFabric Adapters Provided by Emulex
SFP+ Copper Latency Substantiation
HP BladeSystem
UCS Fabric Interconnect Latency
HP Virtual Connect Technology
Nexus 5548 Switch Latency
HP Virtual Connect Traffic Flow
UCS Fabric Expander Latency
HP BladeSystem and Cisco UCS Comparison
HP Virtual Connect Latency
Cisco Fabric Extender
Cisco UCS Adapters
Cisco UCS Ethernet Switching Modes
HP Direct Connection Flat SAN Storage
IT Brand Pulse
About the Author
Rahul Shah, Director, IT Brand Pulse Labs
Rahul Shah has over 20 years of experience in senior engineering and product management positions with semiconductor, storage networking and IP networking manufacturers
including QLogic and Lantronix. At IT Brand Pulse, Rahul is responsible for managing the
delivery of technical services ranging from hands-on testing to product launch collateral.
You can contact Rahul at rahul.shah@itbrandpulse.com.
Document # APP2013011 v16 August, 2013
Page 12 of 12