Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
VXLAN Practice Guide
1. A Practice Guide to
vCNS and VXLAN
Technical Overview and Design Guide
Prasenjit Sarkar – VMware
Hongjun Ma – HP
Andy Grant – HP
2. Agenda
What will we focus on
High level overview how VXLAN works
VXLAN implementation using vCNS including
• Infrastructure Components
• Packet Flow
Deployment Prerequisites
Network Considerations
• Multicast requirements
• Multicast implementation
VTEP Performance and Overhead
• HP Virtual Connect & load-balancing
3. VXLAN Introduction
Target Audience
Architects, Engineers, Consultants, Admins responsible for Data Center Infrastructure and
VMware virtualization technologies
What is VXLAN
VXLAN - Virtual eXtensible Local Area Network is a network overlay that encapsulates
layer 2 traffic within layer 3
• Submitted it IETF by Cisco, VMware, Citrix, Red Hat, Broadcom, & Arista.
•
Coined network virtualization or ‘virtual wires’ by VMware
Competing Solutions?
NVGRE - Network Virtualization using Generic Routing Encapsulation
• Submitted to IETF by Microsoft, Arista, Intel, Dell, HP, Broadcom, Emulex
SST - Stateless Transport Tunneling
• Submitted to IETF by Nicira (VMware)
4. VXLAN Introduction
Why VXLAN?
•
•
•
•
•
Ability to manage overlapping addresses between multiple tenants
Decoupling of the virtual topology provided by the tunnels from the physical topology of the network
Support for virtual machine mobility independent of the physical network
Support for essentially unlimited numbers of virtual networks (in contrast to VLANs, for example)
Decoupling of the network service provided to servers from the technology used in the physical
network (e.g. providing an L2 service over an L3 fabric)
• Isolating the physical network from the addressing of the virtual networks, thus avoiding issues such
as MAC table size in physical switches.
• VXLAN provides up to 16 million virtual networks in contrast to the 4094 limit of VLAN’s
• Application agnostic, all work is performed in the ESXi host.
Where are we today?
•
•
VXLAN still in experimental status in IETF
Primarily targeted in vCloud environments but standalone product available.
5. VXLAN Introduction
How VXLAN?
• VMware vSphere ESXi 5.1 AND
– vCloud Networking Security 5.1 Edge
OR
– Cisco Nexus 1000V
VMware vCloud Networking and Security Edge
• Available vCNS deployment options
– Standalone (licensed per VM)
– AutoDeploy
• Deploying VXLAN through Auto Deploy
– vCloud Director 5.1 (licensed in vCloud Suite)
• Currently tested to support 5000 VXLAN segments
– vCloud Networking and Security 5.1 Edge configuration limits and throughput
Cisco Nexus 1000V
• Currently tested to support 2000 VXLAN segments
– Deploying the VXLAN Feature in Cisco Nexus 1000V Series Switches
7. vCloud Networking and Security - Edge
What is vCloud Networking and Security Edge?
Part of the VMware vCloud Networking and Security suite
• Previously known as the vShield suite.
• Provides gateway services including
– VPN
– DHCP
– DNS
– NAT
– Firewall (5 tuple)
– VXLAN & inter-VXLAN routing
– Load-Balancing (Advanced License)
– High Availability (Advanced License)
Licensing Options
– Standalone per-VM Standard or Advanced licensing
– Bundled with vCloud Suite
8. VXLAN: How it works
What is vCloud Networking and Security Edge?
Part of the VMware vCloud Networking and Security suite
• Encapsulation
– Performed by a kernel module installed on ESXi host
• Acts as the Virtual Tunnel End Point or VTEP
– Adds 24bit identifier and 50 bytes to packet size.
– MAC in UDP + IP
• MAC in UDP + IP
– Why MAC in IP is better than vCNI (MAC in MAC)
• Multicast
– Where it is used, how this impacts scalability
9. vCNS + Edge + VXLAN: Prerequisites
What is vCloud Networking and Security Edge?
Part of the VMware vCloud Networking and Security suite
• Previously known as the vShield suite.
• Highly integrated with vCloud but vCD is not necessary with standalone licenses.
VXLAN + vCNS Edge requires;
• Physical network components;
•
•
•
MTU increase (1550 MIN)
Multicast enabled (depending on topology, more to come)
VMware components;
•
•
•
•
vDS 5.1 (implies vSphere Enterprise Plus licensing & vCenter)
A vCNS Manager
A vCNS Edge
VMware recommends
•
•
•
a single vDS across all clusters.
you isolate your VTEP traffic from VM VLAN’s
Etherchannel or LACP to your host for the VXLAN transport Port Group
10. Multicast
What needs to be enabled on HP or Cisco switches?
What are the multicast design considerations?
• Limits of physical network hardware platforms using multicast
– Cisco Nexus 7000 supports 15,000 L2 IGMP entries
(http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9402/ps9512/brochure_mulitcast_w
ith_cisco_nexus_7000.pdf)
– Cisco Nexus 7000 supports 32,000 MC entries (15K vPC)
(http://www.cisco.com/en/US/docs/switches/datacenter/sw/verified_scalability/b_Cisco_Nexus_700
0_Series_NXOS_Verified_Scalability_Guide.html#reference_04BA8513CF3140D2A2A6C5E5B4E7C60C)
– Check HP gear limits.
– So what do these limits mean?
– VMware recommends one VXLAN ‘virtual wire’ per MC segment therefore we can only support this
for up to 15K or 32K?
• If we don’t follow this recommendation, how does this impact a VM broadcast flooding other
VTEP’s w/ multicast traffic?
• Is better to use IGMP snooping/querier (L2 topology) or PIM w/ L3 topology?
– How does this impact Data Center Interconnects (DCI) and stretched VXLAN implementations?
11. VXLAN Logical View
Packet flow across virtual wires on the same layer 2 VXLAN transport network
•
•
•
VXLAN Fabric
vDS
•
Layer 2
List Pro/Con’s here
Multicast configuration
options
• IGMP
snooping/querier
Explain how they work in
next slide
Design considerations?
• Eg. Broadcast storms?
12. VXLAN Logical View
Packet flow across virtual wires on different layer 3 VXLAN transport networks
•
•
•
VXLAN Fabric
•
vDS
Layer 3
List Pro/Con’s here
Multicast configuration
options
• PIM
Explain how they work in
next slide
Design considerations?
• Eg. Broadcast storms?
13. High Level Physical Deployment
VXLAN Fabric
VTE
P
VTE
P
VTE
P
vSphere Distributed Switch
VTE
P
Solution Components
• vDS 5.1
ESXi
ESXi
ESXi
ESXi
• VXLAN virtual fabric
• VTEP (vmk adapter
in a dedicated Port
Group)
• vCNS Edge 5.1
• vCNS Manager 5.1
14. Physical Deployment – A Closer Look
VXLAN Fabric
• vCNS Manager manages the vCNS deployment
• supports many Edge devices.
VTEP
VTEP
vSphere Distributed Switch
ESXi
ESXi
• VTEP is a single vmkernel interface per host
automatically created on VXLAN vDS Port Group
• LACP, EtherChannel or (static) failover only
supported load balancing methods.
• VLAN ‘trunking’ or virtual switch tagging (VST)
not recommended. Dedicate ‘access’ phyical
uplinks to VXLAN Port Groups
• vCNS Edge virtual appliance provides gateway
services
15. Physical Deployment – Intra-Host Packet Flow
VXLAN Fabric
VM Packet Flow
1. VM sents packet to remote destination on
same virtual wire
VTEP
VTEP
vSphere Distributed Switch
ESXi
ESXi
2. Packet hits vDS and is forwarded to
destination VM
16. Physical Deployment – Inter-Host Packet Flow
VXLAN Fabric
VM Packet Flow
1. VM sents packet to remote destination on
same virtual wire
VTEP
VTEP
vSphere Distributed Switch
ESXi
ESXi
2. Destination VM is remote and packet will
traverses VXLAN network
3. ESXi host encapsulates packet and
transmits on via VTEP vmkernel adapter
4. Target ESXi host running the destination
VM receives the packet on the VTEP,
forward to VM
17. Physical Deployment – Routed Packet Flow
VXLAN Fabric
VM Packet Flow
1. VM transmits packet to remote
destination
VTEP
VTEP
vSphere Distributed Switch
ESXi
ESXi
2. VTEP kernel module in ESXi host
encapsulates packet and transmits on
VXLAN network
3. ESXi host running the Edge device
receives packet and processes through
rule engine
4. Packet processed by firewall/NAT/routing
rules and is sent out external interface on
Edge device
5. Packet hits physical network
infrastructure
18. Comparison of vSphere NIC Teaming
Load Distribution vs Load Balancing vs Active/Standby
vCNS Edge supports LACP & Etherchannel or Failover “aka, Active/Standby” NIC
teaming options
Load Distribution (of IP flows)
Load Balancing (bandwidth)
Active/Standby
90%
load
LAC
P
20%
load
55%
load
LBT
40%
load
0%
load
Active /
Standb
y
IP Flows
(conversations
)
Attempts to evenly distribute
IP traffic flows, bandwidth is
NOT a consideration
Attempts to evenly distribute
bandwidth capacity
Single active link, no
automatic load
distribution/balancing
100%
load
19. VXLAN with HP Virtual Connect Interconnects
Virtual Connect Advantage
East/West Fencing (VTEP) Traffic stays in the VC domain using cross-connect or stacking
links reducing North/South bandwidth requirements.
Virtual Connect Disadvantage
Virtual Connect does not support downstream server EtherChannel or LACP connectivity.
• Limited to the vCNS Teaming Policy of “Failover”
•
•
•
Effectively an Active/Standby configuration
Cuts North/South bandwidth efficiency in half due to idle link
This is not as bad as it sounds due to the East/West traffic savings using cross-connects/stacking
links
Possible Solutions?
• VC Tunnel Mode? – Does it pass link aggregation control traffic? Looks to be a NO
• Multiple Edge devices using an alternating Active/Standby teaming on VXLAN Port
Group?
•
•
Static load-distribution sucks!
Other?
20. VXLAN Performance
Encapsulation Overhead
VXLAN introduces an additional layer of packet processing at the hypervisor level. For
each packet on the VXLAN network, the hypervisor needs to add protocol headers on the
sender side (encapsulation) and remove these headers (decapsulation) on the receiver
side. This causes the CPU additional work for each packet.
Apart from this CPU overhead, some of the offload capabilities of the NIC cannot be used
because the inner packet is no longer accessible. The physical NIC hardware offload
capabilities (for example, checksum offloading and TCP segmentation offload (TSO)) have
been designed for standard (non-encapsulated) packet headers, and some of these
capabilities cannot be used for encapsulated packets. In such a case, a VXLAN enabled
packet will require CPU resources to perform a task that otherwise would have been done
more efficiently by physical NIC hardware. There are certain NIC offload capabilities that
can be used with VXLAN, but they depend on the physical NIC and the driver being used.
As a result, the performance may vary based on the hardware used when VX
http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-VXLAN-Perf.pdfLAN is
configured.
21. VXLAN Isn’t Perfect
Compared to MAC in MAC encapsulation (vCNI) then VXLAN (MAC in UPD) moves in the
right direction for broadcast scalability
• Broadcasts on internal networks (“protected” with vCDNI) get translated into global
broadcasts. This behavior totally destroys scalability. In VLAN-based designs, the number of hosts
and VMs affected by a broadcast is limited by the VLAN configuration... unless you stretch VLANs all
across the data center (but then you ask for trouble). Ivan Pepelnjak
VXLAN Fenced networks communicate via the VXLAN vmk adapter that only uses a single
Netqueue NIC queue. This limits scalability by increasing CPU pressure on the host for a
single pCPU.
vCNS Teaming Policy in conjunction with Virtual Connect. VC has no downstream
EtherChannel/LACP support and therefore VXLAN will always effectively be Active/Passive
going out the chassis. You will be limited to the bandwidth of a single upstream link per
vCNS Edge device (typically per cluster).
The lack of control plane virtualization and reliance on the physical network for MAC
propagation introduces limits imposed by multicast.
–
–
Multicast administrator expertise (not your typical data center protocol)
Multicast segment support limits of physical network infrastructure