2. About Me
• Adam Johnson @adjohn
• Based in San Francisco, CA
• Founding member of Midokura (since 2010)
• Runs Technical Services at Midokura
• Deploying NVOs in production with our
customers for the last 3 years at service
providers, enterprises, and web scale
companies
Confidential
2
3. • Won Nokia’s Silicon Valley Innovation
Challenge – 2014
• Named AlwaysOn award winner for
the second consecutive year
• Significant contributor to the
OpenStack Networking (Neutron)
• First SDN vendor to be certified for
Red Hat OpenStack environment
• Early member of the Open DayLight
Project (ODP)
• Broad and deep technical
partnerships with network switch
vendors, software companies and
solution providers
Confidential
About Midokura
• Founded in 2010, Midokura is a global
company with offices in Tokyo, San
Francisco and Barcelona
• Pioneer in network virtualization –
provides software for networking using
overlay approach. Pedigree derives
Amazon, Cisco, VMware and Google
• Received $17M first round of funding in
April 2013 from Innovation Network
Corporation of Japan, NTT and NEC
• Named by CRN as amongst the top 10
networking stories of 2013 and also
amongst 10 coolest startups in the world
3
4. Agenda
• A bit of background on NVOs
• Evaluating NVOs for performance and
efficiency
• Performance challenges with overlays
• Performance advantages with overlays
• Q&A
Confidential
4
5. A bit of a background on network
virtualization overlays
Confidential
6. Why Overlays?
We’re living in a virtual world
MAC or IP scaling issues
• ToR supports 16k TCAMs, or 16k vNICs in our case
• 1 VM has 1 vNIC, 30 VMs / server = 533 servers
Now let’s add Docker or Containers to the mix
• 1 container has vNIC, 100 containers / server = 160
servers
Confidential
6
7. Why Overlays?
4000 VLANs enough? Not even close!
In an ideal world, each app could/should get their
own isolated network
Think micro-segmentation
Confidential
7
8. Why Overlays?
Manual provisioning networks is slowing everything
down
Storage and compute can be provisioned
automatically in seconds or minutes.
Networking can take days or weeks
This is not acceptable when release cycles are
lowered to 2-4 weeks
Confidential
8
9. So how do overlays help?
Confidential
9
Logical network configuration does not affect the
physical network.
– MACs and IPs of the overlay are invisible to the
underlay network.
• ToR only needs to support # of Hypervisor IPs/
MACs, this is much more feasible
– Creating new networks and services, modifying
them requires no physical fabric reconfiguration
• Only need to change physical fabric when adding
new racks
10. So how do overlays help?
Confidential
10
Centralized configuration and management of
networks.
– API, CLI, GUIs
– Automation via orchestration (OpenStack)
– Config management friendly: Chef, Puppet
11. How do Overlays work?
Physical Server Physical Server
vSwitch or Agent
VM
vSwitch or Agent
ToR ToR ToR ToR
Core Core
Physical Network
Confidential
NIC NIC
VM
VM VM
VM
NIC NIC
VM
VM VM
Provider Router
Tenant A Router Tenant B Router Tenant C Router
Tenant A Net Tenant B Net Tenant C Net
Physical Network
15. What to look for when evaluating NVOs
Raw throughput with iperf?
This is only testing the dataplane , it should be
roughly identical between NVO solutions
Confidential
This is not enough
15
16. What to look for when evaluating NVOs
Need to test the control plane performance
- Flows per second setup
- Add complexity with networking services
* Stateful firewall rules
* NAT
* Load Balancers
* Routing
Confidential
16
17. Not all NVOs are built the same
If you believe marketing-speak, all NVOs are nearly
identical.
Reality sets in once you deploy:
- Centralized Controller Vs. Decentralized control
plane
- How are higher layer services handled?
* Distributed vs. Middle boxes
- External Connectivity?
* Active/Standby GW vs Distributed all Active
* L2 or L3?
* How are failures handled?
* HW or SW GW? 17
Confidential
18. Tips for evaluating NVOs
Deep dives on architecture
Confidential
Ask the tough questions
Talk to the users
Bake off
18
20. Encapsulation Overhead
VXLAN adds 50 bytes of overhead. With standard
size MTU, this equates to roughly 6% overhead
Jumbo frames can be used to significantly reduce
the overhead, and increase performance
Great article on this topic from Packet Pushers:
http://packetpushers.net/vxlan-udp-ip-ethernet-bandwidth-overheads/
Confidential
20
21. Moving up the stack
L2 is easy, L3+ is where things get tricky
* Middle boxes approach adds extra hops, ties
down to physical networking (traffic trombones)
* Distributed everything is the answer
How about Stateful services like NAT, FW?
* Heavily used in IaaS use cases
* Difficult to distribute, but it can be done
Confidential
21
22. First packet lag blues
Initial flow setup requires simulation and
programming of the dataplane.
Overlay may not be suitable if applications are
latency sensitive with a high number of short lived
flows. Long lived flows are fine.
Need to compare latency with and without NVO to
be sure:
– Distributed NVOs can reduce physical hops, if
using L3+ services, it may end up reducing latency
and physical network traffic.
Confidential
22
23. Software switches good enough?
Software switches are here to stay!
Encapsulation overhead?
NIC offloading (Mellanox, Intel) now offering options
Testing with Mellanox ConnectX-3 40GbE with
VXLAN offloading can achieve 35+Gbps
Confidential
23
24. Software switches good enough?
Throughput limitations?
It’s the kernel, stupid.
Userland, here we come:
Intel DPDK (Data Plane Dev Kit) – dpdk.org
Snabb Switch – github.com/snabbco/snabbswitch
• Written in LUA!
• claiming 60Gbps through VM appliance
Confidential
24
26. Increasing performance with NVOs
Single virtual hop networking reduces physical
network traffic, lowers latency (in some cases)
Massive scale of IPs and MACs
Massive scale of isolated networks
Extremely complex/long rule sets for firewalls –
think thousands per network.
Confidential
26