SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
Deeper Dive in
Docker Overlay
Networks
Laurent
Bernaille
@lbernail
CTO D2SI
Agenda
Reminder on the Docker Overlay
VXLAN Control Plane options
Using BGP as a dynamic Control Plane
What can we do with this?
Reminder on the Docker
overlay
The Docker Overlay network
docker0:~$ docker network create --driver overlay --subnet 192.168.0.0/24 dockercon
d099dcc709daddbc0e143c24e7091bef6b13bdc3abb379473af4582bf1e112b1
docker1:~$ docker network ls
NETWORK ID NAME DRIVER SCOPE
d099dcc709da dockercon overlay global
docker0:~$ docker run -d --ip 192.168.0.100 --net dockercon --name C0 debian sleep infinity
docker1:~$ docker run -it --rm --net dockercon debian
root@950d67e96db7:/# ping 192.168.0.100
PING 192.168.0.100 (192.168.0.100): 56 data bytes
64 bytes from 192.168.0.100: seq=0 ttl=64 time=1.153 ms
Docker Overlay: Data plane
docker0
eth0
192.168.0.100
C0 Namespace
br0
vxlanveth
eth0
docker1
C1 Namespace
br0
vxlanveth
eth0PING
eth0
192.168.0.Y
10.0.0.10 10.0.0.11
IP
src:	10.0.0.11
dst:	10.0.0.10
UDP
src:	X
dst:	4789
VXLAN
VNI
Original	L2
src:	192.168.0.Y
dst:	192.168.0.100
What is VXLAN?
• Tunneling technology over UDP (L2 in UDP)
• Developed for cloud SDN to create multi-tenancy
• Without the need for L2 connectivity
• Without the normal VLAN limit (4096 VLAN Ids)
• Easy to encrypt: IPsec
• Overhead: 50 bytes
• In Linux
• Started with Open vSwitch
• Native with Kernel >= 3.7 and >=3.16 for Namespace support
Outer	IP	packet
UDP
dst:	4789
VXLAN
Header
Original	L2
VXLAN: Virtual eXtensible LAN
VNI: VXLAN Network Identifier
VTEP: VXLAN Tunnel Endpoint
docker0 docker1
10.0.0.0/16
10.0.0.10 10.0.1.10
Let's build an overlay
"manually"
Overlay namespaces
docker0
br42
vxlan42
eth0
docker1
br42
eth0
10.0.0.10 10.0.1.10
vxlan42
Creating the overlay
namespaceip netns add overns
ip netns exec overns ip link add dev br42 type bridge
ip netns exec overns ip addr add dev br42 192.168.0.1/24
ip link add dev vxlan42 type vxlan id 42 proxy dstport 4789
ip link set vxlan1 netns overns
ip netns exec overns ip link set vxlan42 master br42
ip netns exec overns ip link set vxlan42 up
ip netns exec overns ip link set br42 up
create overlay NS
create bridge in NS
create VXLAN interface
move it to NS
add it to bridge
bring all interfaces up
setup_vxlan script
docker0
C0 Namespace
br42
veth
eth0
docker1
C1 Namespace
br42
veth
eth0
eth0
192.168.0.10
eth0
192.168.0.20
10.0.0.10 10.0.1.10
vxlan42
vxlan42
Attach containers
docker0
docker run -d --net=none --name=demo debian sleep infinity
ctn_ns_path=$(docker inspect --format="{{ .NetworkSettings.SandboxKey}}" demo)
ctn_ns=${ctn_ns_path##*/}
ip link add dev veth1 mtu 1450 type veth peer name veth2 mtu 1450
ip link set dev veth1 netns overns
ip netns exec overns ip link set veth1 master br42
ip netns exec overns ip link set veth1 up
ip link set dev veth2 netns $ctn_ns
ip netns exec $ctn_ns ip link set dev veth2 name eth0 address 02:42:c0:a8:00:10
ip netns exec $ctn_ns ip addr add dev eth0 192.168.0.10
ip netns exec $ctn_ns ip link set dev eth0 up
docker1
Same with 192.168.0.20 / 02:42:c0:a8:00:20
Create container without net
Create veth
Send veth1 to overlay NS
Attach it to overlay bridge
Send veth2 to container
Rename & Configure
Get NS for container
Create containers and attach
them
plumb script
Does it ping?
docker0:~$ docker exec -it demo ping 192.168.0.20
PING 192.168.0.20 (192.168.0.20): 56 data bytes
92 bytes from 192.168.0.10: Destination Host Unreachable
docker0:~$ sudo ip netns exec overns ip neighbor show
docker0:~$ sudo ip netns exec overns ip neighbor add 192.168.0.20 lladdr 02:42:c0:a8:00:20 dev vxlan42
docker0:~$ sudo ip netns exec overns bridge fdb add 02:42:c0:a8:00:20 dev vxlan42 self dst 10.0.1.10 
vni 42 port 4789
docker1: Same with 192.168.0.10, 02:42:c0:a8:00:10 and 10.0.0.10
docker0
C0 Namespace
br42
veth
eth0
docker1
C1 Namespace
br42
veth
eth0
eth0
192.168.0.20
eth0
192.168.0.20
10.0.0.10 10.0.1.10
vxlan42
vxlan42
PING
FDB
ARP
FDB
ARP
Result
VXLAN Control Plane options
vxlan vxlan
vxlan
Multicast
239.x.x.x
ARP:	Who	has	192.168.0.2?
L2	discovery:	where	is	02:42:c0:a8:00:02	?
Use a multicast group to send traffic for unknown L3/L2 addresses
PROS: simple and efficient
CONS: Multicast connectivity not always available (on public clouds for
instance)
VXLAN Control Plane options - 1: Multicast
Configure a remote IP address where to send traffic for unknown addresses
PROS: simple, not need for multicast, very good for two hosts
CONS: difficult to manage with more than 2 hosts
VXLAN Control Plane options - 2: Point-to-
point
vxlan vxlan
Remote	IP:	point-to-point
Send everything to	remote IP
Do nothing, provide ARP / FDB information from outside
PROS: very flexible
CONS: requires a daemon and a centralized database of addresses
VXLAN Control Plane options - 3: User-Land
vxlan vxlan
daemon daemon
Manual	(with	a	daemon	modifying	ARP/FDB)
ARP:	Do	you	know		192.168.0.2?
L2:	where	is	02:42:c0:a8:00:02	?
vxlan
daemon
consul/swarm
docker0
eth0
192.168.0.100
C0 Namespace
br0
vxlan
veth
eth0
docker1
C1 Namespace
br0
vxlan
veth
eth0
192.168.0.Y
eth0
NAT
PING
dockerd dockerd
10.0.0.10 10.0.1.10
ARP
FDB
ARP
FDB
IP
src:	10.0.0.11
dst:	10.0.0.10
UDP
src:	X
dst:	4789
VXLAN
VNI
Original	L2
src:	192.168.0.Y
dst:	192.168.0.100
Serf / Gossip
Docker Overlay control plane (3: User-land)
"Deep Dive in Docker Overlay Networks", Dockercon Austin 2017
Slides
Video
Blog Posts
That was a lot of information
Using BGP as a dynamic
control plane
Rely on BGP eVPN address family to distribute L2 and L3 data
PROS: BGP is a standard to distribute addresses, supported by SDN
vendors
CONS: limited Linux implementations, requires some BGP knowledge
VXLAN Control Plane- Option 4: BGP-EVPN
vxlan vxlan
bgpd bgpd
vxlan
bgpd
Endpoint	data	is	distributed	with	BGP
BGP in one slide
● Routing Protocol between network entities ("Autonomous Systems", AS)
Google ASN: 15169 / Amazon ASN: 16509
(both actually have more than one)
● BGP is an EGP: Exterior Gateway Protocol
IGP: Interior Gateway Protocol (OSPF, EIGRP, IS-IS)
IGP: next hop is the IP of a router
BGP: next hop is an Autonomous System
● BGP is what makes Internet work
● BGP scales very well
500 000+ prefixes for a full Internet table
A quick BGP example
AS 1
AS 2
AS 3
AS 5
AS 4
eBGP
iBGP
20.0.0.0/16
20.0.0.0/16: AS1
20.0.0.0/16: AS1
20.0.0.0/16: AS4-AS1
Shortest PATH?
20.0.0.0/16: AS5-AS4-AS1
20.0.0.0/16: AS2-AS1
AS: Autonomous System
eBGP: external (different AS)
iBGP: internal (same AS)
iBGP
iBGP requires to mesh between all peers
n peers => n * (n-1) / 2 connections
50 peers => 1225 (49 of each host)
Route-reflectors simulate the mesh
More scalable and simpler
Possible to have more than one RR
RR
Distribute BGP information within an Autonomous System
BGP EVPN
● Part of MP-BGP (multi-protocol BGP: not only IP prefixes)
● Announce VXLAN information instead of IP prefixes
L3: IP addresses of VXLAN endpoints (VTEP)
L2: Location of MAC addresses
● BUM (Broadcast, Unknown, Multicast) traffic unicasted to all VTEPs
● Get the scalability of BGP
10.0.0.0/16
docker0:	10.0.0.10
Environment
RR1 RR2
quagga-
rr
quagga-
rr
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0:~$ docker run -t -d --privileged --name quagga -p 179:179 --hostname docker0 
-v $(pwd)/quagga:/etc/quagga cumulusnetworks/quagga (modify routing/forwarding)
router bgp 65000
bgp router-id 10.0.0.10
no bgp default ipv4-unicast
neighbor reflectors peer-group
neighbor reflectors remote-as 65000
neighbor reflectors capability extended-nexthop
neighbor 10.0.0.5 peer-group reflectors
neighbor 10.0.1.5 peer-group reflectors
address-family evpn
neighbor reflectors activate
advertise-all-vni
BGP configuration on Docker0
router bgp 65000
bgp router-id 10.0.0.5
bgp cluster-id 111.111.111.111
no bgp default ipv4-unicast
neighbor docker peer-group
neighbor docker remote-as 65000
bgp listen range 10.0.0.0/16 peer-group docker
address-family evpn
neighbor docker activate
neighbor docker route-reflector-client
BGP configuration on Route Reflectors
Creating our BGP clients on Docker hosts
10.0.0.0/16
docker0:	10.0.0.10
What we have so far
RR1 RR2
quagga-
rr
quagga-
rr
docker0
quagga eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0
quaggaeth0
Let's look at the BGP data
docker0:~$ docker exec -it quagga vtysh
docker0# show run
docker0# show bgp neighbors
docker0# show bgp evpn summary
BGP router identifier 10.0.0.10, local AS number 65000 vrf-id 0
Peers 2, using 42 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
quagga0(10.0.0.5) 4 65000 42 43 0 0 0 00:02:01 0
quagga1(10.0.1.5) 4 65000 42 43 0 0 0 00:02:01 0
docker0# show bgp evpn route
No EVPN prefixes exist
Configuring VXLAN interfaces
sudo ./setup_vxlan 42 container:quagga dstport 4789 nolearning <= Only learn through EVPN
10.0.0.0/16
docker0:	10.0.0.10
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0
br42vxlan42
quaggaeth0
Let's look at the BGP data
docker0:~$ docker exec -it quagga vtysh
docker0# show bgp evpn route
BGP table version is 0, local router ID is 10.0.0.10
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 10.0.0.10:1
*> [3]:[0]:[32]:[10.0.0.10]
10.0.0.10 32768 i
Route Distinguisher: 10.0.1.10:1
*>i[3]:[0]:[32]:[10.0.1.10]
10.0.1.10 0 100 0 i
docker0# show evpn mac vni all
Let's add containers and try pinging
10.0.0.0/16
docker0:	10.0.0.10
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo:	192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0
br42vxlan42
quagga
demo:	192.168.0.20
eth0
eth0
docker0:~$ sudo ./plumb br42@quagga demo 192.168.0.10/24@192.168.0.1 02:42:c0:a8:00:10
docker1:~$ sudo ./plumb br42@quagga demo 192.168.0.20/24@192.168.0.1 02:42:c0:a8:00:20
What about BGP?
docker0:~$ docker exec -it quagga vtysh
docker0# show bgp evpn route
BGP table version is 0, local router ID is 10.0.0.10
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
Route Distinguisher: 10.0.1.10:1
*>i[2]:[0]:[0]:[48]:[02:42:c0:a8:00:20]
10.0.1.10 0 100 0 i
* i[3]:[0]:[32]:[10.0.1.10]
10.0.1.10 0 100 0 i
docker0# show evpn mac vni all
VNI 42 #MACs (local and remote) 2
MAC Type Intf/Remote VTEP VLAN
02:42:c0:a8:00:10 local veth0pldemo
02:42:c0:a8:00:20 remote 10.0.1.10
10.0.0.0/16
docker0:	10.0.0.10
Overview
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo:	192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0
br42vxlan42
quagga
demo:	192.168.0.20
eth0
eth0Control	plane
Data	plane
● Standard VXLAN address distribution (used on many
routers)
● Full management of BUM traffic
ARP queries
Broadcasts (DHCP)
Multicast (Discovery, keepalived)
● BUM traffic is unicasted (not efficient)
Possible optimizations: ARP suppression (Cumulus Quagga)
What's interesting about this setup?
What can we do with this?
What if we want a second Overlay?
10.0.0.0/16
docker0:	10.0.0.10
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo
192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
br66 vxlan66
docker0
br42vxlan42
quagga
demo
192.168.0.10
eth0
eth0
br66vxlan66
demo66
192.168.66.10
eth0
demo66
192.168.66.20
eth0
docker0:~$ sudo ./setup_vxlan 66 container:quagga dstport 4789 nolearning
docker0:~$ docker run -d --net=none --name=demo66 debian sleep infinity
docker0:~$ sudo ./plumb br66@quagga demo66 192.168.66.10/24 02:42:c0:a8:66:10
What about BGP?
docker0:~$ docker exec -it quagga vtysh
docker0# show evpn vni
Number of VNIs: 2
VNI VxLAN IF VTEP IP # MACs # ARPs # Remote VTEPs
42 vxlan42 0.0.0.0 2 0 1
66 vxlan66 0.0.0.0 2 0 1
docker0# show evpn mac vni all
VNI 42 #MACs (local and remote) 2
MAC Type Intf/Remote VTEP VLAN
02:42:c0:a8:00:10 local veth0pldemo
02:42:c0:a8:00:20 remote 10.0.1.10
VNI 66 #MACs (local and remote) 2
MAC Type Intf/Remote VTEP VLAN
02:42:c0:a8:66:10 local veth0pldemo66
02:42:c0:a8:66:20 remote 10.0.1.10
10.0.0.0/16
docker0:	10.0.0.10
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo
192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0
br42vxlan42
quaggaeth0
Taking advantage of broadcast: DHCP
dhcp
192.168.0.254
eth0
demo
192.168.0.20
eth0
demodhcp
192.168.0.10?
eth0
Configuring DHCP
docker0:~$ docker run -d --net=none --name dhcp -v "$(pwd)/dhcp":/data networkboot/dhcpd eth0
docker0:~$ sudo ./plumb br42@quagga dhcp 192.168.0.254/24
docker1:~$ docker run -d --net=none --name=demodhcp debian sleep infinity
docker1:~$ sudo ./plumb br42@quagga demodhcp dhcp
docker1:~$ docker exec -it demodhcp ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10): 56 data bytes
64 bytes from 192.168.0.10: icmp_seq=0 ttl=47 time=1.566 ms
subnet 192.168.0.0 netmask 255.255.255.0 {
range 192.168.0.100 192.168.0.200;
option routers 192.168.0.1;
option domain-name-servers 8.8.8.8;
}
DHCP configuration
10.0.0.0/16
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo
192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker0
br42vxlan42
quaggaeth0
Getting out of our Docker environment
dhcp
192.168.0.254
eth0
demo
192.168.0.20
eth0
client
192.168.0.100
eth0
quagga
br42
vxlan42
vethgw
192.168.0.1
docker0:	10.0.0.10 docker1:	10.0.1.10gateway0:	10.0.0.20
Getting out of our Docker environment
gateway0:~$ ./setup_vxlan 42 host dstport 4789 nolearning
gateway0:~$ ip link add dev vethbr type veth peer name vethgw
gateway0:~$ ip link set vethbr master br42
gateway0:~$ ip addr add 192.168.0.1/24 dev vethgw
gateway0:~$ ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10): 56 data bytes
64 bytes from 192.168.0.10: icmp_seq=0 ttl=47 time=0.866 ms
br42
vethgw
192.168.0.1
vxlan42
vethbr
10.0.0.0/16
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo
192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker0
br42vxlan42
quaggaeth0
Getting out of VXLAN / Quagga
dhcp
192.168.0.254
eth0
demo
192.168.0.20
eth0
client
192.168.0.100
eth0
quagga
br42
vxlan42
vethgw
192.168.0.1
eth0
Non-VXLAN
host
10.0.0.30
route
10.0.0.0/16 ó 192.168.0.0/24
NAT
docker0:	10.0.0.10 docker1:	10.0.1.10gateway0:	10.0.0.20
Getting out of VXLAN / Quagga
gateway0:~$ echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward
gateway0:~$ iptables -t nat -A POSTROUTING ! -d 10.0.0.0/16 -s 192.168.0.0/24 -o eth0 -j MASQUERADE
docker1:~$ docker exec -it demodhcp ping 192.168.0.1 <= Local (VXLAN)
docker1:~$ docker exec -it demodhcp ping 10.0.0.30 <= Routed
docker1:~$ docker exec -it demodhcp ping 8.8.8.8 <= NATed
simple1:~$ ping 192.168.0.1
simple1:~$ ping 192.168.0.10
eth0
routeNAT
10.0.0.0/16
docker0:	10.0.0.10
RR1 RR2
quagga-
rr
quagga-
rr
docker0
br42 vxlan42
quagga
demo
192.168.0.10
eth0
eth0
10.0.0.5 10.0.1.5
docker1:	10.0.1.10
docker0
br42vxlan42
quaggaeth0
Another nice thing we can do
dhcp
192.168.0.254
eth0
demo
192.168.0.20
eth0
demodhcp
192.168.0.100
eth0
gateway0:	10.0.0.20
quagga
br42
vxlan42
vethgw
192.168.0.1
eth0
Non-VXLAN
host
10.0.0.30
routeNAT
QEMU,	dhclient
192.168.0.10x
tap0
What could a real-life setup look like?
RR2
Docker
quagga
Docker
quagga
Docker
quagga
Docker
quagga
Docker
quagga
Docker
quagga
Docker
quagga
Docker
quagga
BGP/EVPN
Router
Standard h
ost
Standard h
ost
Standard h
ost
Standard h
ost
VXLAN
Routing
Routes from non-VXLAN infraRoutes to VXLAN networks
RR1
How does it compare to other
solutions?
Data plane Control Plane
Swarm Classic VXLAN External KV Store (Consul / Etcd)
SwarmKit VXLAN Swarmkit (Raft / Gossip implementation)
Flannel host-gw Routing Etcd / Kubernetes API
Flannel VXLAN VXLAN Etcd / Kubernetes API
Calico Routing / IPIP Etcd / BGP (IP prefixes)
Weave Classic Custom Custom
Weave Fast Datapath VXLAN Custom
Contiv VXLAN, Routing, L2 Etcd / BGP (IP and maybe eVPN)
Disclaimer: almost no experience with any (from documentation and discussions mostly)
Perspectives
● FFRouting
Quagga fork
Cumulus has switched to FFRouting and merged EVPN support
● Open vSwitch
Alternative to linux native bridge and VXLAN
(Possibly) better performances and more features
Not sure how Quagga/FFRouting would integrate with Open
vSwitch
● Performances
Measure impact of VXLAN
Test VXLAN acceleration when available on NICs
● CNI plugin (to test on Kubernetes and mostly for learning
Thank you!
Questions?
https://github.com/lbernail/dockerco
n2017
@lbernail

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Practical Design Patterns in Docker Networking
Practical Design Patterns in Docker NetworkingPractical Design Patterns in Docker Networking
Practical Design Patterns in Docker Networking
 
Red Hat OpenStack 17 저자직강+스터디그룹_3주차
Red Hat OpenStack 17 저자직강+스터디그룹_3주차Red Hat OpenStack 17 저자직강+스터디그룹_3주차
Red Hat OpenStack 17 저자직강+스터디그룹_3주차
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Meetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStackMeetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStack
 
Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Docker Networking Deep Dive
Docker Networking Deep DiveDocker Networking Deep Dive
Docker Networking Deep Dive
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Red Hat OpenStack 17 저자직강+스터디그룹_1주차
Red Hat OpenStack 17 저자직강+스터디그룹_1주차Red Hat OpenStack 17 저자직강+스터디그룹_1주차
Red Hat OpenStack 17 저자직강+스터디그룹_1주차
 
An overview of the Kubernetes architecture
An overview of the Kubernetes architectureAn overview of the Kubernetes architecture
An overview of the Kubernetes architecture
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitch
 
9 steps to awesome with kubernetes
9 steps to awesome with kubernetes9 steps to awesome with kubernetes
9 steps to awesome with kubernetes
 
AvailabilityZoneとHostAggregate
AvailabilityZoneとHostAggregateAvailabilityZoneとHostAggregate
AvailabilityZoneとHostAggregate
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
 
Osnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptxOsnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptx
 
Red Hat OpenStack 17 저자직강+스터디그룹_4주차
Red Hat OpenStack 17 저자직강+스터디그룹_4주차Red Hat OpenStack 17 저자직강+스터디그룹_4주차
Red Hat OpenStack 17 저자직강+스터디그룹_4주차
 
[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주
[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주
[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주
 
最近のたまおきの取り組み 〜OpenStack+αの実現に向けて〜 - OpenStack最新情報セミナー(2017年3月)
最近のたまおきの取り組み 〜OpenStack+αの実現に向けて〜  - OpenStack最新情報セミナー(2017年3月)最近のたまおきの取り組み 〜OpenStack+αの実現に向けて〜  - OpenStack最新情報セミナー(2017年3月)
最近のたまおきの取り組み 〜OpenStack+αの実現に向けて〜 - OpenStack最新情報セミナー(2017年3月)
 

Destaque

Destaque (20)

Plug-ins: Building, Shipping, Storing, and Running - Nandhini Santhanam and T...
Plug-ins: Building, Shipping, Storing, and Running - Nandhini Santhanam and T...Plug-ins: Building, Shipping, Storing, and Running - Nandhini Santhanam and T...
Plug-ins: Building, Shipping, Storing, and Running - Nandhini Santhanam and T...
 
Deep Dive into Docker Swarm Mode
Deep Dive into Docker Swarm ModeDeep Dive into Docker Swarm Mode
Deep Dive into Docker Swarm Mode
 
Docker on Docker
Docker on DockerDocker on Docker
Docker on Docker
 
Monitoring Dell Infrastructure using Docker & Microservices
Monitoring Dell Infrastructure using Docker & MicroservicesMonitoring Dell Infrastructure using Docker & Microservices
Monitoring Dell Infrastructure using Docker & Microservices
 
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
Docker summit 2015: 以 Docker Swarm 打造多主機叢集環境
 
What's New in Docker 1.12?
What's New in Docker 1.12?What's New in Docker 1.12?
What's New in Docker 1.12?
 
Kubernetes in Docker
Kubernetes in DockerKubernetes in Docker
Kubernetes in Docker
 
Introduction to Docker - IndiaOpsUG
Introduction to Docker - IndiaOpsUGIntroduction to Docker - IndiaOpsUG
Introduction to Docker - IndiaOpsUG
 
Container Orchestration from Theory to Practice
Container Orchestration from Theory to PracticeContainer Orchestration from Theory to Practice
Container Orchestration from Theory to Practice
 
Modernizing .NET Apps
Modernizing .NET AppsModernizing .NET Apps
Modernizing .NET Apps
 
Modernizing Java Apps with Docker
Modernizing Java Apps with DockerModernizing Java Apps with Docker
Modernizing Java Apps with Docker
 
Container-relevant Upstream Kernel Developments
Container-relevant Upstream Kernel DevelopmentsContainer-relevant Upstream Kernel Developments
Container-relevant Upstream Kernel Developments
 
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
 
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, DockerUnder the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
 
LinuxKit Deep Dive
LinuxKit Deep DiveLinuxKit Deep Dive
LinuxKit Deep Dive
 
Introduction to LinuxKit - Docker Bangalore Meetup
Introduction to LinuxKit - Docker Bangalore MeetupIntroduction to LinuxKit - Docker Bangalore Meetup
Introduction to LinuxKit - Docker Bangalore Meetup
 
Moby and Kubernetes entitlements
Moby and Kubernetes entitlements Moby and Kubernetes entitlements
Moby and Kubernetes entitlements
 
Containerd internals: building a core container runtime
Containerd internals: building a core container runtimeContainerd internals: building a core container runtime
Containerd internals: building a core container runtime
 
Kubernetes CRI containerd integration by Lantao Liu (Google)
Kubernetes CRI containerd integration by Lantao Liu (Google)Kubernetes CRI containerd integration by Lantao Liu (Google)
Kubernetes CRI containerd integration by Lantao Liu (Google)
 
Docker Swarm 0.2.0
Docker Swarm 0.2.0Docker Swarm 0.2.0
Docker Swarm 0.2.0
 

Semelhante a Deeper Dive in Docker Overlay Networks

Semelhante a Deeper Dive in Docker Overlay Networks (20)

Deeper dive in Docker Overlay Networks
Deeper dive in Docker Overlay NetworksDeeper dive in Docker Overlay Networks
Deeper dive in Docker Overlay Networks
 
Deep dive in Docker Overlay Networks
Deep dive in Docker Overlay NetworksDeep dive in Docker Overlay Networks
Deep dive in Docker Overlay Networks
 
Deep Dive in Docker Overlay Networks
Deep Dive in Docker Overlay NetworksDeep Dive in Docker Overlay Networks
Deep Dive in Docker Overlay Networks
 
Deep Dive in Docker Overlay Networks - Laurent Bernaille - Architect, D2SI
Deep Dive in Docker Overlay Networks - Laurent Bernaille - Architect, D2SIDeep Dive in Docker Overlay Networks - Laurent Bernaille - Architect, D2SI
Deep Dive in Docker Overlay Networks - Laurent Bernaille - Architect, D2SI
 
Understanding docker networking
Understanding docker networkingUnderstanding docker networking
Understanding docker networking
 
Octo talk : docker multi-host networking
Octo talk : docker multi-host networking Octo talk : docker multi-host networking
Octo talk : docker multi-host networking
 
Docker Meetup: Docker Networking 1.11, by Madhu Venugopal
Docker Meetup: Docker Networking 1.11, by Madhu VenugopalDocker Meetup: Docker Networking 1.11, by Madhu Venugopal
Docker Meetup: Docker Networking 1.11, by Madhu Venugopal
 
Docker Meetup: Docker Networking 1.11 with Madhu Venugopal
Docker Meetup: Docker Networking 1.11 with Madhu VenugopalDocker Meetup: Docker Networking 1.11 with Madhu Venugopal
Docker Meetup: Docker Networking 1.11 with Madhu Venugopal
 
Docker 1.11 Meetup: Networking Showcase
Docker 1.11 Meetup: Networking ShowcaseDocker 1.11 Meetup: Networking Showcase
Docker 1.11 Meetup: Networking Showcase
 
Docker SDN (software-defined-networking) JUG
Docker SDN (software-defined-networking) JUGDocker SDN (software-defined-networking) JUG
Docker SDN (software-defined-networking) JUG
 
[오픈소스컨설팅] Linux Network Troubleshooting
[오픈소스컨설팅] Linux Network Troubleshooting[오픈소스컨설팅] Linux Network Troubleshooting
[오픈소스컨설팅] Linux Network Troubleshooting
 
DockerCon17 - Beyond the backslash
DockerCon17 - Beyond the backslashDockerCon17 - Beyond the backslash
DockerCon17 - Beyond the backslash
 
Docker Networking with New Ipvlan and Macvlan Drivers
Docker Networking with New Ipvlan and Macvlan DriversDocker Networking with New Ipvlan and Macvlan Drivers
Docker Networking with New Ipvlan and Macvlan Drivers
 
IP Routing, AWS, and Docker
IP Routing, AWS, and DockerIP Routing, AWS, and Docker
IP Routing, AWS, and Docker
 
Docker Setting for Static IP allocation
Docker Setting for Static IP allocationDocker Setting for Static IP allocation
Docker Setting for Static IP allocation
 
9 creating cent_os 7_mages_for_dpdk_training
9 creating cent_os 7_mages_for_dpdk_training9 creating cent_os 7_mages_for_dpdk_training
9 creating cent_os 7_mages_for_dpdk_training
 
Docker networking Tutorial 101
Docker networking Tutorial 101Docker networking Tutorial 101
Docker networking Tutorial 101
 
Openstack openswitch basics
Openstack openswitch basicsOpenstack openswitch basics
Openstack openswitch basics
 
Thebasicintroductionofopenvswitch
ThebasicintroductionofopenvswitchThebasicintroductionofopenvswitch
Thebasicintroductionofopenvswitch
 
Docker1.12イングレスロードバランサ
Docker1.12イングレスロードバランサDocker1.12イングレスロードバランサ
Docker1.12イングレスロードバランサ
 

Mais de Docker, Inc.

Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 

Mais de Docker, Inc. (20)

Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience
 
How to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker BuildHow to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker Build
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINX
 
How To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and ComposeHow To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and Compose
 
Hands-on Helm
Hands-on Helm Hands-on Helm
Hands-on Helm
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker HubThe First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker Hub
 
Monitoring in a Microservices World
Monitoring in a Microservices WorldMonitoring in a Microservices World
Monitoring in a Microservices World
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
 
Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with Docker
 
Become a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio CodeBecome a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio Code
 
How to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container RegistryHow to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container Registry
 
Monolithic to Microservices + Docker = SDLC on Steroids!
Monolithic to Microservices + Docker = SDLC on Steroids!Monolithic to Microservices + Docker = SDLC on Steroids!
Monolithic to Microservices + Docker = SDLC on Steroids!
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
 
Labels, Labels, Labels
Labels, Labels, Labels Labels, Labels, Labels
Labels, Labels, Labels
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment ModelUsing Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
 
Developing with Docker for the Arm Architecture
Developing with Docker for the Arm ArchitectureDeveloping with Docker for the Arm Architecture
Developing with Docker for the Arm Architecture
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Deeper Dive in Docker Overlay Networks

  • 1. Deeper Dive in Docker Overlay Networks Laurent Bernaille @lbernail CTO D2SI
  • 2. Agenda Reminder on the Docker Overlay VXLAN Control Plane options Using BGP as a dynamic Control Plane What can we do with this?
  • 3. Reminder on the Docker overlay
  • 4. The Docker Overlay network docker0:~$ docker network create --driver overlay --subnet 192.168.0.0/24 dockercon d099dcc709daddbc0e143c24e7091bef6b13bdc3abb379473af4582bf1e112b1 docker1:~$ docker network ls NETWORK ID NAME DRIVER SCOPE d099dcc709da dockercon overlay global docker0:~$ docker run -d --ip 192.168.0.100 --net dockercon --name C0 debian sleep infinity docker1:~$ docker run -it --rm --net dockercon debian root@950d67e96db7:/# ping 192.168.0.100 PING 192.168.0.100 (192.168.0.100): 56 data bytes 64 bytes from 192.168.0.100: seq=0 ttl=64 time=1.153 ms
  • 5. Docker Overlay: Data plane docker0 eth0 192.168.0.100 C0 Namespace br0 vxlanveth eth0 docker1 C1 Namespace br0 vxlanveth eth0PING eth0 192.168.0.Y 10.0.0.10 10.0.0.11 IP src: 10.0.0.11 dst: 10.0.0.10 UDP src: X dst: 4789 VXLAN VNI Original L2 src: 192.168.0.Y dst: 192.168.0.100
  • 6. What is VXLAN? • Tunneling technology over UDP (L2 in UDP) • Developed for cloud SDN to create multi-tenancy • Without the need for L2 connectivity • Without the normal VLAN limit (4096 VLAN Ids) • Easy to encrypt: IPsec • Overhead: 50 bytes • In Linux • Started with Open vSwitch • Native with Kernel >= 3.7 and >=3.16 for Namespace support Outer IP packet UDP dst: 4789 VXLAN Header Original L2 VXLAN: Virtual eXtensible LAN VNI: VXLAN Network Identifier VTEP: VXLAN Tunnel Endpoint
  • 9. Creating the overlay namespaceip netns add overns ip netns exec overns ip link add dev br42 type bridge ip netns exec overns ip addr add dev br42 192.168.0.1/24 ip link add dev vxlan42 type vxlan id 42 proxy dstport 4789 ip link set vxlan1 netns overns ip netns exec overns ip link set vxlan42 master br42 ip netns exec overns ip link set vxlan42 up ip netns exec overns ip link set br42 up create overlay NS create bridge in NS create VXLAN interface move it to NS add it to bridge bring all interfaces up setup_vxlan script
  • 11. docker0 docker run -d --net=none --name=demo debian sleep infinity ctn_ns_path=$(docker inspect --format="{{ .NetworkSettings.SandboxKey}}" demo) ctn_ns=${ctn_ns_path##*/} ip link add dev veth1 mtu 1450 type veth peer name veth2 mtu 1450 ip link set dev veth1 netns overns ip netns exec overns ip link set veth1 master br42 ip netns exec overns ip link set veth1 up ip link set dev veth2 netns $ctn_ns ip netns exec $ctn_ns ip link set dev veth2 name eth0 address 02:42:c0:a8:00:10 ip netns exec $ctn_ns ip addr add dev eth0 192.168.0.10 ip netns exec $ctn_ns ip link set dev eth0 up docker1 Same with 192.168.0.20 / 02:42:c0:a8:00:20 Create container without net Create veth Send veth1 to overlay NS Attach it to overlay bridge Send veth2 to container Rename & Configure Get NS for container Create containers and attach them plumb script
  • 12. Does it ping? docker0:~$ docker exec -it demo ping 192.168.0.20 PING 192.168.0.20 (192.168.0.20): 56 data bytes 92 bytes from 192.168.0.10: Destination Host Unreachable docker0:~$ sudo ip netns exec overns ip neighbor show docker0:~$ sudo ip netns exec overns ip neighbor add 192.168.0.20 lladdr 02:42:c0:a8:00:20 dev vxlan42 docker0:~$ sudo ip netns exec overns bridge fdb add 02:42:c0:a8:00:20 dev vxlan42 self dst 10.0.1.10 vni 42 port 4789 docker1: Same with 192.168.0.10, 02:42:c0:a8:00:10 and 10.0.0.10
  • 15. vxlan vxlan vxlan Multicast 239.x.x.x ARP: Who has 192.168.0.2? L2 discovery: where is 02:42:c0:a8:00:02 ? Use a multicast group to send traffic for unknown L3/L2 addresses PROS: simple and efficient CONS: Multicast connectivity not always available (on public clouds for instance) VXLAN Control Plane options - 1: Multicast
  • 16. Configure a remote IP address where to send traffic for unknown addresses PROS: simple, not need for multicast, very good for two hosts CONS: difficult to manage with more than 2 hosts VXLAN Control Plane options - 2: Point-to- point vxlan vxlan Remote IP: point-to-point Send everything to remote IP
  • 17. Do nothing, provide ARP / FDB information from outside PROS: very flexible CONS: requires a daemon and a centralized database of addresses VXLAN Control Plane options - 3: User-Land vxlan vxlan daemon daemon Manual (with a daemon modifying ARP/FDB) ARP: Do you know 192.168.0.2? L2: where is 02:42:c0:a8:00:02 ? vxlan daemon
  • 18. consul/swarm docker0 eth0 192.168.0.100 C0 Namespace br0 vxlan veth eth0 docker1 C1 Namespace br0 vxlan veth eth0 192.168.0.Y eth0 NAT PING dockerd dockerd 10.0.0.10 10.0.1.10 ARP FDB ARP FDB IP src: 10.0.0.11 dst: 10.0.0.10 UDP src: X dst: 4789 VXLAN VNI Original L2 src: 192.168.0.Y dst: 192.168.0.100 Serf / Gossip Docker Overlay control plane (3: User-land)
  • 19. "Deep Dive in Docker Overlay Networks", Dockercon Austin 2017 Slides Video Blog Posts That was a lot of information
  • 20. Using BGP as a dynamic control plane
  • 21. Rely on BGP eVPN address family to distribute L2 and L3 data PROS: BGP is a standard to distribute addresses, supported by SDN vendors CONS: limited Linux implementations, requires some BGP knowledge VXLAN Control Plane- Option 4: BGP-EVPN vxlan vxlan bgpd bgpd vxlan bgpd Endpoint data is distributed with BGP
  • 22. BGP in one slide ● Routing Protocol between network entities ("Autonomous Systems", AS) Google ASN: 15169 / Amazon ASN: 16509 (both actually have more than one) ● BGP is an EGP: Exterior Gateway Protocol IGP: Interior Gateway Protocol (OSPF, EIGRP, IS-IS) IGP: next hop is the IP of a router BGP: next hop is an Autonomous System ● BGP is what makes Internet work ● BGP scales very well 500 000+ prefixes for a full Internet table
  • 23. A quick BGP example AS 1 AS 2 AS 3 AS 5 AS 4 eBGP iBGP 20.0.0.0/16 20.0.0.0/16: AS1 20.0.0.0/16: AS1 20.0.0.0/16: AS4-AS1 Shortest PATH? 20.0.0.0/16: AS5-AS4-AS1 20.0.0.0/16: AS2-AS1 AS: Autonomous System eBGP: external (different AS) iBGP: internal (same AS)
  • 24. iBGP iBGP requires to mesh between all peers n peers => n * (n-1) / 2 connections 50 peers => 1225 (49 of each host) Route-reflectors simulate the mesh More scalable and simpler Possible to have more than one RR RR Distribute BGP information within an Autonomous System
  • 25. BGP EVPN ● Part of MP-BGP (multi-protocol BGP: not only IP prefixes) ● Announce VXLAN information instead of IP prefixes L3: IP addresses of VXLAN endpoints (VTEP) L2: Location of MAC addresses ● BUM (Broadcast, Unknown, Multicast) traffic unicasted to all VTEPs ● Get the scalability of BGP
  • 27. docker0:~$ docker run -t -d --privileged --name quagga -p 179:179 --hostname docker0 -v $(pwd)/quagga:/etc/quagga cumulusnetworks/quagga (modify routing/forwarding) router bgp 65000 bgp router-id 10.0.0.10 no bgp default ipv4-unicast neighbor reflectors peer-group neighbor reflectors remote-as 65000 neighbor reflectors capability extended-nexthop neighbor 10.0.0.5 peer-group reflectors neighbor 10.0.1.5 peer-group reflectors address-family evpn neighbor reflectors activate advertise-all-vni BGP configuration on Docker0 router bgp 65000 bgp router-id 10.0.0.5 bgp cluster-id 111.111.111.111 no bgp default ipv4-unicast neighbor docker peer-group neighbor docker remote-as 65000 bgp listen range 10.0.0.0/16 peer-group docker address-family evpn neighbor docker activate neighbor docker route-reflector-client BGP configuration on Route Reflectors Creating our BGP clients on Docker hosts
  • 28. 10.0.0.0/16 docker0: 10.0.0.10 What we have so far RR1 RR2 quagga- rr quagga- rr docker0 quagga eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 docker0 quaggaeth0
  • 29. Let's look at the BGP data docker0:~$ docker exec -it quagga vtysh docker0# show run docker0# show bgp neighbors docker0# show bgp evpn summary BGP router identifier 10.0.0.10, local AS number 65000 vrf-id 0 Peers 2, using 42 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd quagga0(10.0.0.5) 4 65000 42 43 0 0 0 00:02:01 0 quagga1(10.0.1.5) 4 65000 42 43 0 0 0 00:02:01 0 docker0# show bgp evpn route No EVPN prefixes exist
  • 30. Configuring VXLAN interfaces sudo ./setup_vxlan 42 container:quagga dstport 4789 nolearning <= Only learn through EVPN 10.0.0.0/16 docker0: 10.0.0.10 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 docker0 br42vxlan42 quaggaeth0
  • 31. Let's look at the BGP data docker0:~$ docker exec -it quagga vtysh docker0# show bgp evpn route BGP table version is 0, local router ID is 10.0.0.10 EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 10.0.0.10:1 *> [3]:[0]:[32]:[10.0.0.10] 10.0.0.10 32768 i Route Distinguisher: 10.0.1.10:1 *>i[3]:[0]:[32]:[10.0.1.10] 10.0.1.10 0 100 0 i docker0# show evpn mac vni all
  • 32. Let's add containers and try pinging 10.0.0.0/16 docker0: 10.0.0.10 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo: 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 docker0 br42vxlan42 quagga demo: 192.168.0.20 eth0 eth0 docker0:~$ sudo ./plumb br42@quagga demo 192.168.0.10/24@192.168.0.1 02:42:c0:a8:00:10 docker1:~$ sudo ./plumb br42@quagga demo 192.168.0.20/24@192.168.0.1 02:42:c0:a8:00:20
  • 33. What about BGP? docker0:~$ docker exec -it quagga vtysh docker0# show bgp evpn route BGP table version is 0, local router ID is 10.0.0.10 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] Route Distinguisher: 10.0.1.10:1 *>i[2]:[0]:[0]:[48]:[02:42:c0:a8:00:20] 10.0.1.10 0 100 0 i * i[3]:[0]:[32]:[10.0.1.10] 10.0.1.10 0 100 0 i docker0# show evpn mac vni all VNI 42 #MACs (local and remote) 2 MAC Type Intf/Remote VTEP VLAN 02:42:c0:a8:00:10 local veth0pldemo 02:42:c0:a8:00:20 remote 10.0.1.10
  • 34. 10.0.0.0/16 docker0: 10.0.0.10 Overview RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo: 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 docker0 br42vxlan42 quagga demo: 192.168.0.20 eth0 eth0Control plane Data plane
  • 35. ● Standard VXLAN address distribution (used on many routers) ● Full management of BUM traffic ARP queries Broadcasts (DHCP) Multicast (Discovery, keepalived) ● BUM traffic is unicasted (not efficient) Possible optimizations: ARP suppression (Cumulus Quagga) What's interesting about this setup?
  • 36. What can we do with this?
  • 37. What if we want a second Overlay? 10.0.0.0/16 docker0: 10.0.0.10 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 br66 vxlan66 docker0 br42vxlan42 quagga demo 192.168.0.10 eth0 eth0 br66vxlan66 demo66 192.168.66.10 eth0 demo66 192.168.66.20 eth0 docker0:~$ sudo ./setup_vxlan 66 container:quagga dstport 4789 nolearning docker0:~$ docker run -d --net=none --name=demo66 debian sleep infinity docker0:~$ sudo ./plumb br66@quagga demo66 192.168.66.10/24 02:42:c0:a8:66:10
  • 38. What about BGP? docker0:~$ docker exec -it quagga vtysh docker0# show evpn vni Number of VNIs: 2 VNI VxLAN IF VTEP IP # MACs # ARPs # Remote VTEPs 42 vxlan42 0.0.0.0 2 0 1 66 vxlan66 0.0.0.0 2 0 1 docker0# show evpn mac vni all VNI 42 #MACs (local and remote) 2 MAC Type Intf/Remote VTEP VLAN 02:42:c0:a8:00:10 local veth0pldemo 02:42:c0:a8:00:20 remote 10.0.1.10 VNI 66 #MACs (local and remote) 2 MAC Type Intf/Remote VTEP VLAN 02:42:c0:a8:66:10 local veth0pldemo66 02:42:c0:a8:66:20 remote 10.0.1.10
  • 39. 10.0.0.0/16 docker0: 10.0.0.10 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 docker0 br42vxlan42 quaggaeth0 Taking advantage of broadcast: DHCP dhcp 192.168.0.254 eth0 demo 192.168.0.20 eth0 demodhcp 192.168.0.10? eth0
  • 40. Configuring DHCP docker0:~$ docker run -d --net=none --name dhcp -v "$(pwd)/dhcp":/data networkboot/dhcpd eth0 docker0:~$ sudo ./plumb br42@quagga dhcp 192.168.0.254/24 docker1:~$ docker run -d --net=none --name=demodhcp debian sleep infinity docker1:~$ sudo ./plumb br42@quagga demodhcp dhcp docker1:~$ docker exec -it demodhcp ping 192.168.0.10 PING 192.168.0.10 (192.168.0.10): 56 data bytes 64 bytes from 192.168.0.10: icmp_seq=0 ttl=47 time=1.566 ms subnet 192.168.0.0 netmask 255.255.255.0 { range 192.168.0.100 192.168.0.200; option routers 192.168.0.1; option domain-name-servers 8.8.8.8; } DHCP configuration
  • 41. 10.0.0.0/16 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker0 br42vxlan42 quaggaeth0 Getting out of our Docker environment dhcp 192.168.0.254 eth0 demo 192.168.0.20 eth0 client 192.168.0.100 eth0 quagga br42 vxlan42 vethgw 192.168.0.1 docker0: 10.0.0.10 docker1: 10.0.1.10gateway0: 10.0.0.20
  • 42. Getting out of our Docker environment gateway0:~$ ./setup_vxlan 42 host dstport 4789 nolearning gateway0:~$ ip link add dev vethbr type veth peer name vethgw gateway0:~$ ip link set vethbr master br42 gateway0:~$ ip addr add 192.168.0.1/24 dev vethgw gateway0:~$ ping 192.168.0.10 PING 192.168.0.10 (192.168.0.10): 56 data bytes 64 bytes from 192.168.0.10: icmp_seq=0 ttl=47 time=0.866 ms br42 vethgw 192.168.0.1 vxlan42 vethbr
  • 43. 10.0.0.0/16 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker0 br42vxlan42 quaggaeth0 Getting out of VXLAN / Quagga dhcp 192.168.0.254 eth0 demo 192.168.0.20 eth0 client 192.168.0.100 eth0 quagga br42 vxlan42 vethgw 192.168.0.1 eth0 Non-VXLAN host 10.0.0.30 route 10.0.0.0/16 ó 192.168.0.0/24 NAT docker0: 10.0.0.10 docker1: 10.0.1.10gateway0: 10.0.0.20
  • 44. Getting out of VXLAN / Quagga gateway0:~$ echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward gateway0:~$ iptables -t nat -A POSTROUTING ! -d 10.0.0.0/16 -s 192.168.0.0/24 -o eth0 -j MASQUERADE docker1:~$ docker exec -it demodhcp ping 192.168.0.1 <= Local (VXLAN) docker1:~$ docker exec -it demodhcp ping 10.0.0.30 <= Routed docker1:~$ docker exec -it demodhcp ping 8.8.8.8 <= NATed simple1:~$ ping 192.168.0.1 simple1:~$ ping 192.168.0.10 eth0 routeNAT
  • 45. 10.0.0.0/16 docker0: 10.0.0.10 RR1 RR2 quagga- rr quagga- rr docker0 br42 vxlan42 quagga demo 192.168.0.10 eth0 eth0 10.0.0.5 10.0.1.5 docker1: 10.0.1.10 docker0 br42vxlan42 quaggaeth0 Another nice thing we can do dhcp 192.168.0.254 eth0 demo 192.168.0.20 eth0 demodhcp 192.168.0.100 eth0 gateway0: 10.0.0.20 quagga br42 vxlan42 vethgw 192.168.0.1 eth0 Non-VXLAN host 10.0.0.30 routeNAT QEMU, dhclient 192.168.0.10x tap0
  • 46. What could a real-life setup look like? RR2 Docker quagga Docker quagga Docker quagga Docker quagga Docker quagga Docker quagga Docker quagga Docker quagga BGP/EVPN Router Standard h ost Standard h ost Standard h ost Standard h ost VXLAN Routing Routes from non-VXLAN infraRoutes to VXLAN networks RR1
  • 47. How does it compare to other solutions? Data plane Control Plane Swarm Classic VXLAN External KV Store (Consul / Etcd) SwarmKit VXLAN Swarmkit (Raft / Gossip implementation) Flannel host-gw Routing Etcd / Kubernetes API Flannel VXLAN VXLAN Etcd / Kubernetes API Calico Routing / IPIP Etcd / BGP (IP prefixes) Weave Classic Custom Custom Weave Fast Datapath VXLAN Custom Contiv VXLAN, Routing, L2 Etcd / BGP (IP and maybe eVPN) Disclaimer: almost no experience with any (from documentation and discussions mostly)
  • 48. Perspectives ● FFRouting Quagga fork Cumulus has switched to FFRouting and merged EVPN support ● Open vSwitch Alternative to linux native bridge and VXLAN (Possibly) better performances and more features Not sure how Quagga/FFRouting would integrate with Open vSwitch ● Performances Measure impact of VXLAN Test VXLAN acceleration when available on NICs ● CNI plugin (to test on Kubernetes and mostly for learning