Linux has become a 1st class Network Citizen for many years and doesn't fall short compared to commercial solutions. It in fact is the very essence many of those are build on and is used as the foundation for nearly all cloud solutions out there.
This talk will touch on methods and features to set up Layer3 network separation and will walk through and show case
* Policy-based routing
* VRFs (with and without MPLS)
* Network Namespaces
We will compare features and options and go through a number of use cases, covering Linux as a router, VPN server, load balancer, etc.
A basic understanding of networking, routing and how the Internet works certainly help, some aha moments will be there in any way.
3. Who's who Maximilian Wilhelm
Networker
OpenSource Hacker
Fanboy of
(Debian) Linux
(Linux) networking
Occupation:
By day: Network Engineer at Cloudflare
By night: Infrastructure Archmage, Freifunk Hochstift
In between: Freelance Infrastructre Architect for hire
Contact
@BarbarossaTM
max@sdn.clinic
3 / 45
9. Who's who
Motivation
Use cases
Policy-based routing
Route IPv4 traffic leaving the network to CGN boxes
Route non-interactive traffic across cheaper link
VRFs
Keep Internet and internal routing domains seperated
Provide LB/proxy to internal services but don't expose hosts completely
Provide overlays for customers / different routing domains
NetNS
Full-blown separation for applications (-> containers)
vEth + NetNS for debugging purposes
9 / 45
11. Who's who
Motivation
Routing
Routing
Every device speaking IP has a routing table
German translation according to IBM: "Leitwegtabelle"
Packets are forwarded according to longest prefix match
Default Gateway or Gateway of last resort used if no entry matches
Hot Potato principle
Packets forwarded to next hop w/o knowledge of their routing table
Asymmetric routing
Path to destination and return path don't have to be identical
11 / 45
12. Who's who
Motivation
Routing
Routing table
Possible routing table of your laptop when using company VPN:
Prefix Iface Next-hop
10.0.0.0/8 tun0 10.23.42.1
10.23.42.0/25 tun0
192.168.178.0/24 wlan0
0.0.0.0/0 wlan0 192.168.178.1
12 / 45
13. Who's who
Motivation
Routing
Source address selection
With every routing decision for a locally originated connection a source address is
selected based on the routing table.
Usually the (primary) IP configured on the outgoing interface
May be explicitly set to any IP
For example IP on loopback interface
Prefix Iface Next-hop Src address
10.0.0.0/8 tun0 10.23.42.1
10.23.42.0/25 tun0 10.23.42.8
192.168.178.0/24 wlan0 192.168.178.5
0.0.0.0/0 wlan0 192.168.178.1
13 / 45
14. Who's who
Motivation
Routing
Source address selection - ICMP erros
What IP will answer on errors?
# icmp_errors_use_inbound_ifaddr - BOOLEAN
#
# If zero, icmp error messages are sent with the primary address of
# the exiting interface.
#
# If non-zero, the message will be sent with the primary address of
# the interface that received the packet that caused the icmp error.
# This is the behaviour many network administrators will expect from
# a router. And it can make debugging complicated network layouts
# much easier.
#
# Note that if no primary address exists for the interface selected,
# then the primary address of the first non-loopback interface that
# has one will be used regardless of this setting.
#
# Default: 0
net.ipv4.icmp_errors_use_inbound_ifaddr = 1
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.rst
IPv6: It's complicated, see RFC6724
14 / 45
15. Who's who
Motivation
Routing
Routing tables
Every Linux box has a number of routing tables
$ ip route help
Usage: ip route { list | flush } SELECTOR
...
SELECTOR := ... [ table TABLE_ID ]
...
TABLE_ID := [ local | main | default | all | NUMBER ]
By default routing table main is used
So ip route show and ip route show table main show the same thing
15 / 45
16. Who's who
Motivation
Routing
Default Routing Tables on Linux
Table local
Contains all routes to
Locally connected IPs
Broadcast addresses
Table main
Contains "usual" routes
Locally connected subnets
Routes to remote subnets
Table default
Usually empty
16 / 45
17. Who's who
Motivation
Routing
Default Routing Tables on Linux
Table local
$ ip route show table local
broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
broadcast 192.168.178.0 dev wlan0 proto kernel scope link src 192.168.178.42
local 192.168.178.42 dev wlan0 proto kernel scope host src 192.168.178.42
broadcast 192.168.178.255 dev wlan0 proto kernel scope link src 192.168.178.42
Table main
$ ip route show [table main]
default via 192.168.178.1 dev wlan0 proto dhcp metric 600
192.168.178.0/24 dev wlan0 proto kernel scope link src 192.168.178.42 metric 600
Table default
$ ip route show table default
$
17 / 45
18. Who's who
Motivation
Routing
What happens on link-down?
By default Linux will try to use routes with link down
Behaviour can be controlled via sysctl
# ip r
default via 192.168.178.1 dev eth2
192.168.178.0/24 dev eth2 proto kernel scope link src 192.168.178.42
# echo 1 > /proc/sys/net/ipv4/conf/eth2/ignore_routes_with_linkdown
# ip r
default via 192.168.178.1 dev eth2 dead linkdown
192.168.178.0/24 dev eth2 proto kernel scope link src 192.168.178.42 dead linkdown
# ping 1.1.1.1
connect: Network is unreachable
18 / 45
20. Who's who
Motivation
Routing
PBR
Policy-based routing
Available since Linux 2.2 (1999)
Allows to influence routing decision depending on (e.g.)
Ingress interface
Source address
Source/destination port
Something netfilter can match
Drawbacks
Beware to close loopholes
Rule for IPv4
Rule for IPv6
Rule for incoming interface
ICMP errors might still get routed by main table
20 / 45
21. Who's who
Motivation
Routing
PBR
Defaut routing policy on every Linux box
Remember the routing tables from before?
$ ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
21 / 45
22. Who's who
Motivation
Routing
PBR
PBR rules
$ ip rule help
Usage: ip rule { add | del } SELECTOR ACTION
ip rule { flush | save | restore }
ip rule [ list [ SELECTOR ]]
SELECTOR := [ not ] [ from PREFIX ] [ to PREFIX ] [ tos TOS ] [ fwmark FWMARK[/MASK] ]
[ iif STRING ] [ oif STRING ] [ pref NUMBER ] [ l3mdev ]
[ uidrange NUMBER-NUMBER ]
[ ipproto PROTOCOL ]
[ sport [ NUMBER | NUMBER-NUMBER ]
[ dport [ NUMBER | NUMBER-NUMBER ] ]
ACTION := [ table TABLE_ID ]
[ protocol PROTO ]
[ nat ADDRESS ]
[ realms [SRCREALM/]DSTREALM ]
[ goto NUMBER ]
SUPPRESSOR
SUPPRESSOR := [ suppress_prefixlength NUMBER ]
[ suppress_ifgroup DEVGROUP ]
TABLE_ID := [ local | main | default | NUMBER ]
22 / 45
23. Who's who
Motivation
Routing
PBR
PBR rules - examples
Half our users are special
# ip rule add from 192.168.178.0/25 table 178
Web traffic is special
# ip rule add dport 80 table 80
# ip rule add dport 443 table 80
Packets arriving at eth0 are special
# ip rule add iif eth0 table 23
23 / 45
25. Who's who
Motivation
Routing
PBR
VRFs
Virtual Routing and Forwarding (VRFs)
Independent routing instances, provides Layer 3 separation
Commonly used for
(OOB) mgmt access
L3-VPNs, usually in combination with MPLS
VRFs on Linux
VRF interface is master for “real” (member) interfaces
Maps to a (numeric) routing table
Netfilter rules shared across VRFs
Introduced in Kernel 4.[345] (use >= 4.9)
25 / 45
26. Who's who
Motivation
Routing
PBR
VRFs
Con guring VRFs
By foot
ip link add vrf_external type vrf table 1023
ip link set eth0 master vrf_external # Option 1: generic
ip link set eth0 vrf vrf_external # Option 2: VRF specific
ifupdown2 / ifupdown-ng
auto eth0
iface eth0
address 2002:db8:23:42::2/64
gateway 2001:db8:23:42::1/64
vrf vrf_external
auto vrf_external
iface vrf_external
vrf-table 1023
Device routes move from table main and local to table 1023
26 / 45
27. Who's who
Motivation
Routing
PBR
VRFs
VRFs: Under the hood - IPv4
A VRF is like a routing table with benefits:
$ ip r s vrf vrf_external
default via 192.0.2.1 dev eth0 metric 1
192.0.2.0/24 dev eth0 proto kernel scope link src 192.0.2.42
$ ip r s table 1023
default via 192.0.2.1 dev eth0 metric 1
broadcast 192.0.2.255 dev eth0 proto kernel scope link src 192.0.2.42
192.0.2.0/24 dev eth0 proto kernel scope link src 192.0.2.42
local 192.0.2.42 dev eth0 proto kernel scope host src 192.0.2.42
broadcast 192.0.2.255 dev eth0 proto kernel scope link src 192.0.2.42
27 / 45
28. Who's who
Motivation
Routing
PBR
VRFs
VRFs: Under the hood - IPv6
A VRF is like a routing table with benefits:
$ ip -6 r s vrf vrf_external
anycast 2001:db8:23:42:: dev eth0 proto kernel metric 0 pref medium
2002:db8:23:42::/64 dev eth0 proto kernel metric 256 linkdown pref medium
anycast fe80:: dev eth0 proto kernel metric 0 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
multicast ff00::/8 dev eth0 proto kernel metric 256 pref medium
default via 2001:db8:23:42::1 dev eth0 metric 1 pref medium
$ ip -6 r s table 1023
anycast 2001:db8:23:42:: dev eth0 proto kernel metric 0 pref medium
local 2001:db8:23:42::2 dev eth0 proto kernel metric 0 pref medium
2002:db8:23:42::/64 dev eth0 proto kernel metric 256 linkdown pref medium
anycast fe80:: dev eth0 proto kernel metric 0 pref medium
local fe80::222:19ff:fe65:b835 dev eth0 proto kernel metric 0 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
multicast ff00::/8 dev eth0 proto kernel metric 256 pref medium
default via 2001:db8:23:42::1 dev eth0 metric 1 pref medium
28 / 45
29. Who's who
Motivation
Routing
PBR
VRFs
VRFs: Under the hood - Plumbing
Remember PBR? Setting up a VRF adds a global VRF rule:
$ ip rule
0: from all lookup local
1000: from all lookup [l3mdev-table]
32766: from all lookup main
32767: from all lookup default
29 / 45
30. Who's who
Motivation
Routing
PBR
VRFs
Connecting VRFs
Requires vEth pair
Like a virtual network cable within the box
A end in main VRF, Z end in VRF “foo”
Usual routing
Static
Bird talking BGP to itself
Drawback:
ARP didn't work recently (didn't debug :))
Static entries helped
ND worked though
30 / 45
31. Who's who
Motivation
Routing
PBR
VRFs
Connecting VRFs
By foot
# ip link add VETH_END1 type veth peer name VETH_END2
ifupdown2* / ifupdown-ng
iface veth_ext2int
link-type veth
veth-peer-name veth_int2ext
vrf vrf_external
iface veth_int2ext
link-type veth
veth-peer-name veth_ext2int
* Merged with PR25, unsure if still works 31 / 45
32. Who's who
Motivation
Routing
PBR
VRFs
Leaking Routes
Similar to vendor boxes
Leaking VRF -> GRT (eth2 part of GRT):
# ip route add default via 192.0.2.1 dev eth2 vrf vrf_foo
Leaking GRT -> VRF
# ip route add 198.51.100.0/24 dev vrf_foo
32 / 45
33. Who's who
Motivation
Routing
PBR
VRFs
VRF awareness for applications
By default applications only use main table
Packets received in VRF table reach application
Reply sent out via main table
There's help:
# tcp_l3mdev_accept - BOOLEAN
#
# Enables child sockets to inherit the L3 master device index.
# Enabling this option allows a "global" listen socket to work
# across L3 master domains (e.g., VRFs) with connected sockets
# derived from the listen socket to be bound to the L3 domain in
# which the packets originated. Only valid when the kernel was
# compiled with CONFIG_NET_L3_MASTER_DEV.
#
# Default: 0 (disabled)
net.ipv4.tcp_l3mdev_accept = 1
This switch has influence on IPv6, too!
33 / 45
34. Who's who
Motivation
Routing
PBR
VRFs
Real World Applications for VRFs
LB / Web proxy/frontend
External interface is part of vrf_external
tcp_l3mdev_accept set to 1
nginx as reverse proxy
Listens on ip in GRT + IP in vrf_external
Uses main table for connections to internal services
Can serve queries from external + internal clients
External interface in VRF
External interface is part of vrf_external
GRE / OpenVPN tunnel sent / receive encapsulated packets over VRF
Local tunnel endpoint is in GRT
No risk of leaking stuff from GRT by accident
34 / 45
35. Who's who
Motivation
Routing
PBR
VRFs
Real World Applications - Tunnels / GRE
Outer and/or inner side of tunnel can be part of a VRF
Outer side in VRF
# ip link add DEVICE type gre remote ADDR local ADDR dev PHYS_DEV
If PHYS_DEV is within a VRF, all encapsulated packets are send/received in VRF
Inner side in VRF
Pushing the inner side of a tunnel into a VRF is equally simple:
# ip link set DEVICE master VRF
35 / 45
36. Who's who
Motivation
Routing
PBR
VRFs
Real World Applications - Tunnel / OpenVPN
Pushing the inner side of an OpenVPN tunnel into a VRF is as simple as before.
Sending/receiving encapsulated packets into/from a VRF needs application support.
My patch from October 2016 finally made it into OpenVPN 2.5 :)
# openvpn --config your_config.cfg --bind-dev VRF
This is used to glue remote POPs of Freifunk Hochstift together
36 / 45
38. Who's who
Motivation
Routing
PBR
VRFs
Real World Applications - VRFs + MPLS
The plumbing:
# modprobe mpls_iptunnel # Active MPLS
# sysctl -w net.mpls.platform_label=1000000 # Set max. label
# sysctl -w net.mpls.conf.ethX.input=1 # Active MPLS decap on ethX
Encap traffic with MPLS label 2342, send it to neighbor on ethX (in GRT)
# ip route add 192.0.2.0/24 encap mpls 2342 via inet6 2001:d8:42::1 dev ethX vrf vrf_x
Decap traffic with label 4223 and send it to VRF vrf_x
# ip -M route add 4223 dev vrf_x
Swap labels on the path (100 -> 200)
# ip -M route add 100 as 200 via inet6 2001:db8:4711::1
38 / 45
39. Who's who
Motivation
Routing
PBR
VRFs
NetNS
Network Namespaces (NetNS)
Layer 1 separation
An interface is part of exactly one NetNS
Similar to VRFs on vendor gear
Own set of routing tables
VRFs and PBR available within NetNS
Own set of netfilter rules
Processes can be bound to a NetNS
Introduced in Kernel 2.6.29
39 / 45
41. Who's who
Motivation
Routing
PBR
VRFs
NetNS
Con guring Network Namespaces
$ ip netns help
Usage: ip netns list
ip netns add NAME
ip netns set NAME NETNSID
ip [-all] netns delete [NAME]
ip netns identify [PID]
ip netns pids NAME
ip [-all] netns exec [NAME] cmd ...
ip netns monitor
ip netns list-id
NETNSID := auto | POSITIVE-INT
41 / 45
45. Who's who
Motivation
Routing
PBR
VRFs
NetNS
Links
Further Reading
Contemporary Linux Networking - DENOG9 (2017)
https://www.slideshare.net/BarbarossaTM/contemporary-linux-networking
VRFs
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/
Documentation/networking/vrf.rst
https://de.slideshare.net/CumulusNetworks/operationalizing-vrf-in-the-data-center
OpenVPN and VRFs
https://blog.sdn.clinic/2018/12/openvpn-and-vrfs/
MPLS Lab – Playing with static LSPs and VRFs on Linux
https://blog.sdn.clinic/2022/01/mpls-lab-playing-with-static-lsps-and-vrfs-on-linux/
45 / 45