SlideShare uma empresa Scribd logo
1 de 42
Baixar para ler offline
SR-IOV ixgbe driver limitations and
improvement
2016-07-15
Hiroshi Shimamoto
3 © NEC Corporation 2015 NEC Group Internal Use Only3 © NEC Corporation 2016
Who am I
Hiroshi Shimamoto
▌Working for NEC
▌Main activities in open source community
Carrier Grade Linux
Virtualization
Packet Processing, DPDK
to Make Linux system usable for Telecom Carriers
Outline
SR-IOV and ixgbe implementation
SR-IOV ixgbe limitations for NFV
Addressing the issues
Future work and possible security issues
5 © NEC Corporation 2015 NEC Group Internal Use Only5 © NEC Corporation 2016
▌What is SR-IOV
One of device virtualization technology
Most of people here may know it
▌In SR-IOV enabled PCI device provides PF (Physical Function)
and VF (Virtual Function)
VF will be used directly in VM
PCI device
SR-IOV
VF
0
PF VF
1
VF
2 ...
VM VM VM
PCI pass through
6 © NEC Corporation 2015 NEC Group Internal Use Only6 © NEC Corporation 2016
Why SR-IOV?
▌I/O in virtual machines is a performance bottleneck
▌Especially packet switching between VM and host
▌Technologies were introduced to address network performance
PCI pass through
SR-IOV
DPDK
7 © NEC Corporation 2015 NEC Group Internal Use Only7 © NEC Corporation 2016
SR-IOV and etc. comparison
▌Brief pros and cons each technology
PCI Pass through SR-IOV DPDK
pros Hardware
performance
Multiple VMs
Hardware
performance
Multiple VMs
Software flexibility
No special driver in
guest
cons Single VM Device support
VF driver in guest
Consume much
CPU power
8 © NEC Corporation 2015 NEC Group Internal Use Only8 © NEC Corporation 2016
How SR-IOV implemented in Intel 82599
▌Target device: Intel 82599 10GbE Controller and ixgbe driver for
PF and ixgbevf driver for VF
▌There are 64 VMDq (Virtual Machine Device queue) in 82599
Calls this queue pool
For SR-IOV, each VF has associated pool
82599 switches a packet to pool (VF)
▌The current ixgbe driver map
VF pool
0 0
1 1
: :
N N
PF N+1
Intel 82599 10GbE controller datasheet URL
http://www.intel.com/content/www/us/en/embedded/products/networking/82599-10-gbe-controller-datasheet.html
9 © NEC Corporation 2015 NEC Group Internal Use Only9 © NEC Corporation 2016
Packet switching in Intel 82599
▌In SR-IOV mode
▌82599 chip switches a packet to corresponding pool
ref. datasheet
7.10.3 Packet Switching
82599 NIC
VF
0
PFVF
1
VF
2
...
Packet Switch
Rx Tx
pools
10 © NEC Corporation 2015 NEC Group Internal Use Only10 © NEC Corporation 2016
Rx packet switching
▌Receive a packet from outside
▌Step
1. Exact unicast or multicast match
2. Broadcast
3. Unicast hash*
4. Multicast hash
5. Multicast promiscuous*
6. VLAN group
7. Default pool
8. Ethertype filters
9. PFVFRE
10.Mirroring*
VF
0
PFVF
1
VF
2
...
Packet Switch
Note*
driver didn’t support these features ref. datasheet
7.10.3.3 Rx Packet Switching
11 © NEC Corporation 2015 NEC Group Internal Use Only11 © NEC Corporation 2016
Tx packet switching
▌Receive a packet from device (PF and/or VF)
▌Step
1. Exact unicast or multicast match
2. Broadcast
3. Unicast hash*
4. Multicast hash
5. Multicast promiscuous*
6. Filer source pool
7. VLAN groups
8. Forwarding to the network
9. PFVFRE
10.Mirroring*
11.PFVFRE
VF
0
PFVF
1
VF
2
...
Packet Switch
Note*
driver didn’t support these features ref. datasheet
7.10.3.4 Tx Packet Switching
12 © NEC Corporation 2015 NEC Group Internal Use Only12 © NEC Corporation 2016
Use Intel 82599 NIC SR-IOV as switch
▌Typical use case
Intel 82599 NIC
PFVF
2 ...
Virtualization
VM0
Guest OS
VF driver
VF
0
NIC
VM1
Guest OS
VF
1
VF driver
NIC
switch
13 © NEC Corporation 2015 NEC Group Internal Use Only13 © NEC Corporation 2016
NFV (Network Function Virtualization)
▌Using virtualization technology, implements Network Functions
on general purpose hardware, to improve flexibility, agility and
efficiency
▌Network Function is virtualized
 VNF (Virtual Network Function) is realized on VM
Virtualization Layer
Orchestration/Resource Management
IMS
VM
VM
EPC
VM
VM
L3SW
VM
VM
GW
VM
VM
Data Base
VM
VM
App Server
VM
VM
IMS EPC L3SW GW
Data
Base
App
Server
Dedicated/Specific Hardware General Purpose Hardware
14 © NEC Corporation 2015 NEC Group Internal Use Only14 © NEC Corporation 2016
SR-IOV ixgbe driver limitations for NFV
▌Using Intel 82599 and SR-IOV, 3 critical limitations for NFV
 VLAN filtering
 Multicast addresses
 Unicast promiscuous
Come from hardware limitation and software (driver) limitation
▌Explain with 2 use cases
Router
Layer 2 switch
Intel 82599 NIC
PFVF
2 ...
Virtualization
VNF
Guest OS
VF driver
VF
0
NIC
VNF
Guest OS
VF
1
VF driver
NIC
switch
NFV use case
VNF is run on VM
15 © NEC Corporation 2015 NEC Group Internal Use Only15 © NEC Corporation 2016
NFV use case 1
▌Network Function: Router
VLAN is used to virtualize the network
Access NetworkUser Equipment Private Network
VNF
VLAN:1
VLAN:2
VLAN:3
VLAN:N
...
Terminates
many VLANs
and IPs
...
16 © NEC Corporation 2015 NEC Group Internal Use Only16 © NEC Corporation 2016
VLAN filtering limitation
▌Intel 82599 has hardware VLAN filter
▌The problem comes from
 VLAN filter has only 64 entries
 Driver always enable VLAN filter if SR-IOV enabled
 Only 64 VLANs can be used with SR-IOV
17 © NEC Corporation 2015 NEC Group Internal Use Only17 © NEC Corporation 2016
Example: VF device returns error against adding VLAN
▌Try to create 64 VLANs on VM
# for i in `seq 1 64`; do
> echo "vlan $i"
> ip link add link ens6 name ens6.$i type vlan id $i
> done
vlan 1
vlan 2
:
vlan 63
vlan 64
RTNETLINK answers: Permission denied
Note
The ixgbe driver use the first entry for VLAN 0,
that means actual the number of VLANs is 63
18 © NEC Corporation 2015 NEC Group Internal Use Only18 © NEC Corporation 2016
Enabling VLAN filter automatically in driver (latest)
▌upstream
static void ixgbe_vlan_promisc_enable(struct ixgbe_adapter *adapter)
{
:
switch (hw->mac.type) {
case ixgbe_mac_82599EB:
case ixgbe_mac_X540:
case ixgbe_mac_X550:
case ixgbe_mac_X550EM_x:
case ixgbe_mac_x550em_a:
default:
if (adapter->flags & IXGBE_FLAG_VMDQ_ENABLED)
break;
/* fall through */
case ixgbe_mac_82598EB:
/* legacy case, we can just disable VLAN filtering */
vlnctrl = IXGBE_READ_REG(hw, IXGBE_VLNCTRL);
vlnctrl &= ~(IXGBE_VLNCTRL_VFE | IXGBE_VLNCTRL_CFIEN);
IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, vlnctrl);
return;
}
19 © NEC Corporation 2015 NEC Group Internal Use Only19 © NEC Corporation 2016
Enabling VLAN filter automatically in driver (old)
▌previous version
void ixgbe_set_rx_mode(struct net_device *netdev)
{
:
:
vlnctrl &= ~(IXGBE_VLNCTRL_VFE | IXGBE_VLNCTRL_CFIEN);
if (netdev->flags & IFF_PROMISC) {
hw->addr_ctrl.user_set_promisc = true;
fctrl |= (IXGBE_FCTRL_UPE | IXGBE_FCTRL_MPE);
vmolr |= IXGBE_VMOLR_MPE;
/* Only disable hardware filter vlans in promiscuous mode
* if SR-IOV and VMDQ are disabled - otherwise ensure
* that hardware VLAN filters remain enabled.
*/
if (adapter->flags & (IXGBE_FLAG_VMDQ_ENABLED |
IXGBE_FLAG_SRIOV_ENABLED))
vlnctrl |= (IXGBE_VLNCTRL_VFE | IXGBE_VLNCTRL_CFIEN);
20 © NEC Corporation 2015 NEC Group Internal Use Only20 © NEC Corporation 2016
Multicast address limitation
▌The interface (PF-VF mailbox API) limits the number of addresses
▌Only first 30 multicast addresses can be registered
▌Overflowed addresses are silently dropped
 Having Multicast promiscuous is a solution
/* Each entry in the list uses 1 16 bit word. We have 30
* 16 bit words available in our HW msg buffer (minus 1 for the
* msg type). That's 30 hash values if we pack 'em right. If
* there are more than 30 MC addresses to add then punt the
* extras for now and then add code to handle more than 30 later.
* It would be unusual for a server to request that many multi-cast
* addresses except for in large enterprise network environments.
*/
cnt = netdev_mc_count(netdev);
if (cnt > 30)
cnt = 30;
msgbuf[0] = IXGBE_VF_SET_MULTICAST;
msgbuf[0] |= cnt << IXGBE_VT_MSGINFO_SHIFT;
21 © NEC Corporation 2015 NEC Group Internal Use Only21 © NEC Corporation 2016
Why so many multicast addresses?
▌Our case is to support many IPv6 addresses on VF
For Neighbor Discovery, each IPv6 address requires corresponding multicast
address
Unicast address
Multicast MAC
2001:0000 0000:0000 0000:0000 00 12:3456
ff02:0000 0000:0000 0000:0001 ff 12:3456
33:33 ff:12:34:56
Solicited node multicast address
22 © NEC Corporation 2015 NEC Group Internal Use Only22 © NEC Corporation 2016
Example: Cannot receive multicast packet
▌Some multicast packets are not passed to VF, then failed to ND
▌Assign 30 IPv6 addresses on VM (different network address)
▌Ping from other physical machine to VM
# for i in `seq 1 30`; do
> ip -6 addr add 2001:$i::1:$i/64 dev ens6
> done
# for i in `seq 1 30`; do ping6 –w 1 –c 1 –I eth2 2001:$i::1:$i; done
:
PING 2001:28::1:28(2001:28::1:28) from 2001:28::1 eth2: 56 data bytes
--- 2001:28::1:28 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1000ms
23 © NEC Corporation 2015 NEC Group Internal Use Only23 © NEC Corporation 2016
NFV use case 2
▌Network Function: L2 switch
Access NetworkUser Equipment Private Network
VNF
...
MAC:A
MAC:B
Left to right
both of MAC:A and
MAC:B must be able to
receive through this VF
dst:A
dst:B
24 © NEC Corporation 2015 NEC Group Internal Use Only24 © NEC Corporation 2016
Unicast promiscuous
▌Only packets which match the registered MAC can be received
▌VNF should be able to handle every MAC addresses in network
▌Hard to register all MAC addresses in L2 network
 We want Unicast promiscuous feature in VF
25 © NEC Corporation 2015 NEC Group Internal Use Only25 © NEC Corporation 2016
SR-IOV limitations for NFV and current status
▌VLAN filtering
Only 64 VLANs can be used
 Proposed to add an option to disable hardware VLAN filter, but
not accepted
▌Multicast addresses
Only 30 Multicast addresses can be used
 Implemented VF Multicast promiscuous mode in ixgbe/ixgbevf
driver in Linux 4.4
▌Unicast promiscuous
Single unicast MAC address can be used
 Hardware limitation
26 © NEC Corporation 2015 NEC Group Internal Use Only26 © NEC Corporation 2016
Addressing Multicast addresses limitation
▌There is a hardware feature in 82599
VF Multicast promiscuous mode
▌But driver didn’t support that feature
▌Implement new PF-VF mailbox API in ixgbe and ixgbevf
▌First, automatically enable to VF multicast promiscuous when the
number of addresses overs 30
▌Suggested way, implement xcast mode in VF
There is ALLMULTI flag that means that every multicast packet is received in
this device
 Accepted in Linux 4.4
27 © NEC Corporation 2015 NEC Group Internal Use Only27 © NEC Corporation 2016
PF-VF mailbox APIs
▌There are mailbox APIs
Communicate between PF and VF
version API description
legacy RESET VF requests reset
SET_MAC_ADDR VF requests PF to set MAC addr
SET_MULTICAST VF requests PF to set MC addr
SET_VLAN VF requests PF to set VLAN
1.0 SET_LPE VF requests PF to set VMOLR.LPE
SET_MACVLAN VF requests PF for unicast filter
API_NEGOTIATE negotiate API version
1.1 GET_QUEUES get queue configuration
1.2 GET_RETA VF requests for RETA
GET_RSS_KEY get RSS key
UPDATE_XCAST_MODE VF requests PF to set MC mode
28 © NEC Corporation 2015 NEC Group Internal Use Only28 © NEC Corporation 2016
Implementation (PF ixgbe)
▌Handle UPDATE_XCAST_MODE API
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
:
@@ -1066,6 +1122,9 @@ static int ixgbe_rcv_msg_from_vf(struct ixgbe_adapter
*adapter, u32 vf)
case IXGBE_VF_GET_RSS_KEY:
retval = ixgbe_get_vf_rss_key(adapter, msgbuf, vf);
break;
+ case IXGBE_VF_UPDATE_XCAST_MODE:
+ retval = ixgbe_update_vf_xcast_mode(adapter, msgbuf, vf);
+ break;
default:
e_err(drv, "Unhandled Msg %8.8x¥n", msgbuf[0]);
retval = IXGBE_ERR_MBX;
29 © NEC Corporation 2015 NEC Group Internal Use Only29 © NEC Corporation 2016
Implementation (VF ixgbevf)
▌Request to PF (from ixgbevf_set_rx_mode)
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -1894,9 +1894,17 @@ static void ixgbevf_set_rx_mode(struct net_device
*netdev)
{
struct ixgbevf_adapter *adapter = netdev_priv(netdev);
struct ixgbe_hw *hw = &adapter->hw;
+ unsigned int flags = netdev->flags;
+ int xcast_mode;
+
+ xcast_mode = (flags & IFF_ALLMULTI) ? IXGBEVF_XCAST_MODE_ALLMULTI :
+ (flags & (IFF_BROADCAST | IFF_MULTICAST)) ?
+ IXGBEVF_XCAST_MODE_MULTI : IXGBEVF_XCAST_MODE_NONE;
spin_lock_bh(&adapter->mbx_lock);
+ hw->mac.ops.update_xcast_mode(hw, netdev, xcast_mode);
+
/* reprogram multicast list */
hw->mac.ops.update_mc_addr_list(hw, netdev);
30 © NEC Corporation 2015 NEC Group Internal Use Only30 © NEC Corporation 2016
Security consideration
▌Enabling VF Multicast promiscuous mode causes security issues
 Can see all multicast packets through this device
 Can hurt performance
NIC duplicates packets and does DMA to each pool
 Make only trusted VF can enable Multicast promiscuous mode
static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
u32 *msgbuf, u32 vf)
{
:
if (xcast_mode > IXGBEVF_XCAST_MODE_MULTI &&
!adapter->vfinfo[vf].trusted) {
xcast_mode = IXGBEVF_XCAST_MODE_MULTI;
}
31 © NEC Corporation 2015 NEC Group Internal Use Only31 © NEC Corporation 2016
Implement functionality to trust VF
▌Add new operation “set_vf_trust” in net_device_ops
▌Also add support VF trust operation in iproute2 (ip command)
# ip link set dev enp3s0f0 vf 1 trust on
(dmesg)
kernel: ixgbe 0000:03:00.0 enp3s0f0: VF 1 is trusted
kernel: ixgbevf 0000:03:10.2: NIC Link is Down
kernel: ixgbe 0000:03:00.0 enp3s0f0: VF Reset msg received from vf 1
kernel: ixgbevf 0000:03:10.2: NIC Link is Up 10 Gbps
# ip link set dev enp3s0f0 vf 1 trust off
(dmesg)
kernel: ixgbe 0000:03:00.0 enp3s0f0: VF 1 is not trusted
kernel: ixgbevf 0000:03:10.2: NIC Link is Down
kernel: ixgbe 0000:03:00.0 enp3s0f0: VF Reset msg received from vf 1
kernel: ixgbevf 0000:03:10.2: NIC Link is Up 10 Gbps
Note
When trusted state is changed, target vf is reset
32 © NEC Corporation 2015 NEC Group Internal Use Only32 © NEC Corporation 2016
Future work and possible security issues
▌Still, there are limitations
 VLAN filtering
 Unicast promiscuous
▌Possible security issues
 VLAN filter is not isolated
 Multicast hash table is not handled strictly
33 © NEC Corporation 2015 NEC Group Internal Use Only33 © NEC Corporation 2016
VLAN filtering
▌Disabling hardware VLAN filtering may solve this issue
▌But
 It could break existing network feature (DCB, FCoE)
 Broadcast(Multicast) storm could cause performance degradation
 Security issue, BC/MC packets can be seen in every VF
▌Maybe okay if the network and VMs are well managed
▌Another point is that there is no suitable knob to do it now
What command is right to turn VLAN filter off in general
34 © NEC Corporation 2015 NEC Group Internal Use Only34 © NEC Corporation 2016
Unicast promiscuous
▌Supporting this feature in hardware is the best
▌No VF unicast promiscuous feature in 82599 unfortunately
35 © NEC Corporation 2015 NEC Group Internal Use Only35 © NEC Corporation 2016
Hardware support in X540 and X550
▌Later NIC chips, X540 and X550 support VLAN promiscuous and
Unicast promiscuous mode per VF
▌The issues could be solved with X540/X550 chip
▌To make framework for X540/X550 and use the same semantics
for 82599 may be needed
Field Bit(s) Description
Reserved 23:0 21:0 Reserved
UPE 22 Unicast Promiscuous Enable
VPE 23 VLAN Promiscuous Enable
AUPE 24 Accept Untagged Packet Enable
ROMPE 25 Receive Overflow Multicast Packets
ROPE 26 Receive MAC Filters Overflow
BAM 27 Broadcast Accept
MPE 28 Multicast Promiscuous Enable
Reserved 31:29 Reserved
PF VM L2Control Register
new
feature bit
36 © NEC Corporation 2015 NEC Group Internal Use Only36 © NEC Corporation 2016
Ideas
▌Mirroring
▌Unicast hash
Those features are not used/supported in driver
37 © NEC Corporation 2015 NEC Group Internal Use Only37 © NEC Corporation 2016
Possible security issues
▌In the current ixgbe/ixgbevf implementation, there are 2 issues
to be considered
 VLAN filter is not isolated
Single VLAN filter table
No limitation to request new VLAN from VF
If a VM requests 64 VLANs, other VMs never use different VLANs
 Single multicast hash table
Using multicast hash table for switching multicast packet to VF
SET_MULTICAST API is only for setting hash value, no unset functionality
Manipulating IP address assignment can make VF to have multicast
promiscuous behavior
38 © NEC Corporation 2015 NEC Group Internal Use Only38 © NEC Corporation 2016
VLAN filter is not isolated
▌If a VM uses all VLANs, other VM can’t make new VLAN
(VM0) # for i in `seq 1 64`; do
> echo "vlan $i"
> ip link add link ens6 name ens6.$i type vlan id $i
> done
vlan 1
vlan 2
:
vlan 63
vlan 64
RTNETLINK answers: Permission denied
(VM1) # ip link add link ens6 name ens6.64 type vlan id 64
RTNETLINK answers: Permission denied
and if VM does not shutdown gracefully,
registered VLAN filter entry remains
39 © NEC Corporation 2015 NEC Group Internal Use Only39 © NEC Corporation 2016
Single multicast hash table
▌ixgbe driver uses multicast hash table (MTA register)
▌Multicast Table Array is a 4Kb bitmap
i. 82599 make 12 bits hash value from MAC
ii. If corresponding bit in MTA is set, it means hit
iii. Check VF capability PFVFL2FLT.ROMPE and transfer packet
 If every bits in MTA are set, every multicast packet matches
and transferred to pool
▌MTA bit is set by SET_MULTICAST API and no unset API
▌PFVFL2FLT.ROMPE bit always set by receiving SET_MULTICAST
API message
▌Long living host may have unnecessary bits in MTA
40 © NEC Corporation 2015 NEC Group Internal Use Only40 © NEC Corporation 2016
Example: Can receive all multicast packet
▌Invoke SET_MULTICAST from VF each IP address
▌Assign 30 IP addresses again
▌Ping from other physical machine to VM
# for i in `seq 1 30`; do
> ip -6 addr add 2001:$i::1:$i/64 dev ens6
> ip -6 addr del 2001:$i::1:$i/64 dev ens6
> done
# for i in `seq 1 30`; do ping6 –w 1 –c 1 –I eth2 2001:$i::1:$i; done
:
PING 2001:30::1:30(2001:30::1:30) from 2001:30::1 eth2: 56 data bytes
64 bytes from 2001:30::1:30: icmp_seq=1 ttl=64 time=0.947 ms
--- 2001:30::1:30 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.947/0.947/0.947/0.000 ms
# for i in `seq 1 30`; do
> ip -6 addr add 2001:$i::1:$i/64 dev ens6
> done
41 © NEC Corporation 2015 NEC Group Internal Use Only41 © NEC Corporation 2016
Summary
▌SR-IOV in Intel 82599 and ixgbe driver implementation
▌SR-IOV ixgbe driver limitations for NFV
VLAN filtering
Multicast addresses
Unicast promiscuous
▌Addressing Multicast addresses issue
Implement new mailbox API and ndo VF trust
▌Future work and possible security issues
▌Questions?
▌E-Mail: h-shimamoto@ct.jp.nec.com
SR-IOV ixgbe Driver Limitations and Improvement

Mais conteúdo relacionado

Mais procurados

GPU仮想化最前線 - KVMGTとvirtio-gpu -
GPU仮想化最前線 - KVMGTとvirtio-gpu -GPU仮想化最前線 - KVMGTとvirtio-gpu -
GPU仮想化最前線 - KVMGTとvirtio-gpu -zgock
 
日本OpenStackユーザ会 第37回勉強会
日本OpenStackユーザ会 第37回勉強会日本OpenStackユーザ会 第37回勉強会
日本OpenStackユーザ会 第37回勉強会Yushiro Furukawa
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020Akihiro Suda
 
Packet flow on openstack
Packet flow on openstackPacket flow on openstack
Packet flow on openstackAchhar Kalia
 
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionProject ACRN
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...The Linux Foundation
 
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLinaro
 
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netReceive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netYan Vugenfirer
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareLinaro
 
OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月
OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月
OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月VirtualTech Japan Inc.
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPThomas Graf
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu WorksZhen Wei
 
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介VirtualTech Japan Inc.
 
FD.io VPP事始め
FD.io VPP事始めFD.io VPP事始め
FD.io VPP事始めtetsusat
 

Mais procurados (20)

GPU仮想化最前線 - KVMGTとvirtio-gpu -
GPU仮想化最前線 - KVMGTとvirtio-gpu -GPU仮想化最前線 - KVMGTとvirtio-gpu -
GPU仮想化最前線 - KVMGTとvirtio-gpu -
 
日本OpenStackユーザ会 第37回勉強会
日本OpenStackユーザ会 第37回勉強会日本OpenStackユーザ会 第37回勉強会
日本OpenStackユーザ会 第37回勉強会
 
KubeVirt 101
KubeVirt 101KubeVirt 101
KubeVirt 101
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020
 
Packet flow on openstack
Packet flow on openstackPacket flow on openstack
Packet flow on openstack
 
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
 
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
 
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netReceive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Kayobe_desc
Kayobe_descKayobe_desc
Kayobe_desc
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
Vyatta 改造入門
Vyatta 改造入門Vyatta 改造入門
Vyatta 改造入門
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月
OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月
OpenStack Neutronの機能概要 - OpenStack最新情報セミナー 2014年12月
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
 
FD.io VPP事始め
FD.io VPP事始めFD.io VPP事始め
FD.io VPP事始め
 
SR-IOV Introduce
SR-IOV IntroduceSR-IOV Introduce
SR-IOV Introduce
 

Destaque

NVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxNVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxLF Events
 
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Odinot Stanislas
 
Userspace Linux I/O
Userspace Linux I/O Userspace Linux I/O
Userspace Linux I/O Garima Kapoor
 
Hardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux conHardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux consprdd
 
NVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about TimeNVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about TimeDell World
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver인구 강
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red_Hat_Storage
 
Devconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKDevconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKMaxime Coquelin
 
Boost UDP Transaction Performance
Boost UDP Transaction PerformanceBoost UDP Transaction Performance
Boost UDP Transaction PerformanceLF Events
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RSimon Huang
 
Moving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM ExpressMoving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM ExpressOdinot Stanislas
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Odinot Stanislas
 
Identifying PCIe 3.0 Dynamic Equalization Problems
Identifying PCIe 3.0 Dynamic Equalization ProblemsIdentifying PCIe 3.0 Dynamic Equalization Problems
Identifying PCIe 3.0 Dynamic Equalization Problemsteledynelecroy
 
Webinar: How NVMe Will Change Flash Storage
Webinar: How NVMe Will Change Flash StorageWebinar: How NVMe Will Change Flash Storage
Webinar: How NVMe Will Change Flash StorageStorage Switzerland
 
PCIe Gen 3.0 Presentation @ 4th FPGA Camp
PCIe Gen 3.0 Presentation @ 4th FPGA CampPCIe Gen 3.0 Presentation @ 4th FPGA Camp
PCIe Gen 3.0 Presentation @ 4th FPGA CampFPGA Central
 
DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)
DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)
DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)Ontico
 

Destaque (20)

NVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxNVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in Linux
 
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
 
Userspace Linux I/O
Userspace Linux I/O Userspace Linux I/O
Userspace Linux I/O
 
Hardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux conHardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux con
 
NVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about TimeNVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about Time
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
 
Devconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKDevconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDK
 
Boost UDP Transaction Performance
Boost UDP Transaction PerformanceBoost UDP Transaction Performance
Boost UDP Transaction Performance
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 
Moving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM ExpressMoving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM Express
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
HP_NextGEN_Training_Q4_2015
HP_NextGEN_Training_Q4_2015HP_NextGEN_Training_Q4_2015
HP_NextGEN_Training_Q4_2015
 
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
 
Identifying PCIe 3.0 Dynamic Equalization Problems
Identifying PCIe 3.0 Dynamic Equalization ProblemsIdentifying PCIe 3.0 Dynamic Equalization Problems
Identifying PCIe 3.0 Dynamic Equalization Problems
 
PCIe
PCIePCIe
PCIe
 
Webinar: How NVMe Will Change Flash Storage
Webinar: How NVMe Will Change Flash StorageWebinar: How NVMe Will Change Flash Storage
Webinar: How NVMe Will Change Flash Storage
 
Pci express modi
Pci express modiPci express modi
Pci express modi
 
PCIe Gen 3.0 Presentation @ 4th FPGA Camp
PCIe Gen 3.0 Presentation @ 4th FPGA CampPCIe Gen 3.0 Presentation @ 4th FPGA Camp
PCIe Gen 3.0 Presentation @ 4th FPGA Camp
 
DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)
DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)
DPDK в виртуальном коммутаторе Open vSwitch / Александр Джуринский (Selectel)
 

Semelhante a SR-IOV ixgbe Driver Limitations and Improvement

SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/StableSR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stablejuet-y
 
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/StableSR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stablejuet-y
 
82599 sriov vm configuration notes
82599 sriov vm configuration notes82599 sriov vm configuration notes
82599 sriov vm configuration notesRyan Aydelott
 
20141102 VyOS 1.1.0 and NIFTY Cloud New Features
20141102 VyOS 1.1.0 and NIFTY Cloud New Features20141102 VyOS 1.1.0 and NIFTY Cloud New Features
20141102 VyOS 1.1.0 and NIFTY Cloud New Features雄也 日下部
 
VXLAN: Enhancements and Network Integration
VXLAN: Enhancements and Network Integration VXLAN: Enhancements and Network Integration
VXLAN: Enhancements and Network Integration Eddie Parra
 
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Odinot Stanislas
 
Power vc for powervm deep dive tips &amp; tricks
Power vc for powervm deep dive tips &amp; tricksPower vc for powervm deep dive tips &amp; tricks
Power vc for powervm deep dive tips &amp; trickssolarisyougood
 
2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANLdgoodell
 
如何用k8s打造國產5G NFV平臺? 剖析經濟部5G核網技術的關鍵
如何用k8s打造國產5G NFV平臺?剖析經濟部5G核網技術的關鍵如何用k8s打造國產5G NFV平臺?剖析經濟部5G核網技術的關鍵
如何用k8s打造國產5G NFV平臺? 剖析經濟部5G核網技術的關鍵Jace Liang
 
Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstackIkuo Kumagai
 
Netsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvNetsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvIntel
 
The Next Step of OpenStack Evolution for NFV Deployments
The Next Step ofOpenStack Evolution for NFV DeploymentsThe Next Step ofOpenStack Evolution for NFV Deployments
The Next Step of OpenStack Evolution for NFV DeploymentsDirk Kutscher
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpJames Denton
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stablejuet-y
 
Openstack v4 0
Openstack v4 0Openstack v4 0
Openstack v4 0sprdd
 
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineShapeBlue
 
PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...
PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...
PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...PROIDEA
 

Semelhante a SR-IOV ixgbe Driver Limitations and Improvement (20)

10.) vxlan
10.) vxlan10.) vxlan
10.) vxlan
 
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/StableSR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
 
Icnd210 s02l01
Icnd210 s02l01Icnd210 s02l01
Icnd210 s02l01
 
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/StableSR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
SR-IOV, KVM and Intel X520 10Gbps cards on Debian/Stable
 
82599 sriov vm configuration notes
82599 sriov vm configuration notes82599 sriov vm configuration notes
82599 sriov vm configuration notes
 
20141102 VyOS 1.1.0 and NIFTY Cloud New Features
20141102 VyOS 1.1.0 and NIFTY Cloud New Features20141102 VyOS 1.1.0 and NIFTY Cloud New Features
20141102 VyOS 1.1.0 and NIFTY Cloud New Features
 
VXLAN: Enhancements and Network Integration
VXLAN: Enhancements and Network Integration VXLAN: Enhancements and Network Integration
VXLAN: Enhancements and Network Integration
 
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
 
Power vc for powervm deep dive tips &amp; tricks
Power vc for powervm deep dive tips &amp; tricksPower vc for powervm deep dive tips &amp; tricks
Power vc for powervm deep dive tips &amp; tricks
 
2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL2014/09/02 Cisco UCS HPC @ ANL
2014/09/02 Cisco UCS HPC @ ANL
 
如何用k8s打造國產5G NFV平臺? 剖析經濟部5G核網技術的關鍵
如何用k8s打造國產5G NFV平臺?剖析經濟部5G核網技術的關鍵如何用k8s打造國產5G NFV平臺?剖析經濟部5G核網技術的關鍵
如何用k8s打造國產5G NFV平臺? 剖析經濟部5G核網技術的關鍵
 
Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstack
 
Netsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvNetsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfv
 
The Next Step of OpenStack Evolution for NFV Deployments
The Next Step ofOpenStack Evolution for NFV DeploymentsThe Next Step ofOpenStack Evolution for NFV Deployments
The Next Step of OpenStack Evolution for NFV Deployments
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stable
 
Ccnav5.org ccna 3-v50_final_exam_2014
Ccnav5.org ccna 3-v50_final_exam_2014Ccnav5.org ccna 3-v50_final_exam_2014
Ccnav5.org ccna 3-v50_final_exam_2014
 
Openstack v4 0
Openstack v4 0Openstack v4 0
Openstack v4 0
 
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
 
PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...
PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...
PLNOG 5: Piotr Szołkowski - Data Center i nie tylko...
 

Mais de LF Events

Feature rich BTRFS is Getting Richer with Encryption
Feature rich BTRFS is Getting Richer with EncryptionFeature rich BTRFS is Getting Richer with Encryption
Feature rich BTRFS is Getting Richer with EncryptionLF Events
 
KASan in a Bare-Metal Hypervisor
 KASan in a Bare-Metal Hypervisor  KASan in a Bare-Metal Hypervisor
KASan in a Bare-Metal Hypervisor LF Events
 
Efficient kernel backporting
Efficient kernel backportingEfficient kernel backporting
Efficient kernel backportingLF Events
 
Raspberry pi Update - Encourage your IOT
Raspberry pi Update - Encourage your IOTRaspberry pi Update - Encourage your IOT
Raspberry pi Update - Encourage your IOTLF Events
 
Introduction to Open-O
Introduction to Open-OIntroduction to Open-O
Introduction to Open-OLF Events
 
CNCF and Fujitsu
CNCF and FujitsuCNCF and Fujitsu
CNCF and FujitsuLF Events
 
Linxu conj2016 96boards
Linxu conj2016 96boardsLinxu conj2016 96boards
Linxu conj2016 96boardsLF Events
 
Taking over to the Next Generation
Taking over to the Next GenerationTaking over to the Next Generation
Taking over to the Next GenerationLF Events
 
Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...
Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...
Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...LF Events
 
Generating a Reproducible and Maintainable Embedded Linux Environment with Po...
Generating a Reproducible and Maintainable Embedded Linux Environment with Po...Generating a Reproducible and Maintainable Embedded Linux Environment with Po...
Generating a Reproducible and Maintainable Embedded Linux Environment with Po...LF Events
 
Secure IOT Gateway
Secure IOT GatewaySecure IOT Gateway
Secure IOT GatewayLF Events
 
Trading Derivatives on Hyperledger
Trading Derivatives on HyperledgerTrading Derivatives on Hyperledger
Trading Derivatives on HyperledgerLF Events
 
Introducing Oracle Linux and Securing It With ksplice
Introducing Oracle Linux and Securing It With kspliceIntroducing Oracle Linux and Securing It With ksplice
Introducing Oracle Linux and Securing It With kspliceLF Events
 
Containers: Don't Skeu Them Up, Use Microservices Instead
Containers: Don't Skeu Them Up, Use Microservices InsteadContainers: Don't Skeu Them Up, Use Microservices Instead
Containers: Don't Skeu Them Up, Use Microservices InsteadLF Events
 

Mais de LF Events (14)

Feature rich BTRFS is Getting Richer with Encryption
Feature rich BTRFS is Getting Richer with EncryptionFeature rich BTRFS is Getting Richer with Encryption
Feature rich BTRFS is Getting Richer with Encryption
 
KASan in a Bare-Metal Hypervisor
 KASan in a Bare-Metal Hypervisor  KASan in a Bare-Metal Hypervisor
KASan in a Bare-Metal Hypervisor
 
Efficient kernel backporting
Efficient kernel backportingEfficient kernel backporting
Efficient kernel backporting
 
Raspberry pi Update - Encourage your IOT
Raspberry pi Update - Encourage your IOTRaspberry pi Update - Encourage your IOT
Raspberry pi Update - Encourage your IOT
 
Introduction to Open-O
Introduction to Open-OIntroduction to Open-O
Introduction to Open-O
 
CNCF and Fujitsu
CNCF and FujitsuCNCF and Fujitsu
CNCF and Fujitsu
 
Linxu conj2016 96boards
Linxu conj2016 96boardsLinxu conj2016 96boards
Linxu conj2016 96boards
 
Taking over to the Next Generation
Taking over to the Next GenerationTaking over to the Next Generation
Taking over to the Next Generation
 
Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...
Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...
Learning From Real Practice of Providing Highly Available Hybrid Cloud Servic...
 
Generating a Reproducible and Maintainable Embedded Linux Environment with Po...
Generating a Reproducible and Maintainable Embedded Linux Environment with Po...Generating a Reproducible and Maintainable Embedded Linux Environment with Po...
Generating a Reproducible and Maintainable Embedded Linux Environment with Po...
 
Secure IOT Gateway
Secure IOT GatewaySecure IOT Gateway
Secure IOT Gateway
 
Trading Derivatives on Hyperledger
Trading Derivatives on HyperledgerTrading Derivatives on Hyperledger
Trading Derivatives on Hyperledger
 
Introducing Oracle Linux and Securing It With ksplice
Introducing Oracle Linux and Securing It With kspliceIntroducing Oracle Linux and Securing It With ksplice
Introducing Oracle Linux and Securing It With ksplice
 
Containers: Don't Skeu Them Up, Use Microservices Instead
Containers: Don't Skeu Them Up, Use Microservices InsteadContainers: Don't Skeu Them Up, Use Microservices Instead
Containers: Don't Skeu Them Up, Use Microservices Instead
 

Último

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

SR-IOV ixgbe Driver Limitations and Improvement

  • 1. SR-IOV ixgbe driver limitations and improvement 2016-07-15 Hiroshi Shimamoto
  • 2.
  • 3. 3 © NEC Corporation 2015 NEC Group Internal Use Only3 © NEC Corporation 2016 Who am I Hiroshi Shimamoto ▌Working for NEC ▌Main activities in open source community Carrier Grade Linux Virtualization Packet Processing, DPDK to Make Linux system usable for Telecom Carriers
  • 4. Outline SR-IOV and ixgbe implementation SR-IOV ixgbe limitations for NFV Addressing the issues Future work and possible security issues
  • 5. 5 © NEC Corporation 2015 NEC Group Internal Use Only5 © NEC Corporation 2016 ▌What is SR-IOV One of device virtualization technology Most of people here may know it ▌In SR-IOV enabled PCI device provides PF (Physical Function) and VF (Virtual Function) VF will be used directly in VM PCI device SR-IOV VF 0 PF VF 1 VF 2 ... VM VM VM PCI pass through
  • 6. 6 © NEC Corporation 2015 NEC Group Internal Use Only6 © NEC Corporation 2016 Why SR-IOV? ▌I/O in virtual machines is a performance bottleneck ▌Especially packet switching between VM and host ▌Technologies were introduced to address network performance PCI pass through SR-IOV DPDK
  • 7. 7 © NEC Corporation 2015 NEC Group Internal Use Only7 © NEC Corporation 2016 SR-IOV and etc. comparison ▌Brief pros and cons each technology PCI Pass through SR-IOV DPDK pros Hardware performance Multiple VMs Hardware performance Multiple VMs Software flexibility No special driver in guest cons Single VM Device support VF driver in guest Consume much CPU power
  • 8. 8 © NEC Corporation 2015 NEC Group Internal Use Only8 © NEC Corporation 2016 How SR-IOV implemented in Intel 82599 ▌Target device: Intel 82599 10GbE Controller and ixgbe driver for PF and ixgbevf driver for VF ▌There are 64 VMDq (Virtual Machine Device queue) in 82599 Calls this queue pool For SR-IOV, each VF has associated pool 82599 switches a packet to pool (VF) ▌The current ixgbe driver map VF pool 0 0 1 1 : : N N PF N+1 Intel 82599 10GbE controller datasheet URL http://www.intel.com/content/www/us/en/embedded/products/networking/82599-10-gbe-controller-datasheet.html
  • 9. 9 © NEC Corporation 2015 NEC Group Internal Use Only9 © NEC Corporation 2016 Packet switching in Intel 82599 ▌In SR-IOV mode ▌82599 chip switches a packet to corresponding pool ref. datasheet 7.10.3 Packet Switching 82599 NIC VF 0 PFVF 1 VF 2 ... Packet Switch Rx Tx pools
  • 10. 10 © NEC Corporation 2015 NEC Group Internal Use Only10 © NEC Corporation 2016 Rx packet switching ▌Receive a packet from outside ▌Step 1. Exact unicast or multicast match 2. Broadcast 3. Unicast hash* 4. Multicast hash 5. Multicast promiscuous* 6. VLAN group 7. Default pool 8. Ethertype filters 9. PFVFRE 10.Mirroring* VF 0 PFVF 1 VF 2 ... Packet Switch Note* driver didn’t support these features ref. datasheet 7.10.3.3 Rx Packet Switching
  • 11. 11 © NEC Corporation 2015 NEC Group Internal Use Only11 © NEC Corporation 2016 Tx packet switching ▌Receive a packet from device (PF and/or VF) ▌Step 1. Exact unicast or multicast match 2. Broadcast 3. Unicast hash* 4. Multicast hash 5. Multicast promiscuous* 6. Filer source pool 7. VLAN groups 8. Forwarding to the network 9. PFVFRE 10.Mirroring* 11.PFVFRE VF 0 PFVF 1 VF 2 ... Packet Switch Note* driver didn’t support these features ref. datasheet 7.10.3.4 Tx Packet Switching
  • 12. 12 © NEC Corporation 2015 NEC Group Internal Use Only12 © NEC Corporation 2016 Use Intel 82599 NIC SR-IOV as switch ▌Typical use case Intel 82599 NIC PFVF 2 ... Virtualization VM0 Guest OS VF driver VF 0 NIC VM1 Guest OS VF 1 VF driver NIC switch
  • 13. 13 © NEC Corporation 2015 NEC Group Internal Use Only13 © NEC Corporation 2016 NFV (Network Function Virtualization) ▌Using virtualization technology, implements Network Functions on general purpose hardware, to improve flexibility, agility and efficiency ▌Network Function is virtualized  VNF (Virtual Network Function) is realized on VM Virtualization Layer Orchestration/Resource Management IMS VM VM EPC VM VM L3SW VM VM GW VM VM Data Base VM VM App Server VM VM IMS EPC L3SW GW Data Base App Server Dedicated/Specific Hardware General Purpose Hardware
  • 14. 14 © NEC Corporation 2015 NEC Group Internal Use Only14 © NEC Corporation 2016 SR-IOV ixgbe driver limitations for NFV ▌Using Intel 82599 and SR-IOV, 3 critical limitations for NFV  VLAN filtering  Multicast addresses  Unicast promiscuous Come from hardware limitation and software (driver) limitation ▌Explain with 2 use cases Router Layer 2 switch Intel 82599 NIC PFVF 2 ... Virtualization VNF Guest OS VF driver VF 0 NIC VNF Guest OS VF 1 VF driver NIC switch NFV use case VNF is run on VM
  • 15. 15 © NEC Corporation 2015 NEC Group Internal Use Only15 © NEC Corporation 2016 NFV use case 1 ▌Network Function: Router VLAN is used to virtualize the network Access NetworkUser Equipment Private Network VNF VLAN:1 VLAN:2 VLAN:3 VLAN:N ... Terminates many VLANs and IPs ...
  • 16. 16 © NEC Corporation 2015 NEC Group Internal Use Only16 © NEC Corporation 2016 VLAN filtering limitation ▌Intel 82599 has hardware VLAN filter ▌The problem comes from  VLAN filter has only 64 entries  Driver always enable VLAN filter if SR-IOV enabled  Only 64 VLANs can be used with SR-IOV
  • 17. 17 © NEC Corporation 2015 NEC Group Internal Use Only17 © NEC Corporation 2016 Example: VF device returns error against adding VLAN ▌Try to create 64 VLANs on VM # for i in `seq 1 64`; do > echo "vlan $i" > ip link add link ens6 name ens6.$i type vlan id $i > done vlan 1 vlan 2 : vlan 63 vlan 64 RTNETLINK answers: Permission denied Note The ixgbe driver use the first entry for VLAN 0, that means actual the number of VLANs is 63
  • 18. 18 © NEC Corporation 2015 NEC Group Internal Use Only18 © NEC Corporation 2016 Enabling VLAN filter automatically in driver (latest) ▌upstream static void ixgbe_vlan_promisc_enable(struct ixgbe_adapter *adapter) { : switch (hw->mac.type) { case ixgbe_mac_82599EB: case ixgbe_mac_X540: case ixgbe_mac_X550: case ixgbe_mac_X550EM_x: case ixgbe_mac_x550em_a: default: if (adapter->flags & IXGBE_FLAG_VMDQ_ENABLED) break; /* fall through */ case ixgbe_mac_82598EB: /* legacy case, we can just disable VLAN filtering */ vlnctrl = IXGBE_READ_REG(hw, IXGBE_VLNCTRL); vlnctrl &= ~(IXGBE_VLNCTRL_VFE | IXGBE_VLNCTRL_CFIEN); IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, vlnctrl); return; }
  • 19. 19 © NEC Corporation 2015 NEC Group Internal Use Only19 © NEC Corporation 2016 Enabling VLAN filter automatically in driver (old) ▌previous version void ixgbe_set_rx_mode(struct net_device *netdev) { : : vlnctrl &= ~(IXGBE_VLNCTRL_VFE | IXGBE_VLNCTRL_CFIEN); if (netdev->flags & IFF_PROMISC) { hw->addr_ctrl.user_set_promisc = true; fctrl |= (IXGBE_FCTRL_UPE | IXGBE_FCTRL_MPE); vmolr |= IXGBE_VMOLR_MPE; /* Only disable hardware filter vlans in promiscuous mode * if SR-IOV and VMDQ are disabled - otherwise ensure * that hardware VLAN filters remain enabled. */ if (adapter->flags & (IXGBE_FLAG_VMDQ_ENABLED | IXGBE_FLAG_SRIOV_ENABLED)) vlnctrl |= (IXGBE_VLNCTRL_VFE | IXGBE_VLNCTRL_CFIEN);
  • 20. 20 © NEC Corporation 2015 NEC Group Internal Use Only20 © NEC Corporation 2016 Multicast address limitation ▌The interface (PF-VF mailbox API) limits the number of addresses ▌Only first 30 multicast addresses can be registered ▌Overflowed addresses are silently dropped  Having Multicast promiscuous is a solution /* Each entry in the list uses 1 16 bit word. We have 30 * 16 bit words available in our HW msg buffer (minus 1 for the * msg type). That's 30 hash values if we pack 'em right. If * there are more than 30 MC addresses to add then punt the * extras for now and then add code to handle more than 30 later. * It would be unusual for a server to request that many multi-cast * addresses except for in large enterprise network environments. */ cnt = netdev_mc_count(netdev); if (cnt > 30) cnt = 30; msgbuf[0] = IXGBE_VF_SET_MULTICAST; msgbuf[0] |= cnt << IXGBE_VT_MSGINFO_SHIFT;
  • 21. 21 © NEC Corporation 2015 NEC Group Internal Use Only21 © NEC Corporation 2016 Why so many multicast addresses? ▌Our case is to support many IPv6 addresses on VF For Neighbor Discovery, each IPv6 address requires corresponding multicast address Unicast address Multicast MAC 2001:0000 0000:0000 0000:0000 00 12:3456 ff02:0000 0000:0000 0000:0001 ff 12:3456 33:33 ff:12:34:56 Solicited node multicast address
  • 22. 22 © NEC Corporation 2015 NEC Group Internal Use Only22 © NEC Corporation 2016 Example: Cannot receive multicast packet ▌Some multicast packets are not passed to VF, then failed to ND ▌Assign 30 IPv6 addresses on VM (different network address) ▌Ping from other physical machine to VM # for i in `seq 1 30`; do > ip -6 addr add 2001:$i::1:$i/64 dev ens6 > done # for i in `seq 1 30`; do ping6 –w 1 –c 1 –I eth2 2001:$i::1:$i; done : PING 2001:28::1:28(2001:28::1:28) from 2001:28::1 eth2: 56 data bytes --- 2001:28::1:28 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1000ms
  • 23. 23 © NEC Corporation 2015 NEC Group Internal Use Only23 © NEC Corporation 2016 NFV use case 2 ▌Network Function: L2 switch Access NetworkUser Equipment Private Network VNF ... MAC:A MAC:B Left to right both of MAC:A and MAC:B must be able to receive through this VF dst:A dst:B
  • 24. 24 © NEC Corporation 2015 NEC Group Internal Use Only24 © NEC Corporation 2016 Unicast promiscuous ▌Only packets which match the registered MAC can be received ▌VNF should be able to handle every MAC addresses in network ▌Hard to register all MAC addresses in L2 network  We want Unicast promiscuous feature in VF
  • 25. 25 © NEC Corporation 2015 NEC Group Internal Use Only25 © NEC Corporation 2016 SR-IOV limitations for NFV and current status ▌VLAN filtering Only 64 VLANs can be used  Proposed to add an option to disable hardware VLAN filter, but not accepted ▌Multicast addresses Only 30 Multicast addresses can be used  Implemented VF Multicast promiscuous mode in ixgbe/ixgbevf driver in Linux 4.4 ▌Unicast promiscuous Single unicast MAC address can be used  Hardware limitation
  • 26. 26 © NEC Corporation 2015 NEC Group Internal Use Only26 © NEC Corporation 2016 Addressing Multicast addresses limitation ▌There is a hardware feature in 82599 VF Multicast promiscuous mode ▌But driver didn’t support that feature ▌Implement new PF-VF mailbox API in ixgbe and ixgbevf ▌First, automatically enable to VF multicast promiscuous when the number of addresses overs 30 ▌Suggested way, implement xcast mode in VF There is ALLMULTI flag that means that every multicast packet is received in this device  Accepted in Linux 4.4
  • 27. 27 © NEC Corporation 2015 NEC Group Internal Use Only27 © NEC Corporation 2016 PF-VF mailbox APIs ▌There are mailbox APIs Communicate between PF and VF version API description legacy RESET VF requests reset SET_MAC_ADDR VF requests PF to set MAC addr SET_MULTICAST VF requests PF to set MC addr SET_VLAN VF requests PF to set VLAN 1.0 SET_LPE VF requests PF to set VMOLR.LPE SET_MACVLAN VF requests PF for unicast filter API_NEGOTIATE negotiate API version 1.1 GET_QUEUES get queue configuration 1.2 GET_RETA VF requests for RETA GET_RSS_KEY get RSS key UPDATE_XCAST_MODE VF requests PF to set MC mode
  • 28. 28 © NEC Corporation 2015 NEC Group Internal Use Only28 © NEC Corporation 2016 Implementation (PF ixgbe) ▌Handle UPDATE_XCAST_MODE API --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c : @@ -1066,6 +1122,9 @@ static int ixgbe_rcv_msg_from_vf(struct ixgbe_adapter *adapter, u32 vf) case IXGBE_VF_GET_RSS_KEY: retval = ixgbe_get_vf_rss_key(adapter, msgbuf, vf); break; + case IXGBE_VF_UPDATE_XCAST_MODE: + retval = ixgbe_update_vf_xcast_mode(adapter, msgbuf, vf); + break; default: e_err(drv, "Unhandled Msg %8.8x¥n", msgbuf[0]); retval = IXGBE_ERR_MBX;
  • 29. 29 © NEC Corporation 2015 NEC Group Internal Use Only29 © NEC Corporation 2016 Implementation (VF ixgbevf) ▌Request to PF (from ixgbevf_set_rx_mode) --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -1894,9 +1894,17 @@ static void ixgbevf_set_rx_mode(struct net_device *netdev) { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = &adapter->hw; + unsigned int flags = netdev->flags; + int xcast_mode; + + xcast_mode = (flags & IFF_ALLMULTI) ? IXGBEVF_XCAST_MODE_ALLMULTI : + (flags & (IFF_BROADCAST | IFF_MULTICAST)) ? + IXGBEVF_XCAST_MODE_MULTI : IXGBEVF_XCAST_MODE_NONE; spin_lock_bh(&adapter->mbx_lock); + hw->mac.ops.update_xcast_mode(hw, netdev, xcast_mode); + /* reprogram multicast list */ hw->mac.ops.update_mc_addr_list(hw, netdev);
  • 30. 30 © NEC Corporation 2015 NEC Group Internal Use Only30 © NEC Corporation 2016 Security consideration ▌Enabling VF Multicast promiscuous mode causes security issues  Can see all multicast packets through this device  Can hurt performance NIC duplicates packets and does DMA to each pool  Make only trusted VF can enable Multicast promiscuous mode static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf) { : if (xcast_mode > IXGBEVF_XCAST_MODE_MULTI && !adapter->vfinfo[vf].trusted) { xcast_mode = IXGBEVF_XCAST_MODE_MULTI; }
  • 31. 31 © NEC Corporation 2015 NEC Group Internal Use Only31 © NEC Corporation 2016 Implement functionality to trust VF ▌Add new operation “set_vf_trust” in net_device_ops ▌Also add support VF trust operation in iproute2 (ip command) # ip link set dev enp3s0f0 vf 1 trust on (dmesg) kernel: ixgbe 0000:03:00.0 enp3s0f0: VF 1 is trusted kernel: ixgbevf 0000:03:10.2: NIC Link is Down kernel: ixgbe 0000:03:00.0 enp3s0f0: VF Reset msg received from vf 1 kernel: ixgbevf 0000:03:10.2: NIC Link is Up 10 Gbps # ip link set dev enp3s0f0 vf 1 trust off (dmesg) kernel: ixgbe 0000:03:00.0 enp3s0f0: VF 1 is not trusted kernel: ixgbevf 0000:03:10.2: NIC Link is Down kernel: ixgbe 0000:03:00.0 enp3s0f0: VF Reset msg received from vf 1 kernel: ixgbevf 0000:03:10.2: NIC Link is Up 10 Gbps Note When trusted state is changed, target vf is reset
  • 32. 32 © NEC Corporation 2015 NEC Group Internal Use Only32 © NEC Corporation 2016 Future work and possible security issues ▌Still, there are limitations  VLAN filtering  Unicast promiscuous ▌Possible security issues  VLAN filter is not isolated  Multicast hash table is not handled strictly
  • 33. 33 © NEC Corporation 2015 NEC Group Internal Use Only33 © NEC Corporation 2016 VLAN filtering ▌Disabling hardware VLAN filtering may solve this issue ▌But  It could break existing network feature (DCB, FCoE)  Broadcast(Multicast) storm could cause performance degradation  Security issue, BC/MC packets can be seen in every VF ▌Maybe okay if the network and VMs are well managed ▌Another point is that there is no suitable knob to do it now What command is right to turn VLAN filter off in general
  • 34. 34 © NEC Corporation 2015 NEC Group Internal Use Only34 © NEC Corporation 2016 Unicast promiscuous ▌Supporting this feature in hardware is the best ▌No VF unicast promiscuous feature in 82599 unfortunately
  • 35. 35 © NEC Corporation 2015 NEC Group Internal Use Only35 © NEC Corporation 2016 Hardware support in X540 and X550 ▌Later NIC chips, X540 and X550 support VLAN promiscuous and Unicast promiscuous mode per VF ▌The issues could be solved with X540/X550 chip ▌To make framework for X540/X550 and use the same semantics for 82599 may be needed Field Bit(s) Description Reserved 23:0 21:0 Reserved UPE 22 Unicast Promiscuous Enable VPE 23 VLAN Promiscuous Enable AUPE 24 Accept Untagged Packet Enable ROMPE 25 Receive Overflow Multicast Packets ROPE 26 Receive MAC Filters Overflow BAM 27 Broadcast Accept MPE 28 Multicast Promiscuous Enable Reserved 31:29 Reserved PF VM L2Control Register new feature bit
  • 36. 36 © NEC Corporation 2015 NEC Group Internal Use Only36 © NEC Corporation 2016 Ideas ▌Mirroring ▌Unicast hash Those features are not used/supported in driver
  • 37. 37 © NEC Corporation 2015 NEC Group Internal Use Only37 © NEC Corporation 2016 Possible security issues ▌In the current ixgbe/ixgbevf implementation, there are 2 issues to be considered  VLAN filter is not isolated Single VLAN filter table No limitation to request new VLAN from VF If a VM requests 64 VLANs, other VMs never use different VLANs  Single multicast hash table Using multicast hash table for switching multicast packet to VF SET_MULTICAST API is only for setting hash value, no unset functionality Manipulating IP address assignment can make VF to have multicast promiscuous behavior
  • 38. 38 © NEC Corporation 2015 NEC Group Internal Use Only38 © NEC Corporation 2016 VLAN filter is not isolated ▌If a VM uses all VLANs, other VM can’t make new VLAN (VM0) # for i in `seq 1 64`; do > echo "vlan $i" > ip link add link ens6 name ens6.$i type vlan id $i > done vlan 1 vlan 2 : vlan 63 vlan 64 RTNETLINK answers: Permission denied (VM1) # ip link add link ens6 name ens6.64 type vlan id 64 RTNETLINK answers: Permission denied and if VM does not shutdown gracefully, registered VLAN filter entry remains
  • 39. 39 © NEC Corporation 2015 NEC Group Internal Use Only39 © NEC Corporation 2016 Single multicast hash table ▌ixgbe driver uses multicast hash table (MTA register) ▌Multicast Table Array is a 4Kb bitmap i. 82599 make 12 bits hash value from MAC ii. If corresponding bit in MTA is set, it means hit iii. Check VF capability PFVFL2FLT.ROMPE and transfer packet  If every bits in MTA are set, every multicast packet matches and transferred to pool ▌MTA bit is set by SET_MULTICAST API and no unset API ▌PFVFL2FLT.ROMPE bit always set by receiving SET_MULTICAST API message ▌Long living host may have unnecessary bits in MTA
  • 40. 40 © NEC Corporation 2015 NEC Group Internal Use Only40 © NEC Corporation 2016 Example: Can receive all multicast packet ▌Invoke SET_MULTICAST from VF each IP address ▌Assign 30 IP addresses again ▌Ping from other physical machine to VM # for i in `seq 1 30`; do > ip -6 addr add 2001:$i::1:$i/64 dev ens6 > ip -6 addr del 2001:$i::1:$i/64 dev ens6 > done # for i in `seq 1 30`; do ping6 –w 1 –c 1 –I eth2 2001:$i::1:$i; done : PING 2001:30::1:30(2001:30::1:30) from 2001:30::1 eth2: 56 data bytes 64 bytes from 2001:30::1:30: icmp_seq=1 ttl=64 time=0.947 ms --- 2001:30::1:30 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.947/0.947/0.947/0.000 ms # for i in `seq 1 30`; do > ip -6 addr add 2001:$i::1:$i/64 dev ens6 > done
  • 41. 41 © NEC Corporation 2015 NEC Group Internal Use Only41 © NEC Corporation 2016 Summary ▌SR-IOV in Intel 82599 and ixgbe driver implementation ▌SR-IOV ixgbe driver limitations for NFV VLAN filtering Multicast addresses Unicast promiscuous ▌Addressing Multicast addresses issue Implement new mailbox API and ndo VF trust ▌Future work and possible security issues ▌Questions? ▌E-Mail: h-shimamoto@ct.jp.nec.com