SR-IOV config to reproduce: (the important part is that not all VFs are 'passthrough: true') /etc/iov/ixl1.conf --------- PF { device : "ixl1"; num_vfs : 4; } DEFAULT { passthrough : true; } #VF for use by host VF-0 { passthrough : false; } --------- starting a bhyve VM causes the following errors and the host to loose network connectivity: (it only happens if a VM is started) kernel: ixl1: Malicious Driver Detection event 1 on RX queue 776, pf number 64 (PF-1), (VF-0) syslogd: last message repeated 352 times kernel: iavf0: ARQ Critical Error detected kernel: iavf0: ASQ Critical Error detected kernel: iavf0: WARNING: Stopping VF! kernel: iavf0: Unable to send opcode DISABLE_QUEUES to PF, ASQ is not alive kernel: iavf0: 2<DISABLE_QUEUES> timed out kernel: iavf0: Unable to send opcode RESET_VF to PF, ASQ is not alive kernel: ixl1: Malicious Driver Detection event 1 on RX queue 776, pf number 64 (PF-1), (VF-0) syslogd: last message repeated 3797 times kernel: ixl1: kernel: Malicious Driver Detection event 1 on RX queue 776, pf number 64 (PF-1), (VF-0) kernel: ixl1: Malicious Driver Detection event 1 on RX queue 776, pf number 64 (PF-1), (VF-0) syslogd: last message repeated 835 times "nvmupdate -v" output: ----------- Intel(R) Ethernet NVM Update Tool NVMUpdate version 1.32.20.30 Copyright (C) 2013 - 2018 Intel Corporation. Version: QV SDK - 2.32.20.28 ixl - 2.1.0-k ----------- This problem does not occur when all VFs are in passthrough mode (and the host does not use ixl1): example: /etc/iov/ixl1.conf --------- PF { device : "ixl1"; num_vfs : 4; } DEFAULT { passthrough : true; } --------- Intel NIC Model: X710-DA2 dev.ixl.1.fw_version: fw 6.0.48442 api 1.7 nvm 6.01 etid 800035cf oem 1.262.0
What is the output of ifconfig -a before you start the VM?
Thanks for asking, here is the output (the relevant interface is ixl1 as seen in the ixl1.conf above and iavf0 is the VF-0 for the host, that works fine until the first VM is started). Other interfaces (ixl0, igb1) are not used. ifconfig -a ixl0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether xx:xx:xx:xx:xx:xx media: Ethernet autoselect status: no carrier nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> ixl1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether xx:xx:xx:xx:xx:xx media: Ethernet autoselect (10Gbase-LR <full-duplex,rxpause,txpause>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether xx:xx:xx:xx:xx:xx inet x.x.x.x netmask 0xffffffe0 broadcast x.x.x.x media: Ethernet autoselect (100baseTX <full-duplex>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether xx:xx:xx:xx:xx:xx inet x.x.x.x netmask 0xfffffff8 broadcast x.x.x.x media: Ethernet autoselect status: no carrier nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> iavf0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6> ether xx:xx:xx:xx:xx:xx inet6 a:b:c:2 prefixlen 64 inet6 a:b:c%iavf0 prefixlen 64 scopeid 0x6 inet6 a:b:c:3 prefixlen 64 inet6 a:b:c:4 prefixlen 64 inet6 a:b:c:5 prefixlen 64 inet6 a:b:c:6 prefixlen 64 inet6 a:b:c:7 prefixlen 64 inet6 a:b:c:8 prefixlen 64 inet6 a:b:c:9 prefixlen 64 inet6 a:b:c:a prefixlen 64 inet6 a:b:c:b prefixlen 64 inet6 a:b:c:c prefixlen 64 inet6 a:b:c:d prefixlen 64 inet6 a:b:c:e prefixlen 64 inet6 a:b:c:f prefixlen 64 inet6 a:b:c:10 prefixlen 64 inet6 a:b:c:11 prefixlen 64 inet6 a:b:c:12 prefixlen 64 inet6 a:b:c:13 prefixlen 64 inet6 a:b:c:14 prefixlen 64 inet6 a:b:c:15 prefixlen 64 inet6 a:b:c:16 prefixlen 64 inet6 a:b:c:17 prefixlen 64 inet6 a:b:c:18 prefixlen 64 inet6 a:b:c:19 prefixlen 64 inet6 a:b:c:1a prefixlen 64 inet6 a:b:c:1b prefixlen 64 inet6 a:b:c:1c prefixlen 64 inet6 a:b:c:1d prefixlen 64 inet6 a:b:c:1e prefixlen 64 inet6 a:b:c:1f prefixlen 64 inet6 a:b:c:20 prefixlen 64 inet6 a:b:c:21 prefixlen 64 inet6 a:b:c:1:1 prefixlen 64 inet6 a:b:c::0 prefixlen 64 inet6 a:b:c::1 prefixlen 64 inet6 a:b:c::2 prefixlen 64 inet6 a:b:c::3 prefixlen 64 inet6 a:b:c::4 prefixlen 64 inet6 a:b:c::5 prefixlen 64 inet6 a:b:c::6 prefixlen 64 inet6 a:b:c::7 prefixlen 64 inet6 a:b:c::8 prefixlen 64 inet6 a:b:c::9 prefixlen 64 inet x.x.x.2 netmask 0xffffff80 broadcast x.x.x.127 inet x.x.x.3 netmask 0xffffffff broadcast x.x.x.3 inet x.x.x.4 netmask 0xffffffff broadcast x.x.x.4 inet x.x.x.5 netmask 0xffffffff broadcast x.x.x.5 inet x.x.x.6 netmask 0xffffffff broadcast x.x.x.6 inet x.x.x.7 netmask 0xffffffff broadcast x.x.x.7 inet x.x.x.8 netmask 0xffffffff broadcast x.x.x.8 inet x.x.x.9 netmask 0xffffffff broadcast x.x.x.9 inet x.x.x.10 netmask 0xffffffff broadcast x.x.x.10 inet x.x.x.11 netmask 0xffffffff broadcast x.x.x.11 inet x.x.x.12 netmask 0xffffffff broadcast x.x.x.12 inet x.x.x.13 netmask 0xffffffff broadcast x.x.x.13 inet x.x.x.14 netmask 0xffffffff broadcast x.x.x.14 inet x.x.x.15 netmask 0xffffffff broadcast x.x.x.15 inet x.x.x.16 netmask 0xffffffff broadcast x.x.x.16 inet x.x.x.17 netmask 0xffffffff broadcast x.x.x.17 inet x.x.x.18 netmask 0xffffffff broadcast x.x.x.18 inet x.x.x.19 netmask 0xffffffff broadcast x.x.x.19 inet x.x.x.20 netmask 0xffffffff broadcast x.x.x.20 inet x.x.x.21 netmask 0xffffffff broadcast x.x.x.21 inet x.x.x.22 netmask 0xffffffff broadcast x.x.x.22 inet x.x.x.23 netmask 0xffffffff broadcast x.x.x.23 inet x.x.x.24 netmask 0xffffffff broadcast x.x.x.24 inet x.x.x.25 netmask 0xffffffff broadcast x.x.x.25 inet x.x.x.26 netmask 0xffffffff broadcast x.x.x.26 inet x.x.x.27 netmask 0xffffffff broadcast x.x.x.27 inet x.x.x.28 netmask 0xffffffff broadcast x.x.x.28 inet x.x.x.29 netmask 0xffffffff broadcast x.x.x.29 inet x.x.x.30 netmask 0xffffffff broadcast x.x.x.30 inet x.x.x.31 netmask 0xffffffff broadcast x.x.x.31 inet x.x.x.32 netmask 0xffffffff broadcast x.x.x.32 inet x.x.x.33 netmask 0xffffffff broadcast x.x.x.33 inet x.x.x.64 netmask 0xffffff80 broadcast x.x.x.127 inet x.x.x.65 netmask 0xffffffff broadcast x.x.x.65 inet x.x.x.66 netmask 0xffffffff broadcast x.x.x.66 inet x.x.x.67 netmask 0xffffffff broadcast x.x.x.67 inet x.x.x.126 netmask 0xffffff80 broadcast x.x.x.127 media: Ethernet autoselect (10Gbase-SR <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Thanks, this helps. What does /etc/rc.conf look like for your VMs?
Hi, here some more information: rc.conf: [...] vm_enable="YES" vm_dir="zfs:zroot/vm123" iovctl_files="/etc/iov/ixl1.conf" (VMs are not started automatically due to the problem, the issue is triggered by manually starting the vm via "vm -f install ...") debian9-test.conf: -- loader="grub" cpu=1 memory=4096M disk0_type="ahci-hd" disk0_name="disk0.img" grub_run_partition="1" grub_run_dir="/boot/grub" passthru0="2/0/81" -- pciconf -l [...] ppt0@pci0:2:0:81: ... "vm passthru" output DEVICE BHYVE ID READY DESCRIPTION [...] iavf0 2/0/80 No Ethernet Virtual Function 700 Series ppt0 2/0/81 Yes Ethernet Virtual Function 700 Series ppt1 2/0/82 Yes Ethernet Virtual Function 700 Series ppt2 2/0/83 Yes Ethernet Virtual Function 700 Series
Can you tell me how many queues ixl1 is using? That information should be printed out when the ixl(4) driver is loaded.
Also, I see the VM is Debian 9. Can you get the i40evf driver version #? It should be in dmesg when the i40evf driver loads at boot.
(In reply to Eric Joyner from comment #5) 8 queues: kernel: ixl1: PF-ID[1]: VFs 64, MSIX 129, VF MSIX 5, QPs 768, I2C kernel: ixl1: using 1024 tx descriptors and 1024 rx descriptors kernel: ixl1: msix_init qsets capped at 64 kernel: ixl1: pxm cpus: 8 queue msgs: 128 admincnt: 1 kernel: ixl1: using 8 rx queues 8 tx queues kernel: ixl1: Using MSIX interrupts with 9 vectors kernel: ixl1: Allocating 8 queues for PF LAN VSI; 8 queues active kernel: ixl1: PCI Express Bus: Speed 8.0GT/s Width x8 kernel: ixl1: netmap queues/slots: TX 8/1024, RX 8/1024
(In reply to Jeff Pieper from comment #6) It is not specific to Debian guests, I tried FreeBSD 11.2 and OpenBSD 6.4 guests as well.
Hi, is there any update on this? thanks!
(In reply to Finn from comment #9) We're still investigating the issue.
any update on this? thanks!
(In reply to Finn from comment #11) No, sorry. We haven't been able to pin down a root cause for the issue. We've also had to do work on internal projects, so we haven't been able to spend much time investigating this. That said, we haven't forgotten about this issue; we do want to get this fixed.
ixl0: Malicious Driver Detection event 1 on RX queue 16, pf number 0 (PF-0), (VF-0) ixl0: Malicious Driver Detection event 1 on RX queue 19, pf number 0 (PF-0), (VF-0) ixl0: Malicious Driver Detection event 1 on RX queue 16, pf number 0 (PF-0), (VF-0) 12.1-RELEASE-p5, 2 guests 12.1-RELEASE PF { device: ixl0; num_vfs: 5; } DEFAULT { passthrough: true; allow-set-mac: true; allow-promisc: true; } VF-0 { mac-addr: "58:9c:fc:0b:xx:xx"; passthrough: false; } vm passthru iavf0 25/0/16 No Ethernet Virtual Function 700 Series ppt0 25/0/17 Yes Ethernet Virtual Function 700 Series ppt1 25/0/18 Yes Ethernet Virtual Function 700 Series ppt2 25/0/19 Yes Ethernet Virtual Function 700 Series ppt3 25/0/20 Yes Ethernet Virtual Function 700 Series
When I passthrough first and then up the iavf0 interface I get this error iavf0: Unable to send opcode CONFIG_PROMISCUOUS_MODE to PF, ASQ is not alive iavf0: Unable to send opcode ENABLE_QUEUES to PF, ASQ is not alive iavf0: ARQ Critical Error detected iavf0: ASQ Critical Error detected iavf0: WARNING: Stopping VF! iavf0: Unable to send opcode RESET_VF to PF, ASQ is not alive iavf0: 1<ENABLE_QUEUES> timed out Maybe it helps?
I am reproducing this issue using 13.0-RELEASE, even using latest intel-ixl-kmod port.
SO I tried the following experience: 1. start the bhyve VM with 2 passtrhrough device 2. launch dhclient to get an ip on VF-0 I got the following error: ``` Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode CONFIG_RSS_KEY to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode SET_RSS_HENA to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode CONFIG_RSS_LUT to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode CONFIG_IRQ_MAP to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode CONFIG_PROMISCUOUS_MODE to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ENABLE_QUEUES to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: ARQ Critical Error detected Apr 30 22:40:55 pollen1 kernel: iavf0: ASQ Critical Error detected Apr 30 22:40:55 pollen1 kernel: iavf0: WARNING: Stopping VF! Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode RESET_VF to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: 1<ENABLE_QUEUES> timed out Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 bird[3179]: KIF: Invalid interface address 0.0.0.0 for iavf0 Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 bird[3179]: KIF: Invalid interface address 0.0.0.0 for iavf0 Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode DEL_ETH_ADDR to PF, ASQ is not alive Apr 30 22:40:55 pollen1 kernel: iavf0: Unable to send opcode ADD_ETH_ADDR to PF, ASQ is not alive Apr 30 22:41:14 pollen1 dhclient[6008]: connection closed ``` Configuration is the following ``` PF { device: "ixl0"; num_vfs: 8; } DEFAULT { passthrough: true; allow-set-mac: true; allow-promisc: true; } VF-0 { passthrough: false; mac-addr: "xx:xx:xx:xx:xx:xx"; } VF-1 { passthrough: false; mac-addr: "xx:xx:xx:xx:xx:xx"; } VF-2 { passthrough: false; mac-addr: "xx:xx:xx:xx:xx:xx"; } ```
nobody is assigned to this bug ? Looks weird to not care about sr-iov that much these days
is there anything that can be done for it ? Any update needed? It's kind of suprising nobody came with a solution in 3 years on such obvious issue.
(In reply to benoitc from comment #19) Could you please include: - FreeBSD versions the you reproduced in (uname -a outputs) - pciconf -lv output (as attachment) - /var/run/dmesg.boot output of freebsd versions its reproducible in (as attachments) - Your minimal steps to reproduce (with, without bhyve?) If you haven't yet, a test of 14-CURRENT snapshot image for reproduction Thanks!
Created attachment 234362 [details] pciconf.txt
Created attachment 234363 [details] dmesg.txt
Created attachment 234364 [details] dmesg.boot
I reproduced it on Freebsd 13.0 (all patches). Last uname is FreeBSD flower01.domain.tld 13.1-RELEASE FreeBSD 13.1-RELEASE releng/13.1-n250148-fc952ac2212 GENERIC amd64 I attached the requested files for one of the machine using the ixl driver, I will attach the logs for the machine using the IX driver ASAP. But so far it is the same issue. To reproduce it: * setup the nic to have the 2 ports setup with 1 tVF with throughtput to true and 3 others to false. * Start the system using the exposed VF to the system * start a bhyve VM using one of the ppt associated to the card. AT this point the message "Malicious Driver Detection event 1 on RX queue" starts to be printed in the logs and stderr.
Just chiming in to confirm I have this issue as well. X710-DA2 13.2-RELEASE-p2 GENERIC
(In reply to Kubilay Kocak from comment #20) I've been able to reproduce on 14.1-RELEASE FreeBSD 14.1-RELEASE (GENERIC) releng/14.1-n267679-10e31f0946d8 I have tried various combinations of VFs marked as passthrough or not, and also with the PF marked for passthrough. In all cases I see the messages as in the original post shortly after the VM starts. To reproduce 1. Configure 2 VFs on each port, mark those on port 1 for passthrough 2. Host networking uses a VF on port 2 3. Start a bhyve VM with one of the VFs on port 1 as a passthrough device I've used nvmupdate to update to the most recent firmware as of 2024-06-07. System is a minisforum MS-01 with the onboard NICs.
Created attachment 251280 [details] output of pciconf -lv
Created attachment 251281 [details] dmesg.boot
Created attachment 251282 [details] output of ifconfig
Reconfirming this problem exists (14.1-RELEASE) and is not card specific. In addition to the X710 I have reproduced the failure with Chelsio and Mellanox cards as well. Additional reports: 273372, 278058 The last one is interesting. If I pass through a VF to bhyve, and in that VM use the VF for a vnet jail it works fine. But if I mix pass through and non-pass through (vnet jail) in the host it fails every time.
^Triage: mfc-stable12 is now OBE.
I did eventually find a solution for this. I found a forum post which clued me into the fact that the intel-ixl-kmod port has an SR-IOV option, which is not enabled in the package. Installing the port with SR-IOV support enabled solves the problem. 1. Configure intel-ixl-kmod port with SR-IOV support, and "make install" 2. Add if_ixl_updated_load="YES" to /boot/loader.conf 3. Reboot 4. Configure VFs with iovctl 5. Start VM
^Triage: clear unneeded flags. Nothing has yet been committed to be merged. To submitter: is this problem still present in recent versions of FreeBSD?
The SR-IOV issue is present on both 14.1Rp5 and STABLE. It is card independent. I can trigger it on Intel, Chelsio, and Mellanox cards. Passthrough alone works. Non-passthrough alone works. But if you start one of each the host device and its VFs will drop.