Bug 211062

Summary: ix(4): SR-IOV virtual function driver fails to attach Intel 10-Gigabit X540-AT2 (0x1528): Failed to attach pci0:129:0:129: Input/output error
Product: Base System Reporter: Ultima <Ultima1252>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed FIXED    
Severity: Affects Many People CC: erj, freebsd, j, jhb, lxv, naito.yuichiro, net, ozkan.kirik, pi, piotr.pietruszewski, pkubaj, rahudev2, rstone, torstenb, tsuroerusu, ultima
Priority: Normal Keywords: IntelNetworking, needs-qa
Version: 12.2-RELEASEFlags: koobs: maintainer-feedback? (freebsd)
koobs: mfc-stable13?
koobs: mfc-stable12?
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263568
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229852

Description Ultima 2016-07-12 23:08:10 UTC
Currently when attempting to attach the vf driver for x540-AT[1-2] driver (ix) the driver will fail to attach and spit out these error messages.

12.0-CURRENT FreeBSD 12.0-CURRENT #17 r302480 amd64

# pciconf -lv
ix0@pci0:129:0:0:	class=0x020000 card=0x00001458 chip=0x15288086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Controller 10-Gigabit X540-AT2'
    class      = network
    subclass   = ethernet
ix1@pci0:129:0:1:	class=0x020000 card=0x00001458 chip=0x15288086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Controller 10-Gigabit X540-AT2'
    class      = network
    subclass   = ethernet
none189@pci0:129:0:129:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
none190@pci0:129:0:131:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
ppt0@pci0:129:0:133:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
ppt1@pci0:129:0:135:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
...so on to ppt28

# devctl attach pci0:129:0:129
devctl: Failed to attach pci0:129:0:129: Input/output error

dmesg:
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version - 1.4.6-k> at device 0.129 on pci9
ixv0: Using MSIX interrupts with 2 vectors
ixv0: ixgbe_reset_hw() failed with error -100
device_attach: ixv0 attach returned 5

iovctl.conf:
PF {
        device : ix1;
        num_vfs : 31;
}

DEFAULT {
        passthrough : true;
}
VF-0 {
        passthrough : false;
}
VF-1 {
        passthrough : false;
}
Comment 1 Ultima 2016-07-13 21:51:46 UTC
Just got a little more information on this. The ppt's appear to be broken as well. I have started bhyve on a linux guest, using passthru ppt,

#!/bin/sh
IMAGES="/home/user/bhyve_images"
MAP="/home/user/bhyve_map"
#grub-bhyve \
#       -m ${MAP}/antergos.map -r hd0 \
#       -M 2G vm0
bhyveload \
        -S -m 20G -d ${IMAGES}/antergos-2016.06.18-x86_64.iso vm0
bhyve \
        -c 16 -m 20G -w -H -S \
        -s 0,hostbridge \
        -s 3,ahci-cd,${IMAGES}/antergos-2016.06.18-x86_64.iso \
        -s 5,ahci-hd,/dev/zvol/tank/bhyve/antergos \
        -s 6,passthru,129/0/133 \
        -l bootrom,${IMAGES}/BHYVE_UEFI_20160526.fd \
        -s 29,fbuf,tcp=0.0.0.0:5900,w=1440,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        vm0

I'm still learning how to use bhyve, so this may not be an optimal vm startup.

bhyveload fails to start. But I use to to load the memory for bhyve to start. Bhyve starts successfully.

Once in the vm, checked dmesg, and this is output for the vf driver trying to attach.

ixgbevf: Intel(R) 10 Gigabite PCI express Virtual Function Network Driver - version 2.12.1-k
ixgbevf: Copyright (c) 2009 - 2015 Intel Corporation.
ixgbevf 0000:00:06:0 enabling device (0004) -> 0006)
...snip, unrelated...
ixgbevf 0000:00:06.0: PF still in reset state. Is the PF interface up?
ixgbevf 0000:00:06.0: Assigning random MAC address
ixgbevf 0000:00:06.0: aa:f2:ab:a9:b1:f3
ixgbevf 0000:00:06.0: MAC: 2
ixgbevf 0000:00:06.0: Intel(R) X540 Virtual Function
IPv6: ADDRCONF(NETDEV_UP): eth0 link is not ready
...snip, unrelated...
ixgbevf: Unable to start - perhaps the PF driver isn't up yet
vboxguest: PCI device not found, probably running on physical hardware.
vboxguest: PCI device not found, probably running on physical hardware.
vboxguest: PCI device not found, probably running on physical hardware.

I'm not sure if the last 3 entries are related.
Comment 2 John Baldwin freebsd_committer freebsd_triage 2016-07-26 16:05:04 UTC
This is not a generic SR-IOV bug, but specific to the Intel ix(4) driver.
Comment 3 Ultima 2016-07-26 16:53:19 UTC
(In reply to John Baldwin from comment #2)
I guess will just have to wait for intel to fix they're driver then.

Thanks for looking into this John.
Comment 4 Richard Gallamore freebsd_committer freebsd_triage 2017-09-01 21:13:58 UTC
Just checking in with an update as there is a new driver version. Everything is the same except I changed num_vfs to 4 and there is a new error message.

12.0-CURRENT FreeBSD 12.0-CURRENT #3 r323109 amd64

dmsg:
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version - 1.5.13-k> at device 0.129 on pci10
ixv0: ...reset_hw() failure: Reset Failed!
device_attach: ixv0 attach returned 5
Comment 5 rahu 2017-09-09 14:21:02 UTC
MARKED AS SPAM
Comment 6 Eric Joyner freebsd_committer freebsd_triage 2017-09-11 17:15:52 UTC
(In reply to Richard Gallamore from comment #4)

Err, so what's your setup again? It's hard to follow from the previous comments. Are you using a linux host with a freebsd guest? Just running the VFs alongside the PFs on the same OS?
Comment 7 Richard Gallamore freebsd_committer freebsd_triage 2017-09-11 18:00:08 UTC
(In reply to Eric Joyner from comment #6)
 This is a FreeBSD host using the iovctl program to create vfs with the X540-AT2 network card. I haven't attempted to test the vfs with a guest in over a year but I doubt it will work because the error is similar or possibly same with new error message. Also, Ultima was my old bugzilla account so sorry if this brings some confusion.

 My previous tests (over a year ago) once the iovctl command is invoked and the vfs spawn, the network port no longer functions until iovctl removes the vfs. I am pretty sure I have tested with and without pf on the previous test but I can't say for certain. The recent test was with pf enabled.

 When I have some time I'll do a more complete test with errors provided by the bhyve guest as using a vf if this will help. Here are the relevant configuration files.

loader.conf:
hw.ix.num_queues="4"

iovctl.conf:
PF {
        device : ix1;
        num_vfs : 4;
}

DEFAULT {
        passthrough : true;
}
VF-0 {
        passthrough : false;
}
VF-1 {
        passthrough : false;
}

pf.conf: (omitted jail specific rules and variables to keep private)
set block-policy drop
set skip on { lo, bridge, tap } # skip on bridge and tap, they can cause issues with bhyve
scrub all no-df max-mss 1440 random-id reassemble tcp

block on ix0 all

pass in proto tcp to $host port { $host_tcp } modulate state
pass in proto udp to $host port { $host_udp } modulate state
pass in proto tcp from $nfs_clients to $host port { $host_nfs_ports } modulate state
pass in proto udp from $nfs_clients to $host port { $host_nfs_ports } modulate state
pass out all modulate state

pass in inet proto icmp all icmp-type echoreq
pass in inet6 proto ipv6-icmp all icmp6-type { 1, 2, 3, 4, 128, 129, 133, 134, 135, 136, 137 }
Comment 8 Richard Gallamore freebsd_committer freebsd_triage 2017-09-12 19:21:33 UTC
Booting up a windows 10:

nice -n -20 bhyve \
        -c 4 -m 8G -w -H -S \
        -s 0,hostbridge \
        -s 3,ahci-cd,${IMAGES}/null.iso \
        -s 5,ahci-hd,${ZVOL_DIR}/${NAME} \
        -s 6:0,passthru,129/0/133 \
        -l bootrom,${IMAGES}/BHYVE_UEFI.fd \
        -s 29,fbuf,tcp=${VNC},w=1600,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        ${NAME}

pciconf -lv:
ppt0@pci0:129:0:133:    class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet

While windows 10 is booting an error is received:
Assertion failed: (error == 0), function modify_bar_registration, file /usr/src/head/src/usr.sbin/bhyve/pci_emul.c, line 491.
Abort trap

Booting a debian 8 (jessie):
nice -n -20 bhyve \
        -c 2 -m 4G -w -H -S \
        -s 0,hostbridge \
        -s 5,ahci-hd,${ZVOL_DIR}/${NAME} \
        -s 6:0,passthru,129/0/133 \
        -l bootrom,${IMAGES}/BHYVE_UEFI.fd \
        -s 29,fbuf,tcp=${VNC},w=1440,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        ${NAME}

This boots successfully, however # dmesg | grep ixgbe has errors:
ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 2. 12.1-k
ixgbevf: Copyright (c) 2009 - 2012 Intel corporation.
ixgbevf: 0000:00:06.0: enabling device (0004 -> 0006)
ixgbevf: 0000:00:06.0: PF still in reset state. Is the PF interface up?
ixgbevf: 0000:00:06.0: Assigning random MAC address
ixgbevf: 0000:00:06.0: irq 48 for MSI/MSI-X
ixgbevf: 0000:00:06.0: irq 49 for MSI/MSI-X

# ip link set eth0 up
ixgbevf: Unable to start - perhaps the PF Driver isn't up yet
RTNETLINK answers: Network is down


Hope this helps.
Comment 9 Piotr Pietruszewski 2017-12-21 13:14:24 UTC
(In reply to Richard Gallamore from comment #8)

It is likely that this bug is fixed in newest driver provided by Intel on https://downloadcenter.intel.com/download/14688/Intel-Network-Adapters-Driver-for-PCIe-10-Gigabit-Network-Connections-Under-FreeBSD- . Feedback about problem resolution would be greatly appreciated.
Comment 10 Richard Gallamore freebsd_committer freebsd_triage 2017-12-22 02:13:13 UTC
(In reply to Piotr Pietruszewski from comment #9)
Hello Piotr,

Thank you very much for the link. After compiling and installing the driver everything appears to work though I have noticed an errors, i'm not sure if it is a false positive and also I think the vf is short by one. I have done basic testing (pinging) on the vfs and seems to work fine.

Setting hw.ix.num_queues seems to no longer matter so I removed it from /boot/loader.conf. I'm not sure if this was intentional. One error I found so far:
Dec 21 15:50:42 S1 kernel: ix1: CRITICAL: ECC ERROR!!  Please Reboot!!

Not sure if its a false positive. Seems to happen after the 3rd invoke of iovctl after reboot, but not entirely sure of the trigger.

64 vfs per port should be available, However, vfs <= 63 everything appeared normal other than the occasional error previously mentioned.

vfs >= 64 will return error:
iovctl: Failed to configure SR-IOV: No space left on device.

64 is supposed to be the max vfs or am I mistaken? or is the first interface (ix1) count as one of the vfs?

FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #2 r327068: Thu Dec 21 13:00:34 PST 2017

# cat /boot/loader.conf
if_ix_load="YES"

# cat /etc/iovctl.conf
PF {
        device : ix1;
        num_vfs : 32;
}

DEFAULT {
        passthrough : true;
}
VF-0 {
        passthrough : false;
}
VF-1 {
        passthrough : false;
}

It would be nice to get this in head but after seeing r327031[1], I don't see this happening anytime soon.

Also want to mention, tested head driver(4.0.0-k) before 3.2.17 and sr-iov is returning the similar/same errors mentioned in earlier posts on this thread.

[1] https://svnweb.freebsd.org/base?view=revision&revision=327031
Comment 11 Eric Joyner freebsd_committer freebsd_triage 2018-02-24 03:13:50 UTC
(In reply to Richard Gallamore from comment #10)

In regards to the 63 VF limit, the card supports up to 64 (fixed) queue pools, but the current implementation always assigns one to the PF interface, so you get the 63 VF limit. If it were changed to not give the PF interface any queues, then you could have 64 VFs.
Comment 12 Troels Just 2019-05-05 08:28:57 UTC
I have just run into this problem on FreeBSD 11.2 for one of my use cases. Has this problem of VFs not attaching been fixed in 11-STABLE for 11.3, 12.0-RELEASE or 12-STABLE?
Comment 13 Ozkan KIRIK 2021-01-19 09:04:32 UTC
Is it possible to merge to stable/12 ?
Comment 14 Ozkan KIRIK 2021-01-19 16:51:43 UTC
FreeBSD 13.0-CURRENT #0 3cc0c0d66a0-c255241(main)-dirty: Thu Dec 24 06:21:50 UTC 2020     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

# sysctl dev.ix.0.iflib.driver_version
dev.ix.0.iflib.driver_version: 4.0.1-k

# cat /etc/iovctl.conf
PF {
  device: "ix0";
  num_vfs: 2;
}
DEFAULT {
  passthrough: false;
}

# iovctl -C -f /etc/iovctl.conf

# dmesg
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver> at device 0.128 on pci4
ixv0: ...reset_hw() failure: Reset Failed!
ixv0: IFDI_ATTACH_PRE failed 5
device_attach: ixv0 attach returned 5
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver> at device 0.130 on pci4
ixv0: ...reset_hw() failure: Reset Failed!
ixv0: IFDI_ATTACH_PRE failed 5
device_attach: ixv0 attach returned 5

# pciconf -lvc ix0
ix0@pci0:3:0:0:	class=0x020000 rev=0x00 hdr=0x00 vendor=0x8086 device=0x15ad subvendor=0x15d9 subdevice=0x15ad
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection X552/X557-AT 10GBASE-T'
    class      = network
    subclass   = ethernet
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 64 messages, enabled
                 Table in map 0x20[0x0], PBA in map 0x20[0x2000]
    cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR
                 max read 4096
                 link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0003[140] = Serial 1 0000c9ffff000000
    ecap 000e[150] = ARI 1
    ecap 0010[160] = SR-IOV 1 IOV enabled, Memory Space enabled, ARI enabled
                     2 VFs configured out of 64 supported
                     First VF RID Offset 0x0080, VF RID Stride 0x0002
                     VF Device ID 0x15a8
                     Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
    iov bar  [184] = type Memory, range 64, base rxfb100000, size 16384, enabled
    iov bar  [190] = type Memory, range 64, base rxfb108000, size 16384, enabled
    ecap 000d[1b0] = ACS 1
    ecap 0018[1c0] = LTR 1


Is it possible to fix ?
Comment 15 xygzen 2021-10-18 21:35:49 UTC
If it helps any - I was getting the exact same issue at one cloud provider.

When I enabled SR-IOV in the BIOS it had a cryptic message about needing to enable ASPM (Active State Power Management) for the SR-IOV to work properly. Nothing I did allowed it to work.

When I switched providers, there was an ASPM setting in the BIOS of the new machine that wasn't there previously - and these were both Supermicro boards so it looks like not all motherboards support this functionality or maybe there is a newer version of the BIOS that has the setting correctly enabled.

I still needed to enable hw.pci.honor_msi_blacklist=0 in /boot/loader.conf and used the latest IX and IXV drivers from Intel:

IX - v3.3.25 - https://www.intel.com/content/www/us/en/download/14303/intel-network-adapters-driver-for-pcie-10-gigabit-network-connections-under-freebsd.html

IXV -v.1.5.28 - https://www.intel.com/content/www/us/en/download/645984/intel-network-adapter-virtual-function-driver-for-pcie-10-gigabit-network-connections-under-freebsd.html

Once I ran iovctl -C -f /etc/iovctl.conf the VF driver correctly attached to ixv0 (no passthrough) and the pci devices were configured correctly for the ixv1-3 for passthrough without drivers attached.

Setting iovctl_files="/etc/iovctl.conf" in rc.conf and changing the ip to be configured on ixv0 instead of ix0 got this up and running automatically on reboot.

If you're looking for a good cloud host with support for this I can highly recommend https://www.zare.com

Hope that helps!
Comment 16 xygzen 2021-10-19 23:14:55 UTC
Further update - to get this to work without crashing there is a patch that needs to be applied to Intel VTd in the kernel.

A thread on it is here:https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235/

And the patch is here: https://bz-attachments.freebsd.org/attachment.cgi?id=195225
Comment 17 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-26 00:43:09 UTC
Patch referenced in comment 16 is from bug 229852
Comment 18 Piotr Kubaj freebsd_committer freebsd_triage 2023-05-05 15:35:50 UTC
12.2-RELEASE is not supported. Fix has been committed to newer versions.