Bug 211062 - [ixv] sr-iov virtual function driver fails to attach
Summary: [ixv] sr-iov virtual function driver fails to attach
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net mailing list
URL:
Keywords: IntelNetworking, needs-patch
Depends on:
Blocks:
 
Reported: 2016-07-12 23:08 UTC by Ultima
Modified: 2019-05-05 08:28 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ultima 2016-07-12 23:08:10 UTC
Currently when attempting to attach the vf driver for x540-AT[1-2] driver (ix) the driver will fail to attach and spit out these error messages.

12.0-CURRENT FreeBSD 12.0-CURRENT #17 r302480 amd64

# pciconf -lv
ix0@pci0:129:0:0:	class=0x020000 card=0x00001458 chip=0x15288086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Controller 10-Gigabit X540-AT2'
    class      = network
    subclass   = ethernet
ix1@pci0:129:0:1:	class=0x020000 card=0x00001458 chip=0x15288086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Controller 10-Gigabit X540-AT2'
    class      = network
    subclass   = ethernet
none189@pci0:129:0:129:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
none190@pci0:129:0:131:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
ppt0@pci0:129:0:133:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
ppt1@pci0:129:0:135:	class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet
...so on to ppt28

# devctl attach pci0:129:0:129
devctl: Failed to attach pci0:129:0:129: Input/output error

dmesg:
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version - 1.4.6-k> at device 0.129 on pci9
ixv0: Using MSIX interrupts with 2 vectors
ixv0: ixgbe_reset_hw() failed with error -100
device_attach: ixv0 attach returned 5

iovctl.conf:
PF {
        device : ix1;
        num_vfs : 31;
}

DEFAULT {
        passthrough : true;
}
VF-0 {
        passthrough : false;
}
VF-1 {
        passthrough : false;
}
Comment 1 Ultima 2016-07-13 21:51:46 UTC
Just got a little more information on this. The ppt's appear to be broken as well. I have started bhyve on a linux guest, using passthru ppt,

#!/bin/sh
IMAGES="/home/user/bhyve_images"
MAP="/home/user/bhyve_map"
#grub-bhyve \
#       -m ${MAP}/antergos.map -r hd0 \
#       -M 2G vm0
bhyveload \
        -S -m 20G -d ${IMAGES}/antergos-2016.06.18-x86_64.iso vm0
bhyve \
        -c 16 -m 20G -w -H -S \
        -s 0,hostbridge \
        -s 3,ahci-cd,${IMAGES}/antergos-2016.06.18-x86_64.iso \
        -s 5,ahci-hd,/dev/zvol/tank/bhyve/antergos \
        -s 6,passthru,129/0/133 \
        -l bootrom,${IMAGES}/BHYVE_UEFI_20160526.fd \
        -s 29,fbuf,tcp=0.0.0.0:5900,w=1440,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        vm0

I'm still learning how to use bhyve, so this may not be an optimal vm startup.

bhyveload fails to start. But I use to to load the memory for bhyve to start. Bhyve starts successfully.

Once in the vm, checked dmesg, and this is output for the vf driver trying to attach.

ixgbevf: Intel(R) 10 Gigabite PCI express Virtual Function Network Driver - version 2.12.1-k
ixgbevf: Copyright (c) 2009 - 2015 Intel Corporation.
ixgbevf 0000:00:06:0 enabling device (0004) -> 0006)
...snip, unrelated...
ixgbevf 0000:00:06.0: PF still in reset state. Is the PF interface up?
ixgbevf 0000:00:06.0: Assigning random MAC address
ixgbevf 0000:00:06.0: aa:f2:ab:a9:b1:f3
ixgbevf 0000:00:06.0: MAC: 2
ixgbevf 0000:00:06.0: Intel(R) X540 Virtual Function
IPv6: ADDRCONF(NETDEV_UP): eth0 link is not ready
...snip, unrelated...
ixgbevf: Unable to start - perhaps the PF driver isn't up yet
vboxguest: PCI device not found, probably running on physical hardware.
vboxguest: PCI device not found, probably running on physical hardware.
vboxguest: PCI device not found, probably running on physical hardware.

I'm not sure if the last 3 entries are related.
Comment 2 John Baldwin freebsd_committer freebsd_triage 2016-07-26 16:05:04 UTC
This is not a generic SR-IOV bug, but specific to the Intel ix(4) driver.
Comment 3 Ultima 2016-07-26 16:53:19 UTC
(In reply to John Baldwin from comment #2)
I guess will just have to wait for intel to fix they're driver then.

Thanks for looking into this John.
Comment 4 Richard Gallamore freebsd_committer 2017-09-01 21:13:58 UTC
Just checking in with an update as there is a new driver version. Everything is the same except I changed num_vfs to 4 and there is a new error message.

12.0-CURRENT FreeBSD 12.0-CURRENT #3 r323109 amd64

dmsg:
ixv0: <Intel(R) PRO/10GbE Virtual Function Network Driver, Version - 1.5.13-k> at device 0.129 on pci10
ixv0: ...reset_hw() failure: Reset Failed!
device_attach: ixv0 attach returned 5
Comment 5 rahu 2017-09-09 14:21:02 UTC
MARKED AS SPAM
Comment 6 Eric Joyner freebsd_committer 2017-09-11 17:15:52 UTC
(In reply to Richard Gallamore from comment #4)

Err, so what's your setup again? It's hard to follow from the previous comments. Are you using a linux host with a freebsd guest? Just running the VFs alongside the PFs on the same OS?
Comment 7 Richard Gallamore freebsd_committer 2017-09-11 18:00:08 UTC
(In reply to Eric Joyner from comment #6)
 This is a FreeBSD host using the iovctl program to create vfs with the X540-AT2 network card. I haven't attempted to test the vfs with a guest in over a year but I doubt it will work because the error is similar or possibly same with new error message. Also, Ultima was my old bugzilla account so sorry if this brings some confusion.

 My previous tests (over a year ago) once the iovctl command is invoked and the vfs spawn, the network port no longer functions until iovctl removes the vfs. I am pretty sure I have tested with and without pf on the previous test but I can't say for certain. The recent test was with pf enabled.

 When I have some time I'll do a more complete test with errors provided by the bhyve guest as using a vf if this will help. Here are the relevant configuration files.

loader.conf:
hw.ix.num_queues="4"

iovctl.conf:
PF {
        device : ix1;
        num_vfs : 4;
}

DEFAULT {
        passthrough : true;
}
VF-0 {
        passthrough : false;
}
VF-1 {
        passthrough : false;
}

pf.conf: (omitted jail specific rules and variables to keep private)
set block-policy drop
set skip on { lo, bridge, tap } # skip on bridge and tap, they can cause issues with bhyve
scrub all no-df max-mss 1440 random-id reassemble tcp

block on ix0 all

pass in proto tcp to $host port { $host_tcp } modulate state
pass in proto udp to $host port { $host_udp } modulate state
pass in proto tcp from $nfs_clients to $host port { $host_nfs_ports } modulate state
pass in proto udp from $nfs_clients to $host port { $host_nfs_ports } modulate state
pass out all modulate state

pass in inet proto icmp all icmp-type echoreq
pass in inet6 proto ipv6-icmp all icmp6-type { 1, 2, 3, 4, 128, 129, 133, 134, 135, 136, 137 }
Comment 8 Richard Gallamore freebsd_committer 2017-09-12 19:21:33 UTC
Booting up a windows 10:

nice -n -20 bhyve \
        -c 4 -m 8G -w -H -S \
        -s 0,hostbridge \
        -s 3,ahci-cd,${IMAGES}/null.iso \
        -s 5,ahci-hd,${ZVOL_DIR}/${NAME} \
        -s 6:0,passthru,129/0/133 \
        -l bootrom,${IMAGES}/BHYVE_UEFI.fd \
        -s 29,fbuf,tcp=${VNC},w=1600,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        ${NAME}

pciconf -lv:
ppt0@pci0:129:0:133:    class=0x020000 card=0x00001458 chip=0x15158086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'X540 Ethernet Controller Virtual Function'
    class      = network
    subclass   = ethernet

While windows 10 is booting an error is received:
Assertion failed: (error == 0), function modify_bar_registration, file /usr/src/head/src/usr.sbin/bhyve/pci_emul.c, line 491.
Abort trap

Booting a debian 8 (jessie):
nice -n -20 bhyve \
        -c 2 -m 4G -w -H -S \
        -s 0,hostbridge \
        -s 5,ahci-hd,${ZVOL_DIR}/${NAME} \
        -s 6:0,passthru,129/0/133 \
        -l bootrom,${IMAGES}/BHYVE_UEFI.fd \
        -s 29,fbuf,tcp=${VNC},w=1440,h=900 \
        -s 30,xhci,tablet \
        -s 31,lpc -l com1,stdio \
        ${NAME}

This boots successfully, however # dmesg | grep ixgbe has errors:
ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 2. 12.1-k
ixgbevf: Copyright (c) 2009 - 2012 Intel corporation.
ixgbevf: 0000:00:06.0: enabling device (0004 -> 0006)
ixgbevf: 0000:00:06.0: PF still in reset state. Is the PF interface up?
ixgbevf: 0000:00:06.0: Assigning random MAC address
ixgbevf: 0000:00:06.0: irq 48 for MSI/MSI-X
ixgbevf: 0000:00:06.0: irq 49 for MSI/MSI-X

# ip link set eth0 up
ixgbevf: Unable to start - perhaps the PF Driver isn't up yet
RTNETLINK answers: Network is down


Hope this helps.
Comment 9 Piotr Pietruszewski 2017-12-21 13:14:24 UTC
(In reply to Richard Gallamore from comment #8)

It is likely that this bug is fixed in newest driver provided by Intel on https://downloadcenter.intel.com/download/14688/Intel-Network-Adapters-Driver-for-PCIe-10-Gigabit-Network-Connections-Under-FreeBSD- . Feedback about problem resolution would be greatly appreciated.
Comment 10 Richard Gallamore freebsd_committer 2017-12-22 02:13:13 UTC
(In reply to Piotr Pietruszewski from comment #9)
Hello Piotr,

Thank you very much for the link. After compiling and installing the driver everything appears to work though I have noticed an errors, i'm not sure if it is a false positive and also I think the vf is short by one. I have done basic testing (pinging) on the vfs and seems to work fine.

Setting hw.ix.num_queues seems to no longer matter so I removed it from /boot/loader.conf. I'm not sure if this was intentional. One error I found so far:
Dec 21 15:50:42 S1 kernel: ix1: CRITICAL: ECC ERROR!!  Please Reboot!!

Not sure if its a false positive. Seems to happen after the 3rd invoke of iovctl after reboot, but not entirely sure of the trigger.

64 vfs per port should be available, However, vfs <= 63 everything appeared normal other than the occasional error previously mentioned.

vfs >= 64 will return error:
iovctl: Failed to configure SR-IOV: No space left on device.

64 is supposed to be the max vfs or am I mistaken? or is the first interface (ix1) count as one of the vfs?

FreeBSD S1 12.0-CURRENT FreeBSD 12.0-CURRENT #2 r327068: Thu Dec 21 13:00:34 PST 2017

# cat /boot/loader.conf
if_ix_load="YES"

# cat /etc/iovctl.conf
PF {
        device : ix1;
        num_vfs : 32;
}

DEFAULT {
        passthrough : true;
}
VF-0 {
        passthrough : false;
}
VF-1 {
        passthrough : false;
}

It would be nice to get this in head but after seeing r327031[1], I don't see this happening anytime soon.

Also want to mention, tested head driver(4.0.0-k) before 3.2.17 and sr-iov is returning the similar/same errors mentioned in earlier posts on this thread.

[1] https://svnweb.freebsd.org/base?view=revision&revision=327031
Comment 11 Eric Joyner freebsd_committer 2018-02-24 03:13:50 UTC
(In reply to Richard Gallamore from comment #10)

In regards to the 63 VF limit, the card supports up to 64 (fixed) queue pools, but the current implementation always assigns one to the PF interface, so you get the 63 VF limit. If it were changed to not give the PF interface any queues, then you could have 64 VFs.
Comment 12 tsuroerusu 2019-05-05 08:28:57 UTC
I have just run into this problem on FreeBSD 11.2 for one of my use cases. Has this problem of VFs not attaching been fixed in 11-STABLE for 11.3, 12.0-RELEASE or 12-STABLE?