Summary: | ixl: panic on attach of X722 on-motherboard interfaces | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Garrett Wollman <wollman> | ||||
Component: | kern | Assignee: | freebsd-net (Nobody) <net> | ||||
Status: | New --- | ||||||
Severity: | Affects Only Me | CC: | benoitc, krzysztof.galazka, pen, sm, vmaffione, zlei | ||||
Priority: | --- | Keywords: | IntelNetworking, crash | ||||
Version: | 12.2-RELEASE | ||||||
Hardware: | amd64 | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Garrett Wollman
2021-01-12 23:49:32 UTC
(In reply to Garrett Wollman from comment #0) vsi->ctx is set at the beginning of ixl_if_attach_pre and timer is started at the end of ixl_if_attach_post so it looks a bit strange. I haven't seen anything like that before. I'm trying to get a reproduction. Is there a chance for a core dump? Also could you, please, check the FW version (sysctl dev.ixl.0.fw_version) and provide exact model of the motherboard? (In reply to Krzysztof Galazka from comment #1) Since these are production file servers, I removed the ixl driver so that I could complete the upgrade within the scheduled window, and won't be able to tell you the firmware version any time soon. The machine is an iXsystems IXC-4224P-IXN, and the motherboard is a Supermicro X11DPH-i rev 1.10. According to our inventory, these servers were delivered in October, 2018. Created attachment 234817 [details]
fatal-trap
Sorry. just submitted the image without text by mistake. This is happening on a supermicro server running 13.1. We have servers with the same spec. 3 of them run FreeBSD 13.0 + intel driver from ports and they don't show any issues at all (SRIOV and PT are working). One of the servers has been wiped and installed with 13.1. Without the intel driver from ports, it works well. As soon as the server is booted with the intel driver from the ports, it starts having issues. Sometimes it just freezes, and sometimes it triggers a trap 12 and the system reboot. The trap always occurs when the interface gets activated (ifconfig ixl2 up). Something interesting. When compiling the drivers (from ports), it provides three options related to netmap, auto/on/off. 1 - When compiling the driver with netmap off the problem disappears. 2 - When compiling the driver with netmap auto the driver compiles but triggers the trap when booted and the interface is activated. 3 - When compiling the driver with netmap on, the driver fails to compile, which probably explains the aforementioned behaviour. have done some test, added some IOV and things seems to work. Still, somebody should fix the netmap support ( as I'm not sure whats the correct way to fix it) I have sent an email to freebsd@intel.com Just to make sure I understand correctly. These issues are not affecting the iflib(4) based ixl driver that comes with the stock kernel, are they? Is the issue only related to the intel drivers from ports? i hve same chipset intel card on HPE (both media card with 1Gb and 10Gb). When IOV is enabled using the stock kernel IXL driver of freebsd I get a panic at boot. When using the intel-ixl-kmod driver it works but once SRV is enabled I get the bug #234073. When, I upgrade to latest intel-ixl-kmod, the kernel panics and reboot. ``` dmesg |grep ixl module_register: cannot register pci/ixl from kernel; already loaded from if_ixl_updated.ko Module pci/ixl failed to register: 17 ixl0: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.2> mem 0xeb000000-0xebffffff,0xef000000-0xef007fff at device 0.0 numa-domain 0 on pci9 ixl0: using 1024 tx descriptors and 1024 rx descriptors ixl0: fw 5.5.67510 api 1.12 nvm 5.50 etid 80003373 oem 1.268.0 ixl0: The driver for the device detected a newer version of the NVM image than expected. ixl0: Please install the most recent version of the network driver. ixl0: PF-ID[0]: VFs 32, MSI-X 129, VF MSI-X 5, QPs 384, MDIO shared ixl0: Using MSI-X interrupts with 9 vectors ixl0: Allocating 8 queues for PF LAN VSI; 8 queues active ixl0: Ethernet address: b4:7a:f1:dd:c6:00 ixl0: SR-IOV ready ixl0: The device is not iWARP enabled ixl1: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.2> mem 0xec000000-0xecffffff,0xef008000-0xef00ffff at device 0.1 numa-domain 0 on pci9 ixl1: using 1024 tx descriptors and 1024 rx descriptors ixl1: fw 5.5.67510 api 1.12 nvm 5.50 etid 80003373 oem 1.268.0 ixl1: The driver for the device detected a newer version of the NVM image than expected. ixl1: Please install the most recent version of the network driver. ixl1: PF-ID[1]: VFs 32, MSI-X 129, VF MSI-X 5, QPs 384, MDIO shared ixl1: Using MSI-X interrupts with 9 vectors ixl1: Allocating 8 queues for PF LAN VSI; 8 queues active ixl1: Ethernet address: b4:7a:f1:dd:c6:01 ixl1: SR-IOV ready ixl1: The device is not iWARP enabled ixl2: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.2> mem 0xed000000-0xedffffff,0xef010000-0xef017fff at device 0.2 numa-domain 0 on pci9 ixl2: using 1024 tx descriptors and 1024 rx descriptors ixl2: fw 5.5.67510 api 1.12 nvm 5.50 etid 80003373 oem 1.268.0 ixl2: The driver for the device detected a newer version of the NVM image than expected. ixl2: Please install the most recent version of the network driver. ixl2: PF-ID[2]: VFs 32, MSI-X 129, VF MSI-X 5, QPs 384, I2C ixl2: Using MSI-X interrupts with 9 vectors ixl2: Allocating 8 queues for PF LAN VSI; 8 queues active ixl2: Ethernet address: b4:7a:f1:dd:c6:02 ixl2: SR-IOV ready ixl2: The device is not iWARP enabled ixl3: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.2> mem 0xee000000-0xeeffffff,0xef018000-0xef01ffff at device 0.3 numa-domain 0 on pci9 ixl3: using 1024 tx descriptors and 1024 rx descriptors ixl3: fw 5.5.67510 api 1.12 nvm 5.50 etid 80003373 oem 1.268.0 ixl3: The driver for the device detected a newer version of the NVM image than expected. ixl3: Please install the most recent version of the network driver. ixl3: PF-ID[3]: VFs 32, MSI-X 129, VF MSI-X 5, QPs 384, I2C ixl3: Using MSI-X interrupts with 9 vectors ixl3: Allocating 8 queues for PF LAN VSI; 8 queues active ixl3: Ethernet address: b4:7a:f1:dd:c6:03 ixl3: SR-IOV ready ixl3: The device is not iWARP enabled ixl2: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl2: link state changed to UP ixl3: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None ixl3: link state changed to UP ``` ``` pciconf -vl ixl2 ixl2@pci0:100:0:2: class=0x020000 rev=0x09 hdr=0x00 vendor=0x8086 device=0x37d3 subvendor=0x1590 subdevice=0x0219 vendor = 'Intel Corporation' device = 'Ethernet Connection X722 for 10GbE SFP+' class = network subclass = ethernet ``` ried it (112.35 with netmap support disabled and got a kernel panic: ``` Fatal trap 12: page fault while in kernel mode cpuid = 10; apic id = 0a fault virtual address= 0x54 fault code= supervisor read data, page not present instruction pointer= 0x20:0xffffffff82f04b8a stack pointer = 0x28:0xfffffe020d8108a0 frame pointer = 0x28:0xfffffe020d8108e0 code segment= base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 21330 (iovctl) trap number= 12 panic: page fault cpuid = 10 time = 1655897957 KDB: stack backtrace: #0 0xffffffff80c69465 at kdb_backtrace+0x65 #1 0xffffffff80c1bb1f at vpanic+0x17f #2 0xffffffff80c1b993 at panic+0x43 #3 0xffffffff810afdf5 at trap_fatal+0x385 #4 0xffffffff810afe4f at trap_pfault+0x4f #5 0xffffffff81087528 at calltrap+0x8 #6 0xffffffff82efe716 at ixl_reconfigure_filters+0x66 #7 0xffffffff82f24b90 at ixl_vf_setup_vsi+0x3b0 #8 0xffffffff82f2469e at ixl_add_vf+0x1ee #9 0xffffffff8086e407 at pci_iov_ioctl+0x1497 #10 0xffffffff80ab4e46 at devfs_ioctl+0xc6 #11 0xffffffff80d0cde4 at vn_ioctl+0x1a4 #12 0xffffffff80ab54fe at devfs_ioctl_f+0x1e #13 0xffffffff80c897cb at kern_ioctl+0x25b #14 0xffffffff80c894d1 at sys_ioctl+0xf1 #15 0xffffffff810b06ec at amd64_syscall+0x10c #16 0xffffffff81087e3b at fast_syscall_common+0xf8 Uptime: 13s ``` Hi Vincenzo, I think there are multiple issues, that maybe need to be handled separately. Current setup: OS: 13.1-RELEASE Card: Ethernet Connection X722 for 10GBASE-T With stock kernel ixl driver: 1- NO-SRIOV - ethernet card works, port comes up and connectivity looks ok. 2- SRIOV - when iovctl -C -f /etc/iovctl.conf is executed, then kernel panics. With ixl driver from ports: Version: ixl 1.12.2 boot loader: ixl_updated_load="YES": 1- NO-SRIOV - when ethernet ports come up (ifconfig up) there is a panic. There seems to be related to a problem with NETMAP, please see: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=264809. Recompiling the driver from ports with netmap support disabled, makes the trick for bot NO-SRIOV and SRIOV (at least for me) |