Bug 272828 - Intel XL710 : kernel panic with SRVIO - intel-ixl-kmod-1.12.40
Summary: Intel XL710 : kernel panic with SRVIO - intel-ixl-kmod-1.12.40
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: IntelNetworking
Depends on:
Blocks:
 
Reported: 2023-07-30 21:27 UTC by Santiago Martinez
Modified: 2023-08-22 21:07 UTC (History)
4 users (show)

See Also:


Attachments
ixl panic screenshot (320.13 KB, image/png)
2023-07-30 21:27 UTC, Santiago Martinez
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Santiago Martinez 2023-07-30 21:27:27 UTC
Created attachment 243717 [details]
ixl panic screenshot

Hi there, the intel_ixl_updated (intel-ixl-kmod-1.12.40) driver makes the kernel panic when trying to create virtual functions. without SRIOV seems to be stable. please see attached image for the panic.

Release : 13.2-p1
Hardware : Super-micro (happens on multiple machines).


dmesg:
ixl0: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.40> mem 0xb0800000-0xb0ffffff,0xb1808000-0xb180ffff irq 118 at device 0.0 on pci14
ixl0: using 1024 tx descriptors and 1024 rx descriptors
ixl0: fw 7.2.60285 api 1.9 nvm 7.20 etid 80008322 oem 1.266.0
ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl0: Using MSI-X interrupts with 9 vectors
ixl0: Allocating 8 queues for PF LAN VSI; 8 queues active
ixl0: Ethernet address: 3c:ec:ef:32:c3:ec
ixl0: PCI Express Bus: Speed 8.0GT/s Width x8
ixl0: SR-IOV ready
ixl0: The device is not iWARP enabled


sysctl output:
dev.ixl.0.fw_version: fw 7.2.60285 api 1.9 nvm 7.20 etid 80008322 oem 1.266.0
dev.ixl.0.current_speed: 40 Gbps
dev.ixl.0.supported_speeds: 32
dev.ixl.0.advertise_speed: 32
dev.ixl.0.fc: 0
dev.ixl.0.%parent: pci14
dev.ixl.0.%pnpinfo: vendor=0x8086 device=0x1583 subvendor=0x15d9 subdevice=0x084a class=0x020000
dev.ixl.0.%location: slot=0 function=0 dbsf=pci0:65:0:0
dev.ixl.0.%driver: ixl
dev.ixl.0.%desc: Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.40

pciconf -lv ixl0
ixl0@pci0:65:0:0:       class=0x020000 rev=0x02 hdr=0x00 vendor=0x8086 device=0x1583 subvendor=0x15d9 subdevice=0x084a
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Controller XL710 for 40GbE QSFP+'
    class      = network
    subclass   = ethernet
Comment 1 Aleksandr Fedorov freebsd_committer freebsd_triage 2023-07-31 08:45:26 UTC
This panic is very similar to https://reviews.freebsd.org/D35649
Comment 2 Santiago Martinez 2023-08-06 15:54:26 UTC
Hi, sorry for my late reply. I have tried the patch but it still crashing.
Same exact crash output.
Santi
Comment 3 Santiago Martinez 2023-08-06 16:18:49 UTC
I have just tested with intel-ixl-kmod-1.12.3 and does work with SRVIO. 
At least is does not crash while re configuring the filters.
Comment 4 Santiago Martinez 2023-08-06 17:15:21 UTC
BT output:

0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:396
#2  0xffffffff807b871a in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:484
#3  0xffffffff807b8bbe in vpanic (fmt=<optimized out>, ap=ap@entry=0xfffffe05db485730) at /usr/src/sys/kern/kern_shutdown.c:923
#4  0xffffffff807b89f3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:847
#5  0xffffffff80ba3127 in trap_fatal (frame=0xfffffe05db485820, eva=124) at /usr/src/sys/amd64/amd64/trap.c:942
#6  0xffffffff80ba317f in trap_pfault (frame=0xfffffe05db485820, usermode=false, signo=<optimized out>, ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:761
#7  <signal handler called>
#8  if_getcapenable (ifp=0x0) at /usr/src/sys/net/if.c:4171
#9  0xffffffff822406d4 in ixl_reconfigure_filters () from /home/smartinez/ixl-1.12.40/src/if_ixl.ko
#10 0xffffffff82266c0c in ixl_vf_setup_vsi () from /home/smartinez/ixl-1.12.40/src/if_ixl.ko
#11 0xffffffff8226671e in ixl_add_vf () from /home/smartinez/ixl-1.12.40/src/if_ixl.ko
#12 0xffffffff8059fb22 in PCI_IOV_ADD_VF (dev=0xfffffe05d068c000, vfnum=0, config=0xf8) at ./pci_iov_if.h:60
#13 pci_iov_enumerate_vfs (dinfo=<optimized out>, config=0xfffff8010b253740, first_rid=<optimized out>, rid_stride=1) at /usr/src/sys/dev/pci/pci_iov.c:665
#14 pci_iov_config (cdev=<optimized out>, arg=<optimized out>) at /usr/src/sys/dev/pci/pci_iov.c:761
#15 pci_iov_ioctl (dev=<optimized out>, cmd=<optimized out>, data=<optimized out>, fflag=<optimized out>, td=<optimized out>) at /usr/src/sys/dev/pci/pci_iov.c:986
#16 0xffffffff8064c7c6 in devfs_ioctl (ap=0xfffffe05db485ba8) at /usr/src/sys/fs/devfs/devfs_vnops.c:944
#17 0xffffffff808ac104 in vn_ioctl (fp=0xfffff801629fa1e0, com=18446735282067515392, data=0xfffffe05db485d50, active_cred=0xfffff801de9ea700, td=0x10) at /usr/src/sys/kern/vfs_vnops.c:1697
#18 0xffffffff8064ce7e in devfs_ioctl_f (fp=0x0, com=18446735282067515392, data=0xf8, cred=0xfffffe0670b10cc0, td=0x10) at /usr/src/sys/fs/devfs/devfs_vnops.c:875
#19 0xffffffff8082719d in fo_ioctl (fp=0xfffff801629fa1e0, com=18446735282067515392, data=0xf8, active_cred=0xfffffe0670b10cc0, td=0xfffffe05d839e900) at /usr/src/sys/sys/file.h:361
#20 kern_ioctl (td=0x10, td@entry=0xfffffe05d839e900, fd=<optimized out>, com=18446735282067515392, com@entry=2148560906, data=0xf8 <error: Cannot access memory at address 0xf8>, 
    data@entry=0xfffffe05db485d50 "") at /usr/src/sys/kern/sys_generic.c:803
#21 0xffffffff80826e80 in sys_ioctl (td=0xfffffe05d839e900, uap=0xfffffe05d839ece8) at /usr/src/sys/kern/sys_generic.c:711
#22 0xffffffff80ba3a1c in syscallenter (td=0xfffffe05d839e900) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:190
#23 amd64_syscall (td=0xfffffe05d839e900, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1183
#24 <signal handler called>
#25 0x00000e55d9e4bb8a in ?? ()
Backtrace stopped: Cannot access memory at address 0xe55d67c9fe8
Comment 5 Santiago Martinez 2023-08-09 13:01:28 UTC
Hi everyone, do we know who is the right person/contact from Intel working on the FreeBSD drivers?
Comment 6 Krzysztof Galazka freebsd_committer freebsd_triage 2023-08-09 13:23:43 UTC
(In reply to Santiago Martinez from comment #5)
Hi,

That would be Eric Joyner (erj) and me. At first glance it seems that using ixl_reconfigure_filters for VFs may be not a good idea. I'll look into it.
Comment 7 Santiago Martinez 2023-08-09 14:16:05 UTC
thanks a lot Krzysztof. Let me know if there is anything I can help with.
Comment 8 Santiago Martinez 2023-08-14 15:35:25 UTC
(In reply to Santiago Martinez from comment #7)
Hi Krzysztof, 

Today i was checking and it seems it is related to https://reviews.freebsd.org/D35649. I think last time apply the patch but loaded the old drivers.. sorry for that.

Adding the check of vsi-ifp equals NULL does work as then we are not calling getcaps when ifp is NULL. I have also added a print line for me to follow. The output looks like this (load driver, crate VF, delete VF).

[3097] ixl0: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.40> mem 0xb0800000-0xb0ffffff,0xb1808000-0xb180ffff irq 118 at device 0.0 on pci14
[3097] ixl0: using 1024 tx descriptors and 1024 rx descriptors
[3097] ixl0: fw 7.2.60285 api 1.9 nvm 7.20 etid 80008322 oem 1.266.0
[3097] ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
[3097] ixl0: Using MSI-X interrupts with 9 vectors
[3097] ixl0: Allocating 8 queues for PF LAN VSI; 8 queues active
[3097] ixl0: Ethernet address: 3c:ec:ef:32:c3:ec
[3097] ixl0: Link is up, 40 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
[3097] ixl0: link state changed to UP
[3097] ixl0: PCI Express Bus: Speed 8.0GT/s Width x8
[3097] ixl0: SR-IOV ready
[3097] ixl0: The device is not iWARP enabled
[3097] ixl0: link state changed to DOWN
[3097] ixl1: <Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.12.40> mem 0xb0000000-0xb07fffff,0xb1800000-0xb1807fff irq 118 at device 0.1 on pci14
[3097] ixl1: using 1024 tx descriptors and 1024 rx descriptors
[3097] ixl1: fw 7.2.60285 api 1.9 nvm 7.20 etid 80008322 oem 1.266.0
[3097] ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
[3097] ixl1: Using MSI-X interrupts with 9 vectors
[3097] ixl1: Allocating 8 queues for PF LAN VSI; 8 queues active
[3097] ixl1: Ethernet address: 3c:ec:ef:32:c3:ed
[3097] ixl1: PCI Express Bus: Speed 8.0GT/s Width x8
[3097] ixl1: SR-IOV ready
[3097] ixl1: The device is not iWARP enabled
[3098] ixl0: Link is up, 40 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
[3098] ixl0: link state changed to UP
[3098] ixl1: Link is up, 40 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
[3098] ixl1: link state changed to UP
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] vsi-ipf is null
[3098] ppt0 at device 0.16 on pci14
[3098] ppt1 at device 0.17 on pci14
[3098] ppt2 at device 0.18 on pci14
[3098] ppt3 at device 0.19 on pci14
[3098] ppt4 at device 0.20 on pci14
[3098] ppt5 at device 0.21 on pci14
[3098] ppt6 at device 0.22 on pci14
[3098] ppt7 at device 0.23 on pci14
[3098] ppt0: detached
[3098] ppt1: detached
[3098] ppt2: detached
[3098] ppt3: detached
[3098] ppt4: detached
[3098] ppt5: detached
[3098] ppt6: detached
[3098] ppt7: detached
[3099] ixl0: detached
[3099] pci14: <network, ethernet> at device 0.0 (no driver attached)
[3100] ixl1: detached
[3100] pci14: <network, ethernet> at device 0.1 (no driver attached)
[3100] Warning: memory type ixl leaked memory on destroy (8 allocations, 256 bytes leaked).
Comment 9 Santiago Martinez 2023-08-22 21:07:49 UTC
Hi Krzysztof, did you had a chance to review the code.

I saw there is a new driver release but I haven't seen any change related to the SRIOV panic.

best regards.