Loading mlx4en on CURRENT (main-n250453-7ac82c96fe7) hangs kldload. Pressing ^C allows the module to continue and to load properly. This happens from a shell and when loading the module using 'kld_list' The mlx4en part of dmesg is: mlx4_core0: <mlx4_core> mem 0xfce00000-0xfcefffff,0xe0000000-0xe07fffff irq 54 at device 0.0 on pci9 mlx4_core: Mellanox ConnectX core driver v3.7.0 (July 2021) mlx4_core: Initializing mlx4_core mlx4_core0: Unable to determine PCI device chain minimum BW mlx4_en mlx4_core0: Activating port:1 mlxen0: Ethernet address: ec:0d:9a:e1:34:20 mlx4_en: mlx4_core0: Port 1: Using 16 TX rings mlxen0: link state changed to DOWN mlx4_en: mlx4_core0: Port 1: Using 16 RX rings mlx4_en: mlxen0: Using 16 TX rings mlx4_en: mlxen0: Using 16 RX rings mlx4_en: mlxen0: Initializing port mlx4_en mlx4_core0: Activating port:2 mlxen1: Ethernet address: ec:0d:9a:e1:34:21 mlx4_en: mlx4_core0: Port 2: Using 16 TX rings mlxen1: link state changed to DOWN mlx4_en: mlx4_core0: Port 2: Using 16 RX rings mlx4_en: mlxen1: Using 16 TX rings mlx4_en: mlxen1: Using 16 RX rings mlx4_en: mlxen1: Initializing port mlx4_en: mlxen1: Link Up mlxen1: link state changed to UP mlx4_en: mlxen0: Link Up mlxen0: link state changed to UP This happens on my Ryzen 3700X system, but happened previously on other systems. 13-RELEASE doesn't have this problem
Can you do a procstat -akk , when this happens? Have you updated the firmware on this device? --HPS
Created attachment 229410 [details] procstat -akk
Created attachment 229411 [details] mstflint query output
Updated to the latest firmware for this card (2.42.5000) and kldload still hangs
Did you run the procstat -akk as root?
Created attachment 229432 [details] procstat -akk run as root
Comment on attachment 229432 [details] procstat -akk run as root @kib: The LinuxKPI can load modules inside kldload: 1050 100580 kldload - mi_switch+0x155 sleepq_switch+0x119 sleepq_catch_signals+0x266 sleepq_wait_sig+0x9 _sleep+0x294 kern_kldload+0xd5 mlx4_request_modules+0x9e mlx4_load_one+0x2f8d mlx4_init_one+0x4cc linux_pci_attach_device+0x42e device_attach+0x3c1 device_probe_and_attach+0x70 pci_driver_added+0xf3 devclass_driver_added+0x39 devclass_add_driver+0x147 _linux_pci_register_driver+0xcf This is a regression after: commit e266a0f7f001c7886eab56d8c058d92d87010400 Author: Konstantin Belousov <kib@FreeBSD.org> Date: Thu May 20 17:50:43 2021 +0300 kern linker: do not allow more than one kldload and kldunload syscalls simultaneously kld_sx is dropped e.g. for executing sysinits, which allows user to initiate kldunload while module is not yet fully initialized. Reviewed by: markj Differential revision: https://reviews.freebsd.org/D30456 Sponsored by: The FreeBSD Foundation MFC after: 1 week Maybe allow recursion here? Or use a taskqueue to load the module? --HPS
Please try https://reviews.freebsd.org/D32972
Tom Jones: ping
kldload doesn't hang with this, tested on e383c423c492781bd7e7a0de9dfe433e4d6a4eed
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4f924a786ae08af496dfe55230f8fe1e2ca16150 commit 4f924a786ae08af496dfe55230f8fe1e2ca16150 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2021-11-12 19:45:06 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2021-11-28 08:36:09 +0000 linker_kldload_busy(): allow recursion Some drivers recursively loads modules by explicit calls to kldload during initialization, which might occur during kldload. PR: 259748 Reported and tested by: thj Reviewed by: markj Sponsored by: Nvidia networking MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32972 sys/kern/kern_linker.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2c52eba4f46e2cc9a4fda3a9e6e81e06fb8daf57 commit 2c52eba4f46e2cc9a4fda3a9e6e81e06fb8daf57 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2021-11-12 19:45:06 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2021-12-05 01:02:57 +0000 linker_kldload_busy(): allow recursion PR: 259748 (cherry picked from commit 4f924a786ae08af496dfe55230f8fe1e2ca16150) sys/kern/kern_linker.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)