Bug 266480 - Panic "sleeping thread" with qlnxe driver
Summary: Panic "sleeping thread" with qlnxe driver
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.3-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-18 09:38 UTC by Leonardo Secci
Modified: 2023-01-24 07:03 UTC (History)
3 users (show)

See Also:


Attachments
Dump info (17.10 KB, application/x-compressed-tar)
2022-09-18 09:38 UTC, Leonardo Secci
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Leonardo Secci 2022-09-18 09:38:39 UTC
Created attachment 236659 [details]
Dump info

Dear Support Team,

I am experiencing problems using the qlnxe driver with a QLogic 41000 series device.

A crash occurs simply by enabling or disabling a VLAN interface created on a NIC .

I assume that the operation triggers a series of hotplug events raises critical runs that generates conflicts.

Attached is the crash report.

Best regards
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2022-11-21 09:39:31 UTC
From the stack dump:
> Sleeping thread (tid 100919, pid 62482) owns a non-sleepable lock
> KDB: stack backtrace of thread 100919:
> sched_switch() at sched_switch+0x630/frame 0xfffffe00c741ddf0
> mi_switch() at mi_switch+0xd4/frame 0xfffffe00c741de20
> sleepq_timedwait() at sleepq_timedwait+0x2f/frame 0xfffffe00c741de60
> _sleep() at _sleep+0x1c8/frame 0xfffffe00c741dee0
> pause_sbt() at pause_sbt+0xf1/frame 0xfffffe00c741df10
> qlnx_stop() at qlnx_stop+0x4b5/frame 0xfffffe00c741dfa0
> qlnx_init_locked() at qlnx_init_locked+0x2a/frame 0xfffffe00c741e070
> qlnx_ioctl() at qlnx_ioctl+0x53a/frame 0xfffffe00c741e0d0
> ifhwioctl() at ifhwioctl+0x596/frame 0xfffffe00c741e150
> ifioctl() at ifioctl+0x4bc/frame 0xfffffe00c741e210
> kern_ioctl() at kern_ioctl+0x2b7/frame 0xfffffe00c741e270
> sys_ioctl() at sys_ioctl+0x101/frame 0xfffffe00c741e340
> amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe00c741e470
> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00c741e470
> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800b54d4a, rsp = 0x7fffffffd198, > rbp = 0x7fffffffd210 ---
> panic: sleeping thread
> cpuid = 3
> time = 1663318115
> KDB: enter: panic

`qlnx_stop()` tried to sleep while is under mutex locked. That is prohibit since mutex is not sleepable lock.

See mutext(9):
> Sleeping
>     Sleeping while holding a mutex (except for Giant) is never safe and
>     should be avoided.  There are numerous assertions which will fail if this
>     is attempted.

The driver calls `qlnx_mdelay()`, cold boot and hot boot differs.
> #define qlnx_mdelay(fn, msecs)  \
>        {\
>                if (cold) \
>                        DELAY((msecs * 1000)); \
>                else  \
>                        pause(fn, qlnx_ms_to_hz(msecs)); \
>        }

I guess this happens when your box is hot rebooted.

Can you please try cold reboot (shutdown -p and then startup) your box and try again?
Comment 2 Zhenlei Huang freebsd_committer freebsd_triage 2022-11-21 10:02:42 UTC
(In reply to Zhenlei Huang from comment #1)

> I guess this happens when your box is hot rebooted.

> Can you please try cold reboot (shutdown -p and then startup) your box and try again?

Sorry that is probably not OK as it is too late.
 `sys/x86/x86/autoconf.c` will set `cold` to 0 right after finishing device probing, before rc(8) have a chance to run ( rc(8) is called by init(8) which runs after device probing).
Comment 3 jjcomput 2023-01-24 07:03:47 UTC
Yea, im hitting this exact bug as well, has any one been able to find a workaround? I did notice something peculiar though, I was able to make interface changes on the console (Through iLO remote console) without causing the kernal panic before going through the pfSense first time setup. Once I went through that it started panicing on every change just like as described here console or not.

Since this was a fresh install anyways, im just going to reinstall pfsense and try to set them all in the console before touching the web interface. Somehow skipping the first time setup since it will overwrite the changes made prior and leave you stuck. Will report back

QL41164 on the qlnxe driver