Bug 277432 - iwlwifi panic: Sleeping thread owns a non-sleepable lock
Summary: iwlwifi panic: Sleeping thread owns a non-sleepable lock
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: 13.3-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-wireless (Nobody)
URL:
Keywords: crash
Depends on:
Blocks: iwlwifi
  Show dependency treegraph
 
Reported: 2024-03-02 10:35 UTC by weiss
Modified: 2024-03-03 04:47 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description weiss 2024-03-02 10:35:46 UTC
FreeBSD  13.3-RC1 FreeBSD 13.3-RC1 releng/13.3-n257425-8997b0270dae GENERIC amd64

with 13.2 userland, wireless WPA2-PSK, 
Mar  2 10:13:14 laptop kernel: iwlwifi0: <iwlwifi> mem 0xec000000-0xec001fff irq 18 at device 0.0 on pci2
Mar  2 10:13:14 laptop kernel: iwlwifi0: Detected crf-id 0xbadcafe, cnv-id 0x10 wfpm id 0x80000000
Mar  2 10:13:14 laptop kernel: iwlwifi0: PCI dev 24fd/0050, rev=0x230, rfid=0xd55555d5
Mar  2 10:13:14 laptop kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-8265-36.ucode'
Mar  2 10:13:14 laptop kernel: iwlwifi0: loaded firmware version 36.ca7b901d.0 8265-36.ucode op_mode iwlmvm
Mar  2 10:13:14 laptop kernel: iwlwifi0: Detected Intel(R) Dual Band Wireless AC 8265, REV=0x230
Mar  2 10:13:14 laptop kernel: iwlwifi0: base HW address: 24:ee:9a:b6:36:ea, OTP minor version: 0x0
u

after about an hour of usage I got this panic:


Mar  2 11:09:10 laptop kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0093b72e80 vap 0xfffffe0093b72010 state RUN
Mar  2 11:09:10 laptop kernel: wlan0: link state changed to DOWN
Mar  2 11:09:10 laptop wpa_supplicant[350]: wlan0: CTRL-EVENT-DISCONNECTED bssid=a0:f3:c1:74:98:ec reason=0
Mar  2 11:09:10 laptop wpa_supplicant[350]: ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Can't assign requested address
Mar  2 11:09:10 laptop kernel: iwlwifi0: Couldn't drain frames for staid 0, status 0x8
Mar  2 11:09:10 laptop kernel: iwlwifi0: lkpi_sta_run_to_init:2173: mo_sta_state(NOTEXIST) failed: -5
Mar  2 11:09:10 laptop kernel: iwlwifi0: lkpi_iv_newstate: error -5 during state transition 5 (RUN) -> 0 (INIT)
Mar  2 11:09:11 laptop kernel: iwlwifi0: lkpi_sta_scan_to_auth:1033: lvif 0xfffffe0093b72000 vap 0xfffffe0093b72010 iv_bss 0xfffffe00d21f9000 lvif_bss 0xfffff8000715c
800 lvif_bss->ni 0xfffffe00cafe2000 synched 0
Mar  2 11:09:11 laptop kernel: iwlwifi0: lkpi_iv_newstate: error 16 during state transition 1 (SCAN) -> 2 (AUTH)
Mar  2 11:10:00 laptop syslogd: kernel boot file is /boot/kernel/kernel
Mar  2 11:10:00 laptop kernel: Sleeping thread (tid 100876, pid 0) owns a non-sleepable lock
Mar  2 11:10:00 laptop kernel: KDB: stack backtrace of thread 100876:
Mar  2 11:10:00 laptop kernel: #0 0xffffffff80c11dbf at mi_switch+0xbf
Mar  2 11:10:00 laptop kernel: #1 0xffffffff80c11520 at _sleep+0x1f0
Mar  2 11:10:00 laptop kernel: #2 0xffffffff80c67151 at taskqueue_thread_loop+0xb1
Mar  2 11:10:00 laptop kernel: #3 0xffffffff80bc094d at fork_exit+0x7d
Mar  2 11:10:00 laptop kernel: #4 0xffffffff8109ab9e at fork_trampoline+0xe
Mar  2 11:10:00 laptop kernel: panic: sleeping thread
Mar  2 11:10:00 laptop kernel: cpuid = 1
Mar  2 11:10:00 laptop kernel: time = 1709374162
Mar  2 11:10:00 laptop kernel: KDB: stack backtrace:
Mar  2 11:10:00 laptop kernel: #0 0xffffffff80c514c5 at kdb_backtrace+0x65
Mar  2 11:10:00 laptop kernel: #1 0xffffffff80c04e22 at vpanic+0x152
Mar  2 11:10:00 laptop kernel: #2 0xffffffff80c04cc3 at panic+0x43
Mar  2 11:10:00 laptop kernel: #3 0xffffffff80c69673 at propagate_priority+0x293
Mar  2 11:10:00 laptop kernel: #4 0xffffffff80c6a1c4 at turnstile_wait+0x314
Mar  2 11:10:00 laptop kernel: #5 0xffffffff80be1bbb at __mtx_lock_sleep+0x17b
Mar  2 11:10:00 laptop kernel: #6 0xffffffff80e58960 at linuxkpi_ieee80211_find_sta+0xd0
Mar  2 11:10:00 laptop kernel: #7 0xffffffff80e58a0f at linuxkpi_ieee80211_find_sta_by_ifaddr+0x7f
Mar  2 11:10:00 laptop kernel: #8 0xffffffff831312b8 at iwl_mvm_rx_rx_mpdu+0x1e8
Mar  2 11:10:00 laptop kernel: #9 0xffffffff8315eeb4 at iwl_pcie_rx_handle+0x444
Mar  2 11:10:00 laptop kernel: #10 0xffffffff8315e78d at iwl_pcie_napi_poll+0x2d
Mar  2 11:10:00 laptop kernel: #11 0xffffffff80e674cf at lkpi_napi_task+0xf
Mar  2 11:10:00 laptop kernel: #12 0xffffffff80c65ed2 at taskqueue_run_locked+0x182
Mar  2 11:10:00 laptop kernel: #13 0xffffffff80c67162 at taskqueue_thread_loop+0xc2
Mar  2 11:10:00 laptop kernel: #14 0xffffffff80bc094d at fork_exit+0x7d
Mar  2 11:10:00 laptop kernel: #15 0xffffffff8109ab9e at fork_trampoline+0xe
Mar  2 11:10:00 laptop kernel: Uptime: 56m26s
Comment 1 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-03-02 10:53:36 UTC
I believe the first half of the problem is a duplicate of PR 275255 with a more recent incarnation after other changes.
It seems that for the "older" iwlwifi (FW) API something else we do already removes the station from the firmware and so when we try to explicitly do it we get an error;  from there on state is out of sync and follow-up errors happen.


The second half seems (locking) could be an issue and shouldn't happen, but I wonder as-to how much that is also an error recovery path or due to out-of-sync something (in net80211) happened which shouldn't.  That part will need separate investigation.