Bug 274382

Summary: iwlwifi Invalid TXQ id (FIXED)
Product: Base System Reporter: Bjoern A. Zeeb <bz>
Component: wirelessAssignee: Bjoern A. Zeeb <bz>
Status: In Progress ---    
Severity: Affects Some People CC: bz, cc, dch, emaste, fbugzilla, gothmog, hrs, justinlbouchard, mark, mav, maxim.usatov, nunziotocci2000, olce, osa, portmaster, rkoberman, thj, vlad.shabanov, wireless
Priority: --- Keywords: tracking
Version: CURRENTFlags: bz: mfc-stable14+
bz: mfc-stable13+
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271988
Bug Depends on:    
Bug Blocks: 273620    
Attachments:
Description Flags
dmesg including wifi panic
none
Second dmesg with panic at end, different hardware
none
iwl_mvm_tx_mpdu log messages
none
repro after applying D43879 on top of D43389, D43725, D43753 none

Description Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-10-09 19:28:42 UTC
iwlwifi PR to collect all the reports of "invalid TXQ id".

If you get here please:

(0) read through the reports and see if your version was already covered.

(1) add PCI ID information and firmware version of your card (pciconf -lv and dmesg will tell you).

(2) please report any (wifi related) lines leading up to the warning message from your kernel message buffer (dmesg).
Comment 1 commit-hook freebsd_committer freebsd_triage 2023-10-09 19:50:59 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ad134f8ab768e3b1d2db203f1cc963bdf0eea857

commit ad134f8ab768e3b1d2db203f1cc963bdf0eea857
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-09 19:47:39 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-10-09 19:50:02 +0000

    iwlwifi: re-enable "Invalid TXQ id" logging

    Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
    Unconditionally enable logging and add a note to report to a specific
    PR in the log message for now.
    Along with 018d93ece16b this will hopefully help us to understand what
    is going on.

    Sponsored by:   The FreeBSD Foundation
    PR:     274382

 sys/contrib/dev/iwlwifi/mvm/tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 2 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-10-19 20:09:37 UTC
There is one report in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979#c24
Comment 3 Cheng Cui freebsd_committer freebsd_triage 2023-11-09 19:34:57 UTC
Hit this on a first touch of enabling 'Wi-Fi 6 AX200' on my mini PC.

root@n1_iwl_vm:~ # service netif restart
Nov  9 10:21:34 n1_iwl_vm dhclient[1370]: My address (192.168.1.154) was deleted, dhclient exiting
Stopping dhclient.
Waiting for PIDS: 1448.
Stopping wpa_supplicant.
Waiting for PIDS: 1330.
Stopping Network: lo0 vtnet0 wlan0.
lo0: flags=8048<LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
vtnet0: flags=1008802<BROADCAST,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE>
        ether 58:9c:fc:0a:c9:c4
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
wlan0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether e0:2e:0b:92:e5:82
        groups: wlan
        ssid "" channel 149 (5745 MHz 11a)
        regdomain FCC country US authmode OPEN privacy OFF txpower 23 bmiss 7
        mcastrate 6 mgmtrate 6 scanvalid 60 wme
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Destroyed wlan(4) interfaces: wlan0.
Created wlan(4) interfaces: wlan0.
Starting wpa_supplicant.
Nov  9 10:21:35 n1_iwl_vm wpa_supplicant[1836]: ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
Starting dhclient.
wlan0: no link ...Nov  9 10:21:35 n1_iwl_vm syslogd: last message repeated 1 times
Nov  9 10:21:35 n1_iwl_vm wpa_supplicant[1837]: ioctl[SIOCS80211, op=103, val=0, arg_len=128]: Operation now in progress
..iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b1258408 skb 0xfffff800aa6d6800 { len 30 } info 0xfffffe00745dcce8 sta 0xfffff80007afd880 (if you see this please report to PR 274382)
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b1258408 skb 0xfffff800aa6d6800 { len 30 } info 0xfffffe00745dcce8 sta 0xfffff80007afd880 (if you see this please report to PR 274382)
panic: lkpi_sta_auth_to_scan: lsta 0xfffff80007cbe000 state not NONE: 0, nstate 1 arg 1

cpuid = 6
time = 1699543297
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00b0eb5b70
vpanic() at vpanic+0x132/frame 0xfffffe00b0eb5ca0
panic() at panic+0x43/frame 0xfffffe00b0eb5d00
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x2c8/frame 0xfffffe00b0eb5d80
lkpi_iv_newstate() at lkpi_iv_newstate+0x253/frame 0xfffffe00b0eb5df0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x1e7/frame 0xfffffe00b0eb5e40
taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe00b0eb5ec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe00b0eb5ef0
fork_exit() at fork_exit+0x82/frame 0xfffffe00b0eb5f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00b0eb5f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100167 ]
Stopped at      kdb_enter+0x32: movq    $0,0xe2aee3(%rip)
db>
Comment 4 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-11-09 19:44:38 UTC
(In reply to Cheng Cui from comment #3)

Which version of FreeBSD?
Given you get a panic on the state machine I assume it is main?

I would have expected something else being logged.  If you can reliably reproduce this, let me know and I'll send you a patch offline to see if we can at least try to catch the latter (panic cause).
Comment 5 Cheng Cui freebsd_committer freebsd_triage 2023-11-09 19:59:51 UTC
(In reply to Bjoern A. Zeeb from comment #4)
cc@n1_iwl_vm:~ % uname -a
FreeBSD n1_iwl_vm 15.0-CURRENT FreeBSD 15.0-CURRENT #0 main-f7d16a627: Tue Nov  7 17:31:17 EST 2023     cc@n1_iwl_vm:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
cc@n1_iwl_vm:~ %

It looks I am reporting two issues in this thread:
1. (if you see this please report to PR 274382)
2. panic: lkpi_sta_auth_to_scan: lsta 0xfffff80007cbe000 state not NONE: 0, nstate 1 arg 1

I can easily reproduce the #2 panic. Please send me the patch offline and I can try to dig deeper. thanks
Comment 6 Cheng Cui freebsd_committer freebsd_triage 2023-11-09 21:57:04 UTC
(In reply to Cheng Cui from comment #5)

I found the master bug (Bug 271979) is tracking this panic issue.
panic: lkpi_sta_auth_to_scan: lsta 0xfffff80007756800 state not NONE: 0, nstate 1 arg 1


I will update my traces there.
Comment 7 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-11-09 23:46:15 UTC
(In reply to Cheng Cui from comment #6)

See comment https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979#c29 for a possible explanation.
Comment 8 Alexander Motin freebsd_committer freebsd_triage 2023-11-10 14:39:46 UTC
I've updated my laptop to yesterday's main and lost my WiFi that worked before.  dmesg output references this PR.  Here are dmesg and pciconf outputs: https://people.freebsd.org/~mav/mav.log .  Please let me know if I can provide or test something.  Thanks.
Comment 9 commit-hook freebsd_committer freebsd_triage 2023-11-29 16:39:42 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3cd4af3825b564a6bf249b0f8d2e5dd1368ca49a

commit 3cd4af3825b564a6bf249b0f8d2e5dd1368ca49a
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-09 19:47:39 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-11-29 16:36:11 +0000

    iwlwifi: re-enable "Invalid TXQ id" logging

    Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
    Unconditionally enable logging and add a note to report to a specific
    PR in the log message for now.
    Along with 018d93ece16b this will hopefully help us to understand what
    is going on.

    Sponsored by:   The FreeBSD Foundation
    PR:     274382

    (cherry picked from commit ad134f8ab768e3b1d2db203f1cc963bdf0eea857)

 sys/contrib/dev/iwlwifi/mvm/tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 10 commit-hook freebsd_committer freebsd_triage 2023-11-30 00:39:13 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d27f3b8c7e6486ed0750286d7ec65e61935ce409

commit d27f3b8c7e6486ed0750286d7ec65e61935ce409
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-09 19:47:39 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-11-30 00:36:57 +0000

    iwlwifi: re-enable "Invalid TXQ id" logging

    Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
    Unconditionally enable logging and add a note to report to a specific
    PR in the log message for now.
    Along with 018d93ece16b this will hopefully help us to understand what
    is going on.

    Sponsored by:   The FreeBSD Foundation
    PR:     274382

    (cherry picked from commit ad134f8ab768e3b1d2db203f1cc963bdf0eea857)

 sys/contrib/dev/iwlwifi/mvm/tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 11 Jeremy Beker 2023-12-12 20:07:33 UTC
Created attachment 247013 [details]
dmesg including wifi panic
Comment 12 Jeremy Beker 2023-12-12 20:08:26 UTC
Just the panic from the previous attachment. (sorry, didn't realize it would add as two comments)

wlan0: ieee80211_new_state_locked: pending SCAN -> AUTH transition lost
Invalid TXQ id
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe01d5b9b448 skb 0xfffff8000292d800 { len 30 } info 0xfffffe0118462ce8 sta 0xfffff800545e2880 (if you see this please report to PR 274382)
panic: lkpi_sta_auth_to_scan: lsta 0xfffff800547f4800 state not NONE: 0, nstate 1 arg 1

cpuid = 3
time = 1702411289
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01d54d8b70
vpanic() at vpanic+0x132/frame 0xfffffe01d54d8ca0
panic() at panic+0x43/frame 0xfffffe01d54d8d00
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x2c8/frame 0xfffffe01d54d8d80
lkpi_iv_newstate() at lkpi_iv_newstate+0x253/frame 0xfffffe01d54d8df0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x1e7/frame 0xfffffe01d54d8e40
taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe01d54d8ec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe01d54d8ef0
fork_exit() at fork_exit+0x82/frame 0xfffffe01d54d8f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01d54d8f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Comment 13 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-12-12 22:56:32 UTC
(In reply to Jeremy Beker from comment #12)

The panic is a duplicate of 271979 I believe.
But this one is a good order of events to explain the problem of this PR it seems.

Thanks a lot for reporting it!  I'll get back here hopefully before the holidays.
Comment 14 Jeremy Beker 2023-12-14 18:12:55 UTC
Created attachment 247048 [details]
Second dmesg with panic at end, different hardware

Here is another panic from a different Dell laptop. Looks similar but I figured more data is better.

Thanks for all your work on this! Have a great holiday!
Comment 15 Hiroki Sato freebsd_committer freebsd_triage 2024-01-07 17:02:47 UTC
I got an "Invalid TXQ id" message on my machine running stable/14 as of Jan 7.  It caused no panic, but I could get no association after the driver was loaded. The output of uname+pciconf+dmesg is the following:

https://people.freebsd.org/~hrs/lenovo_x1_nano_gen1-iwlwifi-20240108.txt

The above log was captured with the driver loaded automatically by devd(8).  When I disabled loading the driver by devmatch_blocklist="if_iwlwifi" in rc.conf and tried to load it manually after entering the multi-user mode, it worked.  "kldload", "ifconfig wlan0 create", and "sh /etc/rc.d/wpa_supplicant start wlan0."  The log was the following:

https://people.freebsd.org/~hrs/lenovo_x1_nano_gen1-iwlwifi-20240108-kldload.txt

The "no buffer space" error reported by dhclient was not persistent in this case.  Eventually, DHCP packets are successfully exchanged.
Comment 16 rkoberman 2024-01-07 21:06:21 UTC
Just to update the status of my recent testing (at the request of bz@), I have determined that the error seems tied to some interaction between in the lkpi. If loading either i915drm or if_iwlwifi is delayed in loading automatically during boot and loaded from the command line, the problem has not occurred in multiple system boots.
x". 

If I manually load i915kms, all I need to do is kldload it and "startx". If I load if_iwlwifi, I need to "service netif start wlan0" followed by "service local_unbound start". All other network pieces (dhclient and wpa_supplicant) are started by netif.

Normally, i915kms is loaded from the kld_list in /etc/rc.conf. It is "cpuctl acpi_ibm acpi_video i915kms filemon". I know that the modules are loaded in their order in the list, but I doubt that it is in any way significant.
Comment 17 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-03 21:38:16 UTC
Tracking myself:

iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b09dc448 skb 0xfffff80001b59000 { len 30 } info 0xfffffe000edf5ce8 sta 0xfffff8002320f080 (if you see this please report to PR 274382)
[1357.633531] Invalid TXQ id
iwlwifi0: lkpi_sta_a_to_a:1531: lvif 0xfffffe00b0d7b000 vap 0xfffffe00b0d7b010 iv_bss 0xfffffe00b1e20000 lvif_bss 0xfffff8002320f000 lvif_bss->ni 0xfffffe00b1176000 synched 0
iwlwifi0: lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)

        
Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer     = 0x20:0xffffffff80cf7991
stack pointer           = 0x28:0xfffffe00909efd00
frame pointer           = 0x28:0xfffffe00909efd10
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (iwlwifi0 net80211 t)
rdi: fffffe00b1176000 rsi: fffff80001b5caf0 rdx: 0000000000000006
rcx: fffffe008ca59c80  r8: fffff80164d36b78  r9: 000000002320f000
rax: deadc0dedeadc0de rbx: fffffe00b1176000 rbp: fffffe00909efd10
r10: 0000000000000000 r11: 0000000000010000 r12: fffff8002320f000
r13: fffffe00b0d7bdc0 r14: deadc0dedeadc0de r15: fffffe00b09dc380
trap number             = 9
panic: general protection fault
cpuid = 1
time = 1706993398
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00909efa40
vpanic() at vpanic+0x135/frame 0xfffffe00909efb70
panic() at panic+0x43/frame 0xfffffe00909efbd0
trap_fatal() at trap_fatal+0x40f/frame 0xfffffe00909efc30
calltrap() at calltrap+0x8/frame 0xfffffe00909efc30
--- trap 0x9, rip = 0xffffffff80cf7991, rsp = 0xfffffe00909efd00, rbp = 0xfffffe00909efd10 ---
node_free() at node_free+0x11/frame 0xfffffe00909efd10
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x27f/frame 0xfffffe00909efd80
lkpi_iv_newstate() at lkpi_iv_newstate+0x253/frame 0xfffffe00909efdf0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x2a4/frame 0xfffffe00909efe40
taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe00909efec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe00909efef0
fork_exit() at fork_exit+0x82/frame 0xfffffe00909eff30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00909eff30
--- trap 0xafafafaf, rip = 0xafafafafafafafaf, rsp = 0xafafafafafafafaf, rbp = 0xafafafafafafafaf ---
KDB: enter: panic
[ thread pid 0 tid 100134 ]
Stopped at      kdb_enter+0x33: movq    $0,0xe3a3e2(%rip)


..
        iv_nstate 0x1 SCAN iv_nstate_b 4 iv_nstate_n 0
         [0] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b360 _args 192
         [1] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b380 _args 192
         [2] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b3a0 _args 192
         [3] iv_nstates 0x1 SCAN _task 0xfffffe00b0d7b3c0 _args 1
         [4] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b3e0 _args 192
         [5] iv_nstates 0 INIT _task 0xfffffe00b0d7b400 _args 0
         [6] iv_nstates 0 INIT _task 0xfffffe00b0d7b420 _args 3
         [7] iv_nstates 0x1 SCAN _task 0xfffffe00b0d7b440 _args 0
..

Note the deadc0dedeadc0de in rax;  from the log it looks like we did a bss swap;  trying to send on auth still.  Going A-A multiple times.  The txq is no longer valid (so likely was the sta.  We try to free our local cache and it is gone already in auth_to_scan.  Need to find that missing ni reference (possibly on the m->m_pkthdr.rcvif or our pkthdr cache?
Comment 18 rkoberman 2024-02-03 22:04:22 UTC
Just to update to my current status, since my update to head on Jan 13 I have not had a single incident of "Invalid TXQ id". Possibly related, drm-61-kmod no longer fails to load firmware on my Alder Lake laptop and iwlwifi had been running almost flawlessly, except for an odd outage of a few seconds on two occasions when the interface lost its association with the AP.
Comment 19 Ed Maste freebsd_committer freebsd_triage 2024-02-09 18:51:08 UTC
this reproduces for me in my WIP branch which is main from Jan 31 + wulf@'s drm6.6 branch + linuxkpi updates

--
iwlwifi0: WRT: Invalid buffer destination
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
<6>wlan0: Ethernet address: f4:26:79:6a:7d:7c
<6>lo0: link state changed to UP
<118>Feb  9 13:14:06 framework wpa_supplicant[1795]: ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
<118>Feb  9 13:14:06 framework syslogd: last message repeated 1 times
<118>Feb  9 13:14:06 framework wpa_supplicant[1796]: ioctl[SIOCS80211, op=103, val=0, arg_len=128]: Operation now in progress
<4>Invalid TXQ id
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142b96448 skb 0xfffff8000b4e3000 { len 30 } info 0xfffffe00e4e5dce0 sta 0xfffff80029024880 (if you see this please report to PR 274382)
panic: lkpi_sta_auth_to_scan: lsta 0xfffff8000b6c0800 state not NONE: 0, nstate 1 arg 1
cpuid = 7
time = 1707502448
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0141a0eb80
vpanic() at vpanic+0x135/frame 0xfffffe0141a0ecb0
panic() at panic+0x43/frame 0xfffffe0141a0ed10
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x259/frame 0xfffffe0141a0ed80
lkpi_iv_newstate() at lkpi_iv_newstate+0x256/frame 0xfffffe0141a0edf0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x235/frame 0xfffffe0141a0ee40
taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame 0xfffffe0141a0eec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe0141a0eef0
fork_exit() at fork_exit+0x87/frame 0xfffffe0141a0ef30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0141a0ef30
--- trap 0xdf61d4b9, rip = 0x390b869d14fc8eac, rsp = 0x48765a79c3d4e72f, rbp = 0xd2dbe161c7e7a77e ---
KDB: enter: panic
Comment 20 Cheng Cui freebsd_committer freebsd_triage 2024-02-09 19:50:44 UTC
(In reply to Ed Maste from comment #19)

Hi Ed,

You can try my workaround mentioned in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276562#c9, to bypass this "service netif restart" panic. Hope it will be helpful.
Comment 21 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-09 20:44:55 UTC
(In reply to Ed Maste from comment #19)

Can you try the patches pending in Phabricator?  D43389, D43725 and D43753?  I am waiting on review...
Comment 22 Cheng Cui freebsd_committer freebsd_triage 2024-02-09 20:58:00 UTC
(In reply to Bjoern A. Zeeb from comment #21)
I am about to send the same recommendation of applying the combined D43389, D43725 and D43753.

I have been testing it for a few hours now, so far so good.
Comment 23 Ed Maste freebsd_committer freebsd_triage 2024-02-09 21:13:37 UTC
I'll grab the patches from Phab, or grab from a public git repo, and try later today.
Comment 24 Ed Maste freebsd_committer freebsd_triage 2024-02-10 04:27:07 UTC
With the three patches applied iwlwifi worked fine for several hours.

dmesg snippet:

iwlwifi0: <iwlwifi> mem 0x7a200000-0x7a203fff at device 0.0 on pci2
iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
iwlwifi0: api flags index 2 larger than supported by driver
iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
iwlwifi0: WRT: Invalid buffer destination
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
iwlwifi0: loaded PNVM version 181407b3
iwlwifi0: Detected RF GF, rfid=0x10d000
iwlwifi0: base HW address: f4:26:79:6a:7d:7c
iwlwifi0: WRT: Invalid buffer destination
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
wlan0: Ethernet address: xx:xx:xx:xx:xx:xx
Invalid TXQ id
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142af3448 skb 0xfffff80001e32000 { len 30 } info 0xfffffe00e4e5dce0 sta 0xfffff80001d52080 (if you see this please report to PR 274382)
Comment 25 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-10 14:39:07 UTC
(In reply to Ed Maste from comment #24)

Did you have an accompanying log from wpa_supplicant of that time?
Was there a re-assoc event or something?
Comment 26 Ed Maste freebsd_committer freebsd_triage 2024-02-10 18:43:37 UTC
(In reply to Bjoern A. Zeeb from comment #25)
For reference this is on my Framework laptop (i7-1165G7 Tiger Lake CPU, 8086:2725 "Wi-Fi 6 AX210/AX211/AX411 160MHz"). It occurs on on startup, with wpa_supplicant started from rc.

I tried `service wpa_supplicant stop` and then running wpa_supplicant from the cmdline without -B. I was not able to reproduce the message again in this context. The only notable output from this is wpa_supplicant reporting:
ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
Comment 27 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-10 18:48:29 UTC
(In reply to Ed Maste from comment #26)

Thanks for the update.  I'll try to add a lot more logging to this one once other things are in.  It seems we are trying to send an ASSOC REQ with a sta which no longer is valid.  This is likely just another side effect of net80211 assumptions from 10+ years ago along with deferred contexts.


For the latter, wpa_supplicant has been logging these for years on all WNICs.  We should probably just clear the noise at this point.
Comment 28 rkoberman 2024-02-10 21:13:45 UTC
(In reply to Ed Maste from comment #26)
This was a problem I was seeing with the lkpi prior to the update from drm-515-kmod to drm-61-kmod. I see nothing likely to be involved between the two. I was seeing this problem while running n266789-06c083f79dd7 and drm-515-kmod from Dec. 22 and n267556-69748e62e82a drm-61-kmod from Jan. 15. Aside from fixing the serious limitations of running my Alder Lake graphics on 515, it also fixed the invalid TXQ and crashing issues I was seeing prior to the update to 61.

I had previously noted that the stack backtraces from the many panics tied to the invalid TXQ always included drm items near the top of the stack.

Have you tried drm-61-kmod? It looks like using drm-66 (I assume from git as it is not in ports yet) has likely led to the return to the issues I had with drm-515.

I have looked at commits to iwlwifi and the lkpi in the relevant time period and I see nothing that looks likely in either place. I am now running a system from Feb. 6 with no issues with either drm-61 or iwlwifi.
Comment 29 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 14:57:23 UTC
The patches from D43389, D43725, D43753 in combination seem to work and
prevented the crash "panic: lkpi_sta_auth_to_assoc" from "service netif
restart" or simply booting the kernel. The patches does not prevent "please
report to PR 274382" from showing up.

However, I observed this strange difference among my three freebsd VMs.

vm1: Feb 12 main kernel, keeps hitting the crash without the patches
vm2: Feb 12 main kernel, and never hit the crash
vm3: Feb 14 main kernel, and never hit the crash, on a second physical host

The only different between vm1 and vm2 (they are on the same physical host) is
that vm1 was created last November, 2023 with World rebuilt that time; but vm2
was created last week using the Feb 8 freebsd-current ISO image without
rebuilding the World. The vm3 was also created in Feb without rebuilding the
World.

But I can't explain why the old World but new kernel will cause the problem.
And I suspect that is the drm-515-kmod (as rkoberman@gmail.com mentioned) the
culprit?

I remember Ed told me before that the ports tree has drm-kmod, but could
somebody help explain my suspicion, before I rebuild the drm-kmod in vm1 to
verify myself?
Comment 30 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-14 15:29:08 UTC
(In reply to Cheng Cui from comment #29)

> [..] The patches does not prevent "please report to PR 274382" from showing up.

Can you reliably reproduce it?  There's some more debugging to be added to see in which state we are with the sta.

> However, I observed this strange difference among my three freebsd VMs.
> 
> vm1: Feb 12 main kernel, keeps hitting the crash without the patches
> vm2: Feb 12 main kernel, and never hit the crash
> vm3: Feb 14 main kernel, and never hit the crash, on a second physical host

I think that basically explains the observations from a lot of people incl. the problem which made it hard to track some of this down.  Some people do see the problems while others don't.  Some people see if depending on timing and (radio) environment and others don't.  This is why some people have been reporting "very happy" recently and others still saw crashes.  It's not a deterministic setting and depends on many different things from user space to radio.  This is why the reproducer based on your observations last year was so important.
Comment 31 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 15:38:37 UTC
Created attachment 248461 [details]
iwl_mvm_tx_mpdu log messages
Comment 32 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 15:39:41 UTC
(In reply to Bjoern A. Zeeb from comment #30)

It is reliably reproduced. I have attached the "iwl_mvm_tx_mpdu log messages" from console after running the below script. Please let me know if there is any more log knob needed to setup.

root@n1iwlvm:~ # cat test.tcsh 
#!/bin/tcsh -f

set cnt = $argv[1]
set i = 0
while ($i < $cnt)
        @ i = $i + 1
        echo "($i)"
        service netif restart && sleep 10
        ifconfig wlan0
        echo "run scp now"
        scp /tmp/data.txt root@n2fbsd:/tmp/
end
root@n1iwlvm:~ #
Comment 33 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-14 15:53:30 UTC
(In reply to Cheng Cui from comment #32)

Can you apply  https://reviews.freebsd.org/D43879

it's not been tested or even compiled so you may have to make it work but you'll get the idea.  It should dump extra sta/lsta/ni and vap/lvif state for us so we can see if this is just another result of the iv_bss being switched under us while we still have packets queued for tx.
Comment 34 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 16:50:59 UTC
Created attachment 248462 [details]
repro after applying D43879 on top of D43389, D43725, D43753
Comment 35 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 16:54:04 UTC
(In reply to Bjoern A. Zeeb from comment #33)

Please find the extra debug info in attached "repro after applying D43879 on top of D43389, D43725, D43753".

Meanwhile, I have an idea to create a new VM but based on old (last year) fbsd-15-current ISO image to see if I can reproduce the story.
Comment 36 Cheng Cui freebsd_committer freebsd_triage 2024-02-15 01:02:29 UTC
(In reply to Cheng Cui from comment #35)

Fortunately, I have kept my World build log and did the same way this time. And
finally I figured out how to reproduce the "panic: lkpi_sta_auth_to_scan" crash.

I created vm4 from the FreeBSD-13.2-RELEASE iso, and rebuilt World/kernel from the Feb 12 src main. Now, the crash happens every time, and I have 5/5 crashes/reboots.

These vm3 and vm4 are on the same physical host and share the same physical WiFi interface.

vm3: installed from FreeBSD-15.0-CURRENT (20240201), has not hit a crash
vm4: installed from FreeBSD-13.2-RELEASE, rebuilt World/kernel from Feb 12 src main, hit a crash every time on booting up wlan0
Comment 37 Ed Maste freebsd_committer freebsd_triage 2024-02-15 01:52:06 UTC
On boot I get at startup (same Framework laptop case as comment #26):

Invalid TXQ id
iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142c3a448 skb 0xfffff800029a6000 { len 30 } info 0xfffffe00e4e5dcc0 sta 0xfffff80001d14880 (if you see this please report to PR 274382)
iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe01424e4000 vap 0xfffffe01424e4010 iv_bss 0xfffffe0143a83000 lvif_bss 0xfffff80001d14800 lvif_bss->ni 0xfffffe0143a7b000 synched 1
wlan0: link state changed to UP
Comment 38 Cheng Cui freebsd_committer freebsd_triage 2024-02-15 16:07:56 UTC
After applying D43879 and D43909 with the latest Feb 15 main kernel, still
seeing the "Invalid TXQ" print from the vm4 I managed to reproduce the story.

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #2 main-n268309-2c5ff9118c1e-dirty: Thu Feb 15 11:00:46 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
root@n2fbsdtmp:~ # 

=== booting print ===

Created wlan(4) interfaces: wlan0.
lo0: link state changed to UP
vtnet0: link state changed to UP
Starting wpa_supplicant.
Starting dhclient.
wlan0: no link ....Invalid TXQ idiwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbf9448 skb 0xfffff8000584e800 { len 30 } info 0xfffffe0063b4cce8 sta 0xfffff800231e5880 (if you see this please report to PR 274382)
iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab4a5000 vap 0xfffffe00ab4a5010 iv_bss 0xfffffe00ab79f000 lvif_bss 0xfffff800231e5800 lvif_bss->ni 0xfffffe00ab574000 synched 1
iwl_mvm_tx_mpdu:1210 lsta 0xfffff800231e5800 ni 0xfffffe00ab574000 sta 0xfffff800231e5880
	txq_task txq len 0 mtx
	kc 0 state 1 added_to_drv 1 in_mgd 1
0xfffffe00ab574000: mac 3e:60:f9:06:d1:9b refcnt 1
	authmode 1 flags 0x20004
	associd 0x0 txpower 100 vlan 0
	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
	rssi 58 noise -96 intval 100 capinfo 0x531
	bssid 3e:60:f9:06:d1:9b essid "" channel 2422:0x480
	inact 2 inact_reload 2 txrate 72
	htcap 19ef htparam 17 htctlchan 3 ht2ndchan 1
	htopmode 2 htstbc 0 htchw 0
	vhtcap 0 freq1 0 freq2 0 vhtbasicmcs 0
.......... giving up
/etc/rc.d/dhclient: WARNING: failed to start dhclient
Comment 39 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-15 19:11:16 UTC
(In reply to Cheng Cui from comment #38)

sta is added to FW and in state IEEE80211_STA_NONE  sta->lsta->ni is 0xfffffe00ab574000

IEEE80211_STA_NONE can be from scan-to-auth, assoc-down, run-init.
Given the fc type/subtype this should be an AUTH frame to it should be scan_to_auth.

However:
vap->iv_bss 0xfffffe00ab79f000 != lvif_bss->ni 0xfffffe00ab574000 but synched is 1.


Can you apply the following change to give us an idea of how the updates to iv_bss go (untested):

--- sys/compat/linuxkpi/common/src/linux_80211.c
+++ sys/compat/linuxkpi/common/src/linux_80211.c
@@ -2531,6 +2531,11 @@ lkpi_iv_update_bss(struct ieee80211vap *vap, struct ieee80211_node *ni)
        lvif = VAP_TO_LVIF(vap);
 
        LKPI_80211_LVIF_LOCK(lvif);
+       ic_printf(vap->iv_ic, "%s:%d: lvif %p vap %p iv_bss %p lvif_bss %p "
+          "lvif_bss->ni %p synched %d, ni %p\n", __func__, __LINE__,
+          lvif, vap, vap->iv_bss, lvif->lvif_bss,
+          (lvif->lvif_bss != NULL) ? lvif->lvif_bss->ni : NULL,
+          lvif->lvif_bss_synched, ni);
        lvif->lvif_bss_synched = false;
        LKPI_80211_LVIF_UNLOCK(lvif);
 

Almost also want to dump vap->iv_state here and in lkpi_sta_dump(), if not the entire history in the latter (taken from ieee80211_ddb.c):

        int i;
        printf("iv_nstate %#x %s iv_nstate_b %d iv_nstate_n %d\n",
            vap->iv_nstate, ieee80211_state_name[vap->iv_nstate], /* historic */
            vap->iv_nstate_b, vap->iv_nstate_n);
        for (i = 0; i < NET80211_IV_NSTATE_NUM; i++) {
                printf("\t [%d] iv_nstates %#x %s _task %p _args %d\n", i,
                    vap->iv_nstates[i], ieee80211_state_name[vap->iv_nstates[i]],
                    &vap->iv_nstate_task[i], vap->iv_nstate_args[i]);
        }
Comment 40 Cheng Cui freebsd_committer freebsd_triage 2024-02-15 20:18:40 UTC
(In reply to Bjoern A. Zeeb from comment #39)

Thanks. With these extra prints. The dmesg after booting shows:
[1.427073] pci0: driver added
[1.632542] wlan0: bpf attached
[1.632744] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab4b2000
[1.633722] wlan0: bpf attached
[1.633851] wlan0: Ethernet address: e0:2e:0b:8f:77:08
[1.652204] lo0: link state changed to UP
[1.676088] vtnet0: link state changed to UP
[3.027620] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab4b2000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab791000
[3.029964] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab791000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab79b000
[3.032791] Invalid TXQ id
[3.032883] iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00ab420448 skb 0xfffff8000357e800 { len 30 } info 0xfffffe0063b4cce8 sta 0xfffff800039fe880 (if you see this please report to PR 274382)
[3.034103] iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab79b000 lvif_bss 0xfffff800039fe800 lvif_bss->ni 0xfffffe00ab791000 synched 1
[3.035176] iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
[3.035470] 	 [0] iv_nstates 0x1 SCAN _task 0xfffffe00ab4ae360 _args 0
[3.035834] 	 [1] iv_nstates 0x2 AUTH _task 0xfffffe00ab4ae380 _args 192
[3.036207] 	 [2] iv_nstates 0x2 AUTH _task 0xfffffe00ab4ae3a0 _args 192
[3.036581] 	 [3] iv_nstates 0 INIT _task 0xfffffe00ab4ae3c0 _args 0
[3.036930] 	 [4] iv_nstates 0 INIT _task 0xfffffe00ab4ae3e0 _args 0
[3.037281] 	 [5] iv_nstates 0 INIT _task 0xfffffe00ab4ae400 _args 0
[3.037631] 	 [6] iv_nstates 0 INIT _task 0xfffffe00ab4ae420 _args 0
[3.037983] 	 [7] iv_nstates 0 INIT _task 0xfffffe00ab4ae440 _args 0
[3.038332] iwl_mvm_tx_mpdu:1210 lsta 0xfffff800039fe800 ni 0xfffffe00ab791000 sta 0xfffff800039fe880
[3.038880] 	txq_task txq len 0 mtx
[3.039032] 	kc 0 state 1 added_to_drv 1 in_mgd 1
[3.039268] 0xfffffe00ab791000: mac 82:d2:94:c5:6f:ea refcnt 1
[3.039582] 	authmode 1 flags 0x20004
[3.039746] 	associd 0x0 txpower 100 vlan 0
[3.039945] 	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
[3.040193] 	rssi 32 noise -96 intval 200 capinfo 0x1431
[3.040471] 	bssid 82:d2:94:c5:6f:ea essid "" channel 2437:0x480
[3.040797] 	inact 2 inact_reload 2 txrate 72
[3.041009] 	htcap 9ad htparam 3 htctlchan 6 ht2ndchan 0
[3.041286] 	htopmode 0 htstbc 0 htchw 0
[3.041468] 	vhtcap 338959b2 freq1 0 freq2 0 vhtbasicmcs fffc
[13.132873] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab79b000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab791000
[14.340433] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab791000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab7db000
[14.397601] wlan0: link state changed to UP
root@n2fbsdtmp:~ #
Comment 41 Ed Maste freebsd_committer freebsd_triage 2024-02-16 02:21:37 UTC
dmesg snippit from the Framework laptop which produces one 'Invalid TXQ id' at boot, with the additional debugging:

iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143343000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b40000
iv_nstate 0x1 SCAN iv_nstate_b 1 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0 INIT _task 0xfffffe01424ae380 _args 0
	 [2] iv_nstates 0 INIT _task 0xfffffe01424ae3a0 _args 0
	 [3] iv_nstates 0 INIT _task 0xfffffe01424ae3c0 _args 0
	 [4] iv_nstates 0 INIT _task 0xfffffe01424ae3e0 _args 0
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 0
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b40000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b50000
iv_nstate 0x2 AUTH iv_nstate_b 2 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0 INIT _task 0xfffffe01424ae3a0 _args 0
	 [3] iv_nstates 0 INIT _task 0xfffffe01424ae3c0 _args 0
	 [4] iv_nstates 0 INIT _task 0xfffffe01424ae3e0 _args 0
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 0
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
ieee80211_get_he_iftype_cap_vif:2703: XXX LKPI80211 TODO
ieee80211_sta_recalc_aggregates:2189: XXX LKPI80211 TODO
Invalid TXQ id
iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142c31448 skb 0xfffff8000250b000 { len 30 } info 0xfffffe00e4e5dcc0 sta 0xfffff80001d6a880 (if you see this please report to PR 274382)
iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b50000 lvif_bss 0xfffff80001d6a800 lvif_bss->ni 0xfffffe0143b40000 synched 1
iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0x2 AUTH _task 0xfffffe01424ae3a0 _args 192
	 [3] iv_nstates 0 INIT _task 0xfffffe01424ae3c0 _args 0
	 [4] iv_nstates 0 INIT _task 0xfffffe01424ae3e0 _args 0
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 0
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
XXX-TODO linuxkpi_ieee80211_iterate_keys:4728: UNIMPLEMENTED
iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b50000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b58000
iv_nstate 0 INIT iv_nstate_b 6 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0x2 AUTH _task 0xfffffe01424ae3a0 _args 192
	 [3] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3c0 _args 1
	 [4] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3e0 _args 1
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 3
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b58000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b40000
iv_nstate 0x1 SCAN iv_nstate_b 7 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0x2 AUTH _task 0xfffffe01424ae3a0 _args 192
	 [3] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3c0 _args 1
	 [4] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3e0 _args 1
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 3
	 [6] iv_nstates 0x1 SCAN _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
Comment 42 commit-hook freebsd_committer freebsd_triage 2024-02-16 15:51:26 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=fa4e4257943650c0b5f58c01bb0bdfadea61dfb2

commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-16 15:50:11 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    MFC after:      1 day
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 43 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-16 21:24:00 UTC
Anyone who's seen this, can you please test:
https://reviews.freebsd.org/D43944
Comment 44 commit-hook freebsd_committer freebsd_triage 2024-02-18 21:11:59 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=035424535b09c57971df2064ad7678e90bd8cb3a

commit 035424535b09c57971df2064ad7678e90bd8cb3a
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-18 18:31:18 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

    (cherry picked from commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2)

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 45 commit-hook freebsd_committer freebsd_triage 2024-02-19 08:09:08 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=763b10806cd4ebdcbd2b6753d4f50ec088dc57fb

commit 763b10806cd4ebdcbd2b6753d4f50ec088dc57fb
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-19 08:02:02 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

    (cherry picked from commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2)

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 46 commit-hook freebsd_committer freebsd_triage 2024-02-19 16:10:45 UTC
A commit in branch releng/13.3 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=401dbf9cfc0cfb2a02355d41bf73e146bd3dfb74

commit 401dbf9cfc0cfb2a02355d41bf73e146bd3dfb74
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-19 16:09:22 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Approved by:    re (cperciva)
    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

    (cherry picked from commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2)
    (cherry picked from commit 763b10806cd4ebdcbd2b6753d4f50ec088dc57fb)

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 47 Cheng Cui freebsd_committer freebsd_triage 2024-02-19 17:52:16 UTC
(In reply to Bjoern A. Zeeb from comment #43)

It's better to have a separate patch for the extra print.

With D43944 and D43879 on top of the latest main kernel, the "Invalid TXQ id" still shows up:

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #5 main-n268402-ef75877fc2d9-dirty: Mon Feb 19 12:42:38 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64


=== dmesg snip ===

[1.666880] vtnet0: link state changed to UP
[3.235969] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab4b8000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab568000
[3.237959] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab568000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab787000
[3.241222] Invalid TXQ id
[3.241326] iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfa448 skb 0xfffff80004935800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff800068e5080 (if you see this please report to PR 274382)
[3.242673] iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab787000 lvif_bss 0xfffff800068e5000 lvif_bss->ni 0xfffffe00ab568000 synched 1
[3.243664] iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
[3.243932] 	 [0] iv_nstates 0x1 SCAN _task 0xfffffe00ab48d360 _args 0
[3.244268] 	 [1] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d380 _args 192
[3.244612] 	 [2] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d3a0 _args 192
[3.244960] 	 [3] iv_nstates 0 INIT _task 0xfffffe00ab48d3c0 _args 0
[3.245283] 	 [4] iv_nstates 0 INIT _task 0xfffffe00ab48d3e0 _args 0
[3.245610] 	 [5] iv_nstates 0 INIT _task 0xfffffe00ab48d400 _args 0
[3.245937] 	 [6] iv_nstates 0 INIT _task 0xfffffe00ab48d420 _args 0
[3.246264] 	 [7] iv_nstates 0 INIT _task 0xfffffe00ab48d440 _args 0
[3.246590] iwl_mvm_tx_mpdu:1210 lsta 0xfffff800068e5000 ni 0xfffffe00ab568000 sta 0xfffff800068e5080
[3.247102] 	txq_task txq len 0 mtx
[3.247243] 	kc 0 state 1 added_to_drv 1 in_mgd 1
[3.247464] 0xfffffe00ab568000: mac 3e:60:f9:06:d1:9b refcnt 1
[3.247766] 	authmode 1 flags 0x20004
[3.247919] 	associd 0x0 txpower 100 vlan 0
[3.248105] 	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
[3.248337] 	rssi 58 noise -96 intval 100 capinfo 0x531
[3.248591] 	bssid 3e:60:f9:06:d1:9b essid "" channel 2422:0x480
[3.248895] 	inact 2 inact_reload 2 txrate 72
[3.249092] 	htcap 19ef htparam 17 htctlchan 3 ht2ndchan 1
[3.249362] 	htopmode 2 htstbc 0 htchw 0
[3.249531] 	vhtcap 0 freq1 0 freq2 0 vhtbasicmcs 0
[3.249774] iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfa448 skb 0xfffff80004935800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff800068e5080 (if you see this please report to PR 274382)
[3.250894] iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab787000 lvif_bss 0xfffff800068e5000 lvif_bss->ni 0xfffffe00ab568000 synched 1
[3.251886] iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
[3.252154] 	 [0] iv_nstates 0x1 SCAN _task 0xfffffe00ab48d360 _args 0
[3.252495] 	 [1] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d380 _args 192
[3.252847] 	 [2] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d3a0 _args 192
[3.253198] 	 [3] iv_nstates 0 INIT _task 0xfffffe00ab48d3c0 _args 0
[3.253536] 	 [4] iv_nstates 0 INIT _task 0xfffffe00ab48d3e0 _args 0
[3.253860] 	 [5] iv_nstates 0 INIT _task 0xfffffe00ab48d400 _args 0
[3.254187] 	 [6] iv_nstates 0 INIT _task 0xfffffe00ab48d420 _args 0
[3.254520] 	 [7] iv_nstates 0 INIT _task 0xfffffe00ab48d440 _args 0
[3.254855] iwl_mvm_tx_mpdu:1210 lsta 0xfffff800068e5000 ni 0xfffffe00ab568000 sta 0xfffff800068e5080
[3.255390] 	txq_task txq len 0 mtx
[3.255537] 	kc 0 state 1 added_to_drv 1 in_mgd 1
[3.255763] 0xfffffe00ab568000: mac 3e:60:f9:06:d1:9b refcnt 1
[3.256065] 	authmode 1 flags 0x20004
[3.256223] 	associd 0x0 txpower 100 vlan 0
[3.256417] 	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
[3.256658] 	rssi 58 noise -96 intval 100 capinfo 0x531
[3.256922] 	bssid 3e:60:f9:06:d1:9b essid "" channel 2422:0x480
[3.257239] 	inact 2 inact_reload 2 txrate 72
[3.257440] 	htcap 19ef htparam 17 htctlchan 3 ht2ndchan 1
[3.257715] 	htopmode 2 htstbc 0 htchw 0
[3.257892] 	vhtcap 0 freq1 0 freq2 0 vhtbasicmcs 0
[13.337998] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab787000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab568000
[14.556865] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab568000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab7b1000
[14.609154] wlan0: link state changed to UP
root@n2fbsdtmp:~ #
Comment 48 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-19 22:49:27 UTC
(In reply to Cheng Cui from comment #47)

I am currently trying to track it down with Ed testing.
Comment 49 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-20 17:53:17 UTC
(In reply to Bjoern A. Zeeb from comment #48)

Thanks Ed for all the traces in private.

I believe at this point we have to bite the bullet and do a full pass over net80211 and all the wireless drivers and adjust the KPI.

This will likely also help us to go into the direction of untangling some of 713db49d06deee90dd358b2e4b9ca05368a5eaf6.

But it'll take a few days, careful testing, and hope that we'll not have to do it a 2nd time at a later point.

But it'll also bring up the question of how to then support 14 and 13.  And the only likely answer is to repeat what we've just done: __FreeBSD_version bump and just do it?
Comment 50 mark 2024-02-24 20:18:43 UTC
well i now have a new issue. wifi was working for a bit then with me not changing anything my ability to resolve ip’s just stopped working. i’m still connected to the network but can’t get on the web. thinking maybe my isp dns was down or something i switched dns info did a networking restart and now im getting messages telling me to post here. 

what’s odd is im still getting that not associated error even though i am ifconfig reports no issues. 

this is. just a test system for me to learn bsd and mach so im willing to be a guinea pig and test stuff.
Comment 51 mark 2024-02-24 20:22:08 UTC
(In reply to mark from comment #50)
uh… as mysteriously as it stopped working it has now started working again. clearly something is unhappy.
Comment 52 Ed Maste freebsd_committer freebsd_triage 2024-02-24 20:29:08 UTC
(In reply to mark from comment #50)
> well i now have a new issue. wifi was working for a bit then with me not changing anything

What version of FreeBSD are you trying? I.e., `uname -a`
Comment 53 mark 2024-02-25 01:28:27 UTC
not in front of my laptop right now but i’m using the latest staging of 14.0 because i was having a issue with iwlwifi.
Comment 54 Cheng Cui freebsd_committer freebsd_triage 2024-02-27 15:07:08 UTC
If the remaining patches are only D43944 and D43967 for this PR, it looks the problem still exists (see below) after applying the two patches. I will test again on the other VM which can also reproduce the "Invalid TXQ", and confirm.


=== dmesg snip ===
...
[1.210188] iwlwifi0: loaded firmware version 77.2df8986f.0 cc-a0-77.ucode op_mode iwlmvm
[1.210931] iwlwifi0: Detected Intel(R) Wi-Fi 6 AX200 160MHz, REV=0x340
[1.326459] iwlwifi0: Detected RF HR B3, rfid=0x10a100
[1.391440] iwlwifi0: base HW address: e0:2e:0b:92:e5:82
[1.415420] iwlwifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[1.415929] iwlwifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
[1.416247] iwlwifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[1.416974] pci0: driver added
[1.622653] wlan0: bpf attached
[1.622921] wlan0: bpf attached
[1.623065] wlan0: Ethernet address: e0:2e:0b:92:e5:82
[1.641814] lo0: link state changed to UP
[1.665468] vtnet0: link state changed to UP
[3.237436] iwlwifi0: lkpi_sta_scan_to_auth:1328: lvif 0xfffffe00ab4b8000 vap 0xfffffe00ab4b8010 iv_bss 0xfffffe00ab53f000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab535000 lsta 0xfffff80039355000
[3.238720] iwlwifi0: 
[3.238757] Invalid TXQ id
[3.238786] lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
[3.238909] iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfc4c8 skb 0xfffff8000357e800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff80039355080 (if you see this please report to PR 274382)
[14.614268] wlan0: link state changed to UP

root@n1iwlvm:~ # uname -a
FreeBSD n1iwlvm 15.0-CURRENT FreeBSD 15.0-CURRENT #77 PR274382-cce3a70a77-dirty: Tue Feb 27 09:56:44 EST 2024     root@n1iwlvm:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
Comment 55 Cheng Cui freebsd_committer freebsd_triage 2024-02-27 15:52:17 UTC
(In reply to Cheng Cui from comment #54)

Same in the other VM. D43944 and D43967 on top of latest main does not fix the `Invalid TXQ id` problem.

=== dmesg snip ===
...
[1.417096] pci0: driver added
[1.618199] wlan0: bpf attached
[1.618601] wlan0: bpf attached
[1.618813] wlan0: Ethernet address: e0:2e:0b:8f:77:08
[1.637810] lo0: link state changed to UP
[1.661634] vtnet0: link state changed to UP
[2.963496] iwlwifi0: lkpi_sta_scan_to_auth:1327: lvif 0xfffffe00ab4c0000 vap 0xfffffe00ab4c0010 iv_bss 0xfffffe00ab7b3000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab576000 lsta 0xfffff80060131000
[2.964666] iwlwifi0: 
[2.964687] Invalid TXQ id
[2.964726] lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
[2.964837] iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfb448 skb 0xfffff8000637d800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff80060131080 (if you see this please report to PR 274382)
[14.336602] wlan0: link state changed to UP

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #6 PR274382-n268402-ef75877fc2d9-dirty: Tue Feb 27 10:41:03 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
root@n2fbsdtmp:~ #
Comment 56 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-27 20:21:12 UTC
(In reply to Cheng Cui from comment #54)

See Comment 49 -- https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274382#c49
Comment 57 Cheng Cui freebsd_committer freebsd_triage 2024-02-28 15:36:29 UTC
(In reply to Bjoern A. Zeeb from comment #56)

What I understand after reading the comment #49, is that there is still remaining work to do to fix this `Invalid TXQ id` problem. And if I understand correctly, D43944 and D43967 still needs to be reviewed and be committed in main, and then MFC.

Given D43944 and D43967 do not solve the problem, I have to test them for regression. Currently, I have tested them on fbsd-15-current VMs in [1], [2] and [3] below. No regression found. They are all using fbsd-15-current kernels, but only [3] can reproduce all the problems in this PR.

[1] rebuilt kernel to fbsd-15-current based on fbsd-15-current iso
[2] rebuilt kernel/World to fbsd-15-current based on fbsd-14.0-release iso
[3] rebuilt kernel/World to fbsd-15-current based on fbsd-13.2-release iso

I am not sure the regression test policy on stable/14 and stable/13, and also not clear how many patches have been committed there. If there is a branch tracking list for each patch, that would be very helpful. Therefore, I am intending to cover regression tests in these two stable branches, which means I need to create VMs and start from there.

Please let me know if there is any concern. Also please let me know if I missed any new patches for this PR.
Comment 58 Maxim Usatov 2024-03-12 09:32:49 UTC
Just upgraded from 13.2 to 13.3 and seeing the message:

Mar 12 09:26:52 freebsd kernel: Intel(R) Wireless WiFi based driver for FreeBSD
Mar 12 09:26:52 freebsd kernel: Autoloading module: ig4
Mar 12 09:26:52 freebsd kernel: iwlwifi0: <iwlwifi> mem 0x60531cc000-0x60531cffff at device 20.3 on pci0
Mar 12 09:26:52 freebsd kernel: iwlwifi0: Detected crf-id 0x400410, cnv-id 0x80400 wfpm id 0x80000020
Mar 12 09:26:52 freebsd kernel: iwlwifi0: PCI dev 51f0/0090, rev=0x370, rfid=0x2010d000
Mar 12 09:26:52 freebsd kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-gf-a0-83.ucode'
Mar 12 09:26:52 freebsd kernel: iwlwifi0: api flags index 2 larger than supported by driver
Mar 12 09:26:52 freebsd kernel: iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
Mar 12 09:26:52 freebsd kernel: iwlwifi0: loaded firmware version 83.e8f84e98.0 so-a0-gf-a0-83.ucode op_mode iwlmvm
Mar 12 09:26:52 freebsd kernel: iwlwifi0: Detected Intel(R) Wi-Fi 6E AX211 160MHz, REV=0x370
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WRT: Invalid buffer destination
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
Mar 12 09:26:52 freebsd kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
Mar 12 09:26:52 freebsd kernel: iwlwifi0: RFIm is deactivated, reason = 4
Mar 12 09:26:52 freebsd kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-gf-a0.pnvm'
Mar 12 09:26:52 freebsd kernel: iwlwifi0: loaded PNVM version 181407b3
Mar 12 09:26:52 freebsd kernel: iwlwifi0: Detected RF GF, rfid=0x2010d000
Mar 12 09:26:52 freebsd kernel: iwlwifi0: base HW address: 8c:17:59:72:2c:1b

. . . 

Mar 12 09:26:52 freebsd kernel: wlan0: link state changed to UP
Mar 12 09:26:52 freebsd kernel: Starting Network: lo0 em0 wlan0.
Mar 12 09:26:52 freebsd kernel: lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
Mar 12 09:26:52 freebsd kernel: 	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
Mar 12 09:26:52 freebsd kernel: 	inet6 ::1 prefixlen 128
Mar 12 09:26:52 freebsd kernel: 	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
Mar 12 09:26:52 freebsd kernel: 	inet 127.0.0.1 netmask 0xff000000
Mar 12 09:26:52 freebsd kernel: 	groups: lo
Mar 12 09:26:52 freebsd kernel: 	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Mar 12 09:26:52 freebsd kernel: em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
Mar 12 09:26:52 freebsd kernel: 	options=4e504bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
Mar 12 09:26:52 freebsd kernel: 	ether 9c:2d:cd:ae:d1:4f
Mar 12 09:26:52 freebsd kernel: 	inet 192.168.200.200 netmask 0xffffff00 broadcast 192.168.200.255
Mar 12 09:26:52 freebsd kernel: 	media: Ethernet autoselect (100baseTX <full-duplex>)
Mar 12 09:26:52 freebsd kernel: 	status: active
Mar 12 09:26:52 freebsd kernel: 	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Mar 12 09:26:52 freebsd kernel: wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
Mar 12 09:26:52 freebsd kernel: 	ether 8c:17:59:72:2c:1b
Mar 12 09:26:52 freebsd kernel: 	groups: wlan
Mar 12 09:26:52 freebsd kernel: 	ssid R320 channel 3 (2422 MHz 11g) bssid 28:b3:71:17:08:98
Mar 12 09:26:52 freebsd kernel: 	regdomain FCC country US authmode WPA2/802.11i privacy ON
Mar 12 09:26:52 freebsd kernel: 	deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
Mar 12 09:26:52 freebsd kernel: 	protmode CTS wme roaming MANUAL
Mar 12 09:26:52 freebsd kernel: 	parent interface: iwlwifi0
Mar 12 09:26:52 freebsd kernel: 	media: IEEE 802.11 Wireless Ethernet OFDM/36Mbps mode 11g
Mar 12 09:26:52 freebsd kernel: 	status: associated
Mar 12 09:26:52 freebsd kernel: 	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

. . . 

Mar 12 09:26:55 freebsd kernel: wlan0: link state changed to DOWN
Mar 12 09:26:55 freebsd devd[2260]: Processing event '!system=IFNET subsystem=wlan0 type=LINK_DOWN'
Mar 12 09:26:55 freebsd wpa_supplicant[350]: wlan0: CTRL-EVENT-DISCONNECTED bssid=28:b3:71:17:08:98 reason=0
Mar 12 09:26:55 freebsd wpa_supplicant[350]: wlan0: Trying to associate with 28:b3:71:17:08:98 (SSID='R320' freq=2422 MHz)
Mar 12 09:26:55 freebsd kernel: Invalid TXQ id
Mar 12 09:26:55 freebsd kernel: iwl_mvm_tx_mpdu:1204: fc 000000 tid 8 txq_id 65535 mvm 0xfffffe010e48f408 skb 0xfffff80059974800 { len 81 } info 0xfffffe001cbcfcc0 sta 0xfffff80001b9c880 (if you see this please report to PR 274382)
Comment 59 commit-hook freebsd_committer freebsd_triage 2024-04-15 17:52:30 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1665ef979d3adab05d178480077674ab5978e4f0

commit 1665ef979d3adab05d178480077674ab5978e4f0
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-16 21:14:02 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-04-15 17:44:00 +0000

    LinuxKPI: 802.11: disable early queue wakeup workaround; better classify

    We used to call lkpi_wake_tx_queues() for all queues early on in order
    to make sure they are ready.  Turns out whatever logic in iwlwifi
    startup is working correctly these days under LinuxKPI so this is no
    longer needed.  As the comment indicated the "workaround" did not
    always work.  Disable it for now.

    The second part of the improvement is to properly classify Non-QOS
    non-data frames (mgmt and ctl frames for now, unless we have to be
    even more careful in the future about certain sub-types) as TID 7/VO.
    Contrary to net80211 we must not promote PAE frames.

    PR:             274382
    MFC after:      3 day
    Reviewed by:    cc
    Differential Revision: https://reviews.freebsd.org/D43944

 sys/compat/linuxkpi/common/src/linux_80211.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)
Comment 60 nunziotocci2000 2024-05-12 23:12:07 UTC
root@frameworkbsd:~ # uname -a
FreeBSD frameworkbsd 14.1-STABLE FreeBSD 14.1-STABLE stable/14-n267607-7e10c2d27a53 GENERIC amd64
root@frameworkbsd:~ # dmesg | grep iwl 
iwlwifi0: <iwlwifi> mem 0x90b00000-0x90b03fff at device 0.0 on pci1
iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
iwlwifi0: api flags index 2 larger than supported by driver
iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
iwlwifi0: WRT: Invalid buffer destination: 0
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
iwlwifi0: loaded PNVM version 181407b3
iwlwifi0: Detected RF GF, rfid=0x10d000
iwlwifi0: base HW address: a0:02:a5:c4:b8:8c
iwlwifi0: WRT: Invalid buffer destination: 0
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe01d6a7f4c8 skb 0xfffff8000259d800 { len 30 } info 0xfffffe0115fe1cd8 sta 0xfffff80011bd5080 (if you see this please report to PR 274382)
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe01d6a7f4c8 skb 0xfffff8000259d800 { len 30 } info 0xfffffe0115fe1cd8 sta 0xfffff80011bd5080 (if you see this please report to PR 274382)


Wi-Fi is working currently and I have not experienced any panics related to wifi.

If you need me to inspect anything then let me know.
Comment 61 rkoberman 2024-05-17 17:18:51 UTC
Just rebooted my system after my network would not restart after the link to my hotspot went down. The hotspot outage was unrelated to FreeBSD, but I was usable to get the interface to associate when the hotspot was restored.

On reboot the network came up, but logged the following:
May 16 18:04:02 ptavv kernel: iwlwifi0: WRT: Invalid buffer destination: 0
May 16 18:04:02 ptavv kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
May 16 18:04:02 ptavv kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
May 16 18:04:02 ptavv kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
May 16 18:04:02 ptavv kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
May 16 18:04:02 ptavv kernel: iwlwifi0: RFIm is deactivated, reason = 4
May 16 18:04:02 ptavv kernel: wlan0: Ethernet address: 70:a8:d3:df:3c:de
May 16 18:04:02 ptavv kernel: lo0: link state changed to UP
May 16 18:04:02 ptavv kernel: iwlwifi0: lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
May 16 18:04:02 ptavv kernel: Invalid TXQ id
May 16 18:04:02 ptavv kernel: iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0154bfc4c8 skb 0xfffff80047b30800 { len 30 } info 0xfffffe00d5e9ece8 sta 0xfffff8003b3b8880 (if you see this please report to PR 274382)

This is not my usual network. I am using my phone as a hotspot as my system has been unable to associate with hotel networks for some time.

I am running a rather old system, main-n269145-3e1c8a35f741, due to recent kernel build issues reported by others.
Comment 62 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-05-17 18:11:14 UTC
(In reply to rkoberman from comment #61)

Hi,

thanks for reporting.  I have another question though:

> This is not my usual network. I am using my phone as a hotspot as my system has been unable to associate with hotel networks for some time.

Not at all?  None?  What's the exact issue?  Do you have logs?
Comment 63 Chris Hutchinson 2024-05-20 04:24:06 UTC
On a recently built kernel to test recent changes against sound:
FreeBSD 15.0-CURRENT #1 main-n270118-94b09d388b81-dirty:
Mon May 13 12:57:41 PDT 2024
root@fbsd15:/usr/obj/usr/src/amd64.amd64/sys/LENO15ND-20240510 amd64

iwlwifi0@pci0:0:20:3:   class=0x028000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x51f0 subvendor=0x8086 subdevice=0x0074
    vendor     = 'Intel Corporation'
    device     = 'Alder Lake-P PCH CNVi WiFi'
    class      = network

I'm receiving "Invalid TXQ id" (among other errors)
excerpts from the log:
May 17 20:27:21 fbsd15 kernel: iwlwifi0: Detected crf-id 0x1300504, cnv-id 0x80400 wfpm id 0x80000030
May 17 20:27:21 fbsd15 kernel: iwlwifi0: PCI dev 51f0/0074, rev=0x370, rfid=0x10a100
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-83.ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi-so-a0-hr-b0-83.ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-83.ucode: could not load binary firmware /boot/firmware/iwlwifi-so-a0-hr-b0-83.ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-83.ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi-so-a0-hr-b0-83.ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-83.ucode: could not load binary firmware /boot/firmware/iwlwifi-so-a0-hr-b0-83.ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-83_ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi-so-a0-hr-b0-83_ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-83_ucode: could not load binary firmware /boot/firmware/iwlwifi-so-a0-hr-b0-83_ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi_so_a0_hr_b0_83_ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi_so_a0_hr_b0_83_ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi_so_a0_hr_b0_83_ucode: could not load binary firmware /boot/firmware/iwlwifi_so_a0_hr_b0_83_ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi0: could not load firmware image 'iwlwifi-so-a0-hr-b0-83.ucode'
May 17 20:27:21 fbsd15 kernel: iwlwifi0: File size way too small!
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-82.ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi-so-a0-hr-b0-82.ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-82.ucode: could not load binary firmware /boot/firmware/iwlwifi-so-a0-hr-b0-82.ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-82.ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi-so-a0-hr-b0-82.ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-82.ucode: could not load binary firmware /boot/firmware/iwlwifi-so-a0-hr-b0-82.ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-82_ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi-so-a0-hr-b0-82_ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi-so-a0-hr-b0-82_ucode: could not load binary firmware /boot/firmware/iwlwifi-so-a0-hr-b0-82_ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi_so_a0_hr_b0_82_ucode: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwlwifi_so_a0_hr_b0_82_ucode
May 17 20:27:21 fbsd15 kernel: iwlwifi_so_a0_hr_b0_82_ucode: could not load binary firmware /boot/firmware/iwlwifi_so_a0_hr_b0_82_ucode either
May 17 20:27:21 fbsd15 kernel: iwlwifi0: could not load firmware image 'iwlwifi-so-a0-hr-b0-82.ucode'
May 17 20:27:21 fbsd15 kernel: iwlwifi0: File size way too small!
May 17 20:27:21 fbsd15 kernel: firmware: 'iwlwifi-so-a0-hr-b0-81.ucode' version 81: 1539288 bytes loaded at 0xffffffff82940000
May 17 20:27:21 fbsd15 kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-hr-b0-81.ucode'
May 17 20:27:21 fbsd15 kernel: iwlwifi0: api flags index 2 larger than supported by driver
May 17 20:27:21 fbsd15 kernel: iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
May 17 20:27:21 fbsd15 kernel: iwl-debug-yoyo.bin: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwl-debug-yoyo.bin
May 17 20:27:21 fbsd15 kernel: iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
May 17 20:27:21 fbsd15 kernel: iwl-debug-yoyo.bin: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwl-debug-yoyo.bin
May 17 20:27:21 fbsd15 kernel: iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
May 17 20:27:21 fbsd15 kernel: iwl-debug-yoyo_bin: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwl-debug-yoyo_bin
May 17 20:27:21 fbsd15 kernel: iwl-debug-yoyo_bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo_bin either
May 17 20:27:21 fbsd15 kernel: iwl_debug_yoyo_bin: could not load firmware image, error 2
May 17 20:27:21 fbsd15 kernel: Trying to load binary firmware from /boot/firmware/iwl_debug_yoyo_bin
May 17 20:27:21 fbsd15 kernel: iwl_debug_yoyo_bin: could not load binary firmware /boot/firmware/iwl_debug_yoyo_bin either
May 17 20:27:21 fbsd15 kernel: iwlwifi0: loaded firmware version 81.31fc9ae6.0 so-a0-hr-b0-81.ucode op_mode iwlmvm
May 17 20:27:21 fbsd15 kernel: iwlwifi0: Detected Intel(R) Wi-Fi 6 AX201 160MHz, REV=0x370
May 17 20:27:21 fbsd15 kernel: ugen1.4: <vendor 0x8087 product 0x0026> at usbus1
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WRT: Invalid buffer destination: 0
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
May 17 20:27:21 fbsd15 kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x10
May 17 20:27:21 fbsd15 kernel: iwlwifi0: RFIm is deactivated, reason = 4
May 17 20:27:21 fbsd15 kernel: iwlwifi0: Detected RF HR B5, rfid=0x10a100
May 17 20:27:21 fbsd15 kernel: iwlwifi0: base HW address: 3c:e9:f7:ec:63:d7
May 17 20:27:21 fbsd15 kernel: iwlwifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
May 17 20:27:21 fbsd15 kernel: iwlwifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
May 17 20:27:21 fbsd15 kernel: iwlwifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps

...

May 17 20:27:21 fbsd15 kernel: iwlwifi0: WRT: Invalid buffer destination: 0
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
May 17 20:27:21 fbsd15 kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
May 17 20:27:21 fbsd15 kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x10
May 17 20:27:21 fbsd15 kernel: iwlwifi0: RFIm is deactivated, reason = 4
May 17 20:27:21 fbsd15 kernel: wlan0: bpf attached
May 17 20:27:21 fbsd15 syslogd: last message repeated 1 times
May 17 20:27:21 fbsd15 kernel: wlan0: Ethernet address: 3c:e9:f7:ec:63:d7
May 17 20:27:21 fbsd15 kernel: lo1: bpf attached
May 17 20:27:21 fbsd15 kernel: lo2: bpf attached
May 17 20:27:21 fbsd15 kernel: lo3: bpf attached
May 17 20:27:21 fbsd15 kernel: lo0: link state changed to UP
May 17 20:27:21 fbsd15 kernel: Invalid TXQ id
May 17 20:27:21 fbsd15 kernel: iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00f597f4c8 skb 0xfffff8000ca6c800 { len 30 } info 0xfffffe00f0694cd8 sta 0xfffff80001c60080 (if you see this please report to PR 274382)
May 17 20:27:21 fbsd15 syslogd: last message repeated 1 times
May 17 20:27:21 fbsd15 kernel: wlan0: link state changed to UP

I have the verbose boot log in it's entirety, and would be more
than happy to attach it if that would help you.

--Chris
Comment 64 rkoberman 2024-05-28 21:41:46 UTC
(In reply to Bjoern A. Zeeb from comment #62)
While I am not certain, I suspect that this issue was the result of a typo in /etc/wpa_supplicant. I will be unable to test this until I try to connect again from a hotel, probably not until September. If it does not turn out to be my fat-fingering, I'll open a new ticket ad attempt to collect more details by adding debugging to the wpa_supplicant. Since "list aps" does show a long lists of SSIDs, I really suspect either a typo on my part or a wps_supplicant issue.
Comment 65 Olivier Certner freebsd_committer freebsd_triage 2024-05-31 20:20:05 UTC
Just got:

May 31 14:47:05 host kernel: iwl_mvm_tx_mpdu:1204: fc 000000 tid 8 txq_id 65535 mvm 0xfffffe00dc0f94c8 skb 0xfffff80219365000 { len 60 } info 0xfffffe00c67bdcb8 sta 0xfffff80051578080 (if you see this please report to PR 274382)
May 31 14:47:06 host kernel: iwlwifi0: Not associated and the session protection is over already...
May 31 14:47:06 host kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe011d5e2e80 vap 0xfffffe011d5e2010 state AUTH

on a recent 14-STABLE (9a8a26aefb36, May 12th).

It's connecting to an open network (so, no WPA/wpa_supplicant at all).

I had had several "iwlwifi0: linuxkpi_ieee80211_connection_loss: (snip)" on this network at some point, but it seems it was rather the network, and those were gone for a while before I got the above.

Searching in 'messages', I see it's the second time that I get "(if you see this please report to PR 274382)".  Before the first occurence, there is a line "Invalid TXQ id" with same timestamp, but there's nothing before the second one.

Let me know if you need more info, or if I could help debugging (only one day left accessing this network).
Comment 66 Tom Jones freebsd_committer freebsd_triage 2024-06-06 12:10:37 UTC
iwlwifi0: <iwlwifi> mem 0x7a200000-0x7a203fff at device 0.0 on pci2
iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
iwlwifi0: api flags index 2 larger than supported by driver
iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
iwl-debug-yoyo_bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo_bin either
iwl_debug_yoyo_bin: could not load binary firmware /boot/firmware/iwl_debug_yoyo_bin either
iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
iwlwifi0: WRT: Invalid buffer destination: 0
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
iwlwifi0: loaded PNVM version 181407b3
iwlwifi0: Detected RF GF, rfid=0x10d000
iwlwifi0: base HW address: 04:cf:4b:1e:ca:e9
iwlwifi0: WRT: Invalid buffer destination: 0
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwlwifi0: linuxkpi_ieee80211_beacon_loss: vif 0xfffffe0159b6de80 vap 0xfffffe0159b6d010 state RUN
iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0159b6de80 vap 0xfffffe0159b6d010 state RUN
iwl_mvm_tx_mpdu:1204: fc 0x00a0 tid 8 txq_id 65535 mvm 0xfffffe015ab784c8 skb 0xfffff80193d9e800 { len 26 } info 0xfffffe00daf15cb8 sta 0xfffff80002178080 (if you see this please report to PR 274382)
[tj@displacementactivity] $ uname -a
FreeBSD displacementactivity 15.0-CURRENT FreeBSD 15.0-CURRENT main-n270504-ce75bfcac9cf GENERIC-NODEBUG amd64


I am getting this regularly after a couple hours of up time on my Framework 13 12th Generation Intel. My network is a couple of Eero 6E access points and there is a quite a lot of other traffic around (I'm across the road from a primary school).
Comment 67 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-06-06 15:29:27 UTC
Can people please try the patch from:

https://reviews.freebsd.org/D45508

If you don't use arc there's a "Download Raw Diff" link on the right side.
Comment 68 Tom Jones freebsd_committer freebsd_triage 2024-06-06 19:47:52 UTC
With the patch applied I've been running for 4 hours without any issue. I'm not sure if the environment to reproduce has existed as it is later in the day.
Comment 69 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-06-06 21:15:52 UTC
(In reply to Tom Jones from comment #68)

Thanks Tom for testing so quickly and reporting back.

I'll likely put the change into main the next days and by then others also have a bit of time to test.
Comment 70 rkoberman 2024-06-07 20:52:09 UTC
This is more of an "it didn't seem to screw up the AX211" report. No problems. orksfine.With the patched code I get the following on loading:
Jun  6 16:25:41 ptavv kernel: Intel(R) Wireless WiFi based driver for FreeBSD
Jun  6 16:25:41 ptavv kernel: iwlwifi0: <iwlwifi> mem 0x603d1cc000-0x603d1cffff at device 20.3 on pci0
Jun  6 16:25:41 ptavv kernel: iwlwifi0: Detected crf-id 0x400410, cnv-id 0x80400 wfpm id 0x80000020
Jun  6 16:25:41 ptavv kernel: iwlwifi0: PCI dev 51f0/0090, rev=0x370, rfid=0x2010d000
Jun  6 16:25:41 ptavv kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-gf-a0-83.ucode'
Jun  6 16:25:41 ptavv kernel: iwlwifi0: api flags index 2 larger than supported by driver
Jun  6 16:25:41 ptavv kernel: iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
Jun  6 16:25:41 ptavv kernel: iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
Jun  6 16:25:41 ptavv syslogd: last message repeated 1 times
Jun  6 16:25:41 ptavv kernel: iwl-debug-yoyo_bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo_bin either
Jun  6 16:25:41 ptavv kernel: iwl_debug_yoyo_bin: could not load binary firmware /boot/firmware/iwl_debug_yoyo_bin either
Jun  6 16:25:41 ptavv kernel: iwlwifi0: loaded firmware version 83.e8f84e98.0 so-a0-gf-a0-83.ucode op_mode iwlmvm
Jun  6 16:25:41 ptavv kernel: iwlwifi0: Detected Intel(R) Wi-Fi 6E AX211 160MHz, REV=0x370
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WRT: Invalid buffer destination: 0
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
Jun  6 16:25:41 ptavv kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
Jun  6 16:25:41 ptavv kernel: iwlwifi0: RFIm is deactivated, reason = 4
Jun  6 16:25:41 ptavv kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-gf-a0.pnvm'
Jun  6 16:25:41 ptavv kernel: iwlwifi0: loaded PNVM version 181407b3
Jun  6 16:25:41 ptavv kernel: iwlwifi0: Detected RF GF, rfid=0x2010d000
Jun  6 16:25:41 ptavv kernel: iwlwifi0: base HW address: 70:a8:d3:df:3c:de

on startup I get:
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WRT: Invalid buffer destination: 0
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
Jun  6 16:25:41 ptavv kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
Jun  6 16:25:41 ptavv kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
Jun  6 16:25:41 ptavv kernel: iwlwifi0: RFIm is deactivated, reason = 4
Jun  6 16:25:41 ptavv kernel: wlan0: Ethernet address: 70:a8:d3:df:3c:de
Jun  6 16:25:41 ptavv kernel: lo0: link state changed to UP
Jun  6 16:25:41 ptavv kernel: iwlwifi0: lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
Jun  6 16:25:41 ptavv kernel: iwlwifi0: lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 3 (ASSOC)
Jun  6 16:25:41 ptavv kernel: iwlwifi0: Not associated and the session protection is over already...
Jun  6 16:25:41 ptavv kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0167875e80 vap 0xfffffe0167875010 state AUTH
Jun  6 16:25:41 ptavv kernel: wlan0: link state changed to UP

While it does not seem to improve anything, i does not seem to have had any negatives, as expected.
Comment 71 commit-hook freebsd_committer freebsd_triage 2024-06-07 21:52:26 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=886653492945f7e945eb9bdaf5bc2ae26df96236

commit 886653492945f7e945eb9bdaf5bc2ae26df96236
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-06-05 22:54:36 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-06-07 21:47:42 +0000

    LinuxKPI: 802.11: make sure we can send DISASSOC or DEAUTH frames

    The "Invalid TXQ" error from iwlwifi seems to be triggered by a
    frame being sent for a sta which is no longer known to the driver/fw.

    While we make sure to trigger the sending of the frame in net80211
    early enough (by calling (*iv_newstate)() early on rather than at
    the end), TX in LinuxKPI is run in a deferred task.  When we drop the
    net80211 ic lock again and re-acquire the LHW lock the packet may not
    yet have made it to the driver.
    Work around this between the (ic and lhw) locks by making sure
    (a) no new packets get queued after we return from (*iv_newstate)(),
    and (b) the TX task has run or gets cancelled and we manually push
    any remaining packets out (or let lsta_free() clean them up).
    The disabled packet queuing now also needs to be re-enabled in
    scan_to_auth() in case an lsta is staying in service or gets re-used.

    Also make sure that any following lkpi_wake_tx_queues() calls no
    longer ignore queues which have not seen a prior dequeue.
    This former workaround "feature" (ltxq->seen_dequeue) should be
    fully garbage collected in a later change on its own.

    Sponsored by:   The FreeBSD Foundation
    MFC after:      3 days
    PR:             274382
    Tested by:      emaste, lwhsu, thj, rkoberman at gmail.com
    Accepted by:    adrian
    Differential Revision: https://reviews.freebsd.org/D45508

 sys/compat/linuxkpi/common/src/linux_80211.c | 95 +++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 9 deletions(-)
Comment 72 justinlbouchard 2024-06-08 17:23:17 UTC
Hey, i'm on FreeBSD 14.1 and i've gotten the error which said to report to this PR. Since i'm not sure what info is needed here, i'll just say the basics.
ax200 wifi 6
on a 5ghz wifi
Comment 73 Ed Maste freebsd_committer freebsd_triage 2024-06-10 20:45:05 UTC
(In reply to justinlbouchard from comment #72)

This issue may be fixed as of Bjoern's commit referenced in comment #71 (it is for me). It should be merged into stable/14 before long, and after that will be available in stable/14 snapshot builds. If you're able to try one of those snapshots that would be much appreciated, otherwise check again on FreeBSD 14.2 when it is out.
Comment 74 Vladislav Shabanov 2024-06-11 20:22:44 UTC
Hello.
I have the same problem with very fresh 15-CURRENT (edbd489d09babebdc6c03924a912013be584c409).

The system starts OK and wifi works without problem. I can download data from the web, surf in browser, etc. But when I try to upload something bug ( > 50 Mb), iwlwifi crashed. The easiest way to reproduce:
    cat /dev/zero | nc -v 192.168.1.xx 9999
    (with nc -v -l 9999 > /dev/null on that host)

The crash happened both in GENERIC and GENERIC-NODEBUG. I was unable to reproduce this bug in FreeBSD-15.0-CURRENT-amd64-20240530-d2f1f71ec8c6-270474, that snapshot seems stable.

pciconf -lv:
iwlwifi0@pci0:87:0:0:   class=0x028000 rev=0x1a hdr=0x00 vendor=0x8086 device=0x2725 subvendor=0x8086 subdevice=0x0024
    vendor     = 'Intel Corporation'
    device     = 'Wi-Fi 6E(802.11ax) AX210/AX1675* 2x2 [Typhoon Peak]'
    class      = network

Dmesg at system start:
kernel: iwlwifi0: <iwlwifi> mem 0x6e200000-0x6e203fff at device 0.0 on pci3
kernel: iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
kernel: iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
kernel: iwlwifi0: api flags index 2 larger than supported by driver
kernel: iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
kernel: iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
syslogd: last message repeated 1 times
kernel: iwl-debug-yoyo_bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo_bin either
kernel: iwl_debug_yoyo_bin: could not load binary firmware /boot/firmware/iwl_debug_yoyo_bin either
kernel: iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
kernel: iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
kernel: iwlwifi0: WRT: Invalid buffer destination: 0
kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
kernel: iwlwifi0: loaded PNVM version 181407b3
kernel: iwlwifi0: Detected RF GF, rfid=0x10d000
kernel: iwlwifi0: base HW address: f8:b5:4d:6e:ce:c7
kernel: acpi_wmi0: <ACPI-WMI mapping> on acpi0
kernel: acpi_wmi0: Embedded MOF found
kernel: ACPI: \_SB.WFDE.WQCC: 1 arguments were passed to a non-method ACPI object (Buffer) (20230628/nsarguments-3>
kernel: acpi_wmi1: <ACPI-WMI mapping> on acpi0
kernel: acpi_wmi1: Embedded MOF found
kernel: ACPI: \_SB.WFTE.WQCC: 1 arguments were passed to a non-method ACPI object (Buffer) (20230628/nsarguments-3>
kernel: acpi_wmi2: <ACPI-WMI mapping> on acpi0
kernel: iwlwifi0: WRT: Invalid buffer destination: 0
kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
kernel: wlan0: Ethernet address: f8:b5:4d:6e:ce:c7
kernel: lo0: link state changed to UP
kernel: uhid0 on uhub1
...........................................

The crash:
kernel: iwlwifi0: Queue 3 is stuck 68 77
Jun 11 13:59:44 vsGB kernel: iwlwifi0:   need_update 0 frozen 0 ampdu 0 now 2146999990 stuck_timer.expires 2146999908 frozen_expiry_rem>
kernel: iwlwifi0: Microcode SW error detected. Restarting 0x0.
kernel: iwlwifi0: Start IWL Error Log Dump:
kernel: iwlwifi0: Transport status: 0x0000004A, valid: 6
kernel: iwlwifi0: Loaded firmware version: 83.e8f84e98.0 ty-a0-gf-a0-83.ucode
kernel: iwlwifi0: 0x00000084 | NMI_INTERRUPT_UNKNOWN
kernel: iwlwifi0: 0x00808203 | trm_hw_status0
kernel: iwlwifi0: 0x00000000 | trm_hw_status1
kernel: iwlwifi0: 0x004DC410 | branchlink2
kernel: iwlwifi0: 0x00008C84 | interruptlink1
kernel: iwlwifi0: 0x00008C84 | interruptlink2
kernel: iwlwifi0: 0x00016AD0 | data1
kernel: iwlwifi0: 0x01000000 | data2
kernel: iwlwifi0: 0x00000000 | data3
kernel: iwlwifi0: 0xF580CBCA | beacon time
kernel: iwlwifi0: 0xA4E9D443 | tsf low
kernel: iwlwifi0: 0x00000002 | tsf hi
kernel: iwlwifi0: 0x00000AA7 | time gp1
kernel: iwlwifi0: 0x063BB9C1 | time gp2
kernel: iwlwifi0: 0x00000001 | uCode revision type
kernel: iwlwifi0: 0x00000053 | uCode version major
kernel: iwlwifi0: 0xE8F84E98 | uCode version minor
kernel: iwlwifi0: 0x00000420 | hw version
kernel: iwlwifi0: 0x00C80002 | board version
kernel: iwlwifi0: 0x02DD001C | hcmd
kernel: iwlwifi0: 0x24023000 | isr0
kernel: iwlwifi0: 0x00048000 | isr1
kernel: iwlwifi0: 0x48F00002 | isr2
kernel: iwlwifi0: 0x00C3009C | isr3
kernel: iwlwifi0: 0x00200000 | isr4
kernel: iwlwifi0: 0x02DC001C | last cmd Id
kernel: iwlwifi0: 0x00016AD0 | wait_event
kernel: iwlwifi0: 0x000000D4 | l2p_control
kernel: iwlwifi0: 0x00019C14 | l2p_duration
kernel: iwlwifi0: 0x00000007 | l2p_mhvalid
kernel: iwlwifi0: 0x00810048 | l2p_addr_match
kernel: iwlwifi0: 0x00000009 | lmpm_pmg_sel
kernel: iwlwifi0: 0x00000000 | timestamp
kernel: iwlwifi0: 0x0000E808 | flow_handler
kernel: iwlwifi0: Start IWL Error Log Dump:
kernel: iwlwifi0: Transport status: 0x0000004A, valid: 7
kernel: iwlwifi0: 0x20000066 | NMI_INTERRUPT_HOST
kernel: iwlwifi0: 0x00000000 | umac branchlink1
kernel: iwlwifi0: 0x8046DA58 | umac branchlink2
kernel: iwlwifi0: 0x8048DF3E | umac interruptlink1
kernel: iwlwifi0: 0x8048DF3E | umac interruptlink2
kernel: iwlwifi0: 0x01000000 | umac data1
kernel: iwlwifi0: 0x8048DF3E | umac data2
kernel: iwlwifi0: 0x00000000 | umac data3
kernel: iwlwifi0: 0x00000053 | umac major
kernel: iwlwifi0: 0xE8F84E98 | umac minor
kernel: iwlwifi0: 0x063BB9BE | frame pointer
kernel: iwlwifi0: 0xC0886258 | stack pointer
kernel: iwlwifi0: 0x009F010C | last host cmd
kernel: iwlwifi0: 0x00000400 | isr status reg
kernel: iwlwifi0: IML/ROM dump:
kernel: iwlwifi0: 0x00000B03 | IML/ROM error/state
kernel: iwlwifi0: 0x000086BE | IML/ROM data1
kernel: iwlwifi0: 0x00000090 | IML/ROM WFPM_AUTH_KEY_0
kernel: iwlwifi0: Fseq Registers:
kernel: iwlwifi0: 0x60000000 | FSEQ_ERROR_CODE
kernel: iwlwifi0: 0x80440007 | FSEQ_TOP_INIT_VERSION
kernel: iwlwifi0: 0x0000A652 | FSEQ_OTP_VERSION
kernel: iwlwifi0: 0x00000002 | FSEQ_TOP_CONTENT_VERSION
kernel: iwlwifi0: 0x4552414E | FSEQ_ALIVE_TOKEN
kernel: iwlwifi0: 0x00400410 | FSEQ_CNVI_ID
kernel: iwlwifi0: 0x00400410 | FSEQ_CNVR_ID
kernel: iwlwifi0: 0x00400410 | CNVI_AUX_MISC_CHIP
kernel: iwlwifi0: 0x00400410 | CNVR_AUX_MISC_CHIP
kernel: iwlwifi0: 0x00009061 | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM
kernel: iwlwifi0: 0x00000061 | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR
kernel: iwlwifi0: 0x00080009 | FSEQ_PREV_CNVIO_INIT_VERSION
kernel: iwlwifi0: 0x00440007 | FSEQ_WIFI_FSEQ_VERSION
kernel: iwlwifi0: 0x5AE2C71E | FSEQ_BT_FSEQ_VERSION
kernel: iwlwifi0: 0x000000DC | FSEQ_CLASS_TP_VERSION
kernel: iwlwifi0: UMAC CURRENT PC: 0x8048da0c
kernel: iwlwifi0: LMAC1 CURRENT PC: 0xd0
kernel: iwlwifi0: WRT: Collecting data: ini trigger 4 fired (delay=0ms).
.........

Another crash:
kernel: iwlwifi0: Queue 3 is stuck 41 45
kernel: iwlwifi0:   need_update 0 frozen 0 ampdu 0 now 2146994036 stuck_timer.expires 2146994017 frozen_expiry_rem>
kernel: iwlwifi0: Microcode SW error detected. Restarting 0x0.
.........

If you need detailed logs, let me know.

With best regards,
    Vladislav
Comment 75 Ed Maste freebsd_committer freebsd_triage 2024-06-11 20:47:52 UTC
(In reply to Vladislav Shabanov from comment #74)
I don't see the Invalid TXQ message in your logs - can you confirm (or explain what you mean by being the same problem)?
Comment 76 Vladislav Shabanov 2024-06-11 22:46:57 UTC
(In reply to Ed Maste from comment #75)

Sorry. Maybe, my comment is not relevant to initial bug report. But I started to resolve the wifi card issue with 'Invalid TXQ id' message and after updates I have no this lines but have 'Queue 3 stuck'. For me, it looks like the patches against 'Invalid TXQ id' fix one problem but trigger another one.

Maybe, I should create another bug report?

Details:

06.06.2024: FreeBSD 15.0-CURRENT #0 main-n270474-d2f1f71ec8c6 (snapshot). 
.......
wpa_supplicant[378]: wlan0: CTRL-EVENT-DISCONNECTED bssid=c8:be:19:ec:2d:31 reason=3 locally_generated=1
wpa_supplicant[378]: wlan0: CTRL-EVENT-DSCP-POLICY clear_all
kernel: wlan0: link state changed to DOWN
kernel: Invalid TXQ id
kernel: iwl_mvm_tx_mpdu:1204: fc 0x00a0 tid 8 txq_id 65535 mvm 0xfffffe01a92634c8 skb 0xfffff80150f26800 { len 26 } info 0xfffffe010a355cc0 sta 0xfffff80150b61880 (if you see this please report to PR 274382)
....

I updated the src, recompiled and got the following:

07.06.2024: FreeBSD 15.0-CURRENT #1 main-n270672-2b887687edc2
...
kernel: iwlwifi0: Queue 3 is stuck 156 158
kernel: iwlwifi0:   need_update 0 frozen 0 ampdu 0 now 18446744071562759140 stuck_timer.expires 18446744071562759103 frozen_expiry_remainder 0 wd_timeout 10000
kernel: iwlwifi0: Microcode SW error detected. Restarting 0x0.
.....

A day later I updated again, but the problem persists:

10.06.2014: FreeBSD 15.0-CURRENT #2 main-n270710-edbd489d09ba
.....
kernel: iwlwifi0: Queue 3 is stuck 77 78
kernel: iwlwifi0:   need_update 0 frozen 0 ampdu 0 now 2147187279 stuck_timer.expires 2147187265 frozen_expiry_remainder 0 wd_timeout 10000
kernel: iwlwifi0: Microcode SW error detected. Restarting 0x0.

When I booted again main-n270474-d2f1f71ec8c6 from LiveCD, I was unable to reproduce wlan0 crash on outgoing traffic using netcat (the 'Queue 3 is stuck' issue). But with latest commits it's very easy to do so.
Comment 77 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-06-12 12:20:29 UTC
(In reply to Vladislav Shabanov from comment #76)

Please open a new bug report with this information.
I would also be curious if anything got logged before the Queue stuck message.
If so please attach it there to or provide the full dmesg output.
Comment 78 commit-hook freebsd_committer freebsd_triage 2024-06-12 16:42:28 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2ab1b827d49f473e30a91382d2ce03e268af45c5

commit 2ab1b827d49f473e30a91382d2ce03e268af45c5
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-06-05 22:54:36 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-06-12 13:59:46 +0000

    LinuxKPI: 802.11: make sure we can send DISASSOC or DEAUTH frames

    The "Invalid TXQ" error from iwlwifi seems to be triggered by a
    frame being sent for a sta which is no longer known to the driver/fw.

    While we make sure to trigger the sending of the frame in net80211
    early enough (by calling (*iv_newstate)() early on rather than at
    the end), TX in LinuxKPI is run in a deferred task.  When we drop the
    net80211 ic lock again and re-acquire the LHW lock the packet may not
    yet have made it to the driver.
    Work around this between the (ic and lhw) locks by making sure
    (a) no new packets get queued after we return from (*iv_newstate)(),
    and (b) the TX task has run or gets cancelled and we manually push
    any remaining packets out (or let lsta_free() clean them up).
    The disabled packet queuing now also needs to be re-enabled in
    scan_to_auth() in case an lsta is staying in service or gets re-used.

    Also make sure that any following lkpi_wake_tx_queues() calls no
    longer ignore queues which have not seen a prior dequeue.
    This former workaround "feature" (ltxq->seen_dequeue) should be
    fully garbage collected in a later change on its own.

    Sponsored by:   The FreeBSD Foundation
    PR:             274382
    Tested by:      emaste, lwhsu, thj, rkoberman at gmail.com
    Accepted by:    adrian
    Differential Revision: https://reviews.freebsd.org/D45508

    (cherry picked from commit 886653492945f7e945eb9bdaf5bc2ae26df96236)

 sys/compat/linuxkpi/common/src/linux_80211.c | 95 +++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 9 deletions(-)
Comment 79 Cheng Cui freebsd_committer freebsd_triage 2024-06-14 11:16:47 UTC
I happened to have a VM that can reproduce the "Invalid TXQ" by the method
listed as [3] in comment# 57. After commit 886653492945, the "Invalid TXQ" is
gone.

=== console output before commit 886653492945 === 
Starting wpa_supplicant.
Starting dhclient.
wlan0: no link ....Invalid TXQ idiwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00bde044c8 skb 0xfffff800070c0800 { len 30 } info 0xfffffe007482cce8 sta 0xfffff80007efd080 (if you see this please report to PR 274382)
.......... giving up
/etc/rc.d/dhclient: WARNING: failed to start dhclient
wlan0: link state changed to UP

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #7 main-n268591-cf4d9bf8b388: Wed Feb 28 11:09:22 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64


=== console output after commit 886653492945 ===
Starting dhclient.
wlan0: no link ....iwlwifi0: lkpi_sta_scan_to_auth:1388: lvif 0xfffffe00be9ef000 vap 0xfffffe00be9ef010 iv_bss 0xfffffe00bed00000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00becf8000 lsta 0xfffff80026f3f800
iwlwifi0: lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
.iwlwifi0: Not associated and the session protection is over already...
iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe00be9efe80 vap 0xfffffe00be9ef010 state AUTH
......... giving up
/etc/rc.d/dhclient: WARNING: failed to start dhclient
wlan0: link state changed to UP

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #8 n270683-886653492945: Fri Jun 14 07:01:23 EDT 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
Comment 80 commit-hook freebsd_committer freebsd_triage 2024-06-14 18:43:21 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=07da0ad63aa52999c0a381440104122a770decb5

commit 07da0ad63aa52999c0a381440104122a770decb5
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-06-05 22:54:36 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-06-14 14:55:16 +0000

    LinuxKPI: 802.11: make sure we can send DISASSOC or DEAUTH frames

    The "Invalid TXQ" error from iwlwifi seems to be triggered by a
    frame being sent for a sta which is no longer known to the driver/fw.

    While we make sure to trigger the sending of the frame in net80211
    early enough (by calling (*iv_newstate)() early on rather than at
    the end), TX in LinuxKPI is run in a deferred task.  When we drop the
    net80211 ic lock again and re-acquire the LHW lock the packet may not
    yet have made it to the driver.
    Work around this between the (ic and lhw) locks by making sure
    (a) no new packets get queued after we return from (*iv_newstate)(),
    and (b) the TX task has run or gets cancelled and we manually push
    any remaining packets out (or let lsta_free() clean them up).
    The disabled packet queuing now also needs to be re-enabled in
    scan_to_auth() in case an lsta is staying in service or gets re-used.

    Also make sure that any following lkpi_wake_tx_queues() calls no
    longer ignore queues which have not seen a prior dequeue.
    This former workaround "feature" (ltxq->seen_dequeue) should be
    fully garbage collected in a later change on its own.

    Sponsored by:   The FreeBSD Foundation
    PR:             274382
    Tested by:      emaste, lwhsu, thj, rkoberman at gmail.com
    Accepted by:    adrian
    Differential Revision: https://reviews.freebsd.org/D45508

    (cherry picked from commit 886653492945f7e945eb9bdaf5bc2ae26df96236)

 sys/compat/linuxkpi/common/src/linux_80211.c | 95 +++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 9 deletions(-)
Comment 81 commit-hook freebsd_committer freebsd_triage 2024-06-14 18:43:27 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b5e3a31ac050dd376d19682fe0e595692ec3379f

commit b5e3a31ac050dd376d19682fe0e595692ec3379f
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-16 21:14:02 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-06-14 14:55:16 +0000

    LinuxKPI: 802.11: disable early queue wakeup workaround; better classify

    We used to call lkpi_wake_tx_queues() for all queues early on in order
    to make sure they are ready.  Turns out whatever logic in iwlwifi
    startup is working correctly these days under LinuxKPI so this is no
    longer needed.  As the comment indicated the "workaround" did not
    always work.  Disable it for now.

    The second part of the improvement is to properly classify Non-QOS
    non-data frames (mgmt and ctl frames for now, unless we have to be
    even more careful in the future about certain sub-types) as TID 7/VO.
    Contrary to net80211 we must not promote PAE frames.

    PR:             274382
    Reviewed by:    cc
    Differential Revision: https://reviews.freebsd.org/D43944

    (cherry picked from commit 1665ef979d3adab05d178480077674ab5978e4f0)

 sys/compat/linuxkpi/common/src/linux_80211.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)
Comment 82 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-06-14 20:03:35 UTC
We believe this is fixed in main, stable/13 and stable/14.

If you still see the warning from the driver (still there for a few more weeks) and you run a release  before (not including) 14.2 or 13.4 then there is no need to report it anymore.
If you have a chance in these cases to try a stable branch after the commits it would be great.

Thanks to everyone who reported the problem, provided debug information or tested the change(s).
Comment 83 Sergey A. Osokin freebsd_committer freebsd_triage 2024-09-15 17:23:10 UTC
$ freebsd-version
14.1-RELEASE-p4
$ dmesg | grep report
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe015572d4c8 skb 0xfffff80035b83000 { len 30 } info 0xfffffe00e1253cd8 sta 0xfffff80035ecc080 (if you see this please report to PR 274382)
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe015572d4c8 skb 0xfffff80035b83000 { len 30 } info 0xfffffe00e1253cd8 sta 0xfffff80035ecc080 (if you see this please report to PR 274382)
Comment 84 Sergey A. Osokin freebsd_committer freebsd_triage 2024-09-16 13:27:22 UTC
Re-open as it's reported on 14.1-p4.
Comment 85 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-09-16 15:09:43 UTC
Sorry but 14.1-RC1 is long before the fix and not supported anymore.
Please see comment 82 [ https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274382#c82 ] for when this is expected to be gone.

On stable/14 you need 2ab1b827d49f473e30a91382d2ce03e268af45c5 or later.
Comment 86 Sergey A. Osokin freebsd_committer freebsd_triage 2024-09-16 17:10:18 UTC
This is about 14.1-p4 (the recent stable security release), not 14.1-RC1.

Re-open the case.
Comment 87 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-09-16 20:41:43 UTC
(In reply to Sergey A. Osokin from comment #86)

Pardon, my fault about RC1.

But the fact remains, the bug was fixed in main, and merged to stable/14 and stable/13 and that is when we close PRs as FIXED.

What is it you want me to do more about it?
Comment 88 Padukajorat 2024-09-18 18:45:14 UTC
I am getting below messages while system boots.
'''
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00955134c8 skb 0xfffff8002d716800 { len 30 } info 0xfffffe008277dcd8 sta 0xfffff80007da1880 (if you see this please report to PR 274382)
''''
14.1-RELEASE-p4
Intel(R) Wireless-AC 9560 160MHz
'''
Comment 89 Sergey A. Osokin freebsd_committer freebsd_triage 2024-09-18 20:11:02 UTC
(In reply to Bjoern A. Zeeb from comment #87)

I'm looking for a way how to fix the issue, probably create an errata for the FreeBSD 14.1-RELEASE.

I also found that the wlan0 network interface (based on iwlwifi) stops working after suspend/resume.  There are a couple of screens of diagnostic messages in the /var/log/messages.  Should I report those messages here or it's better to create a new case?

Thank you.
Comment 90 Sergey A. Osokin freebsd_committer freebsd_triage 2024-10-16 15:59:01 UTC
(In reply to Bjoern A. Zeeb from comment #85)

I'm talking about 14.1-p5, which is latest security release for 14.1, not sure why are you switching to 14.1-RC.

In case that's the bug for release, it'd be good to create an ERRATA, like we usually do for any other bugs in the release.