Bug 274382 - iwlwifi Invalid TXQ id
Summary: iwlwifi Invalid TXQ id
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Bjoern A. Zeeb
URL:
Keywords: tracking
Depends on:
Blocks: iwlwifi
  Show dependency treegraph
 
Reported: 2023-10-09 19:28 UTC by Bjoern A. Zeeb
Modified: 2024-05-12 23:12 UTC (History)
12 users (show)

See Also:


Attachments
dmesg including wifi panic (12.36 KB, text/plain)
2023-12-12 20:07 UTC, Jeremy Beker
no flags Details
Second dmesg with panic at end, different hardware (13.80 KB, text/plain)
2023-12-14 18:12 UTC, Jeremy Beker
no flags Details
iwl_mvm_tx_mpdu log messages (72.22 KB, text/plain)
2024-02-14 15:38 UTC, Cheng Cui
no flags Details
repro after applying D43879 on top of D43389, D43725, D43753 (6.87 KB, text/plain)
2024-02-14 16:50 UTC, Cheng Cui
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-10-09 19:28:42 UTC
iwlwifi PR to collect all the reports of "invalid TXQ id".

If you get here please:

(0) read through the reports and see if your version was already covered.

(1) add PCI ID information and firmware version of your card (pciconf -lv and dmesg will tell you).

(2) please report any (wifi related) lines leading up to the warning message from your kernel message buffer (dmesg).
Comment 1 commit-hook freebsd_committer freebsd_triage 2023-10-09 19:50:59 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ad134f8ab768e3b1d2db203f1cc963bdf0eea857

commit ad134f8ab768e3b1d2db203f1cc963bdf0eea857
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-09 19:47:39 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-10-09 19:50:02 +0000

    iwlwifi: re-enable "Invalid TXQ id" logging

    Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
    Unconditionally enable logging and add a note to report to a specific
    PR in the log message for now.
    Along with 018d93ece16b this will hopefully help us to understand what
    is going on.

    Sponsored by:   The FreeBSD Foundation
    PR:     274382

 sys/contrib/dev/iwlwifi/mvm/tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 2 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-10-19 20:09:37 UTC
There is one report in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979#c24
Comment 3 Cheng Cui freebsd_committer freebsd_triage 2023-11-09 19:34:57 UTC
Hit this on a first touch of enabling 'Wi-Fi 6 AX200' on my mini PC.

root@n1_iwl_vm:~ # service netif restart
Nov  9 10:21:34 n1_iwl_vm dhclient[1370]: My address (192.168.1.154) was deleted, dhclient exiting
Stopping dhclient.
Waiting for PIDS: 1448.
Stopping wpa_supplicant.
Waiting for PIDS: 1330.
Stopping Network: lo0 vtnet0 wlan0.
lo0: flags=8048<LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
vtnet0: flags=1008802<BROADCAST,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE>
        ether 58:9c:fc:0a:c9:c4
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
wlan0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether e0:2e:0b:92:e5:82
        groups: wlan
        ssid "" channel 149 (5745 MHz 11a)
        regdomain FCC country US authmode OPEN privacy OFF txpower 23 bmiss 7
        mcastrate 6 mgmtrate 6 scanvalid 60 wme
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Destroyed wlan(4) interfaces: wlan0.
Created wlan(4) interfaces: wlan0.
Starting wpa_supplicant.
Nov  9 10:21:35 n1_iwl_vm wpa_supplicant[1836]: ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
Starting dhclient.
wlan0: no link ...Nov  9 10:21:35 n1_iwl_vm syslogd: last message repeated 1 times
Nov  9 10:21:35 n1_iwl_vm wpa_supplicant[1837]: ioctl[SIOCS80211, op=103, val=0, arg_len=128]: Operation now in progress
..iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b1258408 skb 0xfffff800aa6d6800 { len 30 } info 0xfffffe00745dcce8 sta 0xfffff80007afd880 (if you see this please report to PR 274382)
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b1258408 skb 0xfffff800aa6d6800 { len 30 } info 0xfffffe00745dcce8 sta 0xfffff80007afd880 (if you see this please report to PR 274382)
panic: lkpi_sta_auth_to_scan: lsta 0xfffff80007cbe000 state not NONE: 0, nstate 1 arg 1

cpuid = 6
time = 1699543297
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00b0eb5b70
vpanic() at vpanic+0x132/frame 0xfffffe00b0eb5ca0
panic() at panic+0x43/frame 0xfffffe00b0eb5d00
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x2c8/frame 0xfffffe00b0eb5d80
lkpi_iv_newstate() at lkpi_iv_newstate+0x253/frame 0xfffffe00b0eb5df0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x1e7/frame 0xfffffe00b0eb5e40
taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe00b0eb5ec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe00b0eb5ef0
fork_exit() at fork_exit+0x82/frame 0xfffffe00b0eb5f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00b0eb5f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100167 ]
Stopped at      kdb_enter+0x32: movq    $0,0xe2aee3(%rip)
db>
Comment 4 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-11-09 19:44:38 UTC
(In reply to Cheng Cui from comment #3)

Which version of FreeBSD?
Given you get a panic on the state machine I assume it is main?

I would have expected something else being logged.  If you can reliably reproduce this, let me know and I'll send you a patch offline to see if we can at least try to catch the latter (panic cause).
Comment 5 Cheng Cui freebsd_committer freebsd_triage 2023-11-09 19:59:51 UTC
(In reply to Bjoern A. Zeeb from comment #4)
cc@n1_iwl_vm:~ % uname -a
FreeBSD n1_iwl_vm 15.0-CURRENT FreeBSD 15.0-CURRENT #0 main-f7d16a627: Tue Nov  7 17:31:17 EST 2023     cc@n1_iwl_vm:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
cc@n1_iwl_vm:~ %

It looks I am reporting two issues in this thread:
1. (if you see this please report to PR 274382)
2. panic: lkpi_sta_auth_to_scan: lsta 0xfffff80007cbe000 state not NONE: 0, nstate 1 arg 1

I can easily reproduce the #2 panic. Please send me the patch offline and I can try to dig deeper. thanks
Comment 6 Cheng Cui freebsd_committer freebsd_triage 2023-11-09 21:57:04 UTC
(In reply to Cheng Cui from comment #5)

I found the master bug (Bug 271979) is tracking this panic issue.
panic: lkpi_sta_auth_to_scan: lsta 0xfffff80007756800 state not NONE: 0, nstate 1 arg 1


I will update my traces there.
Comment 7 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-11-09 23:46:15 UTC
(In reply to Cheng Cui from comment #6)

See comment https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979#c29 for a possible explanation.
Comment 8 Alexander Motin freebsd_committer freebsd_triage 2023-11-10 14:39:46 UTC
I've updated my laptop to yesterday's main and lost my WiFi that worked before.  dmesg output references this PR.  Here are dmesg and pciconf outputs: https://people.freebsd.org/~mav/mav.log .  Please let me know if I can provide or test something.  Thanks.
Comment 9 commit-hook freebsd_committer freebsd_triage 2023-11-29 16:39:42 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3cd4af3825b564a6bf249b0f8d2e5dd1368ca49a

commit 3cd4af3825b564a6bf249b0f8d2e5dd1368ca49a
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-09 19:47:39 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-11-29 16:36:11 +0000

    iwlwifi: re-enable "Invalid TXQ id" logging

    Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
    Unconditionally enable logging and add a note to report to a specific
    PR in the log message for now.
    Along with 018d93ece16b this will hopefully help us to understand what
    is going on.

    Sponsored by:   The FreeBSD Foundation
    PR:     274382

    (cherry picked from commit ad134f8ab768e3b1d2db203f1cc963bdf0eea857)

 sys/contrib/dev/iwlwifi/mvm/tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 10 commit-hook freebsd_committer freebsd_triage 2023-11-30 00:39:13 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d27f3b8c7e6486ed0750286d7ec65e61935ce409

commit d27f3b8c7e6486ed0750286d7ec65e61935ce409
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-09 19:47:39 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-11-30 00:36:57 +0000

    iwlwifi: re-enable "Invalid TXQ id" logging

    Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
    Unconditionally enable logging and add a note to report to a specific
    PR in the log message for now.
    Along with 018d93ece16b this will hopefully help us to understand what
    is going on.

    Sponsored by:   The FreeBSD Foundation
    PR:     274382

    (cherry picked from commit ad134f8ab768e3b1d2db203f1cc963bdf0eea857)

 sys/contrib/dev/iwlwifi/mvm/tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 11 Jeremy Beker 2023-12-12 20:07:33 UTC
Created attachment 247013 [details]
dmesg including wifi panic
Comment 12 Jeremy Beker 2023-12-12 20:08:26 UTC
Just the panic from the previous attachment. (sorry, didn't realize it would add as two comments)

wlan0: ieee80211_new_state_locked: pending SCAN -> AUTH transition lost
Invalid TXQ id
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe01d5b9b448 skb 0xfffff8000292d800 { len 30 } info 0xfffffe0118462ce8 sta 0xfffff800545e2880 (if you see this please report to PR 274382)
panic: lkpi_sta_auth_to_scan: lsta 0xfffff800547f4800 state not NONE: 0, nstate 1 arg 1

cpuid = 3
time = 1702411289
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01d54d8b70
vpanic() at vpanic+0x132/frame 0xfffffe01d54d8ca0
panic() at panic+0x43/frame 0xfffffe01d54d8d00
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x2c8/frame 0xfffffe01d54d8d80
lkpi_iv_newstate() at lkpi_iv_newstate+0x253/frame 0xfffffe01d54d8df0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x1e7/frame 0xfffffe01d54d8e40
taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe01d54d8ec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe01d54d8ef0
fork_exit() at fork_exit+0x82/frame 0xfffffe01d54d8f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01d54d8f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Comment 13 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-12-12 22:56:32 UTC
(In reply to Jeremy Beker from comment #12)

The panic is a duplicate of 271979 I believe.
But this one is a good order of events to explain the problem of this PR it seems.

Thanks a lot for reporting it!  I'll get back here hopefully before the holidays.
Comment 14 Jeremy Beker 2023-12-14 18:12:55 UTC
Created attachment 247048 [details]
Second dmesg with panic at end, different hardware

Here is another panic from a different Dell laptop. Looks similar but I figured more data is better.

Thanks for all your work on this! Have a great holiday!
Comment 15 Hiroki Sato freebsd_committer freebsd_triage 2024-01-07 17:02:47 UTC
I got an "Invalid TXQ id" message on my machine running stable/14 as of Jan 7.  It caused no panic, but I could get no association after the driver was loaded. The output of uname+pciconf+dmesg is the following:

https://people.freebsd.org/~hrs/lenovo_x1_nano_gen1-iwlwifi-20240108.txt

The above log was captured with the driver loaded automatically by devd(8).  When I disabled loading the driver by devmatch_blocklist="if_iwlwifi" in rc.conf and tried to load it manually after entering the multi-user mode, it worked.  "kldload", "ifconfig wlan0 create", and "sh /etc/rc.d/wpa_supplicant start wlan0."  The log was the following:

https://people.freebsd.org/~hrs/lenovo_x1_nano_gen1-iwlwifi-20240108-kldload.txt

The "no buffer space" error reported by dhclient was not persistent in this case.  Eventually, DHCP packets are successfully exchanged.
Comment 16 rkoberman 2024-01-07 21:06:21 UTC
Just to update the status of my recent testing (at the request of bz@), I have determined that the error seems tied to some interaction between in the lkpi. If loading either i915drm or if_iwlwifi is delayed in loading automatically during boot and loaded from the command line, the problem has not occurred in multiple system boots.
x". 

If I manually load i915kms, all I need to do is kldload it and "startx". If I load if_iwlwifi, I need to "service netif start wlan0" followed by "service local_unbound start". All other network pieces (dhclient and wpa_supplicant) are started by netif.

Normally, i915kms is loaded from the kld_list in /etc/rc.conf. It is "cpuctl acpi_ibm acpi_video i915kms filemon". I know that the modules are loaded in their order in the list, but I doubt that it is in any way significant.
Comment 17 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-03 21:38:16 UTC
Tracking myself:

iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b09dc448 skb 0xfffff80001b59000 { len 30 } info 0xfffffe000edf5ce8 sta 0xfffff8002320f080 (if you see this please report to PR 274382)
[1357.633531] Invalid TXQ id
iwlwifi0: lkpi_sta_a_to_a:1531: lvif 0xfffffe00b0d7b000 vap 0xfffffe00b0d7b010 iv_bss 0xfffffe00b1e20000 lvif_bss 0xfffff8002320f000 lvif_bss->ni 0xfffffe00b1176000 synched 0
iwlwifi0: lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)

        
Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer     = 0x20:0xffffffff80cf7991
stack pointer           = 0x28:0xfffffe00909efd00
frame pointer           = 0x28:0xfffffe00909efd10
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (iwlwifi0 net80211 t)
rdi: fffffe00b1176000 rsi: fffff80001b5caf0 rdx: 0000000000000006
rcx: fffffe008ca59c80  r8: fffff80164d36b78  r9: 000000002320f000
rax: deadc0dedeadc0de rbx: fffffe00b1176000 rbp: fffffe00909efd10
r10: 0000000000000000 r11: 0000000000010000 r12: fffff8002320f000
r13: fffffe00b0d7bdc0 r14: deadc0dedeadc0de r15: fffffe00b09dc380
trap number             = 9
panic: general protection fault
cpuid = 1
time = 1706993398
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00909efa40
vpanic() at vpanic+0x135/frame 0xfffffe00909efb70
panic() at panic+0x43/frame 0xfffffe00909efbd0
trap_fatal() at trap_fatal+0x40f/frame 0xfffffe00909efc30
calltrap() at calltrap+0x8/frame 0xfffffe00909efc30
--- trap 0x9, rip = 0xffffffff80cf7991, rsp = 0xfffffe00909efd00, rbp = 0xfffffe00909efd10 ---
node_free() at node_free+0x11/frame 0xfffffe00909efd10
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x27f/frame 0xfffffe00909efd80
lkpi_iv_newstate() at lkpi_iv_newstate+0x253/frame 0xfffffe00909efdf0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x2a4/frame 0xfffffe00909efe40
taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe00909efec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe00909efef0
fork_exit() at fork_exit+0x82/frame 0xfffffe00909eff30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00909eff30
--- trap 0xafafafaf, rip = 0xafafafafafafafaf, rsp = 0xafafafafafafafaf, rbp = 0xafafafafafafafaf ---
KDB: enter: panic
[ thread pid 0 tid 100134 ]
Stopped at      kdb_enter+0x33: movq    $0,0xe3a3e2(%rip)


..
        iv_nstate 0x1 SCAN iv_nstate_b 4 iv_nstate_n 0
         [0] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b360 _args 192
         [1] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b380 _args 192
         [2] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b3a0 _args 192
         [3] iv_nstates 0x1 SCAN _task 0xfffffe00b0d7b3c0 _args 1
         [4] iv_nstates 0x2 AUTH _task 0xfffffe00b0d7b3e0 _args 192
         [5] iv_nstates 0 INIT _task 0xfffffe00b0d7b400 _args 0
         [6] iv_nstates 0 INIT _task 0xfffffe00b0d7b420 _args 3
         [7] iv_nstates 0x1 SCAN _task 0xfffffe00b0d7b440 _args 0
..

Note the deadc0dedeadc0de in rax;  from the log it looks like we did a bss swap;  trying to send on auth still.  Going A-A multiple times.  The txq is no longer valid (so likely was the sta.  We try to free our local cache and it is gone already in auth_to_scan.  Need to find that missing ni reference (possibly on the m->m_pkthdr.rcvif or our pkthdr cache?
Comment 18 rkoberman 2024-02-03 22:04:22 UTC
Just to update to my current status, since my update to head on Jan 13 I have not had a single incident of "Invalid TXQ id". Possibly related, drm-61-kmod no longer fails to load firmware on my Alder Lake laptop and iwlwifi had been running almost flawlessly, except for an odd outage of a few seconds on two occasions when the interface lost its association with the AP.
Comment 19 Ed Maste freebsd_committer freebsd_triage 2024-02-09 18:51:08 UTC
this reproduces for me in my WIP branch which is main from Jan 31 + wulf@'s drm6.6 branch + linuxkpi updates

--
iwlwifi0: WRT: Invalid buffer destination
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
<6>wlan0: Ethernet address: f4:26:79:6a:7d:7c
<6>lo0: link state changed to UP
<118>Feb  9 13:14:06 framework wpa_supplicant[1795]: ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
<118>Feb  9 13:14:06 framework syslogd: last message repeated 1 times
<118>Feb  9 13:14:06 framework wpa_supplicant[1796]: ioctl[SIOCS80211, op=103, val=0, arg_len=128]: Operation now in progress
<4>Invalid TXQ id
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142b96448 skb 0xfffff8000b4e3000 { len 30 } info 0xfffffe00e4e5dce0 sta 0xfffff80029024880 (if you see this please report to PR 274382)
panic: lkpi_sta_auth_to_scan: lsta 0xfffff8000b6c0800 state not NONE: 0, nstate 1 arg 1
cpuid = 7
time = 1707502448
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0141a0eb80
vpanic() at vpanic+0x135/frame 0xfffffe0141a0ecb0
panic() at panic+0x43/frame 0xfffffe0141a0ed10
lkpi_sta_auth_to_scan() at lkpi_sta_auth_to_scan+0x259/frame 0xfffffe0141a0ed80
lkpi_iv_newstate() at lkpi_iv_newstate+0x256/frame 0xfffffe0141a0edf0
ieee80211_newstate_cb() at ieee80211_newstate_cb+0x235/frame 0xfffffe0141a0ee40
taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame 0xfffffe0141a0eec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe0141a0eef0
fork_exit() at fork_exit+0x87/frame 0xfffffe0141a0ef30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0141a0ef30
--- trap 0xdf61d4b9, rip = 0x390b869d14fc8eac, rsp = 0x48765a79c3d4e72f, rbp = 0xd2dbe161c7e7a77e ---
KDB: enter: panic
Comment 20 Cheng Cui freebsd_committer freebsd_triage 2024-02-09 19:50:44 UTC
(In reply to Ed Maste from comment #19)

Hi Ed,

You can try my workaround mentioned in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276562#c9, to bypass this "service netif restart" panic. Hope it will be helpful.
Comment 21 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-09 20:44:55 UTC
(In reply to Ed Maste from comment #19)

Can you try the patches pending in Phabricator?  D43389, D43725 and D43753?  I am waiting on review...
Comment 22 Cheng Cui freebsd_committer freebsd_triage 2024-02-09 20:58:00 UTC
(In reply to Bjoern A. Zeeb from comment #21)
I am about to send the same recommendation of applying the combined D43389, D43725 and D43753.

I have been testing it for a few hours now, so far so good.
Comment 23 Ed Maste freebsd_committer freebsd_triage 2024-02-09 21:13:37 UTC
I'll grab the patches from Phab, or grab from a public git repo, and try later today.
Comment 24 Ed Maste freebsd_committer freebsd_triage 2024-02-10 04:27:07 UTC
With the three patches applied iwlwifi worked fine for several hours.

dmesg snippet:

iwlwifi0: <iwlwifi> mem 0x7a200000-0x7a203fff at device 0.0 on pci2
iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
iwlwifi0: api flags index 2 larger than supported by driver
iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
iwlwifi0: WRT: Invalid buffer destination
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
iwlwifi0: loaded PNVM version 181407b3
iwlwifi0: Detected RF GF, rfid=0x10d000
iwlwifi0: base HW address: f4:26:79:6a:7d:7c
iwlwifi0: WRT: Invalid buffer destination
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
wlan0: Ethernet address: xx:xx:xx:xx:xx:xx
Invalid TXQ id
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142af3448 skb 0xfffff80001e32000 { len 30 } info 0xfffffe00e4e5dce0 sta 0xfffff80001d52080 (if you see this please report to PR 274382)
Comment 25 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-10 14:39:07 UTC
(In reply to Ed Maste from comment #24)

Did you have an accompanying log from wpa_supplicant of that time?
Was there a re-assoc event or something?
Comment 26 Ed Maste freebsd_committer freebsd_triage 2024-02-10 18:43:37 UTC
(In reply to Bjoern A. Zeeb from comment #25)
For reference this is on my Framework laptop (i7-1165G7 Tiger Lake CPU, 8086:2725 "Wi-Fi 6 AX210/AX211/AX411 160MHz"). It occurs on on startup, with wpa_supplicant started from rc.

I tried `service wpa_supplicant stop` and then running wpa_supplicant from the cmdline without -B. I was not able to reproduce the message again in this context. The only notable output from this is wpa_supplicant reporting:
ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
ioctl[SIOCS80211, op=20, val=0, arg_len=7]: Invalid argument
Comment 27 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-10 18:48:29 UTC
(In reply to Ed Maste from comment #26)

Thanks for the update.  I'll try to add a lot more logging to this one once other things are in.  It seems we are trying to send an ASSOC REQ with a sta which no longer is valid.  This is likely just another side effect of net80211 assumptions from 10+ years ago along with deferred contexts.


For the latter, wpa_supplicant has been logging these for years on all WNICs.  We should probably just clear the noise at this point.
Comment 28 rkoberman 2024-02-10 21:13:45 UTC
(In reply to Ed Maste from comment #26)
This was a problem I was seeing with the lkpi prior to the update from drm-515-kmod to drm-61-kmod. I see nothing likely to be involved between the two. I was seeing this problem while running n266789-06c083f79dd7 and drm-515-kmod from Dec. 22 and n267556-69748e62e82a drm-61-kmod from Jan. 15. Aside from fixing the serious limitations of running my Alder Lake graphics on 515, it also fixed the invalid TXQ and crashing issues I was seeing prior to the update to 61.

I had previously noted that the stack backtraces from the many panics tied to the invalid TXQ always included drm items near the top of the stack.

Have you tried drm-61-kmod? It looks like using drm-66 (I assume from git as it is not in ports yet) has likely led to the return to the issues I had with drm-515.

I have looked at commits to iwlwifi and the lkpi in the relevant time period and I see nothing that looks likely in either place. I am now running a system from Feb. 6 with no issues with either drm-61 or iwlwifi.
Comment 29 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 14:57:23 UTC
The patches from D43389, D43725, D43753 in combination seem to work and
prevented the crash "panic: lkpi_sta_auth_to_assoc" from "service netif
restart" or simply booting the kernel. The patches does not prevent "please
report to PR 274382" from showing up.

However, I observed this strange difference among my three freebsd VMs.

vm1: Feb 12 main kernel, keeps hitting the crash without the patches
vm2: Feb 12 main kernel, and never hit the crash
vm3: Feb 14 main kernel, and never hit the crash, on a second physical host

The only different between vm1 and vm2 (they are on the same physical host) is
that vm1 was created last November, 2023 with World rebuilt that time; but vm2
was created last week using the Feb 8 freebsd-current ISO image without
rebuilding the World. The vm3 was also created in Feb without rebuilding the
World.

But I can't explain why the old World but new kernel will cause the problem.
And I suspect that is the drm-515-kmod (as rkoberman@gmail.com mentioned) the
culprit?

I remember Ed told me before that the ports tree has drm-kmod, but could
somebody help explain my suspicion, before I rebuild the drm-kmod in vm1 to
verify myself?
Comment 30 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-14 15:29:08 UTC
(In reply to Cheng Cui from comment #29)

> [..] The patches does not prevent "please report to PR 274382" from showing up.

Can you reliably reproduce it?  There's some more debugging to be added to see in which state we are with the sta.

> However, I observed this strange difference among my three freebsd VMs.
> 
> vm1: Feb 12 main kernel, keeps hitting the crash without the patches
> vm2: Feb 12 main kernel, and never hit the crash
> vm3: Feb 14 main kernel, and never hit the crash, on a second physical host

I think that basically explains the observations from a lot of people incl. the problem which made it hard to track some of this down.  Some people do see the problems while others don't.  Some people see if depending on timing and (radio) environment and others don't.  This is why some people have been reporting "very happy" recently and others still saw crashes.  It's not a deterministic setting and depends on many different things from user space to radio.  This is why the reproducer based on your observations last year was so important.
Comment 31 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 15:38:37 UTC
Created attachment 248461 [details]
iwl_mvm_tx_mpdu log messages
Comment 32 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 15:39:41 UTC
(In reply to Bjoern A. Zeeb from comment #30)

It is reliably reproduced. I have attached the "iwl_mvm_tx_mpdu log messages" from console after running the below script. Please let me know if there is any more log knob needed to setup.

root@n1iwlvm:~ # cat test.tcsh 
#!/bin/tcsh -f

set cnt = $argv[1]
set i = 0
while ($i < $cnt)
        @ i = $i + 1
        echo "($i)"
        service netif restart && sleep 10
        ifconfig wlan0
        echo "run scp now"
        scp /tmp/data.txt root@n2fbsd:/tmp/
end
root@n1iwlvm:~ #
Comment 33 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-14 15:53:30 UTC
(In reply to Cheng Cui from comment #32)

Can you apply  https://reviews.freebsd.org/D43879

it's not been tested or even compiled so you may have to make it work but you'll get the idea.  It should dump extra sta/lsta/ni and vap/lvif state for us so we can see if this is just another result of the iv_bss being switched under us while we still have packets queued for tx.
Comment 34 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 16:50:59 UTC
Created attachment 248462 [details]
repro after applying D43879 on top of D43389, D43725, D43753
Comment 35 Cheng Cui freebsd_committer freebsd_triage 2024-02-14 16:54:04 UTC
(In reply to Bjoern A. Zeeb from comment #33)

Please find the extra debug info in attached "repro after applying D43879 on top of D43389, D43725, D43753".

Meanwhile, I have an idea to create a new VM but based on old (last year) fbsd-15-current ISO image to see if I can reproduce the story.
Comment 36 Cheng Cui freebsd_committer freebsd_triage 2024-02-15 01:02:29 UTC
(In reply to Cheng Cui from comment #35)

Fortunately, I have kept my World build log and did the same way this time. And
finally I figured out how to reproduce the "panic: lkpi_sta_auth_to_scan" crash.

I created vm4 from the FreeBSD-13.2-RELEASE iso, and rebuilt World/kernel from the Feb 12 src main. Now, the crash happens every time, and I have 5/5 crashes/reboots.

These vm3 and vm4 are on the same physical host and share the same physical WiFi interface.

vm3: installed from FreeBSD-15.0-CURRENT (20240201), has not hit a crash
vm4: installed from FreeBSD-13.2-RELEASE, rebuilt World/kernel from Feb 12 src main, hit a crash every time on booting up wlan0
Comment 37 Ed Maste freebsd_committer freebsd_triage 2024-02-15 01:52:06 UTC
On boot I get at startup (same Framework laptop case as comment #26):

Invalid TXQ id
iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142c3a448 skb 0xfffff800029a6000 { len 30 } info 0xfffffe00e4e5dcc0 sta 0xfffff80001d14880 (if you see this please report to PR 274382)
iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe01424e4000 vap 0xfffffe01424e4010 iv_bss 0xfffffe0143a83000 lvif_bss 0xfffff80001d14800 lvif_bss->ni 0xfffffe0143a7b000 synched 1
wlan0: link state changed to UP
Comment 38 Cheng Cui freebsd_committer freebsd_triage 2024-02-15 16:07:56 UTC
After applying D43879 and D43909 with the latest Feb 15 main kernel, still
seeing the "Invalid TXQ" print from the vm4 I managed to reproduce the story.

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #2 main-n268309-2c5ff9118c1e-dirty: Thu Feb 15 11:00:46 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
root@n2fbsdtmp:~ # 

=== booting print ===

Created wlan(4) interfaces: wlan0.
lo0: link state changed to UP
vtnet0: link state changed to UP
Starting wpa_supplicant.
Starting dhclient.
wlan0: no link ....Invalid TXQ idiwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbf9448 skb 0xfffff8000584e800 { len 30 } info 0xfffffe0063b4cce8 sta 0xfffff800231e5880 (if you see this please report to PR 274382)
iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab4a5000 vap 0xfffffe00ab4a5010 iv_bss 0xfffffe00ab79f000 lvif_bss 0xfffff800231e5800 lvif_bss->ni 0xfffffe00ab574000 synched 1
iwl_mvm_tx_mpdu:1210 lsta 0xfffff800231e5800 ni 0xfffffe00ab574000 sta 0xfffff800231e5880
	txq_task txq len 0 mtx
	kc 0 state 1 added_to_drv 1 in_mgd 1
0xfffffe00ab574000: mac 3e:60:f9:06:d1:9b refcnt 1
	authmode 1 flags 0x20004
	associd 0x0 txpower 100 vlan 0
	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
	rssi 58 noise -96 intval 100 capinfo 0x531
	bssid 3e:60:f9:06:d1:9b essid "" channel 2422:0x480
	inact 2 inact_reload 2 txrate 72
	htcap 19ef htparam 17 htctlchan 3 ht2ndchan 1
	htopmode 2 htstbc 0 htchw 0
	vhtcap 0 freq1 0 freq2 0 vhtbasicmcs 0
.......... giving up
/etc/rc.d/dhclient: WARNING: failed to start dhclient
Comment 39 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-15 19:11:16 UTC
(In reply to Cheng Cui from comment #38)

sta is added to FW and in state IEEE80211_STA_NONE  sta->lsta->ni is 0xfffffe00ab574000

IEEE80211_STA_NONE can be from scan-to-auth, assoc-down, run-init.
Given the fc type/subtype this should be an AUTH frame to it should be scan_to_auth.

However:
vap->iv_bss 0xfffffe00ab79f000 != lvif_bss->ni 0xfffffe00ab574000 but synched is 1.


Can you apply the following change to give us an idea of how the updates to iv_bss go (untested):

--- sys/compat/linuxkpi/common/src/linux_80211.c
+++ sys/compat/linuxkpi/common/src/linux_80211.c
@@ -2531,6 +2531,11 @@ lkpi_iv_update_bss(struct ieee80211vap *vap, struct ieee80211_node *ni)
        lvif = VAP_TO_LVIF(vap);
 
        LKPI_80211_LVIF_LOCK(lvif);
+       ic_printf(vap->iv_ic, "%s:%d: lvif %p vap %p iv_bss %p lvif_bss %p "
+          "lvif_bss->ni %p synched %d, ni %p\n", __func__, __LINE__,
+          lvif, vap, vap->iv_bss, lvif->lvif_bss,
+          (lvif->lvif_bss != NULL) ? lvif->lvif_bss->ni : NULL,
+          lvif->lvif_bss_synched, ni);
        lvif->lvif_bss_synched = false;
        LKPI_80211_LVIF_UNLOCK(lvif);
 

Almost also want to dump vap->iv_state here and in lkpi_sta_dump(), if not the entire history in the latter (taken from ieee80211_ddb.c):

        int i;
        printf("iv_nstate %#x %s iv_nstate_b %d iv_nstate_n %d\n",
            vap->iv_nstate, ieee80211_state_name[vap->iv_nstate], /* historic */
            vap->iv_nstate_b, vap->iv_nstate_n);
        for (i = 0; i < NET80211_IV_NSTATE_NUM; i++) {
                printf("\t [%d] iv_nstates %#x %s _task %p _args %d\n", i,
                    vap->iv_nstates[i], ieee80211_state_name[vap->iv_nstates[i]],
                    &vap->iv_nstate_task[i], vap->iv_nstate_args[i]);
        }
Comment 40 Cheng Cui freebsd_committer freebsd_triage 2024-02-15 20:18:40 UTC
(In reply to Bjoern A. Zeeb from comment #39)

Thanks. With these extra prints. The dmesg after booting shows:
[1.427073] pci0: driver added
[1.632542] wlan0: bpf attached
[1.632744] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab4b2000
[1.633722] wlan0: bpf attached
[1.633851] wlan0: Ethernet address: e0:2e:0b:8f:77:08
[1.652204] lo0: link state changed to UP
[1.676088] vtnet0: link state changed to UP
[3.027620] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab4b2000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab791000
[3.029964] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab791000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab79b000
[3.032791] Invalid TXQ id
[3.032883] iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00ab420448 skb 0xfffff8000357e800 { len 30 } info 0xfffffe0063b4cce8 sta 0xfffff800039fe880 (if you see this please report to PR 274382)
[3.034103] iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab79b000 lvif_bss 0xfffff800039fe800 lvif_bss->ni 0xfffffe00ab791000 synched 1
[3.035176] iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
[3.035470] 	 [0] iv_nstates 0x1 SCAN _task 0xfffffe00ab4ae360 _args 0
[3.035834] 	 [1] iv_nstates 0x2 AUTH _task 0xfffffe00ab4ae380 _args 192
[3.036207] 	 [2] iv_nstates 0x2 AUTH _task 0xfffffe00ab4ae3a0 _args 192
[3.036581] 	 [3] iv_nstates 0 INIT _task 0xfffffe00ab4ae3c0 _args 0
[3.036930] 	 [4] iv_nstates 0 INIT _task 0xfffffe00ab4ae3e0 _args 0
[3.037281] 	 [5] iv_nstates 0 INIT _task 0xfffffe00ab4ae400 _args 0
[3.037631] 	 [6] iv_nstates 0 INIT _task 0xfffffe00ab4ae420 _args 0
[3.037983] 	 [7] iv_nstates 0 INIT _task 0xfffffe00ab4ae440 _args 0
[3.038332] iwl_mvm_tx_mpdu:1210 lsta 0xfffff800039fe800 ni 0xfffffe00ab791000 sta 0xfffff800039fe880
[3.038880] 	txq_task txq len 0 mtx
[3.039032] 	kc 0 state 1 added_to_drv 1 in_mgd 1
[3.039268] 0xfffffe00ab791000: mac 82:d2:94:c5:6f:ea refcnt 1
[3.039582] 	authmode 1 flags 0x20004
[3.039746] 	associd 0x0 txpower 100 vlan 0
[3.039945] 	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
[3.040193] 	rssi 32 noise -96 intval 200 capinfo 0x1431
[3.040471] 	bssid 82:d2:94:c5:6f:ea essid "" channel 2437:0x480
[3.040797] 	inact 2 inact_reload 2 txrate 72
[3.041009] 	htcap 9ad htparam 3 htctlchan 6 ht2ndchan 0
[3.041286] 	htopmode 0 htstbc 0 htchw 0
[3.041468] 	vhtcap 338959b2 freq1 0 freq2 0 vhtbasicmcs fffc
[13.132873] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab79b000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab791000
[14.340433] iwlwifi0: lkpi_iv_update_bss:2535: lvif 0xfffffe00ab4ae000 vap 0xfffffe00ab4ae010 iv_bss 0xfffffe00ab791000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab7db000
[14.397601] wlan0: link state changed to UP
root@n2fbsdtmp:~ #
Comment 41 Ed Maste freebsd_committer freebsd_triage 2024-02-16 02:21:37 UTC
dmesg snippit from the Framework laptop which produces one 'Invalid TXQ id' at boot, with the additional debugging:

iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143343000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b40000
iv_nstate 0x1 SCAN iv_nstate_b 1 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0 INIT _task 0xfffffe01424ae380 _args 0
	 [2] iv_nstates 0 INIT _task 0xfffffe01424ae3a0 _args 0
	 [3] iv_nstates 0 INIT _task 0xfffffe01424ae3c0 _args 0
	 [4] iv_nstates 0 INIT _task 0xfffffe01424ae3e0 _args 0
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 0
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b40000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b50000
iv_nstate 0x2 AUTH iv_nstate_b 2 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0 INIT _task 0xfffffe01424ae3a0 _args 0
	 [3] iv_nstates 0 INIT _task 0xfffffe01424ae3c0 _args 0
	 [4] iv_nstates 0 INIT _task 0xfffffe01424ae3e0 _args 0
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 0
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
ieee80211_get_he_iftype_cap_vif:2703: XXX LKPI80211 TODO
ieee80211_sta_recalc_aggregates:2189: XXX LKPI80211 TODO
Invalid TXQ id
iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe0142c31448 skb 0xfffff8000250b000 { len 30 } info 0xfffffe00e4e5dcc0 sta 0xfffff80001d6a880 (if you see this please report to PR 274382)
iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b50000 lvif_bss 0xfffff80001d6a800 lvif_bss->ni 0xfffffe0143b40000 synched 1
iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0x2 AUTH _task 0xfffffe01424ae3a0 _args 192
	 [3] iv_nstates 0 INIT _task 0xfffffe01424ae3c0 _args 0
	 [4] iv_nstates 0 INIT _task 0xfffffe01424ae3e0 _args 0
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 0
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
XXX-TODO linuxkpi_ieee80211_iterate_keys:4728: UNIMPLEMENTED
iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b50000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b58000
iv_nstate 0 INIT iv_nstate_b 6 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0x2 AUTH _task 0xfffffe01424ae3a0 _args 192
	 [3] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3c0 _args 1
	 [4] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3e0 _args 1
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 3
	 [6] iv_nstates 0 INIT _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
iwlwifi0: lkpi_iv_update_bss:2574: lvif 0xfffffe01424ae000 vap 0xfffffe01424ae010 iv_bss 0xfffffe0143b58000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe0143b40000
iv_nstate 0x1 SCAN iv_nstate_b 7 iv_nstate_n 0
	 [0] iv_nstates 0x1 SCAN _task 0xfffffe01424ae360 _args 0
	 [1] iv_nstates 0x2 AUTH _task 0xfffffe01424ae380 _args 192
	 [2] iv_nstates 0x2 AUTH _task 0xfffffe01424ae3a0 _args 192
	 [3] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3c0 _args 1
	 [4] iv_nstates 0x1 SCAN _task 0xfffffe01424ae3e0 _args 1
	 [5] iv_nstates 0 INIT _task 0xfffffe01424ae400 _args 3
	 [6] iv_nstates 0x1 SCAN _task 0xfffffe01424ae420 _args 0
	 [7] iv_nstates 0 INIT _task 0xfffffe01424ae440 _args 0
Comment 42 commit-hook freebsd_committer freebsd_triage 2024-02-16 15:51:26 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=fa4e4257943650c0b5f58c01bb0bdfadea61dfb2

commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-16 15:50:11 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    MFC after:      1 day
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 43 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-16 21:24:00 UTC
Anyone who's seen this, can you please test:
https://reviews.freebsd.org/D43944
Comment 44 commit-hook freebsd_committer freebsd_triage 2024-02-18 21:11:59 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=035424535b09c57971df2064ad7678e90bd8cb3a

commit 035424535b09c57971df2064ad7678e90bd8cb3a
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-18 18:31:18 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

    (cherry picked from commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2)

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 45 commit-hook freebsd_committer freebsd_triage 2024-02-19 08:09:08 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=763b10806cd4ebdcbd2b6753d4f50ec088dc57fb

commit 763b10806cd4ebdcbd2b6753d4f50ec088dc57fb
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-19 08:02:02 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

    (cherry picked from commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2)

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 46 commit-hook freebsd_committer freebsd_triage 2024-02-19 16:10:45 UTC
A commit in branch releng/13.3 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=401dbf9cfc0cfb2a02355d41bf73e146bd3dfb74

commit 401dbf9cfc0cfb2a02355d41bf73e146bd3dfb74
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-14 21:56:48 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-19 16:09:22 +0000

    LinuxKPI: 802.11: lsta txq locking cleanup

    Rename the LSTA lock to LSTA_TXQ lock as that is really what it is and
    put down the full set of macros.  Replace the init and destroy with the
    macro invocation rather than direct code.

    Put locking around the txq_ready unset and check. Move the taskq_enqueue
    call under lock to be sure we do not call it anymore after txq_ready
    got unset.

    Leave a comment related to the node reference which is passed into the
    TX path on the recvif mbuf pointer.

    Approved by:    re (cperciva)
    Fixes:          0936c648ad0ee
    PR:             274382 (possibly)
    Reviewed by:    emaste
    Differential Revision: https://reviews.freebsd.org/D43909

    (cherry picked from commit fa4e4257943650c0b5f58c01bb0bdfadea61dfb2)
    (cherry picked from commit 763b10806cd4ebdcbd2b6753d4f50ec088dc57fb)

 sys/compat/linuxkpi/common/src/linux_80211.c | 31 ++++++++++++++++++----------
 sys/compat/linuxkpi/common/src/linux_80211.h | 14 +++++++++++--
 2 files changed, 32 insertions(+), 13 deletions(-)
Comment 47 Cheng Cui freebsd_committer freebsd_triage 2024-02-19 17:52:16 UTC
(In reply to Bjoern A. Zeeb from comment #43)

It's better to have a separate patch for the extra print.

With D43944 and D43879 on top of the latest main kernel, the "Invalid TXQ id" still shows up:

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #5 main-n268402-ef75877fc2d9-dirty: Mon Feb 19 12:42:38 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64


=== dmesg snip ===

[1.666880] vtnet0: link state changed to UP
[3.235969] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab4b8000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab568000
[3.237959] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab568000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab787000
[3.241222] Invalid TXQ id
[3.241326] iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfa448 skb 0xfffff80004935800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff800068e5080 (if you see this please report to PR 274382)
[3.242673] iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab787000 lvif_bss 0xfffff800068e5000 lvif_bss->ni 0xfffffe00ab568000 synched 1
[3.243664] iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
[3.243932] 	 [0] iv_nstates 0x1 SCAN _task 0xfffffe00ab48d360 _args 0
[3.244268] 	 [1] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d380 _args 192
[3.244612] 	 [2] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d3a0 _args 192
[3.244960] 	 [3] iv_nstates 0 INIT _task 0xfffffe00ab48d3c0 _args 0
[3.245283] 	 [4] iv_nstates 0 INIT _task 0xfffffe00ab48d3e0 _args 0
[3.245610] 	 [5] iv_nstates 0 INIT _task 0xfffffe00ab48d400 _args 0
[3.245937] 	 [6] iv_nstates 0 INIT _task 0xfffffe00ab48d420 _args 0
[3.246264] 	 [7] iv_nstates 0 INIT _task 0xfffffe00ab48d440 _args 0
[3.246590] iwl_mvm_tx_mpdu:1210 lsta 0xfffff800068e5000 ni 0xfffffe00ab568000 sta 0xfffff800068e5080
[3.247102] 	txq_task txq len 0 mtx
[3.247243] 	kc 0 state 1 added_to_drv 1 in_mgd 1
[3.247464] 0xfffffe00ab568000: mac 3e:60:f9:06:d1:9b refcnt 1
[3.247766] 	authmode 1 flags 0x20004
[3.247919] 	associd 0x0 txpower 100 vlan 0
[3.248105] 	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
[3.248337] 	rssi 58 noise -96 intval 100 capinfo 0x531
[3.248591] 	bssid 3e:60:f9:06:d1:9b essid "" channel 2422:0x480
[3.248895] 	inact 2 inact_reload 2 txrate 72
[3.249092] 	htcap 19ef htparam 17 htctlchan 3 ht2ndchan 1
[3.249362] 	htopmode 2 htstbc 0 htchw 0
[3.249531] 	vhtcap 0 freq1 0 freq2 0 vhtbasicmcs 0
[3.249774] iwl_mvm_tx_mpdu:1208: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfa448 skb 0xfffff80004935800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff800068e5080 (if you see this please report to PR 274382)
[3.250894] iwlwifi0: iwl_mvm_tx_mpdu:1210: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab787000 lvif_bss 0xfffff800068e5000 lvif_bss->ni 0xfffffe00ab568000 synched 1
[3.251886] iv_nstate 0x2 AUTH iv_nstate_b 3 iv_nstate_n 0
[3.252154] 	 [0] iv_nstates 0x1 SCAN _task 0xfffffe00ab48d360 _args 0
[3.252495] 	 [1] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d380 _args 192
[3.252847] 	 [2] iv_nstates 0x2 AUTH _task 0xfffffe00ab48d3a0 _args 192
[3.253198] 	 [3] iv_nstates 0 INIT _task 0xfffffe00ab48d3c0 _args 0
[3.253536] 	 [4] iv_nstates 0 INIT _task 0xfffffe00ab48d3e0 _args 0
[3.253860] 	 [5] iv_nstates 0 INIT _task 0xfffffe00ab48d400 _args 0
[3.254187] 	 [6] iv_nstates 0 INIT _task 0xfffffe00ab48d420 _args 0
[3.254520] 	 [7] iv_nstates 0 INIT _task 0xfffffe00ab48d440 _args 0
[3.254855] iwl_mvm_tx_mpdu:1210 lsta 0xfffff800068e5000 ni 0xfffffe00ab568000 sta 0xfffff800068e5080
[3.255390] 	txq_task txq len 0 mtx
[3.255537] 	kc 0 state 1 added_to_drv 1 in_mgd 1
[3.255763] 0xfffffe00ab568000: mac 3e:60:f9:06:d1:9b refcnt 1
[3.256065] 	authmode 1 flags 0x20004
[3.256223] 	associd 0x0 txpower 100 vlan 0
[3.256417] 	txseq 0 rxseq 0 fragno 0 rxfragstamp 0
[3.256658] 	rssi 58 noise -96 intval 100 capinfo 0x531
[3.256922] 	bssid 3e:60:f9:06:d1:9b essid "" channel 2422:0x480
[3.257239] 	inact 2 inact_reload 2 txrate 72
[3.257440] 	htcap 19ef htparam 17 htctlchan 3 ht2ndchan 1
[3.257715] 	htopmode 2 htstbc 0 htchw 0
[3.257892] 	vhtcap 0 freq1 0 freq2 0 vhtbasicmcs 0
[13.337998] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab787000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab568000
[14.556865] iwlwifi0: lkpi_iv_update_bss:2537: lvif 0xfffffe00ab48d000 vap 0xfffffe00ab48d010 iv_bss 0xfffffe00ab568000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab7b1000
[14.609154] wlan0: link state changed to UP
root@n2fbsdtmp:~ #
Comment 48 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-19 22:49:27 UTC
(In reply to Cheng Cui from comment #47)

I am currently trying to track it down with Ed testing.
Comment 49 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-20 17:53:17 UTC
(In reply to Bjoern A. Zeeb from comment #48)

Thanks Ed for all the traces in private.

I believe at this point we have to bite the bullet and do a full pass over net80211 and all the wireless drivers and adjust the KPI.

This will likely also help us to go into the direction of untangling some of 713db49d06deee90dd358b2e4b9ca05368a5eaf6.

But it'll take a few days, careful testing, and hope that we'll not have to do it a 2nd time at a later point.

But it'll also bring up the question of how to then support 14 and 13.  And the only likely answer is to repeat what we've just done: __FreeBSD_version bump and just do it?
Comment 50 mark 2024-02-24 20:18:43 UTC
well i now have a new issue. wifi was working for a bit then with me not changing anything my ability to resolve ip’s just stopped working. i’m still connected to the network but can’t get on the web. thinking maybe my isp dns was down or something i switched dns info did a networking restart and now im getting messages telling me to post here. 

what’s odd is im still getting that not associated error even though i am ifconfig reports no issues. 

this is. just a test system for me to learn bsd and mach so im willing to be a guinea pig and test stuff.
Comment 51 mark 2024-02-24 20:22:08 UTC
(In reply to mark from comment #50)
uh… as mysteriously as it stopped working it has now started working again. clearly something is unhappy.
Comment 52 Ed Maste freebsd_committer freebsd_triage 2024-02-24 20:29:08 UTC
(In reply to mark from comment #50)
> well i now have a new issue. wifi was working for a bit then with me not changing anything

What version of FreeBSD are you trying? I.e., `uname -a`
Comment 53 mark 2024-02-25 01:28:27 UTC
not in front of my laptop right now but i’m using the latest staging of 14.0 because i was having a issue with iwlwifi.
Comment 54 Cheng Cui freebsd_committer freebsd_triage 2024-02-27 15:07:08 UTC
If the remaining patches are only D43944 and D43967 for this PR, it looks the problem still exists (see below) after applying the two patches. I will test again on the other VM which can also reproduce the "Invalid TXQ", and confirm.


=== dmesg snip ===
...
[1.210188] iwlwifi0: loaded firmware version 77.2df8986f.0 cc-a0-77.ucode op_mode iwlmvm
[1.210931] iwlwifi0: Detected Intel(R) Wi-Fi 6 AX200 160MHz, REV=0x340
[1.326459] iwlwifi0: Detected RF HR B3, rfid=0x10a100
[1.391440] iwlwifi0: base HW address: e0:2e:0b:92:e5:82
[1.415420] iwlwifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[1.415929] iwlwifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
[1.416247] iwlwifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
[1.416974] pci0: driver added
[1.622653] wlan0: bpf attached
[1.622921] wlan0: bpf attached
[1.623065] wlan0: Ethernet address: e0:2e:0b:92:e5:82
[1.641814] lo0: link state changed to UP
[1.665468] vtnet0: link state changed to UP
[3.237436] iwlwifi0: lkpi_sta_scan_to_auth:1328: lvif 0xfffffe00ab4b8000 vap 0xfffffe00ab4b8010 iv_bss 0xfffffe00ab53f000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab535000 lsta 0xfffff80039355000
[3.238720] iwlwifi0: 
[3.238757] Invalid TXQ id
[3.238786] lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
[3.238909] iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfc4c8 skb 0xfffff8000357e800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff80039355080 (if you see this please report to PR 274382)
[14.614268] wlan0: link state changed to UP

root@n1iwlvm:~ # uname -a
FreeBSD n1iwlvm 15.0-CURRENT FreeBSD 15.0-CURRENT #77 PR274382-cce3a70a77-dirty: Tue Feb 27 09:56:44 EST 2024     root@n1iwlvm:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
Comment 55 Cheng Cui freebsd_committer freebsd_triage 2024-02-27 15:52:17 UTC
(In reply to Cheng Cui from comment #54)

Same in the other VM. D43944 and D43967 on top of latest main does not fix the `Invalid TXQ id` problem.

=== dmesg snip ===
...
[1.417096] pci0: driver added
[1.618199] wlan0: bpf attached
[1.618601] wlan0: bpf attached
[1.618813] wlan0: Ethernet address: e0:2e:0b:8f:77:08
[1.637810] lo0: link state changed to UP
[1.661634] vtnet0: link state changed to UP
[2.963496] iwlwifi0: lkpi_sta_scan_to_auth:1327: lvif 0xfffffe00ab4c0000 vap 0xfffffe00ab4c0010 iv_bss 0xfffffe00ab7b3000 lvif_bss 0 lvif_bss->ni 0 synched 0, ni 0xfffffe00ab576000 lsta 0xfffff80060131000
[2.964666] iwlwifi0: 
[2.964687] Invalid TXQ id
[2.964726] lkpi_iv_newstate: error 95 during state transition 2 (AUTH) -> 2 (AUTH)
[2.964837] iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe006dbfb448 skb 0xfffff8000637d800 { len 30 } info 0xfffffe0063b4ccf8 sta 0xfffff80060131080 (if you see this please report to PR 274382)
[14.336602] wlan0: link state changed to UP

root@n2fbsdtmp:~ # uname -a
FreeBSD n2fbsdtmp 15.0-CURRENT FreeBSD 15.0-CURRENT #6 PR274382-n268402-ef75877fc2d9-dirty: Tue Feb 27 10:41:03 EST 2024     root@n2fbsdtmp:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
root@n2fbsdtmp:~ #
Comment 56 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-27 20:21:12 UTC
(In reply to Cheng Cui from comment #54)

See Comment 49 -- https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274382#c49
Comment 57 Cheng Cui freebsd_committer freebsd_triage 2024-02-28 15:36:29 UTC
(In reply to Bjoern A. Zeeb from comment #56)

What I understand after reading the comment #49, is that there is still remaining work to do to fix this `Invalid TXQ id` problem. And if I understand correctly, D43944 and D43967 still needs to be reviewed and be committed in main, and then MFC.

Given D43944 and D43967 do not solve the problem, I have to test them for regression. Currently, I have tested them on fbsd-15-current VMs in [1], [2] and [3] below. No regression found. They are all using fbsd-15-current kernels, but only [3] can reproduce all the problems in this PR.

[1] rebuilt kernel to fbsd-15-current based on fbsd-15-current iso
[2] rebuilt kernel/World to fbsd-15-current based on fbsd-14.0-release iso
[3] rebuilt kernel/World to fbsd-15-current based on fbsd-13.2-release iso

I am not sure the regression test policy on stable/14 and stable/13, and also not clear how many patches have been committed there. If there is a branch tracking list for each patch, that would be very helpful. Therefore, I am intending to cover regression tests in these two stable branches, which means I need to create VMs and start from there.

Please let me know if there is any concern. Also please let me know if I missed any new patches for this PR.
Comment 58 Maxim Usatov 2024-03-12 09:32:49 UTC
Just upgraded from 13.2 to 13.3 and seeing the message:

Mar 12 09:26:52 freebsd kernel: Intel(R) Wireless WiFi based driver for FreeBSD
Mar 12 09:26:52 freebsd kernel: Autoloading module: ig4
Mar 12 09:26:52 freebsd kernel: iwlwifi0: <iwlwifi> mem 0x60531cc000-0x60531cffff at device 20.3 on pci0
Mar 12 09:26:52 freebsd kernel: iwlwifi0: Detected crf-id 0x400410, cnv-id 0x80400 wfpm id 0x80000020
Mar 12 09:26:52 freebsd kernel: iwlwifi0: PCI dev 51f0/0090, rev=0x370, rfid=0x2010d000
Mar 12 09:26:52 freebsd kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-gf-a0-83.ucode'
Mar 12 09:26:52 freebsd kernel: iwlwifi0: api flags index 2 larger than supported by driver
Mar 12 09:26:52 freebsd kernel: iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
Mar 12 09:26:52 freebsd kernel: iwlwifi0: loaded firmware version 83.e8f84e98.0 so-a0-gf-a0-83.ucode op_mode iwlmvm
Mar 12 09:26:52 freebsd kernel: iwlwifi0: Detected Intel(R) Wi-Fi 6E AX211 160MHz, REV=0x370
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WRT: Invalid buffer destination
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
Mar 12 09:26:52 freebsd kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
Mar 12 09:26:52 freebsd kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
Mar 12 09:26:52 freebsd kernel: iwlwifi0: RFIm is deactivated, reason = 4
Mar 12 09:26:52 freebsd kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-so-a0-gf-a0.pnvm'
Mar 12 09:26:52 freebsd kernel: iwlwifi0: loaded PNVM version 181407b3
Mar 12 09:26:52 freebsd kernel: iwlwifi0: Detected RF GF, rfid=0x2010d000
Mar 12 09:26:52 freebsd kernel: iwlwifi0: base HW address: 8c:17:59:72:2c:1b

. . . 

Mar 12 09:26:52 freebsd kernel: wlan0: link state changed to UP
Mar 12 09:26:52 freebsd kernel: Starting Network: lo0 em0 wlan0.
Mar 12 09:26:52 freebsd kernel: lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
Mar 12 09:26:52 freebsd kernel: 	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
Mar 12 09:26:52 freebsd kernel: 	inet6 ::1 prefixlen 128
Mar 12 09:26:52 freebsd kernel: 	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
Mar 12 09:26:52 freebsd kernel: 	inet 127.0.0.1 netmask 0xff000000
Mar 12 09:26:52 freebsd kernel: 	groups: lo
Mar 12 09:26:52 freebsd kernel: 	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Mar 12 09:26:52 freebsd kernel: em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
Mar 12 09:26:52 freebsd kernel: 	options=4e504bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
Mar 12 09:26:52 freebsd kernel: 	ether 9c:2d:cd:ae:d1:4f
Mar 12 09:26:52 freebsd kernel: 	inet 192.168.200.200 netmask 0xffffff00 broadcast 192.168.200.255
Mar 12 09:26:52 freebsd kernel: 	media: Ethernet autoselect (100baseTX <full-duplex>)
Mar 12 09:26:52 freebsd kernel: 	status: active
Mar 12 09:26:52 freebsd kernel: 	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Mar 12 09:26:52 freebsd kernel: wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
Mar 12 09:26:52 freebsd kernel: 	ether 8c:17:59:72:2c:1b
Mar 12 09:26:52 freebsd kernel: 	groups: wlan
Mar 12 09:26:52 freebsd kernel: 	ssid R320 channel 3 (2422 MHz 11g) bssid 28:b3:71:17:08:98
Mar 12 09:26:52 freebsd kernel: 	regdomain FCC country US authmode WPA2/802.11i privacy ON
Mar 12 09:26:52 freebsd kernel: 	deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
Mar 12 09:26:52 freebsd kernel: 	protmode CTS wme roaming MANUAL
Mar 12 09:26:52 freebsd kernel: 	parent interface: iwlwifi0
Mar 12 09:26:52 freebsd kernel: 	media: IEEE 802.11 Wireless Ethernet OFDM/36Mbps mode 11g
Mar 12 09:26:52 freebsd kernel: 	status: associated
Mar 12 09:26:52 freebsd kernel: 	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

. . . 

Mar 12 09:26:55 freebsd kernel: wlan0: link state changed to DOWN
Mar 12 09:26:55 freebsd devd[2260]: Processing event '!system=IFNET subsystem=wlan0 type=LINK_DOWN'
Mar 12 09:26:55 freebsd wpa_supplicant[350]: wlan0: CTRL-EVENT-DISCONNECTED bssid=28:b3:71:17:08:98 reason=0
Mar 12 09:26:55 freebsd wpa_supplicant[350]: wlan0: Trying to associate with 28:b3:71:17:08:98 (SSID='R320' freq=2422 MHz)
Mar 12 09:26:55 freebsd kernel: Invalid TXQ id
Mar 12 09:26:55 freebsd kernel: iwl_mvm_tx_mpdu:1204: fc 000000 tid 8 txq_id 65535 mvm 0xfffffe010e48f408 skb 0xfffff80059974800 { len 81 } info 0xfffffe001cbcfcc0 sta 0xfffff80001b9c880 (if you see this please report to PR 274382)
Comment 59 commit-hook freebsd_committer freebsd_triage 2024-04-15 17:52:30 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1665ef979d3adab05d178480077674ab5978e4f0

commit 1665ef979d3adab05d178480077674ab5978e4f0
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-02-16 21:14:02 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-04-15 17:44:00 +0000

    LinuxKPI: 802.11: disable early queue wakeup workaround; better classify

    We used to call lkpi_wake_tx_queues() for all queues early on in order
    to make sure they are ready.  Turns out whatever logic in iwlwifi
    startup is working correctly these days under LinuxKPI so this is no
    longer needed.  As the comment indicated the "workaround" did not
    always work.  Disable it for now.

    The second part of the improvement is to properly classify Non-QOS
    non-data frames (mgmt and ctl frames for now, unless we have to be
    even more careful in the future about certain sub-types) as TID 7/VO.
    Contrary to net80211 we must not promote PAE frames.

    PR:             274382
    MFC after:      3 day
    Reviewed by:    cc
    Differential Revision: https://reviews.freebsd.org/D43944

 sys/compat/linuxkpi/common/src/linux_80211.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)
Comment 60 nunziotocci2000 2024-05-12 23:12:07 UTC
root@frameworkbsd:~ # uname -a
FreeBSD frameworkbsd 14.1-STABLE FreeBSD 14.1-STABLE stable/14-n267607-7e10c2d27a53 GENERIC amd64
root@frameworkbsd:~ # dmesg | grep iwl 
iwlwifi0: <iwlwifi> mem 0x90b00000-0x90b03fff at device 0.0 on pci1
iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
iwlwifi0: api flags index 2 larger than supported by driver
iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
iwlwifi0: WRT: Invalid buffer destination: 0
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
iwlwifi0: loaded PNVM version 181407b3
iwlwifi0: Detected RF GF, rfid=0x10d000
iwlwifi0: base HW address: a0:02:a5:c4:b8:8c
iwlwifi0: WRT: Invalid buffer destination: 0
iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
iwlwifi0: WFPM_AUTH_KEY_0: 0x90
iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe01d6a7f4c8 skb 0xfffff8000259d800 { len 30 } info 0xfffffe0115fe1cd8 sta 0xfffff80011bd5080 (if you see this please report to PR 274382)
iwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe01d6a7f4c8 skb 0xfffff8000259d800 { len 30 } info 0xfffffe0115fe1cd8 sta 0xfffff80011bd5080 (if you see this please report to PR 274382)


Wi-Fi is working currently and I have not experienced any panics related to wifi.

If you need me to inspect anything then let me know.