Bug 279717 - iwlwifi stops working with 'Queue 3 is stuck NN MM', Intel AX210 160MHz, REV=0x420
Summary: iwlwifi stops working with 'Queue 3 is stuck NN MM', Intel AX210 160MHz, REV...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: 15.0-CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Bjoern A. Zeeb
URL:
Keywords:
Depends on:
Blocks: iwlwifi
  Show dependency treegraph
 
Reported: 2024-06-13 18:40 UTC by Vladislav Shabanov
Modified: 2024-06-14 20:25 UTC (History)
1 user (show)

See Also:


Attachments
full dmesg (21.60 KB, text/plain)
2024-06-13 18:40 UTC, Vladislav Shabanov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vladislav Shabanov 2024-06-13 18:40:05 UTC
Created attachment 251439 [details]
full dmesg

After upgrade FreeBSD 15.0-CURRENT from main-n270474-d2f1f71ec8c6 to main-n270672-2b887687edc2 I have easely reproduceable crash in iwlwifi.

Wireless driver stops working every time I try to upload some big file to any host. To reproduce, it's enough to 
    cat /dev/zero | nc -v 192.168.1.xx 9999
    (with nc -v -l 9999 > /dev/null on that host)

pciconf -lv:
iwlwifi0@pci0:87:0:0:   class=0x028000 rev=0x1a hdr=0x00 vendor=0x8086 device=0x2725 subvendor=0x8086 subdevice=0x0024
    vendor     = 'Intel Corporation'
    device     = 'Wi-Fi 6E(802.11ax) AX210/AX1675* 2x2 [Typhoon Peak]'
    class      = network

dmesg:
........
kernel: iwlwifi0: <iwlwifi> mem 0x6e200000-0x6e203fff at device 0.0 on pci3
kernel: iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000
kernel: iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000
kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0-83.ucode'
kernel: iwlwifi0: api flags index 2 larger than supported by driver
kernel: iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.0.2.41
kernel: iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
kernel: iwl-debug-yoyo_bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo_bin either
kernel: iwl_debug_yoyo_bin: could not load binary firmware /boot/firmware/iwl_debug_yoyo_bin either
kernel: iwlwifi0: loaded firmware version 83.e8f84e98.0 ty-a0-gf-a0-83.ucode op_mode iwlmvm
kernel: iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
kernel: iwlwifi0: WRT: Invalid buffer destination: 0
kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
kernel: iwlwifi0: successfully loaded firmware image 'iwlwifi-ty-a0-gf-a0.pnvm'
kernel: iwlwifi0: loaded PNVM version 181407b3
kernel: iwlwifi0: Detected RF GF, rfid=0x10d000
kernel: iwlwifi0: base HW address: f8:b5:4d:6e:ce:c7
kernel: acpi_wmi0: <ACPI-WMI mapping> on acpi0
kernel: acpi_wmi0: Embedded MOF found
kernel: ACPI: \_SB.WFDE.WQCC: 1 arguments were passed to a non-method ACPI object (Buffer) (20230628/nsarguments-361)
kernel: acpi_wmi1: <ACPI-WMI mapping> on acpi0
kernel: acpi_wmi1: Embedded MOF found
kernel: ACPI: \_SB.WFTE.WQCC: 1 arguments were passed to a non-method ACPI object (Buffer) (20230628/nsarguments-361)
kernel: acpi_wmi2: <ACPI-WMI mapping> on acpi0
kernel: iwlwifi0: WRT: Invalid buffer destination: 0
kernel: iwlwifi0: WFPM_UMAC_PD_NOTIFICATION: 0x20
kernel: iwlwifi0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
kernel: iwlwifi0: WFPM_AUTH_KEY_0: 0x90
kernel: iwlwifi0: CNVI_SCU_SEQ_DATA_DW9: 0x0
......
kernel: wlan0: link state changed to UP
......
dhclient[4750]: New IP Address (ue0): 192.168.2.187
......
dhclient[4754]: New Subnet Mask (ue0): 255.255.255.0
dhclient[4759]: New Broadcast Address (ue0): 192.168.2.255
......
dhclient[4764]: New Routers (ue0): 192.168.2.1
......
kernel: drmn1: [drm] *ERROR* PPS state mismatch
syslogd: last message repeated 2 times
kernel: [drm ERROR :nv_drm_gem_export_dmabuf_memory_ioctl] [nvidia-drm] [GPU ID 0x00000100] Failed to get memory to export from DMA-BUF GEM object: 0x00000001
......
kernel: iwlwifi0: Queue 3 is stuck 77 78
kernel: iwlwifi0:   need_update 0 frozen 0 ampdu 0 now 2147187279 stuck_timer.expires 2147187265 frozen_expiry_remainder 0 wd_timeout 10000
kernel: iwlwifi0: Microcode SW error detected. Restarting 0x0.
kernel: iwlwifi0: Start IWL Error Log Dump:
kernel: iwlwifi0: Transport status: 0x0000004A, valid: 6
kernel: iwlwifi0: Loaded firmware version: 83.e8f84e98.0 ty-a0-gf-a0-83.ucode
kernel: iwlwifi0: 0x00000084 | NMI_INTERRUPT_UNKNOWN
kernel: iwlwifi0: 0x00808203 | trm_hw_status0
kernel: iwlwifi0: 0x00000000 | trm_hw_status1
kernel: iwlwifi0: 0x004DC410 | branchlink2
kernel: iwlwifi0: 0x00008C84 | interruptlink1
kernel: iwlwifi0: 0x00008C84 | interruptlink2
kernel: iwlwifi0: 0x00016AD0 | data1
kernel: iwlwifi0: 0x01000000 | data2
kernel: iwlwifi0: 0x00000000 | data3
kernel: iwlwifi0: 0xBB402C4A | beacon time
kernel: iwlwifi0: 0xDF3563CD | tsf low
kernel: iwlwifi0: 0x00000456 | tsf hi
kernel: iwlwifi0: 0x00000161 | time gp1
kernel: iwlwifi0: 0x11514EE0 | time gp2
kernel: iwlwifi0: 0x00000001 | uCode revision type
kernel: iwlwifi0: 0x00000053 | uCode version major
kernel: iwlwifi0: 0xE8F84E98 | uCode version minor
kernel: iwlwifi0: 0x00000420 | hw version
kernel: iwlwifi0: 0x00C80002 | board version
kernel: iwlwifi0: 0x0402001C | hcmd
kernel: iwlwifi0: 0x24023000 | isr0
kernel: iwlwifi0: 0x00048000 | isr1
kernel: iwlwifi0: 0x48F00002 | isr2
kernel: iwlwifi0: 0x00C100CC | isr3
kernel: iwlwifi0: 0x00200000 | isr4
kernel: iwlwifi0: 0x0401001C | last cmd Id
kernel: iwlwifi0: 0x00016AD0 | wait_event
kernel: iwlwifi0: 0x000000D4 | l2p_control
kernel: iwlwifi0: 0x00019C14 | l2p_duration
kernel: iwlwifi0: 0x00000007 | l2p_mhvalid
kernel: iwlwifi0: 0x00810048 | l2p_addr_match
kernel: iwlwifi0: 0x00000009 | lmpm_pmg_sel
kernel: iwlwifi0: 0x00000000 | timestamp
kernel: iwlwifi0: 0x0000B8D8 | flow_handler
kernel: iwlwifi0: Start IWL Error Log Dump:
kernel: iwlwifi0: Transport status: 0x0000004A, valid: 7
kernel: iwlwifi0: 0x20000066 | NMI_INTERRUPT_HOST
kernel: iwlwifi0: 0x00000000 | umac branchlink1
kernel: iwlwifi0: 0x8046DA58 | umac branchlink2
kernel: iwlwifi0: 0x8048DF3E | umac interruptlink1
kernel: iwlwifi0: 0x8048DF3E | umac interruptlink2
kernel: iwlwifi0: 0x01000000 | umac data1
kernel: iwlwifi0: 0x8048DF3E | umac data2
kernel: iwlwifi0: 0x00000000 | umac data3
kernel: iwlwifi0: 0x00000053 | umac major
kernel: iwlwifi0: 0xE8F84E98 | umac minor
kernel: iwlwifi0: 0x11514EDD | frame pointer
kernel: iwlwifi0: 0xC0886258 | stack pointer
kernel: iwlwifi0: 0x00E0010C | last host cmd
kernel: iwlwifi0: 0x00000400 | isr status reg
kernel: iwlwifi0: IML/ROM dump:
kernel: iwlwifi0: 0x00000B03 | IML/ROM error/state
kernel: iwlwifi0: 0x0000868D | IML/ROM data1
kernel: iwlwifi0: 0x00000090 | IML/ROM WFPM_AUTH_KEY_0
kernel: iwlwifi0: Fseq Registers:
kernel: iwlwifi0: 0x60000000 | FSEQ_ERROR_CODE
kernel: iwlwifi0: 0x80440007 | FSEQ_TOP_INIT_VERSION
kernel: iwlwifi0: 0x00080009 | FSEQ_CNVIO_INIT_VERSION
kernel: iwlwifi0: 0x0000A652 | FSEQ_OTP_VERSION
kernel: iwlwifi0: 0x00000002 | FSEQ_TOP_CONTENT_VERSION
kernel: iwlwifi0: 0x4552414E | FSEQ_ALIVE_TOKEN
kernel: iwlwifi0: 0x00400410 | FSEQ_CNVI_ID
kernel: iwlwifi0: 0x00400410 | FSEQ_CNVR_ID
kernel: iwlwifi0: 0x00400410 | CNVI_AUX_MISC_CHIP
kernel: iwlwifi0: 0x00400410 | CNVR_AUX_MISC_CHIP
kernel: iwlwifi0: 0x00009061 | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM
kernel: iwlwifi0: 0x00000061 | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR
kernel: iwlwifi0: 0x00080009 | FSEQ_PREV_CNVIO_INIT_VERSION
kernel: iwlwifi0: 0x00440007 | FSEQ_WIFI_FSEQ_VERSION
kernel: iwlwifi0: 0x1AEAC71E | FSEQ_BT_FSEQ_VERSION
kernel: iwlwifi0: 0x000000DC | FSEQ_CLASS_TP_VERSION
kernel: iwlwifi0: UMAC CURRENT PC: 0x8048da0c
kernel: iwlwifi0: LMAC1 CURRENT PC: 0xd0
kernel: iwlwifi0: WRT: Collecting data: ini trigger 4 fired (delay=0ms).

[vs@vsGB ~]$ ifconfig -a
lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet 127.0.0.1 netmask 0xff000000
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=0
	ether f8:b5:4d:6e:ce:c7
	inet 192.168.2.187 netmask 0xffffff00 broadcast 192.168.2.255
	groups: wlan
	ssid HomeLan5 channel 56 (5280 MHz 11a) bssid 50:ff:20:5a:41:72
	regdomain NONE country RU authmode WPA2/802.11i privacy ON
	deftxkey UNDEF AES-CCM 2:128-bit txpower 24 bmiss 7 mcastrate 6
	mgmtrate 6 scanvalid 60 wme roaming MANUAL
	parent interface: iwlwifi0
	media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11a
	status: associated
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
pflog0: flags=1000141<UP,RUNNING,PROMISC,LOWER_UP> metric 0 mtu 33152
	options=0
	groups: pflog

It still reproduceable on main-n270710-edbd489d09ba

There is nothing before 'Queue 3 is stuck': the system boots, connects to network, starts daemons. Right after the login I started netcat and get this crash.

Experiments: switch extra modules (i915kpi, nvidia_drm, drm_61_kmod) on/off. Nothing changed, crash is reproducible without these modules.
Comment 1 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-06-14 20:25:30 UTC
Thanks for the dedicated PR and the extra information.
I'll try to repro it and see what I can find out.

For (personal) reference in 2022 we put the extra debug information into the driver in e674ddec0b4138274539587fe9336b577ff1242a .  While we just fixed the Ivalid TXQ issue, people have been silent since mostly about the "Stuck Queue" part.