Bug 278935 - [iwlwifi] fails to establish a connection, or fails to maintain it (intel 8265)
Summary: [iwlwifi] fails to establish a connection, or fails to maintain it (intel 8265)
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: 14.0-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-wireless (Nobody)
URL:
Keywords:
Depends on:
Blocks: iwlwifi
  Show dependency treegraph
 
Reported: 2024-05-12 13:57 UTC by Dave Cottlehuber
Modified: 2024-05-13 21:18 UTC (History)
1 user (show)

See Also:


Attachments
fail on start 14.0p6 (78.31 KB, text/plain)
2024-05-12 13:59 UTC, Dave Cottlehuber
no flags Details
recover 14.0p6 (12.49 KB, text/plain)
2024-05-12 14:02 UTC, Dave Cottlehuber
no flags Details
full dmesg 14.0p6 failed scan (16.29 KB, text/plain)
2024-05-12 14:08 UTC, Dave Cottlehuber
no flags Details
dmesg 14.1-BETA2 fails on start iwlwifi up, switches, dies (83.15 KB, text/plain)
2024-05-13 12:36 UTC, Dave Cottlehuber
no flags Details
dmesg 14.1-BETA2 kldunload to restart, fails again (31.76 KB, text/plain)
2024-05-13 12:38 UTC, Dave Cottlehuber
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-12 13:57:14 UTC
This is my 14.0-RELEASE old Dell XP13 laptop, with an intel 8265 nic swapped in, this
has been pretty solid for many years on iwm, including suspend/resume.

Swapping to iwlwifi periodically during last years development has generally worked
fine, but I've had problems recently where the laptop can't make the connection
to the network at all, or makes a connection, and then drops it minutes later.

BAD STARTUP:

wlan0: flags=8c43<UP,BROADCAST,RUNNING,DRV_OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=0
	ether 00:28:f8:d0:91:52
	inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
	inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
	groups: wlan
	ssid "" channel 1 (2412 MHz 11g)
	regdomain ETSI2 country AT authmode WPA1+WPA2/802.11i privacy ON
	deftxkey UNDEF txpower 30 bmiss 7 scanvalid 60 protmode CTS wme
	roaming MANUAL
	parent interface: iwlwifi0
	media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
	status: no carrier
	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

Sometimes it works:

GOOD STARTUP:

wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=0
	ether 00:28:f8:d0:91:52
	inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
	inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
	groups: wlan
	ssid skunkwerks channel 1 (2412 MHz 11g) bssid 80:2a:a8:84:e2:a3
	regdomain ETSI2 country AT authmode WPA2/802.11i privacy ON
	deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
	protmode CTS wme roaming MANUAL
	parent interface: iwlwifi0
	media: IEEE 802.11 Wireless Ethernet OFDM/24Mbps mode 11g
	status: associated
	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

And then stops a minute or two later. Sometimes it recovers tho:

RECOVERED:
wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=0
	ether 00:28:f8:d0:91:52
	inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
	inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
	groups: wlan
	ssid skunkwerks channel 1 (2412 MHz 11g) bssid 80:2a:a8:5a:bd:3f
	regdomain ETSI2 country AT authmode WPA2/802.11i privacy ON
	deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
	protmode CTS wme roaming MANUAL
	parent interface: iwlwifi0
	media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11g
	status: associated
	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

NB there are a few other moving parameters here, including ubiquiti AP
self-update their firmware & with security patches.

I've attached ifconfig, dmesg, from the relevant phases with most of the 
wifi debug flags turned on.

I don't know what I'm looking for, but it's as if the firmware gets stuck
and just won't work. Switching to iwm sometimes seems to reset it, but not
always.
Comment 1 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-12 13:59:26 UTC
Created attachment 250596 [details]
fail on start 14.0p6
Comment 2 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-12 14:02:01 UTC
Created attachment 250597 [details]
recover 14.0p6
Comment 3 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-12 14:08:40 UTC
Created attachment 250598 [details]
full dmesg 14.0p6 failed scan
Comment 4 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-05-12 21:42:45 UTC
#c1 and #c3 are firmware crashes because no one fixed the problems known in net80211 so far until we tried some quick hackish improvement at least in Jan/Feb for 13.3-R;  in theory iwn/iwm/rtwn/.. and all suffer from these but are just a bit harder to trigger.

A lot of fixes are in stable/14 or 14.1-BETA<n>.  If you can try either of those that would be great.

There is one problem left (well many) but one specifically for the 8xxx/9xxx cards but that'll be visible then and will go like "failure to remove station from firmware" followed by errors.. and is similar to your #c3 comment.
This no longer is seen by AX200 or later chipsets and is an iwlwifi internal one;  there is a separate bug open for that and I should really go and figure that out as well.
Comment 5 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-13 12:27:17 UTC
now on BETA2, still seeing same issue: connects, then drops off and can't reattach
until `kldunload if_iwlwifi`. Very reproducible at least :D
Comment 6 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-13 12:30:56 UTC
now on BETA2, still seeing same issue: connects, then drops off and can't reattach
until `kldunload if_iwlwifi`. Very reproducible at least :D

I was able to catch the UP state and noticed that it stops directly after switching
media type:

## ifconfig diff, still associated, but dies immediately after
-|        media: IEEE 802.11 Wireless Ethernet OFDM/36Mbps mode 11g
+|        media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11g
Comment 7 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-13 12:36:32 UTC
Created attachment 250614 [details]
dmesg 14.1-BETA2 fails on start iwlwifi up, switches, dies
Comment 8 Dave Cottlehuber freebsd_committer freebsd_triage 2024-05-13 12:38:07 UTC
Created attachment 250615 [details]
dmesg 14.1-BETA2 kldunload to restart, fails again

iwlwifi state changes below

wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether 00:28:f8:d0:91:52
        inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
        inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
        groups: wlan
        ssid skunkwerks channel 1 (2412 MHz 11g) bssid 80:2a:a8:84:e2:a3
        regdomain ETSI2 country AT authmode WPA2/802.11i privacy ON
        deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
        protmode CTS wme roaming MANUAL
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet OFDM/36Mbps mode 11g
        status: associated
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether 00:28:f8:d0:91:52
        inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
        inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
        groups: wlan
        ssid skunkwerks channel 1 (2412 MHz 11g) bssid 80:2a:a8:84:e2:a3
        regdomain ETSI2 country AT authmode WPA2/802.11i privacy ON
        deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
        protmode CTS wme roaming MANUAL
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11g
        status: associated
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

wlan0: flags=8c43<UP,BROADCAST,RUNNING,DRV_OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether 00:28:f8:d0:91:52
        inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
        inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
        groups: wlan
        ssid "" channel 1 (2412 MHz 11g)
        regdomain ETSI2 country AT authmode WPA1+WPA2/802.11i privacy ON
        deftxkey UNDEF txpower 30 bmiss 7 scanvalid 60 protmode CTS wme
        roaming MANUAL
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
        status: no carrier
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether 00:28:f8:d0:91:52
        inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
        inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
        groups: wlan
        ssid skunkwerks channel 1 (2412 MHz 11g) bssid 80:2a:a8:84:e2:a3
        regdomain ETSI2 country AT authmode WPA2/802.11i privacy ON
        deftxkey UNDEF AES-CCM 2:128-bit txpower 30 bmiss 7 scanvalid 60
        protmode CTS wme roaming MANUAL
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11g
        status: associated
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

wlan0: flags=8c43<UP,BROADCAST,RUNNING,DRV_OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=0
        ether 00:28:f8:d0:91:52
        inet 172.16.2.21 netmask 0xffffff00 broadcast 172.16.2.255
        inet6 fe80::228:f8ff:fed0:9152%wlan0 prefixlen 64 scopeid 0x2
        groups: wlan
        ssid skunkwerks channel 1 (2412 MHz 11g)
        regdomain ETSI2 country AT authmode WPA1+WPA2/802.11i privacy ON
        deftxkey UNDEF txpower 30 bmiss 7 scanvalid 60 protmode CTS wme
        roaming MANUAL
        parent interface: iwlwifi0
        media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
        status: no carrier
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
Comment 9 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-05-13 21:18:25 UTC
(In reply to Dave Cottlehuber from comment #8)

So two problems here:

(1) by the time you actually make it UP you get kicked:
wlan0: [80:2a:a8:84:e2:a3] recv disassociate (reason: 34 (too many frames need to be acknowledged))

(2) when that session then tries to go back up and ends up going down entirely you hit PR 275255 and then it's game over until possible manual intervention:
iwlwifi0: Couldn't drain frames for staid 0, status 0x8
iwlwifi0: lkpi_sta_auth_to_scan:1429: mo_sta_state(NOTEXIST) failed: -5

Where status 8 is ADD_STA_MODIFY_NON_EXISTING_STA: driver requested to modify a station that doesn't exist.

But given we are only trying to remove it from the FW after that "mo_sta_state(NOTEXIST)" something else nuked the station already.

Let's try to fix (2) first as that'll help to keep it at least going.