Bug 277336 - Occasional iwlwifi instability
Summary: Occasional iwlwifi instability
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: 15.0-CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-wireless (Nobody)
URL:
Keywords:
Depends on:
Blocks: iwlwifi
  Show dependency treegraph
 
Reported: 2024-02-26 18:26 UTC by rkoberman
Modified: 2024-03-16 18:59 UTC (History)
2 users (show)

See Also:


Attachments
Log of error messages (5.06 KB, text/plain)
2024-03-16 04:25 UTC, rkoberman
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description rkoberman 2024-02-26 18:26:18 UTC
For the past few weeks I have been seeing my Alder Lake system network occasionally drop off-line for sub-second intervals. While these have not bee noticeable, reporting them might aid in troubleshooting some more serious issues.

The following two log entries are always pairs with identical time stamps:
Not associated and the session protection is over already...
linuxkpi_ieee80211_connection_loss: vif 0xfffffe0166d6bdc0 vap 0xfffffe0166d6b010 state ASSOC

If this is expected behavior, feel free to just close the ticket,
Comment 1 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-02-27 00:30:18 UTC
So you are trying to associate (or re-associate).  For as-to-why in that moment is unclear.  wpa suplicant log in syslog may tell you more.  Possibly have to up the -dd there.

When that happens iwlwifi says it hasn't seen the necessary state update in time and calls connection loss which will take the session down and the next attempt will likely succeed given you say "sub-second".

Could it be you are in a noisy radio environment?  Just checking before further digging in.

Also which chipset+firmware is this?
Seems the firmware is not enabling a lot of debugging (seen that for other bits as well and doesn't seem us).

I haven't tested but can you try adding the following one line change.  Will give us at least an idea about why iwlwifi thinks to notify the lost connection. My first thought always is missing beacons but in this case probably a time event?

--- sys/contrib/dev/iwlwifi/mvm/utils.c
+++ sys/contrib/dev/iwlwifi/mvm/utils.c
@@ -730,6 +730,7 @@ void iwl_mvm_connection_loss(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
        iwl_fw_dbg_collect_trig(&mvm->fwrt, trig, "%s", errmsg);
 
 out:
+       printf("%s: %s\n", __func__, errmsg);
        ieee80211_connection_loss(vif);
 }
Comment 2 rkoberman 2024-02-27 06:24:31 UTC
I really did not look at this properly. When I looked at the full log, instead of just doing a grep for iwlwifi in kernel messages. When I look at the whole text of the incident, the outage was much more significant with the network down for 1:33.

Feb 23 10:03:07 ptavv wpa_supplicant[414]: wlan0: WPA: Group rekeying completed with c0:89:ab:60:ae:68 [GTK=TKIP]
Feb 23 10:03:07 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DISCONNECTED bssid=c0:89:ab:60:ae:68 reason=0
Feb 23 10:03:07 ptavv kernel: wlan0: link state changed to DOWN
Feb 23 10:03:08 ptavv wpa_supplicant[414]: wlan0: Trying to associate with c0:89:ab:60:ae:68 (SSID='babcom' freq=2412 MHz)
Feb 23 10:03:08 ptavv wpa_supplicant[414]: ioctl[SIOCS80211, op=21, val=0, arg_len=42]: No such file or directory
Feb 23 10:03:08 ptavv wpa_supplicant[414]: wlan0: Association request to the driver failed
Feb 23 10:03:08 ptavv kernel: iwlwifi0: Not associated and the session protection is over already...
Feb 23 10:03:08 ptavv kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0166d6bdc0 vap 0xfffffe0166d6b010 state AUTH
Feb 23 10:03:13 ptavv wpa_supplicant[414]: wlan0: Authentication with c0:89:ab:60:ae:68 timed out.
Feb 23 10:03:13 ptavv wpa_supplicant[414]: BSSID c0:89:ab:60:ae:68 ignore list count incremented to 2, ignoring for 10 seconds
Feb 23 10:03:13 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DISCONNECTED bssid=c0:89:ab:60:ae:68 reason=3 locally_generated=1
Feb 23 10:03:13 ptavv wpa_supplicant[414]: BSSID c0:89:ab:60:ae:68 ignore list count incremented to 3, ignoring for 60 seconds
Feb 23 10:03:13 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DSCP-POLICY clear_all
Feb 23 10:03:14 ptavv wpa_supplicant[414]: wlan0: Trying to associate with c0:89:ab:60:ae:68 (SSID='babcom' freq=2412 MHz)
Feb 23 10:03:15 ptavv kernel: iwlwifi0: Not associated and the session protection is over already...
Feb 23 10:03:15 ptavv kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0166d6bdc0 vap 0xfffffe0166d6b010 state ASSOC
Feb 23 10:03:24 ptavv wpa_supplicant[414]: wlan0: Authentication with c0:89:ab:60:ae:68 timed out.
Feb 23 10:03:24 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DISCONNECTED bssid=c0:89:ab:60:ae:68 reason=3 locally_generated=1
Feb 23 10:03:24 ptavv wpa_supplicant[414]: BSSID c0:89:ab:60:ae:68 ignore list count incremented to 2, ignoring for 10 seconds
Feb 23 10:03:24 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DSCP-POLICY clear_all
Feb 23 10:03:25 ptavv wpa_supplicant[414]: wlan0: Trying to associate with c0:89:ab:60:ae:68 (SSID='babcom' freq=2412 MHz)
Feb 23 10:03:26 ptavv kernel: iwlwifi0: Not associated and the session protection is over already...
Feb 23 10:03:26 ptavv kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0166d6bdc0 vap 0xfffffe0166d6b010 state ASSOC
Feb 23 10:03:35 ptavv wpa_supplicant[414]: wlan0: Authentication with c0:89:ab:60:ae:68 timed out.
Feb 23 10:03:35 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DISCONNECTED bssid=c0:89:ab:60:ae:68 reason=3 locally_generated=1
Feb 23 10:03:35 ptavv wpa_supplicant[414]: BSSID c0:89:ab:60:ae:68 ignore list count incremented to 2, ignoring for 10 seconds
Feb 23 10:03:35 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-SSID-TEMP-DISABLED id=1 ssid="babcom" auth_failures=1 duration=10 reason=CONN_FAILED
Feb 23 10:03:35 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DSCP-POLICY clear_all
Feb 23 10:03:46 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-SSID-REENABLED id=1 ssid="babcom"
Feb 23 10:03:46 ptavv wpa_supplicant[414]: wlan0: Trying to associate with c0:89:ab:60:ae:68 (SSID='babcom' freq=2412 MHz)
Feb 23 10:03:47 ptavv kernel: iwlwifi0: Not associated and the session protection is over already...
Feb 23 10:03:47 ptavv kernel: iwlwifi0: linuxkpi_ieee80211_connection_loss: vif 0xfffffe0166d6bdc0 vap 0xfffffe0166d6b010 state ASSOC
Feb 23 10:03:56 ptavv wpa_supplicant[414]: wlan0: Authentication with c0:89:ab:60:ae:68 timed out.
Feb 23 10:03:56 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DISCONNECTED bssid=c0:89:ab:60:ae:68 reason=3 locally_generated=1
Feb 23 10:03:56 ptavv wpa_supplicant[414]: BSSID c0:89:ab:60:ae:68 ignore list count incremented to 2, ignoring for 10 seconds
Feb 23 10:03:56 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-SSID-TEMP-DISABLED id=1 ssid="babcom" auth_failures=2 duration=35 reason=CONN_FAILED
Feb 23 10:03:56 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-DSCP-POLICY clear_all
Feb 23 10:04:32 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-SSID-REENABLED id=1 ssid="babcom"
Feb 23 10:04:32 ptavv wpa_supplicant[414]: wlan0: Trying to associate with c0:89:ab:60:ae:68 (SSID='babcom' freq=2412 MHz)
Feb 23 10:04:32 ptavv wpa_supplicant[414]: wlan0: Associated with c0:89:ab:60:ae:68
Feb 23 10:04:32 ptavv dhclient[32989]: send_packet: No buffer space available
Feb 23 10:04:32 ptavv wpa_supplicant[414]: wlan0: WPA: Key negotiation completed with c0:89:ab:60:ae:68 [PTK=CCMP GTK=TKIP]
Feb 23 10:04:32 ptavv wpa_supplicant[414]: wlan0: CTRL-EVENT-CONNECTED - Connection to c0:89:ab:60:ae:68 completed [id=1 id_str=]
Feb 23 10:04:32 ptavv kernel: wlan0: link state changed to UP
Feb 23 10:04:40 ptavv dhclient[32989]: unknown dhcp option value 0x7d
Feb 23 10:04:40 ptavv dhclient[23875]: New IP Address (wlan0): 192.168.1.5
Feb 23 10:04:40 ptavv dhclient[23879]: New Subnet Mask (wlan0): 255.255.255.0
Feb 23 10:04:40 ptavv dhclient[23883]: New Broadcast Address (wlan0): 192.168.1.255
Feb 23 10:04:40 ptavv dhclient[23887]: New Routers (wlan0): 192.168.1.1

Here are the details on my interface:
iwlwifi0@pci0:0:20:3:   class=0x028000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x51f0 subvendor=0x8086 subdevice=0x0090
    vendor     = 'Intel Corporation'
    device     = 'Alder Lake-P PCH CNVi WiFi

I have rebuilt with the printf. The issue does not happen frequently, but I'll post an update when it happens.
Comment 3 rkoberman 2024-03-16 04:25:37 UTC
Created attachment 249208 [details]
Log of error messages

Finally had another incident after applying the patch. Log attached.
Comment 4 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-03-16 18:59:23 UTC
(In reply to rkoberman from comment #3)

Ignoring the analysis of what happened there, there is no iwl_mvm_connection_loss in the log, so different issue.

Whatever this Arris equipment is... you are still doing tkip for the group key there?

From what you probably read from the log yourself:
you are doing a re-key, connection is dropped, you try to get back, no, no, no, no, wpa_supplicant puts the BSSID on the blacklist temporary, another pause, wpa_supplicant tries again,  back on the AP.

Hard to say what as really going on, if this was your side, the AP, both?

Do you have any chance to know what the AP thought at the same time?