Bug 124753

Summary: [ieee80211] net80211 discards power-save queue packets early
Product: Base System Reporter: Joseph Lee <nugundam>
Component: wirelessAssignee: freebsd-wireless (Nobody) <wireless>
Status: Open ---    
Severity: Affects Only Me CC: bz
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   
Bug Depends on:    
Bug Blocks: 277512    

Description Joseph Lee 2008-06-19 09:20:01 UTC
Using a Windows Mobile 6.1 device in WiFi power-saving mode, and running FreeBSD 7.0-stable in hostap mode with hostapd, there's a discard timing discrepancy.

The WM6.1 device recv ps-poll's for packets every 20 seconds, while FreeBSD discards packets out of the queue every 15 seconds.

I wonder if it's related to the time interval specified in /usr/src/sys/net80211/ieee80211_node.h which has IEEE80211_INACT_WAIT set at 15 seconds.

Looking at ieee80211_power.c, it should be aging frames at:
/*
         * Tag the frame with its expiry time and insert
         * it in the queue.  The aging interval is 4 times
         * the listen interval specified by the station. 
         * Frames that sit around too long are reclaimed
         * using this information.
         */
        /* TU -> secs.  XXX handle overflow? */
        age = IEEE80211_TU_TO_MS((ni->ni_intval * ic->ic_bintval) << 2) / 1000;

but queued packets don't seem to last 4 times the 'listen interval'.

The end result is WM6.1 device is fully-associated, but it never receives packets, unless it's configured to not use power-saving mode.  This has only been a problem with the recent upgrade from FreeBSD 6.2-stable to FreeBSD 7.0-stable.

Changing the debugging code, I can see that my ni_interval is 3, ic_bintval is 100, and so after the math:
((3 * 100) << 2) = 1200
TU_TO_MS(1200) = 1200 * 1024 / 1000 = 1228
1228 / 1000 = 1 (age)

So age is 1, and every 15 seconds, it decrements 15 from the age.  Obviously, this means it discards queued packets every 15 seconds no matter what.  I then tried changing the math to TU_TO_MS((3 * 100) * 100) which gives me an age of 30 which then properly gets decremented to 15 after one cycle, and then discarded after that.  However, now my WM6.1 device times out before it polls for queued packets.  This maybe now be a WM6.1 issue, but this is separate from what looks like a useless formula for queuing packets.

Is there a non-hardcoded way to tune this discard interval?

How-To-Repeat: wlandebug -i <interface> +power

Then associate a recent Windows Mobile 6.1 device to the FreeBSD box running hostapd, and set the WM6.1 device to battery conservation mode.  Then try to browse with IE, or Opera.

Or, ping the WM6.1 device from the FreeBSD device, and watch how none of the packets ever come back.

Changing the formula to *100, instead of <<2, only the first 49 packets are lost.
Comment 1 Remko Lodder freebsd_committer freebsd_triage 2008-06-19 11:29:47 UTC
Responsible Changed
From-To: freebsd-i386->freebsd-net

reassign to networking team.
Comment 2 Joseph Lee 2009-02-20 19:12:16 UTC
ath0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 2290
        ether 00:11:95:8d:17:89
        inet6 fe80::211:95ff:fe8d:1789%ath0 prefixlen 64 scopeid 0x2 
        inet 192.168.5.1 netmask 0xffffff00 broadcast 192.168.5.255
        media: IEEE 802.11 Wireless Ethernet autoselect <hostap> (autoselect <hostap>)
        status: associated
        ssid AP channel 1 (2412 Mhz 11g) bssid 00:11:95:8d:17:89
        authmode WPA privacy MIXED deftxkey 2 TKIP 2:128-bit TKIP 3:128-bit
        txpower 31.5 scanvalid 60 bgscan bgscanintvl 300 bgscanidle 250
        roam:rssi11g 7 roam:rate11g 5 protmode CTS wme burst hidessid
        dtimperiod 1

I've noticed with tcpdump that every time the mobile station queries for power-saved packets, there's a couple of arp who-has packets sent out:

10:30:59.744056 arp who-has AP tell mobile
10:30:59.744104 arp who-has AP tell mobile

Also, packet requests never make it up to the tcpdump level.  Setting bintval to 25 (instead of the default 100), allows packets to be queued longer but still not passed on:

Here's a debug dump from exactly when the WiFi is turned on, on the mobile device with bintval @ 25:
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] power save mode on, 1 sta's in ps mode
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 1 now queued
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 2 now queued
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] power save mode off, 0 sta's in ps mode
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] flush ps queue, 2 packets queue
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] power save mode on, 1 sta's in ps mode
Feb 20 10:37:01 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 1 now queued
Feb 20 10:37:06 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 2 now queued
Feb 20 10:37:06 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:06 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:06 AP kernel: ath0: [00:18:41:c0:06:54] discard 2 frames for age
Feb 20 10:37:07 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 1 now queued
Feb 20 10:37:16 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 2 now queued
Feb 20 10:37:21 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:21 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:21 AP kernel: ath0: [00:18:41:c0:06:54] discard 2 frames for age
Feb 20 10:37:22 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 1 now queued
Feb 20 10:37:25 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 2 now queued
Feb 20 10:37:31 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 3 now queued
Feb 20 10:37:36 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:36 AP last message repeated 2 times
Feb 20 10:37:36 AP kernel: ath0: [00:18:41:c0:06:54] discard 3 frames for age
Feb 20 10:37:39 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 1 now queued
Feb 20 10:37:45 AP kernel: ath0: [00:18:41:c0:06:54] save frame with age 0, 2 now queued
Feb 20 10:37:51 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:51 AP kernel: ath0: [00:18:41:c0:06:54] discard frame, age 0
Feb 20 10:37:51 AP kernel: ath0: [00:18:41:c0:06:54] discard 2 frames for age

I do not what the meaning of the arp requests are for.

Thanks.
Joseph
Comment 3 Joseph Lee 2009-06-16 17:34:11 UTC
Seems to be FINALLY noticed and fixed by 
http://thread.gmane.org/gmane.os.freebsd.current/110707

Thanks.
Comment 4 horuzhy 2010-12-02 15:07:32 UTC
I had the exact same problem with Atheros 9285 and 8.1-STABLE, such as 9-CU=
RRENT.
Changing kernel source as a=20
http://thread.gmane.org/gmane.os.freebsd.current/110707 didn't help.
What else can I do to it work properly?
Comment 5 Adrian Chadd freebsd_committer freebsd_triage 2010-12-04 00:31:29 UTC
Responsible Changed
From-To: freebsd-net->adrian

I'll take care of this; I've been knee-deep in this code for sometime. :/
Comment 6 Adrian Chadd freebsd_committer freebsd_triage 2011-04-11 12:14:55 UTC
Responsible Changed
From-To: adrian->freebsd-wireless

Punt to wireless-testing list
Comment 7 Eitan Adler freebsd_committer freebsd_triage 2018-05-28 19:45:28 UTC
batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.