163689 – [ath] TX timeouts when sending probe/mgmt frames during scanning

Bug 163689 - [ath] TX timeouts when sending probe/mgmt frames during scanning

Summary: [ath] TX timeouts when sending probe/mgmt frames during scanning

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	wireless (show other bugs)
Version:	10.0-CURRENT
Hardware:	Any Any

Importance:	Normal Affects Only Me
Assignee:	freebsd-wireless (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-12-29 01:10 UTC by Adrian Chadd
Modified:	2019-01-19 20:14 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Adrian Chadd freebsd_committer

2011-12-29 01:10:12 UTC

When aggregation is enabled, frames that are queued to the software queue require a call to ath_txq_sched() in order to schedule them to the hardware.

This is currently done by implication - to clarify, the only times frames are queued to the software queue is if the hardware queue is currently busy or paused. If this isn't the case, the frames are directly dispatched to the hardware. ath_txq_sched() is then called by the TXQ completion code in ath_tx_processq().

Unfortunately, during channel scanning, some frames make it to the software queue with no subsequent frames in the hardware queue. This means that ath_tx_processq() never occurs and thus ath_txq_sched() never occurs.

This results in TX timeouts, along with:

TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 1009 0000 adde
ath1: ath_tx_tid_drain: node 0xc77a2000: tid 0: txq_depth=0, txq_aggr_depth=0, sched=1, paused=0, hwq_depth=0, incomp=0, baw_head=77, baw_tail=77 txa_start=1740, ni_txseqs=1740
TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 c00a 0000 adde
ar5416PerCalibrationN: NF calibration didn't finish; delaying CCA
ath1: ath_tx_tid_drain: node 0xc77a2000: tid 0: txq_depth=0, txq_aggr_depth=0, sched=1, paused=0, hwq_depth=0, incomp=0, baw_head=100, baw_tail=100 txa_start=1763, ni_txseqs=1763
TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 600c 0000 adde
ath1: ath_tx_tid_drain: node 0xc77a2000: tid 0: txq_depth=0, txq_aggr_depth=0, sched=1, paused=0, hwq_depth=0, incomp=0, baw_head=85, baw_tail=85 txa_start=1876, ni_txseqs=1876
TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 c017 0000 adde
ath1: ath_tx_tid_drain: node 0xc77a2000: tid 0: txq_depth=0, txq_aggr_depth=0, sched=1, paused=0, hwq_depth=0, incomp=0, baw_head=85, baw_tail=85 txa_start=1876, ni_txseqs=1876
TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 e01a 0000 adde
ath1: ath_tx_tid_drain: node 0xc77a2000: tid 0: txq_depth=0, txq_aggr_depth=0, sched=1, paused=0, hwq_depth=0, incomp=0, baw_head=85, baw_tail=85 txa_start=1876, ni_txseqs=1876
TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 001e 0000 adde
ath1: ath_tx_tid_drain: node 0xc77a2000: tid 0: txq_depth=0, txq_aggr_depth=0, sched=1, paused=0, hwq_depth=0, incomp=0, baw_head=7, baw_tail=7 txa_start=1926, ni_txseqs=1926
TODS 00:03:7f:0b:62:88->00:19:e0:66:66:68(00:19:e0:66:66:68) data QoS [TID 0] 0M
 c819 3a01 0019 e066 6668 0003 7f0b 6288 0019 e066 6668 4021 0000 adde

. this only reliably occurs once aggregation is established.

Fix: 

It's a little more complicated than it needs to be.

The above situation is purely the data frames from ping. The trouble is that it's also probe frames, and anything else that's low traffic.

I've seen it also with probe frames, with extremely busy/crowded air. This is just the easiest way to reliably trigger it.

It's fixed if ath_txq_sched() is called appropriately, but there's no appropriate, non-hackish way to call it at the present moment. It needs the txq in question and that currently isn't available in ath_start() or ath_raw_xmit(). Furthermore, right now the only place it gets called is via the taskqueue and that happens once per ath_tx_processq() call, so we don't have to worry about it running in parallel. To solve this, we may not be able to easily get away with that assumption.
How-To-Repeat: * Associate to an 11n enabled access point
* Pass some TX traffic to ensure that aggregation is established (wlandebug +11n first, so you get told of this.)
* Then start a ping on the station, whilst running "ifconfig wlanX scan"
* see it log these errors.

Comment 1 Mark Linimon freebsd_committer

2011-12-29 03:49:20 UTC

Responsible Changed
From-To: freebsd-bugs->freebsd-wireless

Comment 2 dfilter service freebsd_committer

2012-01-01 01:09:06 UTC

Author: adrian
Date: Sun Jan  1 01:08:51 2012
New Revision: 229165
URL: http://svn.freebsd.org/changeset/base/229165

Log:
  If frames are dumped out of the queue, let's at least see what they are.
  
  This shows that the majority of the weird traffic I see here are probe
  frames that haven't been sent out, but I can also trigger this condition
  by doing ICMP w/ -i 0.3 - enough to trigger the TX during actual scanning,
  but not fast enough to stop scanning from occuring.
  
  PR:		kern/163689

Modified:
  head/sys/dev/ath/if_ath_tx.c

Modified: head/sys/dev/ath/if_ath_tx.c
==============================================================================
--- head/sys/dev/ath/if_ath_tx.c	Sun Jan  1 00:23:32 2012	(r229164)
+++ head/sys/dev/ath/if_ath_tx.c	Sun Jan  1 01:08:51 2012	(r229165)
@@ -2405,6 +2405,12 @@ ath_tx_tid_drain(struct ath_softc *sc, s
 			     tid->hwq_depth, tid->incomp, tid->baw_head,
 			     tid->baw_tail, tap == NULL ? -1 : tap->txa_start,
 			     ni->ni_txseqs[tid->tid]);
+
+			/* XXX Dump the frame, see what it is? */
+			ieee80211_dump_pkt(ni->ni_ic,
+			    mtod(bf->bf_m, const uint8_t *),
+			    bf->bf_m->m_len, 0, -1);
+
 			t = 1;
 		}
 
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"

Comment 3 Eitan Adler freebsd_committer

2018-05-28 19:45:51 UTC

batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.

Comment 4 Oleksandr Tymoshenko freebsd_committer

2019-01-19 20:14:20 UTC

There was a commit referencing this bug, but it's still not closed and has been inactive for some time. Closing as fixed. Please re-open it if the issue hasn't been completely resolved.