Bug 188576 - [ath] traffic hangs in station mode when downgrading from AMPDU TX or reassociating
Summary: [ath] traffic hangs in station mode when downgrading from AMPDU TX or reassoc...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-wireless (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-14 00:30 UTC by Adrian Chadd
Modified: 2014-08-06 08:49 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Adrian Chadd freebsd_committer freebsd_triage 2014-04-14 00:30:00 UTC
Whenever an ath(4) 11n station reassociates or downgrades from aggregation to no aggregation, there's a chance that it'll hang and refuse to queue more frames.

The session needs to be fully torn down (eg ifconfig wlanX down) for things to go back to normal.

Fix: 

I actually have debugged this a little already.

So the problem seems to be that there's more than one entry point into ath_tx_tid_cleanup(). It's likely a couple of calls into the reassociation path or one into reassociate and one into aggregation teardown. I'll go figure that bit out soon.

But what it leads to is thus:

* the caller causes ath_tx_tid_pause();
* ath_tx_tid_cleanup() is called;
* the first time this happens it sees there's 1 or more frames to cleanup, so it sets tid->cleanup_inprogress;
* the caller then checks if that's set to 1 - if so, it assumes that it should wait until the cleanup is finished;
* otherwise it calls ath_tx_tid_resume().

If tid->cleanup_inprogress is set to 1 then the normal TX completion path will eventually call ath_tx_comp_cleanup_unaggr() or ath_tx_comp_cleanup_aggr() which will clear the flag and resume the TID.

If a second path through ath_tx_tid_cleanup() occurs, then:

* the caller pauses;
* ath_tx_tid_cleanup() is called;
* tid->cleanup_inprogress is set to 1, but there's no code to check whether this call actually set it or not - so it doesn't call ath_tx_tid_resume().

So once the frames complete and ath_tx_tid_resume() is called, there's still a pending paused reference and thus traffic never continues flowing.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2014-04-14 00:36:00 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-wireless

Over to maintainer(s).
Comment 2 Adrian Chadd freebsd_committer freebsd_triage 2014-08-06 08:49:30 UTC
I believe this is fixed in FreeBSD-HEAD.

I found and squished these until I had it constantly doing AMPDU upgrade/downgrade due to high packet loss whilst doing constant traffic.

I'll re-open this if it pops up again.