Bug 198298

Summary: [ath] ath_edma_rxbuf_alloc()/ath_edma_rxfifo_alloc() causes system lockup when RX buffers are exausted
Product: Base System Reporter: Glen Barber <gjb>
Component: kernAssignee: freebsd-wireless (Nobody) <wireless>
Status: New ---    
Severity: Affects Only Me CC: adrian
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Glen Barber freebsd_committer freebsd_triage 2015-03-05 02:47:23 UTC
syslog(8) logs the following just before the system becomes unresponsive:

Mar  4 20:48:15 nucleus kernel: ath0: ath_edma_rxfifo_alloc: Q1: alloc failed: i=0, nbufs=128?
Mar  4 20:48:15 nucleus kernel: ath0: ath_edma_rxbuf_alloc: nothing on rxbuf?!
Mar  4 20:48:15 nucleus kernel: ath0: ath_edma_rxfifo_alloc: Q1: alloc failed: i=0, nbufs=128?
Mar  4 20:48:15 nucleus kernel: ath0: ath_edma_rxbuf_alloc: nothing on rxbuf?!

When setting dev.ath.0.txagg to '1', dmesg(8) shows:

Mar  4 20:57:44 nucleus kernel: no tx bufs (empty list): 0
Mar  4 20:57:44 nucleus kernel: no tx bufs (was busy): 0
Mar  4 20:57:44 nucleus kernel: aggr single packet: 0
Mar  4 20:57:44 nucleus kernel: aggr single packet w/ BAW closed: 0
Mar  4 20:57:44 nucleus kernel: aggr non-baw packet: 0
Mar  4 20:57:44 nucleus kernel: aggr aggregate packet: 0
Mar  4 20:57:44 nucleus kernel: aggr single packet low hwq: 0
Mar  4 20:57:44 nucleus kernel: aggr single packet RTS aggr limited: 0
Mar  4 20:57:44 nucleus kernel: aggr sched, no work: 41
Mar  4 20:57:44 nucleus kernel: 0:          0  1:          0  2:          0  3:          0 
Mar  4 20:57:44 nucleus kernel: 4:          0  5:          0  6:          0  7:          0 
Mar  4 20:57:44 nucleus kernel: 8:          0  9:          0 10:          0 11:          0 
Mar  4 20:57:44 nucleus kernel: 12:          0 13:          0 14:          0 15:          0 
Mar  4 20:57:44 nucleus kernel: 16:          0 17:          0 18:          0 19:          0 
Mar  4 20:57:44 nucleus kernel: 20:          0 21:          0 22:          0 23:          0 
Mar  4 20:57:44 nucleus kernel: 24:          0 25:          0 26:          0 27:          0 
Mar  4 20:57:44 nucleus kernel: 28:          0 29:          0 30:          0 31:          0 
Mar  4 20:57:44 nucleus kernel: 32:          0 33:          0 34:          0 35:          0 
Mar  4 20:57:44 nucleus kernel: 36:          0 37:          0 38:          0 39:          0 
Mar  4 20:57:44 nucleus kernel: 40:          0 41:          0 42:          0 43:          0 
Mar  4 20:57:44 nucleus kernel: 44:          0 45:          0 46:          0 47:          0 
Mar  4 20:57:44 nucleus kernel: 48:          0 49:          0 50:          0 51:          0 
Mar  4 20:57:44 nucleus kernel: 52:          0 53:          0 54:          0 55:          0 
Mar  4 20:57:44 nucleus kernel: 56:          0 57:          0 58:          0 59:          0 
Mar  4 20:57:44 nucleus kernel: 60:          0 61:          0 62:          0 63:          0 
Mar  4 20:57:44 nucleus kernel: 
Mar  4 20:57:44 nucleus kernel: HW TXQ 0: axq_depth=0, axq_aggr_depth=0, axq_fifo_depth=0, holdingbf=0
Mar  4 20:57:44 nucleus kernel: HW TXQ 1: axq_depth=0, axq_aggr_depth=0, axq_fifo_depth=0, holdingbf=0
Mar  4 20:57:44 nucleus kernel: HW TXQ 2: axq_depth=0, axq_aggr_depth=0, axq_fifo_depth=0, holdingbf=0
Mar  4 20:57:44 nucleus kernel: HW TXQ 3: axq_depth=0, axq_aggr_depth=0, axq_fifo_depth=0, holdingbf=0
Mar  4 20:57:44 nucleus kernel: HW TXQ 8: axq_depth=0, axq_aggr_depth=0, axq_fifo_depth=0, holdingbf=0
Mar  4 20:57:44 nucleus kernel: Total TX buffers: 512; Total TX buffers busy: 0 (512)
Mar  4 20:57:44 nucleus kernel: Total mgmt TX buffers: 32; Total mgmt TX buffers busy: 0
Mar  4 20:57:44 nucleus kernel: 0: fifolen: 16/16; head=0; tail=0; m_pending=0, m_holdbf=0
Mar  4 20:57:44 nucleus kernel: 1: fifolen: 128/128; head=50; tail=50; m_pending=0, m_holdbf=0
Mar  4 20:57:44 nucleus kernel: Total RX buffers in free list: 368 buffers
Comment 1 Adrian Chadd freebsd_committer freebsd_triage 2015-03-20 05:06:08 UTC
Which laptop hardware is this?
Comment 2 Adrian Chadd freebsd_committer freebsd_triage 2015-03-20 05:15:31 UTC
I don't know why it's locking up your system - maybe this condition is causing an interrupt storm from ath(4) and I'm not handling it right.

The EDMA NICs try to be smart, and I'm tempted to rip out the smart bits and replace it with not-smart bits:

* the interrupt handler pulls out completed descriptors, pushes in fresh ones, and puts the completed descriptors into a pending queue
* .. and then schedules a taskqueue.

* Then when the RX taskqueue runs, it handles RX of each packet and returns now-free descriptors to the hardware.

It's possible that something is making that taskqueue not run and the RX path runs out of descriptors.

Now, as to why .. hm. I'd like to figure that out.