Bug 217721 - axge(4) hangs while link goes offline
Summary: axge(4) hangs while link goes offline
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2017-03-12 09:27 UTC by Eugene Lozovoy
Modified: 2017-03-28 06:30 UTC (History)
3 users (show)

See Also:


Attachments
axge patch that works for me (898 bytes, patch)
2017-03-12 09:27 UTC, Eugene Lozovoy
no flags Details | Diff
axge patch that works for me v2 (899 bytes, patch)
2017-03-13 14:27 UTC, Eugene Lozovoy
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Lozovoy 2017-03-12 09:27:35 UTC
Created attachment 180740 [details]
axge patch that works for me

How to reproduce: 
1. disconnect link from axge network card
2. run "while true; do wake ue0 11:22:33:44:55:66; done"
3. get a lot of "No buffer space available" messages
4. connect link back
5. try to do "wake" (or something else) one more time
6. get "No buffer space available" message

# usbconfig
ugen0.1: <XHCI root HUB 0x8086> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen0.2: <AX88179 ASIX Elec. Corp.> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (248mA)

# uname -a
FreeBSD zinc 11.0-RELEASE-p8 FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64


I'm not familiar with freebsd kernel internals, but attached patch works for me for a long time and no side effects observed.

Also I suppose same problem with axe(4).
Comment 1 Hans Petter Selasky freebsd_committer freebsd_triage 2017-03-13 14:04:26 UTC
Hi,

Your patch doesn't compile. Has it been tested?

--HPS
Comment 2 Eugene Lozovoy 2017-03-13 14:27:52 UTC
Created attachment 180777 [details]
axge patch that works for me v2

has fixed
Comment 3 Pyun YongHyeon freebsd_committer freebsd_triage 2017-03-16 05:49:06 UTC
(In reply to Eugene Lozovoy from comment #2)
Wouldn't you're able to send packets again if you wait some more
time(i.e. link establishment time + time taken to empty queued
packets)?

I guess your patch dequeues packets from if_snd queue and skip
packet write when the link is not UP.  This will quickly empty
if_snd queue if link is not UP.  So if your intention is to
transmit packets regardless of link state, the patch will work in
that case.   BTW, I think you also want to free dequeued packets,
otherwise you would end up with exhausting mbufs(i.e. mbuf leak).

Traditional drivers try very hard not to drop TX packets since TX
is more expensive operation than RX. Suppose you unplug UTP cable
in the middle of TCP operation or ARP resolving and plug it again
after some time.  If driver ignores link state, it will quickly
drop all queued packets and upper stack has to retransmit all of
them when the link is UP.  If application is using UDP(i.e. NFS
over UDP) it will also consume lots of CPU cycles.  If driver keep
packets in if_snd queue, it can send them again when link is UP.
Many drivers honor link state and don't drop packets when link is
not available.  This approach has a side-effect that queued packets
are sent out later and those packets could be meaningless to
receiver.  For instance, if link down time is longer than TCP
timeout, receiver may already have dropped the connection.  
However, given that link DOWN is not frequent event, I guess
current behavior would be slightly better than just dropping
packets.
Comment 4 Eugene Lozovoy 2017-03-16 06:40:52 UTC
(In reply to Pyun YongHyeon from comment #3)

>Wouldn't you're able to send packets again if you wait some more
>time(i.e. link establishment time + time taken to empty queued
>packets)?
No, I waited ~10 minutes, but only ifconfig down && ifconfig up solved "No buffer space available" problem.

> However, given that link DOWN is not frequent event, I guess
> current behavior would be slightly better than just dropping
> packets.
I'm using axge card to share network with home pc. So, link goes DOWN every night, and every morning I get "No buffer space available". My axge included in bridge, but bug reproducing with and without bridge.
Comment 5 Pyun YongHyeon freebsd_committer freebsd_triage 2017-03-16 07:07:24 UTC
(In reply to Eugene Lozovoy from comment #4)
Hmm, then this looks like different issue to me.
I think axge(4) in HEAD has some fixes not merged to stable/11.
Could you try that?  I guess replacing if_axge.c and if_axgereg.h
with the files in HEAD would be ok to build on 11.0-RELEASE(Make
sure to make backups though).
Comment 6 Eugene Lozovoy 2017-03-26 08:57:52 UTC
(In reply to Pyun YongHyeon from comment #5)
Same issue with if_axge.c@304336 and if_axgereg.h@304458
Comment 7 Pyun YongHyeon freebsd_committer freebsd_triage 2017-03-28 06:30:25 UTC
(In reply to Eugene Lozovoy from comment #6)
Added Kevin(driver author) to CC list.
Let's see whether he has a better idea on the issue.