Bug 210488 - ue0 axge AX88179 Ierrs errors under havy network load
Summary: ue0 axge AX88179 Ierrs errors under havy network load
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.3-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net mailing list
URL:
Keywords: patch
: 210464 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-06-23 09:12 UTC by mtatarin76
Modified: 2016-07-07 08:31 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mtatarin76 2016-06-23 09:12:09 UTC
Hi,

I faced with a lot of Ierrs errors with my USB ue0 axge AX88179 USB3.0 Gigabyte nic.

If I change nic port to 100Mb from 1Gb then Ierrs rate decreased,
but under heavy network load it comes again.

I tested a lot already and it looks like the problem is with axge driver.

[2.3.1-RELEASE][root@xxxx]/root: netstat -i
Name    Mtu Network       Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll
ue0    1500 <Link#7>      xx:xx:xx:xx:xx:xx  2779130  2493     0  3466584     0     0

uname -a
FreeBSD atom.home.spb 10.3-RELEASE-p3 FreeBSD 10.3-RELEASE-p3 #2 1988fec(RELENG_2_3_1): Wed May 25 14:15:09 CDT 2016     root@ce23-i386-builder:/builder/pfsense-231/tmp/obj/builder/pfsense-231/tmp/FreeBSD-src/sys/pfSense  i386


What kind of tests, logs are needed to troubleshoot it further?

I can reproduce the problem by running speedtest.net test,
Ierrs errors appears right after speedtest.net Upload speed test is running only.

Thank you in advance,

BR, Mikhail.
Comment 1 mtatarin76 2016-06-23 09:12:44 UTC
*** Bug 210464 has been marked as a duplicate of this bug. ***
Comment 2 Pyun YongHyeon freebsd_committer 2016-06-24 05:26:20 UTC
(In reply to mtatarin76 from comment #0)
I have a WIP version which addresses a couple of issues encountered in the past. I'm not sure what would cause Ierrs for external sites though.
You can get the patch from the following URL.
https://people.freebsd.org/~yongari/axge/axge.tso.diff

The patch adds TSO support and fixed a couple of RX issues.
Let me know whether the patch makes difference for you.
Comment 3 Pyun YongHyeon freebsd_committer 2016-07-07 05:08:03 UTC
(In reply to Pyun YongHyeon from comment #2)
I was able to reproduce the issue but I'm still not sure why it
happens.  Currently known workaround for the issue is enabling
Ethernet flow control like this.

#ifconfig ue0 media auto media-opt flow

The command above will re-establish a link with link partner
and enables Ethernet flow control.  Check current media with
ifconfig(8) after executing the command above.  You should see
"rxpause" and "txpause" in media row of ifconfig output when
everything goes right.
Note, link partner should also support flow control otherwise
the command above has no effect.

Probably driver needs some change on RX FIFO configuration when
flow-control is not active.  Unfortunately there is no publicly
available data sheet so I have to guess on that configuration.
If I manage to find a clue I'll let you know.
Comment 4 mtatarin76 2016-07-07 07:15:35 UTC
(In reply to Pyun YongHyeon from comment #3)

Please take a look at command output error message:

[2.3.1-RELEASE][root@xxx]/root: ifconfig ue0 media auto media-opt flow
ifconfig: media-opt: bad value
Comment 5 mtatarin76 2016-07-07 07:17:55 UTC
(In reply to Pyun YongHyeon from comment #3)
and ifconfig output as well:

ue0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
        ether xx:xx:xx:xx:xx:xx
        inet6 xxxx::xxx:xxxx:xxxx:xxxx%ue0 prefixlen 64 scopeid 0x7
        inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.xxx
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
Comment 6 Pyun YongHyeon freebsd_committer 2016-07-07 07:28:48 UTC
(In reply to mtatarin76 from comment #4)
Oops, sorry, there was a typo. The command should be:
#ifconfig ue0 media auto mediaopt flow
Comment 7 mtatarin76 2016-07-07 07:40:52 UTC
(In reply to Pyun YongHyeon from comment #6)

Thank you very much for the help!!!

Your workaround has solved my problem!

I doublechecked my issue under heavy load and now I do not see any Ierrs errors.

ifconfig output after ifconfig ue0 media auto mediaopt flow

ue0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
        ether xx:xx:xx:xx:xx:xx
        inet6 xxxx::xxx:xxxx:xxxx:xxxx%ue0 prefixlen 64 scopeid 0x7
        inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.xxx
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect <flowcontrol> (100baseTX <full-duplex,flowcontrol,rxpause,txpause>)
        status: active

If as you said that you reproduced this problem,
how can we make a patch for this problem and include it in future FreeBSD release as permanent solution? Could you help with this?

Thank you very much once again!
Comment 8 Pyun YongHyeon freebsd_committer 2016-07-07 08:19:41 UTC
(In reply to mtatarin76 from comment #7)
Probably you can put flow control options to '/etc/start_if.ue0'
file as a workaround(Not tested though).

Enabling flow-control by default in driver would be a POLA
violation so I wanted to mitigate the issue when flow-control is
not active.  However, it seems there is no good way to mitigate
the issue without increasing H/W RX FIFO size.  The hardware have a
fixed sized buffer so we may have to rely on other way like
flow-control to mitigate that.
The issue shall frequently happen when the controller operates at
high-speed(USB 2.0) and established link is 1000baseT(i.e. the
network bottleneck is USB 2.0 bus).  I contacted the vendor to get
more information on the issue so let's see how it goes.  If the
vendor confirms my theory, documenting it in axge(4) would be best
one.

I guess you may see better result when you plug the controller to
USB 3.0 port.  Due to lack of H/W, I wasn't able to test that.

Thanks for testing!
Comment 9 mtatarin76 2016-07-07 08:31:42 UTC
Thank you once again for your efforts!

I put this workaround already at rc boot shell script and it worked.
I'm happy with it now!

I uses USB3 port meanwhile.

It will good if you will get some confirmation and explanation from vendor as well, I will interested in further updates if you will get anything in the future.