Bug 213814 - AWS/EC2: no egress traffic stats on ixv(4)
Summary: AWS/EC2: no egress traffic stats on ixv(4)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 11.0-STABLE
Hardware: amd64 Any
: Normal Affects Some People
Assignee: freebsd-net (Nobody)
URL: https://reviews.freebsd.org/D11058
Keywords: IntelNetworking, needs-qa, regression
Depends on:
Blocks:
 
Reported: 2016-10-26 17:39 UTC by pete
Modified: 2018-06-03 18:04 UTC (History)
11 users (show)

See Also:
koobs: mfc-stable11?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description pete 2016-10-26 17:39:49 UTC
I have only observed this on 11.0-RELEASE on AWS as that is the only platform that I have access to, but I believe this effects all versions.

uname:
$ uname -ar
FreeBSD redis-prod0.skippy.com 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0 r306420: Thu Sep 29 01:43:23 UTC 2016     root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64


Problem description:
If you invoke systat with the "ifstat" flag egress traffic is not updated for non-loopback interfaces in the UI.  Ingress updates correctly, and back directions update as expected on lo0.

I have verified that I am not seeing this problem on 10.3-RELEASE ec2 instances.  Interestingly enough I'm not seeing this issue on 12-CURRENT either.
Comment 1 Bradley T. Hughes freebsd_committer freebsd_triage 2016-11-01 09:24:51 UTC
I see this also on 11.0-RELEASE-p1 with the ixv0 interface:

$ uname -a
FreeBSD ip-172-30-0-105 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0 r306420: Thu Sep 29 01:43:23 UTC 2016     root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64

$ systat -ifstat
...
           ixv0  in      0.000 KB/s          0.000 KB/s          633.016 MB
                 out     0.000 KB/s          0.000 KB/s            0.000 KB
...

netstat(1) also shows zero outgoing packets/bytes on the interface:
$ netstat -bn -I ixv0
Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
ixv0   1500 <Link#1>      02:f1:96:c7:46:19        0     0     0  663770534        0     0          0     0
ixv0      - 172.30.0.0/24 172.30.0.105        942591     -     -  648858505  1050933     -  352207042     -
Comment 2 pete 2016-11-01 17:07:19 UTC
It was suggested that I test out r308126 to see if this addresses the issue.  I have rebuilt the kernel on one of my systems using this patch but am seeing the same issue:

$ uname -ar
FreeBSD netfront-test.skippy.com 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0 r308150M: Mon Oct 31 21:50:43 UTC 2016     pewright@netfront-test.skippy.com:/usr/obj/usr/home/pewright/svn/11.0.1/sys/GENERIC  amd64
Comment 3 ota 2016-11-02 06:24:32 UTC
      Interface           Traffic               Peak                Total
          wlan0  in      0.063 KB/s          3.513 KB/s          291.085 MB
                 out     0.065 KB/s          1.246 KB/s           17.089 MB

            lo0  in      0.000 KB/s          0.000 KB/s           48.340 KB
                 out     0.000 KB/s          0.000 KB/s           48.340 KB

% uname -v
FreeBSD 11.0-RELEASE-p1

I don't have seem to have issues.
Comment 4 pete 2016-11-02 17:52:10 UTC
(In reply to ota from comment #3)
It looks like you are using a wireless interface on 11, which also works for me as well.  I believe this issue is isolated to systems running under the Xen hypervisor (on AWS for example).
Comment 5 Allan Jude freebsd_committer freebsd_triage 2016-12-10 05:27:22 UTC
I am also seeing this problem on AWS

It isn't specific to the tools

systat -ifstat
netstat -I ixv0 1
sysutils/nload

all show the same behaviour
Comment 6 Andreas Andersson 2017-04-01 05:37:14 UTC
This is also consistent on smaller instances such as the t2's. And this appeared in 11.0-RELEASE. It was working perfectly fine in 10.1, 10.2 and 10.3.

It happens both with ixv and xn drivers.

It's not happening with Linux instances or windows instances. Making me believe this is in the FreeBSD drivers.
Comment 7 Andrey V. Elsukov freebsd_committer freebsd_triage 2017-04-03 05:49:40 UTC
There is lack of statistic counters update in the ixv driver.
I'm not sure and unable to test, but probably you can add several IXGBE_SET_XXX() macro in the ixv_update_stats() like in the ixgbe_update_stats_counters().
Also, probably if_setgetcounterfn() should be used to set custom if_get_counter() handler and IFCAP_HWSTATS caps added to if_capabilities.
Comment 8 pete 2017-04-20 00:07:53 UTC
a couple of observations as i've had a few cycles to take a look at ixgbe/if_ix.c and ixgbe/if_ixv.c source.

1) It looks like both drivers are only populating rx_bytes SYSCTL statistic, but rx and tx packets counters are defined (and are updating as per testing on my dev machines).

- I've found that other drivers such as ixl do populate rx and tx byte counters.

2) the if_ixv code does *not* have any code to populate the OS statistics structure as per previous comment.  There *is* code in the if_ix.c though, so perhaps it is easy to port that to the if_ixv device?  My suspicion is that adding this functionality will fix userland tools.

Since I'm pretty green when it comes to hacking device drivers I'm going to play with adding tx_bytes to SYSCTl and seeing how that goes.  If anyone with more experience hacking on drivers wants to take a stab at getting stats in there I'd be more than happy to test :)
Comment 9 Andreas Andersson 2017-05-21 07:08:26 UTC
Any news on this? I would be happy to try out patches.
Comment 10 pete 2017-05-22 15:48:59 UTC
(In reply to a.andersson.thn from comment #9)
I have not made much progress on my end unfortunately, like you though I'm keen to test out patches as this is effecting my production systems in AWS.  For now I'm using cloudwatch metrics for network utilization metrics as a workaround but would prefer to have the OS report this data correctly.
Comment 11 Kubilay Kocak freebsd_committer freebsd_triage 2017-06-04 06:48:15 UTC
Re-assign to more appropriate ML (freebsd-net), cc'ing original ML (virtualization).
Comment 12 Jeff Pieper 2017-06-15 15:59:31 UTC
This is fixed in https://reviews.freebsd.org/D11058
Comment 13 Kubilay Kocak freebsd_committer freebsd_triage 2017-06-16 01:50:31 UTC
CC release engineering for 11.1-RELEASE
Comment 14 Andreas Andersson 2017-09-03 05:21:25 UTC
This still does not work.
Comment 15 pete 2017-09-11 23:15:53 UTC
(In reply to Jeff Pieper from comment #12)
The commited fix does *not* work on 11.1-RELEASE on AWS:

$ uname -ar
FreeBSD snap-prod2.iad0.tribdev.com 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug  9 11:55:48 UTC 2017     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

I've confirmed on multiple C4 instances running the same official AMI.
Comment 16 Allan Jude freebsd_committer freebsd_triage 2017-09-13 00:26:49 UTC
Change title back. xn(4) was an unrelated issue that only looked similar.
Comment 17 Ben Hood 2017-09-20 17:05:52 UTC
(In reply to pete from comment #15)

I've just encountered this missing TX issue on AWS with the xn0 driver:

$ uname -v
FreeBSD 11.0-RELEASE-p2 #0: Mon Oct 24 06:55:27 UTC 2016     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC

$ systat -ifstat 1
Interface           Traffic               Peak                Total
            xn0  in      1.439 KB/s          2.444 KB/s           20.686 GB
                 out     0.000 KB/s          0.000 KB/s            0.000 KB

$ netstat -I xn0 -b
Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
xn0    1500 <Link#2>      06:d0:71:b3:c1:47 296475044     0     0 22211709752 582065784     0          0     0
xn0       - 10.10.0.0/20  10.10.10.20       295927017     -     - 18077218113 581504705     - 32914263942     -

I'm glad to see that other people have reproduced this issue and that it's not only something that I have experienced.
Comment 18 Sean Bruno freebsd_committer freebsd_triage 2018-06-03 18:04:04 UTC
xn(4) seems to be working for me on a RootBSD vm.  ixlv(4) seems to be working for others and this should not longer be a problem in 12-CURRENT and the 11.2-RELEASE images.  Please retest if you wish at your earliest convenience.