Bug 268490 - [igb] [lagg] [vlan]: Intel i210 performance severely degraded
Summary: [igb] [lagg] [vlan]: Intel i210 performance severely degraded
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.4-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: needs-qa, regression
Depends on:
Blocks:
 
Reported: 2022-12-20 18:01 UTC by Daniel Duerr
Modified: 2023-05-05 13:41 UTC (History)
10 users (show)

See Also:


Attachments
iperf mtu 9000, pcap (321.60 KB, image/png)
2023-04-15 19:07 UTC, Santiago Martinez
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Duerr 2022-12-20 18:01:52 UTC
I have a ZFS file server running ctld to provide an iSCSI target to VMware hosts. It runs on a Supermicro motherboard which has dual Intel i210 ports on it. The igb ports are configured in a LACP lagg with the switch, and then vlan is used to create 4 VLAN interfaces on top of that. The system has been rock solid stable in production for over a year on 12.2-STABLE. Upgrading it to 12.4-RELEASE 2 weeks ago rendered it unusable as an iSCSI target. I've tried disabling all the HW options to no avail.

The ctld iSCSI target lives on VLAN 8 (172.27.6.135). When I try to connect to the target from VMware's initiator, VMware basically hangs and I start seeing interface errors accumulate on this machine. Once I saw this behavior, I assumed it was a lower-level network issue and not specific to ctld at all. I then removed the VMware iSCSI configuration and focused on diagnosing the lower-level network.

Here's a pair of iperf tests performed on a neighboring FreeBSD 12.4 machine on the same VLAN segment (172.27.6.135) which has an igb interface without lagg+vlan. The first test is to this machine's lagg0.8 interface (172.27.6.135), and the second test is to a third machine (172.27.6.129) which also has an igb interface without lagg+vlan:

$ for ip in 172.27.6.135 172.27.6.129; do iperf -c ${ip}; done
------------------------------------------------------------
Client connecting to 172.27.6.135, TCP port 5001
TCP window size: 32.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.130 port 22350 connected with 172.27.6.135 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-20.54 sec  43.1 KBytes  17.2 Kbits/sec
------------------------------------------------------------
Client connecting to 172.27.6.129, TCP port 5001
TCP window size: 35.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.130 port 11072 connected with 172.27.6.129 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.01 sec  1.15 GBytes   984 Mbits/sec

Here's this machine's config from /etc/rc.conf:

ifconfig_igb0="mtu 9000 media 1000baseTX mediaopt full-duplex -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso -vlanhwtag -vlanhwcsum -vlanhwfilter up"
ifconfig_igb1="mtu 9000 media 1000baseTX mediaopt full-duplex -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso -vlanhwtag -vlanhwcsum -vlanhwfilter up"
cloned_interfaces="lagg0"
ifconfig_lagg0="up laggproto lacp laggport igb0 laggport igb1 lacp_fast_timeout"
vlans_lagg0="6 7 8 10"
ifconfig_lagg0_6="inet 172.27.6.10/26 mtu 1500"
ifconfig_lagg0_7="inet 172.27.6.123/26 mtu 1500"
ifconfig_lagg0_8="inet 172.27.6.135/28 mtu 9000"
ifconfig_lagg0_10="inet 172.27.6.251/26 mtu 1500"

Here's this machine's relevant ifconfig output:

igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=800028<VLAN_MTU,JUMBO_MTU>
	ether 00:25:90:d6:e6:72
	media: Ethernet 1000baseT <full-duplex>
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=800028<VLAN_MTU,JUMBO_MTU>
	ether 00:25:90:d6:e6:72
	hwaddr 00:25:90:d6:e6:73
	media: Ethernet 1000baseT <full-duplex>
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=800028<VLAN_MTU,JUMBO_MTU>
	ether 00:25:90:d6:e6:72
	laggproto lacp lagghash l2,l3,l4
	laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	groups: lagg
	media: Ethernet autoselect
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lagg0.8: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	ether 00:25:90:d6:e6:72
	inet 172.27.6.135 netmask 0xfffffff0 broadcast 172.27.6.143
	groups: vlan
	vlan: 8 vlanpcp: 0 parent interface: lagg0
	media: Ethernet autoselect
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Comment 1 Daniel Duerr 2022-12-20 18:10:18 UTC
FWIW, the upgrade to 12.4-RELEASE negatively affected all of my igb network configurations that had worked previously on 12.2-STABLE. In one case, removing the use of VLANs worked around this underlying issue -- thankfully I had enough individual igb ports to do so. But in this case, I can't do this because I only have 2 ports to work with.
Comment 2 Ed Maste freebsd_committer freebsd_triage 2022-12-21 02:36:13 UTC
Kevin, any ideas?
Comment 3 Kevin Bowling freebsd_committer freebsd_triage 2023-02-09 07:14:35 UTC
Daniel, can you please bisect the commit range?  There isn't that much that has happened in sys/dev/e1000/* so it should hopefully be found in a half dozen builds bisecting stable/12.
Comment 4 Daniel Duerr 2023-02-09 14:17:32 UTC
(In reply to Kevin Bowling from comment #3)

I'm happy to help with that, Kevin. But I'm not familiar with how to do it and I do not have a 12.2-STABLE machine available from which to compare. Any guidance you can give me would be much appreciated. Thanks!
Comment 5 Kevin Bowling freebsd_committer freebsd_triage 2023-02-10 12:23:18 UTC
(In reply to Daniel Duerr from comment #4)

1) Clone FreeBSD src, check out 12.4:
pkg install git-lite
git clone https://git.freebsd.org/src.git /usr/src
cd /usr/src
git checkout releng/12.4

2) Start the bisecting using the 12.4 release branch as a known bad point, 12.2 as known good and limit the search to the e1000 driver commits:
cd /usr/src
git bisect start releng/12.4 releng/12.2 -- sys/dev/e1000

3) Build and install the kernel:
cd /usr/src
make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG
make installkernel KERNCONF=GENERIC-NODEBUG

4) Reboot the system.  You will be in the new kernel (confirm with uname -a, it will show you the build date and git hash).

5) Perform some test like your iperf.

6) If the performance is good:
cd /usr/src
git bisect good
Repeat step 3 & 4

7) If the performance is bad:
cd /usr/src
git bisect bad
Repeat step 3 & 4

8) Keep repeating steps 3 through 7.  After a handful of repetitions, you will run out of commits to test and it will spit out the first bad hash.  Post that here.

9) To restore your system to a desired kernel, say 12.4 with security patches:
git checkout releng/12.2
Repeat step 3 & 4
Comment 6 Kevin Bowling freebsd_committer freebsd_triage 2023-02-10 12:27:25 UTC
Correction for step 9, before running 'git checkout <branch>', perform 'git bisect reset' to clear the repo out of bisecting.
Comment 7 Daniel Duerr 2023-02-13 14:44:13 UTC
(In reply to Kevin Bowling from comment #5)

Thank you for the detailed instructions, Kevin.  Makes sense.

FWIW, after completing step I got an error on step #2:

[root@nfs src]# git bisect start releng/12.4 releng/12.2 -- sys/dev/e1000
fatal: 'releng/12.2' does not appear to be a valid revision

Assuming there might be tags, I used `git tag -l` to see a list of them. This command works as it represents valid revisions (tags):

[root@nfs src]# git bisect start release/12.4.0 release/12.2.0 -- sys/dev/e1000
Bisecting: a merge base must be tested
[68cfeeb1d3c428e3c3881f45bc3a20a252b37d0e] MFC r365284:

Please let me know if this seems like an OK compromise to your proposed command in step #2.
Comment 8 Daniel Duerr 2023-02-13 15:32:43 UTC
(In reply to Kevin Bowling from comment #5)

Hi Kevin, sorry to say I'm having a difficult time getting the kernel to compile. I have a pretty decent amount of experience building custom kernels, but cannot for the life of me get past this error:

env NM='nm' NMFLAGS='' sh /usr/src/sys/kern/genassym.sh ia32_genassym.o > ia32_assym.h
cc -target x86_64-unknown-freebsd12.2 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -c -x assembler-with-cpp -DLOCORE -O2 -pipe -fno-strict-aliasing  -g -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common  -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -MD  -MF.depend.acpi_wakecode.o -MTacpi_wakecode.o -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member  -mno-aes -mno-avx  -std=iso9899:1999  -Werror /usr/src/sys/amd64/acpica/acpi_wakecode.S
error: unknown -Werror warning specifier: '-Wno-error-tautological-compare' [-Werror,-Wunknown-warning-option]
error: unknown -Werror warning specifier: '-Wno-error-empty-body' [-Werror,-Wunknown-warning-option]
error: unknown -Werror warning specifier: '-Wno-error-parentheses-equality' [-Werror,-Wunknown-warning-option]
error: unknown -Werror warning specifier: '-Wno-error-unused-function' [-Werror,-Wunknown-warning-option]
error: unknown -Werror warning specifier: '-Wno-error-pointer-sign' [-Werror,-Wunknown-warning-option]
error: unknown -Werror warning specifier: '-Wno-error-shift-negative-value' [-Werror,-Wunknown-warning-option]
*** Error code 1

I get this whether I try to build the GENERIC-NODEBUG config (which I created following a guide I found online) or the GENERIC config. I have not had this issue before, not sure why it is happening now but can't seem to find much info on it.
Comment 9 Kevin Bowling freebsd_committer freebsd_triage 2023-02-13 22:38:48 UTC
(In reply to Daniel Duerr from comment #8)
Try one of the methods from https://groups.google.com/g/bsdmailinglist/c/Wz3lSE20hWU, either using WITHOUT_SYSTEM_COMPILER=yes or cherry-picking the format commits as needed.
Comment 10 Daniel Duerr 2023-02-13 22:40:08 UTC
(In reply to Kevin Bowling from comment #9)

Thanks Kevin. I did manage to get myself unblocked but forgot to respond back to you here. I needed to do a `make buildworld` and now `make kernel` works as expected.

I'll report back once I finish the bisect.
Comment 11 Kevin Bowling freebsd_committer freebsd_triage 2023-02-13 22:48:52 UTC
You can save some time by avoiding buildworld, we are only interested in the e1000 driver changes in the kernel
Comment 12 Daniel Duerr 2023-02-15 13:19:03 UTC
(In reply to Kevin Bowling from comment #3)

Ok, I completed the bisect process you outlined above -- thanks again for the great instructions.  Here's the output:

6486e9dd8d24b0195facd23d8ca82e17e180cffb is the first bad commit
commit 6486e9dd8d24b0195facd23d8ca82e17e180cffb
Author: Eric Joyner <erj@FreeBSD.org>
Date:   Mon Sep 21 22:52:57 2020 +0000

    MFC r365774 and r365776
    
    These two commits fix issues in em(4)/igb(4):
    - Fix define and includes with RSS option enabled
    - Properly retain promisc flag in init
    
    PR:             249191, 248869
    MFC after:      1 day

 sys/dev/e1000/if_em.c | 2 +-
 sys/dev/e1000/if_em.h | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

On the first commit I tested, performance was normal/awesome:

[root@nfs dd]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 54394
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.00 sec  1.10 GBytes   942 Mbits/sec
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 53705
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.00 sec  1.09 GBytes   940 Mbits/sec
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 64390
[ ID] Interval       Transfer     Bandwidth
[  3] 0.00-10.00 sec  1.12 GBytes   958 Mbits/sec

Every commit after that, starting with the first bad commit listed above, performance was terrible:

[root@nfs dd]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 64164
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 65279
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 42346
recv failed: Connection reset by peer
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
^CWaiting for server threads to complete. Interrupt again to force quit.
[  2] 0.00-56.63 sec  60.0 Bytes  8.48 bits/sec
[  3] 0.00-36.26 sec  60.0 Bytes  13.2 bits/sec
[SUM] 0.00-95.55 sec   180 Bytes  15.1 bits/sec
Comment 13 Franco Fichtner 2023-02-15 14:04:46 UTC
If it's about missing promisc flag which was set invisibly by default before this commit it's easy to test with bad state:

# ifconfig igb0 promisc
# ifconfig igb1 promisc

And try the iperf again...


Cheers,
Franco
Comment 14 Daniel Duerr 2023-02-15 14:14:29 UTC
(In reply to Franco Fichtner from comment #13)

Thank you, Franco. I tested what you suggested here but I do not see a change. Here's my `ifconfig` output before (the PROMISC flag is missing as you suggest):

igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        media: Ethernet 1000baseT <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        hwaddr 00:25:90:d6:e6:73
        media: Ethernet 1000baseT <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

And here are the baseline `iperf -s` test numbers to match:

[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
[  2] 0.00-76.85 sec  60.0 Bytes  6.25 bits/sec
[  3] 0.00-56.37 sec  60.0 Bytes  8.52 bits/sec
[SUM] 0.00-98.92 sec   180 Bytes  14.6 bits/sec

I then made the change you suggested:

[root@nfs dd]# ifconfig igb0 promisc
[root@nfs dd]# ifconfig igb1 promisc

Which is reflected in the `ifconfig` output:

igb0: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        media: Ethernet 1000baseT <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        hwaddr 00:25:90:d6:e6:73
        media: Ethernet 1000baseT <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

But the `iperf -s` numbers are still terrible:

[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-68.47 sec  60.0 Bytes  7.01 bits/sec
[  2] 0.00-48.60 sec  60.0 Bytes  9.88 bits/sec
[  3] 0.00-28.25 sec  60.0 Bytes  17.0 bits/sec
[SUM] 0.00-70.48 sec   180 Bytes  20.4 bits/sec

Do I need to adjust the `lagg` interface to replicate your test, given I'm running inside of a `lagg` interface?
Comment 15 Franco Fichtner 2023-02-15 15:11:59 UTC
Could be required additionally or just in lagg, certainly worth a try in those combinations.
Comment 16 Daniel Duerr 2023-02-15 15:29:13 UTC
(In reply to Franco Fichtner from comment #15)

Thanks. Unfortunately I cannot get any change in behavior with this flag.

[root@nfs dd]# ifconfig
igb0: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        media: Ethernet 1000baseT <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        hwaddr 00:25:90:d6:e6:73
        media: Ethernet 1000baseT <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=800028<VLAN_MTU,JUMBO_MTU>
        ether 00:25:90:d6:e6:72
        laggproto lacp lagghash l2,l3,l4
        laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lagg0.8: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000
        ether 00:25:90:d6:e6:72
        inet 172.27.6.135 netmask 0xfffffff0 broadcast 172.27.6.143
        groups: vlan
        vlan: 8 vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

Also, I don't believe this machine's interfaces would ever have been in PROMISC mode before. I get that there was a change about some implicit defaults, but if I had seen PROMISC and or PPROMISC in the `ifconfig` output before, I would have thought something was wrong. :). Just saying, as even if this workaround had worked, I wouldn't have considered this a normal running state.
Comment 17 Franco Fichtner 2023-02-15 15:41:15 UTC
Well, before this patch igb was defaulting to promisc even if it didn't report it to ifconfig. It was debuggable using a VM asking for privileged network access during boot in those early 12.x days.

There is something else at play here and the patch likely only surfaced this.


Cheers,
Franco
Comment 18 Daniel Duerr 2023-02-15 15:56:24 UTC
(In reply to Franco Fichtner from comment #17)

Thanks Franco. I agree that whatever I am experiencing here seems a little different than what you were talking about. I've tried more combinations of `ifconfig promisc` and none seem to change the behavior I'm seeing. Yet the kernel build w/ the earliest 12.2 commit work as expected.
Comment 19 Kevin Bowling freebsd_committer freebsd_triage 2023-02-15 19:42:47 UTC
As a sanity check, can you revert 6486e9dd8d24b0195facd23d8ca82e17e180cffb or manually change the em_if_set_promisc() call back as it was on the top of 12.4?
Comment 20 Daniel Duerr 2023-02-16 15:13:37 UTC
(In reply to Kevin Bowling from comment #19)

Sure. I started with an up to date `releng/12.4` branch with no changes. The `git revert 6486e9dd8d24b0195facd23d8ca82e17e180cffb` command could not apply cleanly without manual conflict resolution. I think I resolved it properly -- here's the diff between my local version and origin/releng/12.4:

[root@nfs src]# git diff origin/releng/12.4 sys/dev/e1000/if_em.c
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index a9ab2fb21535..0f20449db6ec 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1361,7 +1361,7 @@ em_if_init(if_ctx_t ctx)
        em_setup_vlan_hw_support(ctx);
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&sc->hw);
 
        /* MSI-X configuration for 82574 */

I then proceeded to build and install a GENERIC kernel from this source. After rebooting on the new kernel, I re-ran the `iperf` tests and the performance is still terrible:

[root@nfs dd]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 50946
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 27230
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 53346
recv failed: Connection reset by peer
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  2] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  3] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
[SUM] 0.00-122.08 sec   180 Bytes  11.8 bits/sec

Did I do the revert wrong? Or do we need to look at other changes that were made in the em driver?
Comment 21 Franco Fichtner 2023-02-16 15:16:12 UTC
Can you go to your good early kernel state and apply the patch in question?

If it's good there in any case you will have to do a bisect, but apply the patch on top every time to find the underlying issue.


Cheers,
Franco
Comment 22 Daniel Duerr 2023-02-16 15:19:03 UTC
(In reply to Franco Fichtner from comment #21)

Sure. To clarify, you are asking me to do another `git bisect` like I did before, but this time to apply this latest patch to each subsequent version -- the ones that were bad in the earlier bisect I did -- in order to see where it then breaks?
Comment 23 mike 2023-02-16 15:24:56 UTC

Just a small datapoint re: i210. I have a Supermicro board (X11SSH-F) with
 pciconf -lvcb igb0
igb0@pci0:2:0:0:        class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I210 Gigabit Network Connection'
    class      = network
    subclass   = ethernet

that terminates an iSCSI vol on ZFS across the internet via ipsec tunnel. Tracking RELENG_12 has not seen any performance differences for me. The remote connection is limited to 500Mb, so it maxes out there. However, I am not using lagg nor vlans on that interface.  This is 12.4-STABLE
Comment 24 Franco Fichtner 2023-02-16 15:28:15 UTC
Hi Daniel,

Yes, at least once in your original state to verify the commit you found unmasks the real cause of the degradation.  If that is the case the initial old commit you started the bisect on would not drop performance using this patch.

Or it would, but then this might reach back even more.  Yet I'm only speculating about it.


Cheers,
Franco
Comment 25 Daniel Duerr 2023-02-16 15:31:01 UTC
(In reply to mike from comment #23)

Thanks Mike. In line with your points, I was able to work around this issue on all my other servers with 12.4-RELEASE by removing the use of vlan and just running straight igb. So there is something about vlan combined with these if_em driver changes that is causing my issues here.

Unfortunately on this particular server, I am port-limited and require some redundancy, hence the need to have the igb + lagg + vlan combination work. And in case anyone wonders, no I did not try igb + lagg without vlan because that would make my port limitation even worse than just going straight igb like I did on all the other servers (which have 4-6 igb ports each).
Comment 26 Daniel Duerr 2023-02-16 15:32:36 UTC
(In reply to Franco Fichtner from comment #24)

Thanks, I think I understand. Would it be acceptable to just checkout the first "bad" commit from my bisect above and then apply the patch to that and test it?
Comment 27 Franco Fichtner 2023-02-16 15:35:42 UTC
Not the first bad commit, the first commit you started the bisect on which is the one that has reliably good performance.
Comment 28 Daniel Duerr 2023-02-16 15:37:23 UTC
(In reply to Franco Fichtner from comment #27)

Ok, thanks. So you want me to go back to the first good commit (which already works without any patches) and then you want me to apply the patch to that version to make sure it still works as it already does? Sorry, I'm a little confused at the goal here.
Comment 29 Kevin Bowling freebsd_committer freebsd_triage 2023-02-24 06:04:59 UTC
I think the intention there is to try and figure out what coincidental commit could be at fault because the offending commit is just enabling functionality but is not necessarily a smoking gun.  I'm not sure if there is a streamlined way to do this kind of bisect, hopefully 'git cherry-pick' and 'git reset' will suffice in the build/test loop.

Right now the likely suspects could be flag management changes in e1000 or some change in lagg(4) I am not privy to.
Comment 30 Daniel Duerr 2023-03-16 13:39:35 UTC
(In reply to Kevin Bowling from comment #29)

Thanks Kevin. I'm happy to do whatever is needed here, I'm just unclear on exactly what you guys are asking me to do here. I don't understand what change I'm trying to apply to the commit _that already works in the first place_.

Also, is there any chance this issue would _not_ occur in 13.x? Or, would you all be more motivated to fix if it _did_ occur in 13.x? I'm not stuck on 12.x and would prefer to take the easiest path forward and put everyone's efforts where it matters most.
Comment 31 Kevin Bowling freebsd_committer freebsd_triage 2023-03-16 19:39:53 UTC
(In reply to Daniel Duerr from comment #30)
The idea is to bisect in somewhat of a reverse fashion where you introduce the one line change on during each bisection step:
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);

and see if that triggers a different resultant commit hash.  Instead of walking backwards to see where things break, which you did, and which identified this line of code, we want to see when this line change results in bad performance if it were applied earlier before, and during other code changes.  If I understand correctly, applying this one line change to 12.2 did not result in poor performance.  If you have already done this and the performance is poor on 12.2 with the change there is no need to bisect further.

The drivers are very similar between 12, 13, and 14 so if it is a driver regression it probably affects all of these.  If it is some cross cut with the network stack there could be something different but I am not aware of major differences in this area.
Comment 32 Daniel Duerr 2023-03-17 14:58:52 UTC
(In reply to Kevin Bowling from comment #31)

Thanks Kevin, this helps. What I will do is rerun the original `git bisect` I had done above, but this time I will apply that patch to each commit and see if I can get a different outcome on the subsequent commits that had been "bad" in the original bisect.  Make sense?
Comment 33 Kevin Bowling freebsd_committer freebsd_triage 2023-03-17 17:24:39 UTC
(In reply to Daniel Duerr from comment #32)
You got it
Comment 34 Daniel Duerr 2023-03-17 18:05:39 UTC
(In reply to Kevin Bowling from comment #33)

Hi Kevin,

Here's my first progress report:

### Round 1: Restart the `git bisect` and confirm the commit 68cfeeb1d3c4 works

[root@nfs ~]# cd /usr/src
[root@nfs ~]# git checkout releng/12.4
[root@nfs ~]# git bisect start release/12.4.0 release/12.2.0 -- sys/dev/e1000
Bisecting: a merge base must be tested
[68cfeeb1d3c428e3c3881f45bc3a20a252b37d0e] MFC r365284:
[root@nfs ~]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs ~]# reboot
[root@nfs ~]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE 68cfeeb1d3c4(HEAD) GENERIC-NODEBUG  amd64
[root@nfs ~]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10616
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.00 sec  1.15 GBytes   985 Mbits/sec
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20446
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.00 sec  1.15 GBytes   988 Mbits/sec
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10068
[ ID] Interval       Transfer     Bandwidth
[  3] 0.00-10.00 sec  1.15 GBytes   985 Mbits/sec

### Round 2: Apply the change to sys/dev/e1000/if_em.c in commit 68cfeeb1d3c4 and see if it still works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index 558a75ac015e..42faacfc3eea 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1338,7 +1338,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, IFF_PROMISC);
+       em_if_set_promisc(ctx, if_getflags(ifp));
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */

[root@nfs ~]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs ~]# reboot
[root@nfs ~]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE #15 68cfeeb1d3c4(HEAD)-dirty: Fri Mar 17 10:50:14 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs ~]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 47216
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 37030
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41145
^CWaiting for server threads to complete. Interrupt again to force quit.
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-76.70 sec  60.0 Bytes  6.26 bits/sec
[  2] 0.00-56.38 sec  60.0 Bytes  8.51 bits/sec
[  3] 0.00-36.23 sec  60.0 Bytes  13.2 bits/sec
[SUM] 0.00-78.70 sec   180 Bytes  18.3 bits/sec

In summary, the first good commit in the bisect works and when I apply the change, it stops working.

As a next step, I was thinking I'd (a) revert the manual change back to the "good" state, (b) so a `git bisect good` to proceed to the next (first "bad") commit, and (c) manually reverse the change the other way to see if the bad commit then works. Does that make sense?
Comment 35 Kevin Bowling freebsd_committer freebsd_triage 2023-03-17 20:49:05 UTC
(In reply to Daniel Duerr from comment #34)
If, after applying the diff it is bad, mark it bad, then undo the change and proceed to the next commit.  Only mark a commit as good if it works as expected _with_ the change over top.
Comment 36 Daniel Duerr 2023-03-20 17:22:22 UTC
(In reply to Kevin Bowling from comment #35)

Hi Kevin,

Okay, I finished another `git bisect` for you as we discussed, but this time I manually reverted the `if_getflags(ifp)` change each time to see how that would affect the results. Also, on the first commit (the one that works as-is), I manually applied the `if_getflags(ifp)` change and confirmed it broke it. I then reverted that, did a `git bisect good` on the first commit, and proceeded with the rest of the process. It definitely produces a different result. Here's the log:

### Step 1: Restart the `git bisect` and confirm the first commit still works

[root@nfs src]# git checkout releng/12.4
[root@nfs src]# git bisect start release/12.4.0 release/12.2.0 -- sys/dev/e1000
Bisecting: a merge base must be tested
[68cfeeb1d3c428e3c3881f45bc3a20a252b37d0e] MFC r365284:
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE 68cfeeb1d3c4(HEAD) GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10616
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.00 sec  1.15 GBytes   985 Mbits/sec
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20446
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.00 sec  1.15 GBytes   988 Mbits/sec
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10068
[ ID] Interval       Transfer     Bandwidth
[  3] 0.00-10.00 sec  1.15 GBytes   985 Mbits/sec

### Step 2: Recreate the if_getflags(ifp) change to sys/dev/e1000/if_em.c on the first commit and see if it breaks it

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index 558a75ac015e..42faacfc3eea 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1338,7 +1338,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, IFF_PROMISC);
+       em_if_set_promisc(ctx, if_getflags(ifp));
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE #15 68cfeeb1d3c4(HEAD)-dirty: Fri Mar 17 10:50:14 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 47216
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 37030
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41145
^CWaiting for server threads to complete. Interrupt again to force quit.
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-76.70 sec  60.0 Bytes  6.26 bits/sec
[  2] 0.00-56.38 sec  60.0 Bytes  8.51 bits/sec
[  3] 0.00-36.23 sec  60.0 Bytes  13.2 bits/sec
[SUM] 0.00-78.70 sec   180 Bytes  18.3 bits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git diff
[root@nfs src]# git bisect good

### Step 3: Advance to next commit (originally first bad), manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index ce13d57da60b..938c30a03f49 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1360,7 +1360,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #18 n233898-355177efed6c-dirty: Fri Mar 17 14:27:50 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17746
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17750
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17751
^CWaiting for server threads to complete. Interrupt again to force quit.
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-76.54 sec  60.0 Bytes  6.27 bits/sec
[  2] 0.00-56.22 sec  60.0 Bytes  8.54 bits/sec
[  3] 0.00-36.32 sec  60.0 Bytes  13.2 bits/sec
[SUM] 0.00-79.15 sec   180 Bytes  18.2 bits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git bisect bad
Bisecting: 16 revisions left to test after this (roughly 4 steps)
[ded3123049a592ec1f9c5b757e3f0f98f104d6cf] e1000: fix build after 92804cf3dc48 (orig c1655b0f)

### Step 4: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index bcf7e0e9ec56..a9c00e58d880 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1362,7 +1362,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #19 n233610-ded3123049a5-dirty: Fri Mar 17 19:17:26 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 39123
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 51144
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 22030
recv failed: Connection reset by peer
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  2] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  3] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
[SUM] 0.00-122.26 sec   180 Bytes  11.8 bits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git bisect bad
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[60b1634944ed4c19c1db5d1c5f9ed9c83ed6585b] e1000: Improve device name strings

### Step 5: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index f284de275066..919f687e5992 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #20 n233210-60b1634944ed-dirty: Fri Mar 17 20:02:39 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 13641
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 44007
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 65487
^CWaiting for server threads to complete. Interrupt again to force quit.
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-48.64 sec  60.0 Bytes  9.87 bits/sec
[  3] 0.00-28.22 sec  60.0 Bytes  17.0 bits/sec
[  1] 0.00-72.92 sec  60.0 Bytes  6.58 bits/sec
[SUM] 0.00-72.92 sec   180 Bytes  19.7 bits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git bisect bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[c9c1838988faa8bcb74af30384ab45a483562727] e1000: Add support for [Tiger, Alder, Meteor] Lake

### Step 6: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index 79a9d8fdcfe9..839454c20fd7 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #21 n233050-c9c1838988fa-dirty: Sat Mar 18 07:55:11 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23361
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.00 sec  1.02 GBytes   877 Mbits/sec
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23362
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.00 sec  1.07 GBytes   917 Mbits/sec
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23363
[ ID] Interval       Transfer     Bandwidth
[  3] 0.00-10.00 sec  1.15 GBytes   986 Mbits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git bisect good
Bisecting: 1 revision left to test after this (roughly 1 step)
[94c02a765cb7f68c80844acb5898be90dc4069c5] e1000: disable hw.em.sbp debug setting

### Step 7: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index 2c13f7750af2..0ff2bd00d6b0 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #21 n233050-c9c1838988fa-dirty: Sat Mar 18 07:55:11 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41958
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.00 sec  1.14 GBytes   982 Mbits/sec
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41959
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.00 sec  1.15 GBytes   987 Mbits/sec
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41960
[ ID] Interval       Transfer     Bandwidth
[  3] 0.00-10.00 sec  1.15 GBytes   983 Mbits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git bisect good
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[1a132077c2cb500410079f9120c3f676d15f7931] e1000: fix em_mac_min and 82547 packet buffer

### Step 8: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works

[root@nfs src]# vi sys/dev/e1000/if_em.c
[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index ce60b1f5d437..e8f215dfa089 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx)
        }
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
        /* MSI-X configuration for 82574 */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #23 n233156-1a132077c2cb-dirty: Mon Mar 20 09:41:24 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20023
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20024
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 15792
^CWaiting for server threads to complete. Interrupt again to force quit.
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-72.73 sec  60.0 Bytes  6.60 bits/sec
[  2] 0.00-52.37 sec  60.0 Bytes  9.17 bits/sec
[  3] 0.00-32.30 sec  60.0 Bytes  14.9 bits/sec
[SUM] 0.00-74.75 sec   180 Bytes  19.3 bits/sec
[root@nfs src]# git checkout -- sys/dev/e1000/if_em.c
[root@nfs src]# git bisect bad
1a132077c2cb500410079f9120c3f676d15f7931 is the first bad commit
commit 1a132077c2cb500410079f9120c3f676d15f7931
Author: Kevin Bowling <kbowling@FreeBSD.org>
Date:   Thu Apr 15 09:58:36 2021 -0700

    e1000: fix em_mac_min and 82547 packet buffer
    
    The boundary differentiating "lem" vs "em" class devices was wrong
    after the iflib conversion of lem(4).
    
    The Packet Buffer size for 82547 class chips was not set correctly
    after the iflib conversion of lem(4).
    
    These changes restore functionality on an 82547 for the submitter.
    
    PR:             236119
    Reported by:    Jeff Gibbons <jgibbons@protogate.com>
    Reviewed by:    markj
    MFC after:      1 month
    Differential Revision:  https://reviews.freebsd.org/D29766
    
    (cherry picked from commit bb1b375fa7487ee5c3843121a0621ac8379c18e6)

 sys/dev/e1000/if_em.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)
Comment 37 Kevin Bowling freebsd_committer freebsd_triage 2023-04-10 15:59:47 UTC
(In reply to Daniel Duerr from comment #36)
Thanks for doing this.  This commit doesn't make sense to me either, it shouldn't have any effect on I210.  Can you do a sanity check and tell me if stable/12 of recent vintage and 'git revert 1a132077c2cb500410079f9120c3f676d15f7931' performs correctly?
Comment 38 Daniel Duerr 2023-04-12 13:45:09 UTC
(In reply to Kevin Bowling from comment #37)

Ok. As you requested, I've recreated my local branch from a fresh checkout of origin/releng/12.4 with no local changes, and performed a `git revert 1a132077c2cb500410079f9120c3f676d15f7931` to back out that last bad commit. The commit reverted cleanly without conflict, but now I am unable to build the kernel as I get an error:

/usr/src/sys/dev/e1000/if_em.c:2544:7: error: use of undeclared identifier 'adapter'
                if (adapter->hw.mac.max_frame_size > 8192)
                    ^
1 error generated.
*** [if_em.o] Error code 1

make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG
--- em_txrx.o ---
ctfconvert -L VERSION -g em_txrx.o
--- if_de.o ---
ctfconvert -L VERSION -g if_de.o
--- modules-all ---
ctfconvert -L VERSION -g ah_regdomain.o
*** [modules-all] Error code 2

make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG
2 errors

make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG

Not sure where to go next.
Comment 39 Kevin Bowling freebsd_committer freebsd_triage 2023-04-12 16:45:25 UTC
(In reply to Daniel Duerr from comment #38)
change the reference 'adapter' to 'sc' on the line indicated it's just a simple rename
Comment 40 Daniel Duerr 2023-04-13 15:30:08 UTC
(In reply to Kevin Bowling from comment #39)
Thanks Kevin. I've run two passes on my local branch of origin/releng/12.4 with a `git revert 1a132077c2cb500410079f9120c3f676d15f7931` to back out that last bad commit. In summary, the performance is still dismal in both of the following cases.

My gut says the problem is in the `vlan` driver, not the `em` (igb) driver. On other machines that had this same exact problem, I was not using lagg but I was using vlan. Getting rid of vlan worked around this problem.

Anyways, here's the log for you:

## Round 1: fix adapter -> sc

[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index e3da4a2f3d20..3d80d7dc1a3b 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -2541,7 +2541,7 @@ em_reset(if_ctx_t ctx)
                pba = E1000_PBA_34K;
                break;
        default:
-               if (adapter->hw.mac.max_frame_size > 8192)
+               if (sc->hw.mac.max_frame_size > 8192)
                        pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */
                else
                        pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot

[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 #25 releng/12.4-n235814-4f54a7f1b95c-dirty: Thu Apr 13 06:23:53 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17974
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 49417
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 64045
^CWaiting for server threads to complete. Interrupt again to force quit.
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-28.43 sec  60.0 Bytes  16.9 bits/sec
[  3] 0.00-8.09 sec  60.0 Bytes  59.4 bits/sec
[  1] 0.00-52.52 sec  60.0 Bytes  9.14 bits/sec
[SUM] 0.00-52.52 sec   180 Bytes  27.4 bits/sec

## Round 2: fix adapter -> sc, restore em_if_set_promisc(ctx, IFF_PROMISC)

[root@nfs src]# git diff
diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index e3da4a2f3d20..4a45a0f84ce8 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -1361,7 +1361,7 @@ em_if_init(if_ctx_t ctx)
        em_setup_vlan_hw_support(ctx);
 
        /* Don't lose promiscuous settings */
-       em_if_set_promisc(ctx, if_getflags(ifp));
+       em_if_set_promisc(ctx, IFF_PROMISC);
        e1000_clear_hw_cntrs_base_generic(&sc->hw);
 
        /* MSI-X configuration for 82574 */
@@ -2541,7 +2541,7 @@ em_reset(if_ctx_t ctx)
                pba = E1000_PBA_34K;
                break;
        default:
-               if (adapter->hw.mac.max_frame_size > 8192)
+               if (sc->hw.mac.max_frame_size > 8192)
                        pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */
                else
                        pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */
[root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG
[root@nfs src]# reboot

[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 #26 releng/12.4-n235814-4f54a7f1b95c-dirty: Thu Apr 13 06:52:12 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 34226
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 42851
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23908
recv failed: Connection reset by peer
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  2] 0.00-79.37 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  3] 0.00-79.37 sec  60.0 Bytes  6.05 bits/sec
[SUM] 0.00-122.44 sec   180 Bytes  11.8 bits/sec
Comment 41 Zhenlei Huang freebsd_committer freebsd_triage 2023-04-14 01:55:36 UTC
Hi Daniel,

Can you please try taking down one of the lagg member, either igb0 or igb1, and try iperf test again ?

```
# ifconfig igb0 down
```
Comment 42 Daniel Duerr 2023-04-14 12:35:44 UTC
(In reply to Zhenlei Huang from comment #41)

Hi Zhenlei,

Sure. Here's the result of that test, using the same kernel I had built for Kevin in my previous comment yesterday. The performance is still dismal.

[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 #26 releng/12.4-n235814-4f54a7f1b95c-dirty: Thu Apr 13 06:52:12 PDT 2023     toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64
[root@nfs src]# ifconfig lagg0 
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=800028<VLAN_MTU,JUMBO_MTU>
	ether 00:25:90:d6:e6:72
	laggproto lacp lagghash l2,l3,l4
	laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	groups: lagg
	media: Ethernet autoselect
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
[root@nfs src]# ifconfig igb0 down
[root@nfs src]# ifconfig lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=800028<VLAN_MTU,JUMBO_MTU>
	ether 00:25:90:d6:e6:72
	laggproto lacp lagghash l2,l3,l4
	laggport: igb0 flags=0<>
	laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	groups: lagg
	media: Ethernet autoselect
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 26356
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 26357
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 30833
recv failed: Connection reset by peer
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  2] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  3] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
[SUM] 0.00-122.03 sec   180 Bytes  11.8 bits/sec
Comment 43 Santiago Martinez 2023-04-14 14:55:42 UTC
Based on the first comment.

- the iperf3 towards 172.27.6.135 is not working and has mtu set to 9000.
- the iperf3 towards 172.27.6.129 is working fine with mtu set to 1500.

Please can you try setting lagg0.8 to an mtu of 1500 and retry?
Just wondering if you have MTU issues between the two hosts.
Comment 44 Marek Zarychta 2023-04-14 21:34:50 UTC
The bug was introduced somewhere in 13.0-CURRENT (see bug 260260). 
Since early stable/13 I am working around this by setting MTU to 9004 on faulty igb/em(4) interfaces, 9004 is inherited by lagg(4), but for children, vlan(4) MTU 9000 is assigned.
Comment 45 Santiago Martinez 2023-04-14 22:24:56 UTC
yeap, that make sense...
Comment 46 Daniel Duerr 2023-04-14 22:30:39 UTC
(In reply to Santiago Martinez from comment #43)

Hi Santiago,

Both 172.27.6.129 and 172.27.6.135 are on the same MTU 9000 storage subnet. I was careful to ensure this for my testing. All of my iperf tests, both failures and successes, have been between these 2 IPs on a properly configured MTU 9000 vlan which only has hosts on it using MTU 9000.

Also, and FWIW, this storage subnet was configured to MTU 9000 long before this issue ever started occurring. It is a mature, known-good, config.

Best,
Daniel
Comment 47 Daniel Duerr 2023-04-14 22:32:16 UTC
(In reply to Marek Zarychta from comment #44)

Hi Marek- are you suggesting I try your MTU 9004 workaround here? Please note my previous comment to Santiago about how this is _not_ a mixed MTU subnet.
Comment 48 Marek Zarychta 2023-04-14 22:36:41 UTC
(In reply to Daniel Duerr from comment #47)
Please feel free to apply the workaround. It's still a bug but can't be hit with a common MTU 1500 setting.
Comment 49 Santiago Martinez 2023-04-15 18:59:33 UTC
Hi Daniel, 

from the config posted seems to be diff,but maybe i miss understood the first msg.Anyhow the behavior that Marek is describing is not limited only to those cards. You can increase the MTU on the phy or reduce the mtu on the logical. 

I can see the same issue with IXL, BNXT and so on.

This is an output from my lab machines.
ixl2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000

ixl2.1011: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
BR_DCN_PF: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000

Connecting to host 192.168.11.42, port 5201
[  5] local 192.168.11.41 port 42631 connected to 192.168.11.42 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec  69.9 KBytes   568 Kbits/sec    2   8.73 KBytes       
[  5]   1.01-2.04   sec  0.00 Bytes  0.00 bits/sec    2   8.73 KBytes       
[  5]   2.04-3.04   sec  0.00 Bytes  0.00 bits/sec    1   8.73 KBytes       
[  5]   3.04-4.00   sec  0.00 Bytes  0.00 bits/sec    0   8.73 KBytes       
[  5]   4.00-5.05   sec  0.00 Bytes  0.00 bits/sec    0   8.73 KBytes       
^C[  5]   5.05-5.82   sec  0.00 Bytes  0.00 bits/sec    1   8.73 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -

ifconfig ixl2.1011 mtu 8974
ifconfig BR_DCN_PF  mtu 8974
iperf3 -c 192.168.11.42

Connecting to host 192.168.11.42, port 5201
[  5] local 192.168.11.41 port 48630 connected to 192.168.11.42 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   120 MBytes  1.00 Gbits/sec    0   1.41 MBytes       
[  5]   1.00-2.00   sec   118 MBytes   990 Mbits/sec    0   1.75 MBytes
Comment 50 Santiago Martinez 2023-04-15 19:07:28 UTC
Created attachment 241517 [details]
iperf mtu 9000, pcap

Here in the attached image, can see the issue. if the logical is set to an MTU of 9000, the resulting size on the phy is 9014, which explains why it fails.
Comment 51 Santiago Martinez 2023-04-17 20:10:50 UTC
Hi Daniel, just wondering if the workaround (mtu change) solved the issue?
Take care.
Comment 52 Daniel Duerr 2023-04-17 22:03:27 UTC
(In reply to Santiago Martinez from comment #51)
Hi Santiago,

Thanks for the follow-up, apologies for the delayed response.

I've recreated the original problem on a clean 12.4-RELEASE-p1 kernel build from source:

[root@nfs ~]# cd /usr/src
[root@nfs src]# uname -a
FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 releng/12.4-n235813-52442e904dfc GENERIC-NODEBUG  amd64
[root@nfs src]# ifconfig igb0 | grep mtu
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# ifconfig igb1 | grep mtu
igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# ifconfig lagg0 | grep mtu
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# ifconfig lagg0.8 | grep mtu
lagg0.8: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 26020
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 29025
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 33549
recv failed: Connection reset by peer
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-79.37 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  2] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
recv failed: Connection reset by peer
[  3] 0.00-79.38 sec  60.0 Bytes  6.05 bits/sec
[SUM] 0.00-122.10 sec   180 Bytes  11.8 bits/sec

You can see the performance is dismal, as expected. Now, I've tried your MTU workaround by reducing the MTU on the logical like you said:

[root@nfs src]# ifconfig lagg0.8 mtu 8974
[root@nfs src]# ifconfig igb0 | grep mtu
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# ifconfig igb1 | grep mtu
igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# ifconfig lagg0 | grep mtu
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
[root@nfs src]# ifconfig lagg0.8 | grep mtu
lagg0.8: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 8974
[root@nfs src]# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 56026
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.00 sec  1.14 GBytes   981 Mbits/sec
[  2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 61809
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.00 sec   977 MBytes   819 Mbits/sec
[  3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 28069
[ ID] Interval       Transfer     Bandwidth
[  3] 0.00-10.00 sec  1.12 GBytes   963 Mbits/sec

Your workaround appears to also work for me, and the speeds are normal (great) again. Sounds like you expected to see this based on your knowledge of the other bug with MTU. Should I chime in on the other bug and provide any feedback there? 
 And, should I make this MTU reduction on the logical interface permanent in my rc.conf for the time being?
Comment 53 Santiago Martinez 2023-05-05 11:33:12 UTC
Hi Daniel, sorry for my late reply, went on holiday...but back online now.

Thanks for confirming, I think at the moment the best thing will be to make it permanent on rc.conf or any other rc.file.

Regarding the "bug", I think we need input from the net@ people. usually, NOS performs this calculation automatically and adjusts the interfaces (parent or child's). 

At least we should have a way to warn the user, maybe ifconfig to emit a warning.
Comment 54 Daniel Duerr 2023-05-05 13:41:05 UTC
(In reply to Santiago Martinez from comment #53)
No problem, thank you for helping me get a workaround in place here. I'm happy to have the issues resolved, even if by workaround.