I have a ZFS file server running ctld to provide an iSCSI target to VMware hosts. It runs on a Supermicro motherboard which has dual Intel i210 ports on it. The igb ports are configured in a LACP lagg with the switch, and then vlan is used to create 4 VLAN interfaces on top of that. The system has been rock solid stable in production for over a year on 12.2-STABLE. Upgrading it to 12.4-RELEASE 2 weeks ago rendered it unusable as an iSCSI target. I've tried disabling all the HW options to no avail. The ctld iSCSI target lives on VLAN 8 (172.27.6.135). When I try to connect to the target from VMware's initiator, VMware basically hangs and I start seeing interface errors accumulate on this machine. Once I saw this behavior, I assumed it was a lower-level network issue and not specific to ctld at all. I then removed the VMware iSCSI configuration and focused on diagnosing the lower-level network. Here's a pair of iperf tests performed on a neighboring FreeBSD 12.4 machine on the same VLAN segment (172.27.6.135) which has an igb interface without lagg+vlan. The first test is to this machine's lagg0.8 interface (172.27.6.135), and the second test is to a third machine (172.27.6.129) which also has an igb interface without lagg+vlan: $ for ip in 172.27.6.135 172.27.6.129; do iperf -c ${ip}; done ------------------------------------------------------------ Client connecting to 172.27.6.135, TCP port 5001 TCP window size: 32.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.130 port 22350 connected with 172.27.6.135 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-20.54 sec 43.1 KBytes 17.2 Kbits/sec ------------------------------------------------------------ Client connecting to 172.27.6.129, TCP port 5001 TCP window size: 35.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.130 port 11072 connected with 172.27.6.129 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.01 sec 1.15 GBytes 984 Mbits/sec Here's this machine's config from /etc/rc.conf: ifconfig_igb0="mtu 9000 media 1000baseTX mediaopt full-duplex -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso -vlanhwtag -vlanhwcsum -vlanhwfilter up" ifconfig_igb1="mtu 9000 media 1000baseTX mediaopt full-duplex -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso -vlanhwtag -vlanhwcsum -vlanhwfilter up" cloned_interfaces="lagg0" ifconfig_lagg0="up laggproto lacp laggport igb0 laggport igb1 lacp_fast_timeout" vlans_lagg0="6 7 8 10" ifconfig_lagg0_6="inet 172.27.6.10/26 mtu 1500" ifconfig_lagg0_7="inet 172.27.6.123/26 mtu 1500" ifconfig_lagg0_8="inet 172.27.6.135/28 mtu 9000" ifconfig_lagg0_10="inet 172.27.6.251/26 mtu 1500" Here's this machine's relevant ifconfig output: igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 hwaddr 00:25:90:d6:e6:73 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 laggproto lacp lagghash l2,l3,l4 laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> lagg0.8: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 ether 00:25:90:d6:e6:72 inet 172.27.6.135 netmask 0xfffffff0 broadcast 172.27.6.143 groups: vlan vlan: 8 vlanpcp: 0 parent interface: lagg0 media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
FWIW, the upgrade to 12.4-RELEASE negatively affected all of my igb network configurations that had worked previously on 12.2-STABLE. In one case, removing the use of VLANs worked around this underlying issue -- thankfully I had enough individual igb ports to do so. But in this case, I can't do this because I only have 2 ports to work with.
Kevin, any ideas?
Daniel, can you please bisect the commit range? There isn't that much that has happened in sys/dev/e1000/* so it should hopefully be found in a half dozen builds bisecting stable/12.
(In reply to Kevin Bowling from comment #3) I'm happy to help with that, Kevin. But I'm not familiar with how to do it and I do not have a 12.2-STABLE machine available from which to compare. Any guidance you can give me would be much appreciated. Thanks!
(In reply to Daniel Duerr from comment #4) 1) Clone FreeBSD src, check out 12.4: pkg install git-lite git clone https://git.freebsd.org/src.git /usr/src cd /usr/src git checkout releng/12.4 2) Start the bisecting using the 12.4 release branch as a known bad point, 12.2 as known good and limit the search to the e1000 driver commits: cd /usr/src git bisect start releng/12.4 releng/12.2 -- sys/dev/e1000 3) Build and install the kernel: cd /usr/src make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG make installkernel KERNCONF=GENERIC-NODEBUG 4) Reboot the system. You will be in the new kernel (confirm with uname -a, it will show you the build date and git hash). 5) Perform some test like your iperf. 6) If the performance is good: cd /usr/src git bisect good Repeat step 3 & 4 7) If the performance is bad: cd /usr/src git bisect bad Repeat step 3 & 4 8) Keep repeating steps 3 through 7. After a handful of repetitions, you will run out of commits to test and it will spit out the first bad hash. Post that here. 9) To restore your system to a desired kernel, say 12.4 with security patches: git checkout releng/12.2 Repeat step 3 & 4
Correction for step 9, before running 'git checkout <branch>', perform 'git bisect reset' to clear the repo out of bisecting.
(In reply to Kevin Bowling from comment #5) Thank you for the detailed instructions, Kevin. Makes sense. FWIW, after completing step I got an error on step #2: [root@nfs src]# git bisect start releng/12.4 releng/12.2 -- sys/dev/e1000 fatal: 'releng/12.2' does not appear to be a valid revision Assuming there might be tags, I used `git tag -l` to see a list of them. This command works as it represents valid revisions (tags): [root@nfs src]# git bisect start release/12.4.0 release/12.2.0 -- sys/dev/e1000 Bisecting: a merge base must be tested [68cfeeb1d3c428e3c3881f45bc3a20a252b37d0e] MFC r365284: Please let me know if this seems like an OK compromise to your proposed command in step #2.
(In reply to Kevin Bowling from comment #5) Hi Kevin, sorry to say I'm having a difficult time getting the kernel to compile. I have a pretty decent amount of experience building custom kernels, but cannot for the life of me get past this error: env NM='nm' NMFLAGS='' sh /usr/src/sys/kern/genassym.sh ia32_genassym.o > ia32_assym.h cc -target x86_64-unknown-freebsd12.2 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -c -x assembler-with-cpp -DLOCORE -O2 -pipe -fno-strict-aliasing -g -nostdinc -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -MD -MF.depend.acpi_wakecode.o -MTacpi_wakecode.o -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member -mno-aes -mno-avx -std=iso9899:1999 -Werror /usr/src/sys/amd64/acpica/acpi_wakecode.S error: unknown -Werror warning specifier: '-Wno-error-tautological-compare' [-Werror,-Wunknown-warning-option] error: unknown -Werror warning specifier: '-Wno-error-empty-body' [-Werror,-Wunknown-warning-option] error: unknown -Werror warning specifier: '-Wno-error-parentheses-equality' [-Werror,-Wunknown-warning-option] error: unknown -Werror warning specifier: '-Wno-error-unused-function' [-Werror,-Wunknown-warning-option] error: unknown -Werror warning specifier: '-Wno-error-pointer-sign' [-Werror,-Wunknown-warning-option] error: unknown -Werror warning specifier: '-Wno-error-shift-negative-value' [-Werror,-Wunknown-warning-option] *** Error code 1 I get this whether I try to build the GENERIC-NODEBUG config (which I created following a guide I found online) or the GENERIC config. I have not had this issue before, not sure why it is happening now but can't seem to find much info on it.
(In reply to Daniel Duerr from comment #8) Try one of the methods from https://groups.google.com/g/bsdmailinglist/c/Wz3lSE20hWU, either using WITHOUT_SYSTEM_COMPILER=yes or cherry-picking the format commits as needed.
(In reply to Kevin Bowling from comment #9) Thanks Kevin. I did manage to get myself unblocked but forgot to respond back to you here. I needed to do a `make buildworld` and now `make kernel` works as expected. I'll report back once I finish the bisect.
You can save some time by avoiding buildworld, we are only interested in the e1000 driver changes in the kernel
(In reply to Kevin Bowling from comment #3) Ok, I completed the bisect process you outlined above -- thanks again for the great instructions. Here's the output: 6486e9dd8d24b0195facd23d8ca82e17e180cffb is the first bad commit commit 6486e9dd8d24b0195facd23d8ca82e17e180cffb Author: Eric Joyner <erj@FreeBSD.org> Date: Mon Sep 21 22:52:57 2020 +0000 MFC r365774 and r365776 These two commits fix issues in em(4)/igb(4): - Fix define and includes with RSS option enabled - Properly retain promisc flag in init PR: 249191, 248869 MFC after: 1 day sys/dev/e1000/if_em.c | 2 +- sys/dev/e1000/if_em.h | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) On the first commit I tested, performance was normal/awesome: [root@nfs dd]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 54394 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.00 sec 1.10 GBytes 942 Mbits/sec [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 53705 [ ID] Interval Transfer Bandwidth [ 2] 0.00-10.00 sec 1.09 GBytes 940 Mbits/sec [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 64390 [ ID] Interval Transfer Bandwidth [ 3] 0.00-10.00 sec 1.12 GBytes 958 Mbits/sec Every commit after that, starting with the first bad commit listed above, performance was terrible: [root@nfs dd]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 64164 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 65279 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 42346 recv failed: Connection reset by peer [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec ^CWaiting for server threads to complete. Interrupt again to force quit. [ 2] 0.00-56.63 sec 60.0 Bytes 8.48 bits/sec [ 3] 0.00-36.26 sec 60.0 Bytes 13.2 bits/sec [SUM] 0.00-95.55 sec 180 Bytes 15.1 bits/sec
If it's about missing promisc flag which was set invisibly by default before this commit it's easy to test with bad state: # ifconfig igb0 promisc # ifconfig igb1 promisc And try the iperf again... Cheers, Franco
(In reply to Franco Fichtner from comment #13) Thank you, Franco. I tested what you suggested here but I do not see a change. Here's my `ifconfig` output before (the PROMISC flag is missing as you suggest): igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 hwaddr 00:25:90:d6:e6:73 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> And here are the baseline `iperf -s` test numbers to match: [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec [ 2] 0.00-76.85 sec 60.0 Bytes 6.25 bits/sec [ 3] 0.00-56.37 sec 60.0 Bytes 8.52 bits/sec [SUM] 0.00-98.92 sec 180 Bytes 14.6 bits/sec I then made the change you suggested: [root@nfs dd]# ifconfig igb0 promisc [root@nfs dd]# ifconfig igb1 promisc Which is reflected in the `ifconfig` output: igb0: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> igb1: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 hwaddr 00:25:90:d6:e6:73 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> But the `iperf -s` numbers are still terrible: [ ID] Interval Transfer Bandwidth [ 1] 0.00-68.47 sec 60.0 Bytes 7.01 bits/sec [ 2] 0.00-48.60 sec 60.0 Bytes 9.88 bits/sec [ 3] 0.00-28.25 sec 60.0 Bytes 17.0 bits/sec [SUM] 0.00-70.48 sec 180 Bytes 20.4 bits/sec Do I need to adjust the `lagg` interface to replicate your test, given I'm running inside of a `lagg` interface?
Could be required additionally or just in lagg, certainly worth a try in those combinations.
(In reply to Franco Fichtner from comment #15) Thanks. Unfortunately I cannot get any change in behavior with this flag. [root@nfs dd]# ifconfig igb0: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> igb1: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 hwaddr 00:25:90:d6:e6:73 media: Ethernet 1000baseT <full-duplex> status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 laggproto lacp lagghash l2,l3,l4 laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> lagg0.8: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 9000 ether 00:25:90:d6:e6:72 inet 172.27.6.135 netmask 0xfffffff0 broadcast 172.27.6.143 groups: vlan vlan: 8 vlanpcp: 0 parent interface: lagg0 media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> Also, I don't believe this machine's interfaces would ever have been in PROMISC mode before. I get that there was a change about some implicit defaults, but if I had seen PROMISC and or PPROMISC in the `ifconfig` output before, I would have thought something was wrong. :). Just saying, as even if this workaround had worked, I wouldn't have considered this a normal running state.
Well, before this patch igb was defaulting to promisc even if it didn't report it to ifconfig. It was debuggable using a VM asking for privileged network access during boot in those early 12.x days. There is something else at play here and the patch likely only surfaced this. Cheers, Franco
(In reply to Franco Fichtner from comment #17) Thanks Franco. I agree that whatever I am experiencing here seems a little different than what you were talking about. I've tried more combinations of `ifconfig promisc` and none seem to change the behavior I'm seeing. Yet the kernel build w/ the earliest 12.2 commit work as expected.
As a sanity check, can you revert 6486e9dd8d24b0195facd23d8ca82e17e180cffb or manually change the em_if_set_promisc() call back as it was on the top of 12.4?
(In reply to Kevin Bowling from comment #19) Sure. I started with an up to date `releng/12.4` branch with no changes. The `git revert 6486e9dd8d24b0195facd23d8ca82e17e180cffb` command could not apply cleanly without manual conflict resolution. I think I resolved it properly -- here's the diff between my local version and origin/releng/12.4: [root@nfs src]# git diff origin/releng/12.4 sys/dev/e1000/if_em.c diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index a9ab2fb21535..0f20449db6ec 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1361,7 +1361,7 @@ em_if_init(if_ctx_t ctx) em_setup_vlan_hw_support(ctx); /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&sc->hw); /* MSI-X configuration for 82574 */ I then proceeded to build and install a GENERIC kernel from this source. After rebooting on the new kernel, I re-ran the `iperf` tests and the performance is still terrible: [root@nfs dd]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 50946 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 27230 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 53346 recv failed: Connection reset by peer [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 2] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 3] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec [SUM] 0.00-122.08 sec 180 Bytes 11.8 bits/sec Did I do the revert wrong? Or do we need to look at other changes that were made in the em driver?
Can you go to your good early kernel state and apply the patch in question? If it's good there in any case you will have to do a bisect, but apply the patch on top every time to find the underlying issue. Cheers, Franco
(In reply to Franco Fichtner from comment #21) Sure. To clarify, you are asking me to do another `git bisect` like I did before, but this time to apply this latest patch to each subsequent version -- the ones that were bad in the earlier bisect I did -- in order to see where it then breaks?
Just a small datapoint re: i210. I have a Supermicro board (X11SSH-F) with pciconf -lvcb igb0 igb0@pci0:2:0:0: class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I210 Gigabit Network Connection' class = network subclass = ethernet that terminates an iSCSI vol on ZFS across the internet via ipsec tunnel. Tracking RELENG_12 has not seen any performance differences for me. The remote connection is limited to 500Mb, so it maxes out there. However, I am not using lagg nor vlans on that interface. This is 12.4-STABLE
Hi Daniel, Yes, at least once in your original state to verify the commit you found unmasks the real cause of the degradation. If that is the case the initial old commit you started the bisect on would not drop performance using this patch. Or it would, but then this might reach back even more. Yet I'm only speculating about it. Cheers, Franco
(In reply to mike from comment #23) Thanks Mike. In line with your points, I was able to work around this issue on all my other servers with 12.4-RELEASE by removing the use of vlan and just running straight igb. So there is something about vlan combined with these if_em driver changes that is causing my issues here. Unfortunately on this particular server, I am port-limited and require some redundancy, hence the need to have the igb + lagg + vlan combination work. And in case anyone wonders, no I did not try igb + lagg without vlan because that would make my port limitation even worse than just going straight igb like I did on all the other servers (which have 4-6 igb ports each).
(In reply to Franco Fichtner from comment #24) Thanks, I think I understand. Would it be acceptable to just checkout the first "bad" commit from my bisect above and then apply the patch to that and test it?
Not the first bad commit, the first commit you started the bisect on which is the one that has reliably good performance.
(In reply to Franco Fichtner from comment #27) Ok, thanks. So you want me to go back to the first good commit (which already works without any patches) and then you want me to apply the patch to that version to make sure it still works as it already does? Sorry, I'm a little confused at the goal here.
I think the intention there is to try and figure out what coincidental commit could be at fault because the offending commit is just enabling functionality but is not necessarily a smoking gun. I'm not sure if there is a streamlined way to do this kind of bisect, hopefully 'git cherry-pick' and 'git reset' will suffice in the build/test loop. Right now the likely suspects could be flag management changes in e1000 or some change in lagg(4) I am not privy to.
(In reply to Kevin Bowling from comment #29) Thanks Kevin. I'm happy to do whatever is needed here, I'm just unclear on exactly what you guys are asking me to do here. I don't understand what change I'm trying to apply to the commit _that already works in the first place_. Also, is there any chance this issue would _not_ occur in 13.x? Or, would you all be more motivated to fix if it _did_ occur in 13.x? I'm not stuck on 12.x and would prefer to take the easiest path forward and put everyone's efforts where it matters most.
(In reply to Daniel Duerr from comment #30) The idea is to bisect in somewhat of a reverse fashion where you introduce the one line change on during each bisection step: - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); and see if that triggers a different resultant commit hash. Instead of walking backwards to see where things break, which you did, and which identified this line of code, we want to see when this line change results in bad performance if it were applied earlier before, and during other code changes. If I understand correctly, applying this one line change to 12.2 did not result in poor performance. If you have already done this and the performance is poor on 12.2 with the change there is no need to bisect further. The drivers are very similar between 12, 13, and 14 so if it is a driver regression it probably affects all of these. If it is some cross cut with the network stack there could be something different but I am not aware of major differences in this area.
(In reply to Kevin Bowling from comment #31) Thanks Kevin, this helps. What I will do is rerun the original `git bisect` I had done above, but this time I will apply that patch to each commit and see if I can get a different outcome on the subsequent commits that had been "bad" in the original bisect. Make sense?
(In reply to Daniel Duerr from comment #32) You got it
(In reply to Kevin Bowling from comment #33) Hi Kevin, Here's my first progress report: ### Round 1: Restart the `git bisect` and confirm the commit 68cfeeb1d3c4 works [root@nfs ~]# cd /usr/src [root@nfs ~]# git checkout releng/12.4 [root@nfs ~]# git bisect start release/12.4.0 release/12.2.0 -- sys/dev/e1000 Bisecting: a merge base must be tested [68cfeeb1d3c428e3c3881f45bc3a20a252b37d0e] MFC r365284: [root@nfs ~]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs ~]# reboot [root@nfs ~]# uname -a FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE 68cfeeb1d3c4(HEAD) GENERIC-NODEBUG amd64 [root@nfs ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10616 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.00 sec 1.15 GBytes 985 Mbits/sec [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20446 [ ID] Interval Transfer Bandwidth [ 2] 0.00-10.00 sec 1.15 GBytes 988 Mbits/sec [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10068 [ ID] Interval Transfer Bandwidth [ 3] 0.00-10.00 sec 1.15 GBytes 985 Mbits/sec ### Round 2: Apply the change to sys/dev/e1000/if_em.c in commit 68cfeeb1d3c4 and see if it still works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index 558a75ac015e..42faacfc3eea 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1338,7 +1338,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, IFF_PROMISC); + em_if_set_promisc(ctx, if_getflags(ifp)); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs ~]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs ~]# reboot [root@nfs ~]# uname -a FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE #15 68cfeeb1d3c4(HEAD)-dirty: Fri Mar 17 10:50:14 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs ~]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 47216 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 37030 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41145 ^CWaiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 1] 0.00-76.70 sec 60.0 Bytes 6.26 bits/sec [ 2] 0.00-56.38 sec 60.0 Bytes 8.51 bits/sec [ 3] 0.00-36.23 sec 60.0 Bytes 13.2 bits/sec [SUM] 0.00-78.70 sec 180 Bytes 18.3 bits/sec In summary, the first good commit in the bisect works and when I apply the change, it stops working. As a next step, I was thinking I'd (a) revert the manual change back to the "good" state, (b) so a `git bisect good` to proceed to the next (first "bad") commit, and (c) manually reverse the change the other way to see if the bad commit then works. Does that make sense?
(In reply to Daniel Duerr from comment #34) If, after applying the diff it is bad, mark it bad, then undo the change and proceed to the next commit. Only mark a commit as good if it works as expected _with_ the change over top.
(In reply to Kevin Bowling from comment #35) Hi Kevin, Okay, I finished another `git bisect` for you as we discussed, but this time I manually reverted the `if_getflags(ifp)` change each time to see how that would affect the results. Also, on the first commit (the one that works as-is), I manually applied the `if_getflags(ifp)` change and confirmed it broke it. I then reverted that, did a `git bisect good` on the first commit, and proceeded with the rest of the process. It definitely produces a different result. Here's the log: ### Step 1: Restart the `git bisect` and confirm the first commit still works [root@nfs src]# git checkout releng/12.4 [root@nfs src]# git bisect start release/12.4.0 release/12.2.0 -- sys/dev/e1000 Bisecting: a merge base must be tested [68cfeeb1d3c428e3c3881f45bc3a20a252b37d0e] MFC r365284: [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE 68cfeeb1d3c4(HEAD) GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10616 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.00 sec 1.15 GBytes 985 Mbits/sec [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20446 [ ID] Interval Transfer Bandwidth [ 2] 0.00-10.00 sec 1.15 GBytes 988 Mbits/sec [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 10068 [ ID] Interval Transfer Bandwidth [ 3] 0.00-10.00 sec 1.15 GBytes 985 Mbits/sec ### Step 2: Recreate the if_getflags(ifp) change to sys/dev/e1000/if_em.c on the first commit and see if it breaks it [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index 558a75ac015e..42faacfc3eea 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1338,7 +1338,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, IFF_PROMISC); + em_if_set_promisc(ctx, if_getflags(ifp)); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-PRERELEASE FreeBSD 12.2-PRERELEASE #15 68cfeeb1d3c4(HEAD)-dirty: Fri Mar 17 10:50:14 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 47216 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 37030 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41145 ^CWaiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 1] 0.00-76.70 sec 60.0 Bytes 6.26 bits/sec [ 2] 0.00-56.38 sec 60.0 Bytes 8.51 bits/sec [ 3] 0.00-36.23 sec 60.0 Bytes 13.2 bits/sec [SUM] 0.00-78.70 sec 180 Bytes 18.3 bits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git diff [root@nfs src]# git bisect good ### Step 3: Advance to next commit (originally first bad), manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index ce13d57da60b..938c30a03f49 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1360,7 +1360,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #18 n233898-355177efed6c-dirty: Fri Mar 17 14:27:50 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17746 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17750 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17751 ^CWaiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 1] 0.00-76.54 sec 60.0 Bytes 6.27 bits/sec [ 2] 0.00-56.22 sec 60.0 Bytes 8.54 bits/sec [ 3] 0.00-36.32 sec 60.0 Bytes 13.2 bits/sec [SUM] 0.00-79.15 sec 180 Bytes 18.2 bits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git bisect bad Bisecting: 16 revisions left to test after this (roughly 4 steps) [ded3123049a592ec1f9c5b757e3f0f98f104d6cf] e1000: fix build after 92804cf3dc48 (orig c1655b0f) ### Step 4: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index bcf7e0e9ec56..a9c00e58d880 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1362,7 +1362,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #19 n233610-ded3123049a5-dirty: Fri Mar 17 19:17:26 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 39123 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 51144 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 22030 recv failed: Connection reset by peer [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 2] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 3] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec [SUM] 0.00-122.26 sec 180 Bytes 11.8 bits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git bisect bad Bisecting: 7 revisions left to test after this (roughly 3 steps) [60b1634944ed4c19c1db5d1c5f9ed9c83ed6585b] e1000: Improve device name strings ### Step 5: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index f284de275066..919f687e5992 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #20 n233210-60b1634944ed-dirty: Fri Mar 17 20:02:39 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 13641 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 44007 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 65487 ^CWaiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 2] 0.00-48.64 sec 60.0 Bytes 9.87 bits/sec [ 3] 0.00-28.22 sec 60.0 Bytes 17.0 bits/sec [ 1] 0.00-72.92 sec 60.0 Bytes 6.58 bits/sec [SUM] 0.00-72.92 sec 180 Bytes 19.7 bits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git bisect bad Bisecting: 3 revisions left to test after this (roughly 2 steps) [c9c1838988faa8bcb74af30384ab45a483562727] e1000: Add support for [Tiger, Alder, Meteor] Lake ### Step 6: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index 79a9d8fdcfe9..839454c20fd7 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #21 n233050-c9c1838988fa-dirty: Sat Mar 18 07:55:11 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23361 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.00 sec 1.02 GBytes 877 Mbits/sec [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23362 [ ID] Interval Transfer Bandwidth [ 2] 0.00-10.00 sec 1.07 GBytes 917 Mbits/sec [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23363 [ ID] Interval Transfer Bandwidth [ 3] 0.00-10.00 sec 1.15 GBytes 986 Mbits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git bisect good Bisecting: 1 revision left to test after this (roughly 1 step) [94c02a765cb7f68c80844acb5898be90dc4069c5] e1000: disable hw.em.sbp debug setting ### Step 7: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index 2c13f7750af2..0ff2bd00d6b0 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #21 n233050-c9c1838988fa-dirty: Sat Mar 18 07:55:11 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41958 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.00 sec 1.14 GBytes 982 Mbits/sec [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41959 [ ID] Interval Transfer Bandwidth [ 2] 0.00-10.00 sec 1.15 GBytes 987 Mbits/sec [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 41960 [ ID] Interval Transfer Bandwidth [ 3] 0.00-10.00 sec 1.15 GBytes 983 Mbits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git bisect good Bisecting: 0 revisions left to test after this (roughly 0 steps) [1a132077c2cb500410079f9120c3f676d15f7931] e1000: fix em_mac_min and 82547 packet buffer ### Step 8: Advance to next commit, manually reverse the if_getflags(ifp) change to sys/dev/e1000/if_em.c and see if it now works [root@nfs src]# vi sys/dev/e1000/if_em.c [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index ce60b1f5d437..e8f215dfa089 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1363,7 +1363,7 @@ em_if_init(if_ctx_t ctx) } /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI-X configuration for 82574 */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.2-STABLE FreeBSD 12.2-STABLE #23 n233156-1a132077c2cb-dirty: Mon Mar 20 09:41:24 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20023 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 20024 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 15792 ^CWaiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 1] 0.00-72.73 sec 60.0 Bytes 6.60 bits/sec [ 2] 0.00-52.37 sec 60.0 Bytes 9.17 bits/sec [ 3] 0.00-32.30 sec 60.0 Bytes 14.9 bits/sec [SUM] 0.00-74.75 sec 180 Bytes 19.3 bits/sec [root@nfs src]# git checkout -- sys/dev/e1000/if_em.c [root@nfs src]# git bisect bad 1a132077c2cb500410079f9120c3f676d15f7931 is the first bad commit commit 1a132077c2cb500410079f9120c3f676d15f7931 Author: Kevin Bowling <kbowling@FreeBSD.org> Date: Thu Apr 15 09:58:36 2021 -0700 e1000: fix em_mac_min and 82547 packet buffer The boundary differentiating "lem" vs "em" class devices was wrong after the iflib conversion of lem(4). The Packet Buffer size for 82547 class chips was not set correctly after the iflib conversion of lem(4). These changes restore functionality on an 82547 for the submitter. PR: 236119 Reported by: Jeff Gibbons <jgibbons@protogate.com> Reviewed by: markj MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D29766 (cherry picked from commit bb1b375fa7487ee5c3843121a0621ac8379c18e6) sys/dev/e1000/if_em.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
(In reply to Daniel Duerr from comment #36) Thanks for doing this. This commit doesn't make sense to me either, it shouldn't have any effect on I210. Can you do a sanity check and tell me if stable/12 of recent vintage and 'git revert 1a132077c2cb500410079f9120c3f676d15f7931' performs correctly?
(In reply to Kevin Bowling from comment #37) Ok. As you requested, I've recreated my local branch from a fresh checkout of origin/releng/12.4 with no local changes, and performed a `git revert 1a132077c2cb500410079f9120c3f676d15f7931` to back out that last bad commit. The commit reverted cleanly without conflict, but now I am unable to build the kernel as I get an error: /usr/src/sys/dev/e1000/if_em.c:2544:7: error: use of undeclared identifier 'adapter' if (adapter->hw.mac.max_frame_size > 8192) ^ 1 error generated. *** [if_em.o] Error code 1 make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG --- em_txrx.o --- ctfconvert -L VERSION -g em_txrx.o --- if_de.o --- ctfconvert -L VERSION -g if_de.o --- modules-all --- ctfconvert -L VERSION -g ah_regdomain.o *** [modules-all] Error code 2 make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG 2 errors make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG Not sure where to go next.
(In reply to Daniel Duerr from comment #38) change the reference 'adapter' to 'sc' on the line indicated it's just a simple rename
(In reply to Kevin Bowling from comment #39) Thanks Kevin. I've run two passes on my local branch of origin/releng/12.4 with a `git revert 1a132077c2cb500410079f9120c3f676d15f7931` to back out that last bad commit. In summary, the performance is still dismal in both of the following cases. My gut says the problem is in the `vlan` driver, not the `em` (igb) driver. On other machines that had this same exact problem, I was not using lagg but I was using vlan. Getting rid of vlan worked around this problem. Anyways, here's the log for you: ## Round 1: fix adapter -> sc [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index e3da4a2f3d20..3d80d7dc1a3b 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -2541,7 +2541,7 @@ em_reset(if_ctx_t ctx) pba = E1000_PBA_34K; break; default: - if (adapter->hw.mac.max_frame_size > 8192) + if (sc->hw.mac.max_frame_size > 8192) pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */ else pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 #25 releng/12.4-n235814-4f54a7f1b95c-dirty: Thu Apr 13 06:23:53 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 17974 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 49417 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 64045 ^CWaiting for server threads to complete. Interrupt again to force quit. [ ID] Interval Transfer Bandwidth [ 2] 0.00-28.43 sec 60.0 Bytes 16.9 bits/sec [ 3] 0.00-8.09 sec 60.0 Bytes 59.4 bits/sec [ 1] 0.00-52.52 sec 60.0 Bytes 9.14 bits/sec [SUM] 0.00-52.52 sec 180 Bytes 27.4 bits/sec ## Round 2: fix adapter -> sc, restore em_if_set_promisc(ctx, IFF_PROMISC) [root@nfs src]# git diff diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index e3da4a2f3d20..4a45a0f84ce8 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -1361,7 +1361,7 @@ em_if_init(if_ctx_t ctx) em_setup_vlan_hw_support(ctx); /* Don't lose promiscuous settings */ - em_if_set_promisc(ctx, if_getflags(ifp)); + em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&sc->hw); /* MSI-X configuration for 82574 */ @@ -2541,7 +2541,7 @@ em_reset(if_ctx_t ctx) pba = E1000_PBA_34K; break; default: - if (adapter->hw.mac.max_frame_size > 8192) + if (sc->hw.mac.max_frame_size > 8192) pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */ else pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */ [root@nfs src]# make -j `sysctl -n hw.ncpu` buildkernel KERNCONF=GENERIC-NODEBUG && make installkernel KERNCONF=GENERIC-NODEBUG [root@nfs src]# reboot [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 #26 releng/12.4-n235814-4f54a7f1b95c-dirty: Thu Apr 13 06:52:12 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 34226 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 42851 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 23908 recv failed: Connection reset by peer [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 2] 0.00-79.37 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 3] 0.00-79.37 sec 60.0 Bytes 6.05 bits/sec [SUM] 0.00-122.44 sec 180 Bytes 11.8 bits/sec
Hi Daniel, Can you please try taking down one of the lagg member, either igb0 or igb1, and try iperf test again ? ``` # ifconfig igb0 down ```
(In reply to Zhenlei Huang from comment #41) Hi Zhenlei, Sure. Here's the result of that test, using the same kernel I had built for Kevin in my previous comment yesterday. The performance is still dismal. [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 #26 releng/12.4-n235814-4f54a7f1b95c-dirty: Thu Apr 13 06:52:12 PDT 2023 toor@nfs.tidepool.cloud:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 [root@nfs src]# ifconfig lagg0 lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 laggproto lacp lagghash l2,l3,l4 laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> [root@nfs src]# ifconfig igb0 down [root@nfs src]# ifconfig lagg0 lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=800028<VLAN_MTU,JUMBO_MTU> ether 00:25:90:d6:e6:72 laggproto lacp lagghash l2,l3,l4 laggport: igb0 flags=0<> laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 26356 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 26357 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 30833 recv failed: Connection reset by peer [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 2] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 3] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec [SUM] 0.00-122.03 sec 180 Bytes 11.8 bits/sec
Based on the first comment. - the iperf3 towards 172.27.6.135 is not working and has mtu set to 9000. - the iperf3 towards 172.27.6.129 is working fine with mtu set to 1500. Please can you try setting lagg0.8 to an mtu of 1500 and retry? Just wondering if you have MTU issues between the two hosts.
The bug was introduced somewhere in 13.0-CURRENT (see bug 260260). Since early stable/13 I am working around this by setting MTU to 9004 on faulty igb/em(4) interfaces, 9004 is inherited by lagg(4), but for children, vlan(4) MTU 9000 is assigned.
yeap, that make sense...
(In reply to Santiago Martinez from comment #43) Hi Santiago, Both 172.27.6.129 and 172.27.6.135 are on the same MTU 9000 storage subnet. I was careful to ensure this for my testing. All of my iperf tests, both failures and successes, have been between these 2 IPs on a properly configured MTU 9000 vlan which only has hosts on it using MTU 9000. Also, and FWIW, this storage subnet was configured to MTU 9000 long before this issue ever started occurring. It is a mature, known-good, config. Best, Daniel
(In reply to Marek Zarychta from comment #44) Hi Marek- are you suggesting I try your MTU 9004 workaround here? Please note my previous comment to Santiago about how this is _not_ a mixed MTU subnet.
(In reply to Daniel Duerr from comment #47) Please feel free to apply the workaround. It's still a bug but can't be hit with a common MTU 1500 setting.
Hi Daniel, from the config posted seems to be diff,but maybe i miss understood the first msg.Anyhow the behavior that Marek is describing is not limited only to those cards. You can increase the MTU on the phy or reduce the mtu on the logical. I can see the same issue with IXL, BNXT and so on. This is an output from my lab machines. ixl2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000 ixl2.1011: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000 BR_DCN_PF: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 Connecting to host 192.168.11.42, port 5201 [ 5] local 192.168.11.41 port 42631 connected to 192.168.11.42 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.01 sec 69.9 KBytes 568 Kbits/sec 2 8.73 KBytes [ 5] 1.01-2.04 sec 0.00 Bytes 0.00 bits/sec 2 8.73 KBytes [ 5] 2.04-3.04 sec 0.00 Bytes 0.00 bits/sec 1 8.73 KBytes [ 5] 3.04-4.00 sec 0.00 Bytes 0.00 bits/sec 0 8.73 KBytes [ 5] 4.00-5.05 sec 0.00 Bytes 0.00 bits/sec 0 8.73 KBytes ^C[ 5] 5.05-5.82 sec 0.00 Bytes 0.00 bits/sec 1 8.73 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - ifconfig ixl2.1011 mtu 8974 ifconfig BR_DCN_PF mtu 8974 iperf3 -c 192.168.11.42 Connecting to host 192.168.11.42, port 5201 [ 5] local 192.168.11.41 port 48630 connected to 192.168.11.42 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 120 MBytes 1.00 Gbits/sec 0 1.41 MBytes [ 5] 1.00-2.00 sec 118 MBytes 990 Mbits/sec 0 1.75 MBytes
Created attachment 241517 [details] iperf mtu 9000, pcap Here in the attached image, can see the issue. if the logical is set to an MTU of 9000, the resulting size on the phy is 9014, which explains why it fails.
Hi Daniel, just wondering if the workaround (mtu change) solved the issue? Take care.
(In reply to Santiago Martinez from comment #51) Hi Santiago, Thanks for the follow-up, apologies for the delayed response. I've recreated the original problem on a clean 12.4-RELEASE-p1 kernel build from source: [root@nfs ~]# cd /usr/src [root@nfs src]# uname -a FreeBSD nfs.tidepool.cloud 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 releng/12.4-n235813-52442e904dfc GENERIC-NODEBUG amd64 [root@nfs src]# ifconfig igb0 | grep mtu igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# ifconfig igb1 | grep mtu igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# ifconfig lagg0 | grep mtu lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# ifconfig lagg0.8 | grep mtu lagg0.8: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 26020 [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 29025 [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 33549 recv failed: Connection reset by peer [ ID] Interval Transfer Bandwidth [ 1] 0.00-79.37 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 2] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec recv failed: Connection reset by peer [ 3] 0.00-79.38 sec 60.0 Bytes 6.05 bits/sec [SUM] 0.00-122.10 sec 180 Bytes 11.8 bits/sec You can see the performance is dismal, as expected. Now, I've tried your MTU workaround by reducing the MTU on the logical like you said: [root@nfs src]# ifconfig lagg0.8 mtu 8974 [root@nfs src]# ifconfig igb0 | grep mtu igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# ifconfig igb1 | grep mtu igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# ifconfig lagg0 | grep mtu lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 [root@nfs src]# ifconfig lagg0.8 | grep mtu lagg0.8: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 8974 [root@nfs src]# iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 1] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 56026 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.00 sec 1.14 GBytes 981 Mbits/sec [ 2] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 61809 [ ID] Interval Transfer Bandwidth [ 2] 0.00-10.00 sec 977 MBytes 819 Mbits/sec [ 3] local 172.27.6.135 port 5001 connected with 172.27.6.129 port 28069 [ ID] Interval Transfer Bandwidth [ 3] 0.00-10.00 sec 1.12 GBytes 963 Mbits/sec Your workaround appears to also work for me, and the speeds are normal (great) again. Sounds like you expected to see this based on your knowledge of the other bug with MTU. Should I chime in on the other bug and provide any feedback there? And, should I make this MTU reduction on the logical interface permanent in my rc.conf for the time being?
Hi Daniel, sorry for my late reply, went on holiday...but back online now. Thanks for confirming, I think at the moment the best thing will be to make it permanent on rc.conf or any other rc.file. Regarding the "bug", I think we need input from the net@ people. usually, NOS performs this calculation automatically and adjusts the interfaces (parent or child's). At least we should have a way to warn the user, maybe ifconfig to emit a warning.
(In reply to Santiago Martinez from comment #53) No problem, thank you for helping me get a workaround in place here. I'm happy to have the issues resolved, even if by workaround.