today i added a vlan subinterface to a bridge, and dmesg reported: bridge1: can't disable some capabilities on igb0.203: 0x400 this is IFCAP_LRO. bridge(4) says: "on all interfaces added to the bridge ... The LRO capability is always disabled." wikipedia(Large receive offload) says: "LRO should not operate on machines acting as routers, as it breaks the end-to-end principle and can significantly impact performance." sys/net/if_vlan.c says: /* * If the parent interface can do LRO and checksum offloading on * VLANs, then guess it may do LRO on VLANs. False positive here * cost nothing, while false negative may lead to some confusions. */ if (p->if_capabilities & IFCAP_VLAN_HWCSUM) cap |= p->if_capabilities & IFCAP_LRO; if (p->if_capenable & IFCAP_VLAN_HWCSUM) ena |= p->if_capenable & IFCAP_LRO; to get LRO to not be sticky-on for a vlan subinterface, i would have to remove LRO from the parent interface. i don't think this is functioning as intended, because i have other non-bridged subinterfaces for which LRO is appropriate. i am not posting a candidate fix because too many layers are being crossed here and there's not an obvious way to relax this deep constraint. someone with recent kernel architecture chops should take a look.
@Alexander Anyone else you suggest might be suitable/able to advise on this issue?
Maybe CC trouble as well, I think someone mentioned that he's been doing some bridge work lately.
I'm pretty sure I came across this on a truenas system where the physical interface is on the trunk and various vlans are derived from that for service jails. I had absolutely terrible throughput for one service and ended up just disabling LRO altogether.
Created attachment 242067 [details] untested patch
(In reply to paul vixie from comment #4) Would you kindly upload a git format-patch style patch or otherwise provide the email and name you would like used for Git's --author
(In reply to Ed Maste from comment #5) since it doesn't seem unreasonable to you, i'll do what you ask, but first i will test it. i think the patch is unreasonable and that it would be better to strike these two "if" statements than to containerize them to the non-bridge case. however, what i'm actually proposing is less controversial in that it's unable to break a working configuration. removing the LRO logic (those two if statements) could indeed break working configurations.
For what it's worth, I was able to test the patch and observe that LRO is disabled when adding a VLAN subinterface to a bridge, and re-enabled when removing it. Unfortunately, as the PRs Ed linked above show, the e1000 driver (what I have in my machine) doesn't actually support LRO even though it claims to, so I can't test that it meaningfully changes performance.
Created attachment 242133 [details] second patch, same content, but in "git format-patch" format
(In reply to Naman Sood from comment #7) i've now tested this patch and it does what i need. the LRO option remains on for the physical interface and for the vlan subinterfaces which are not added to a bridge. i did not performance-test the physical interface or other subinterfaces to see what impact LRO was having, but input coalescence is a theoretical boon and i'm glad to not have to turn off LRO on the physical interface in order to protect the bridge members from having LRO turned on inappropriately. the main motivation is to avoid input coalescence on the bridge member, since the other bridge members are tap(4) interfaces each belonging to a bhyve guest (where the interface shows up as a vtnet(4) interface.) input coalescence in this case is not just a performance problem it damages the input flow and makes the bhyve guest unusable. i was working around this by turning off LRO on the physical interface but this is a price too high: bhyve guests whose bridge is connected not by a physical interface but by a vlan subinterface must "just work". i am attaching a second patch, this time in "git format-patch" format, since this was requested. it looks the same to me. can this patch be reviewed for inclusion in 13.3 and also pulled into the 13.2 release in the next maintenance patch?
> can this patch be reviewed for inclusion in 13.3 and also pulled into the 13.2 > release in the next maintenance patch? In order to make it into a 13.2 errata update the first two steps will be committing to main and MFCing to stable/13. I'll handle that (likely after a couple of weeks). In order to help facilitate an errata update, would you be able to provide content for the errata notice? See https://www.freebsd.org/security/errata-template.txt, and in particular sections I through IV.
Created attachment 242136 [details] partial FreeBSD-EN-ERRATA_TEMPLATE as requested as requested
(In reply to Ed Maste from comment #10) done, see ticket attachments.
Is there anything blocking merging this patch to main? It would be nice to get this fix into the tree before 14.0 branches.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5f11a33ceeb385477cb22d9ad5941061c5a26be9 commit 5f11a33ceeb385477cb22d9ad5941061c5a26be9 Author: Paul Vixie <paul@redbarn.org> AuthorDate: 2023-08-11 18:17:16 +0000 Commit: Kristof Provost <kp@FreeBSD.org> CommitDate: 2023-08-11 22:50:37 +0000 if_vlan: do not enable LRO for bridge interaces If the parent interface is not a bridge and can do LRO and checksum offloading on VLANs, then guess it may do LRO on VLANs. False positive here cost nothing, while false negative may lead to some confusions. According to Wikipedia: "LRO should not operate on machines acting as routers, as it breaks the end-to-end principle and can significantly impact performance." The same reasoning applies to machines acting as bridges. PR: 254596 MFC after: 3 weeks sys/net/if_vlan.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
The above patch was reverted, a different fix will land momentarily.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b1a39c31a3569bd045a0f40057c3773fc8166f6d commit b1a39c31a3569bd045a0f40057c3773fc8166f6d Author: Kevin Bowling <kbowling@FreeBSD.org> AuthorDate: 2023-08-12 16:31:22 +0000 Commit: Kevin Bowling <kbowling@FreeBSD.org> CommitDate: 2023-08-12 16:39:23 +0000 vlan: Respect IFCAP_LRO mask vlan_capabilities(), used by the IFCAP ioctl, was not respecting the IFCAP_LRO bit if it was masked by the requestor. This prevented if_bridge(4) from automasking LRO with a message like: bridge0: can't disable some capabilities on em3.11: 0x400 This also prevented manually disabling LRO from any vlan interface. PR: 254596 Reported by: Paul Vixie <paul@redbarn.org> MFC after: 1 week sys/net/if_vlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f73eb35106cc65d19af0f2e82a111308c5090c3a commit f73eb35106cc65d19af0f2e82a111308c5090c3a Author: Kevin Bowling <kbowling@FreeBSD.org> AuthorDate: 2023-08-12 16:31:22 +0000 Commit: Kevin Bowling <kbowling@FreeBSD.org> CommitDate: 2023-08-18 00:59:30 +0000 vlan: Respect IFCAP_LRO mask vlan_capabilities(), used by the IFCAP ioctl, was not respecting the IFCAP_LRO bit if it was masked by the requestor. This prevented if_bridge(4) from automasking LRO with a message like: bridge0: can't disable some capabilities on em3.11: 0x400 This also prevented manually disabling LRO from any vlan interface. PR: 254596 Reported by: Paul Vixie <paul@redbarn.org> (cherry picked from commit b1a39c31a3569bd045a0f40057c3773fc8166f6d) sys/net/if_vlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=50b52fe14b8949539e7fe583d4578b1f2c96dc59 commit 50b52fe14b8949539e7fe583d4578b1f2c96dc59 Author: Kevin Bowling <kbowling@FreeBSD.org> AuthorDate: 2023-08-12 16:31:22 +0000 Commit: Kevin Bowling <kbowling@FreeBSD.org> CommitDate: 2023-08-18 01:00:07 +0000 vlan: Respect IFCAP_LRO mask vlan_capabilities(), used by the IFCAP ioctl, was not respecting the IFCAP_LRO bit if it was masked by the requestor. This prevented if_bridge(4) from automasking LRO with a message like: bridge0: can't disable some capabilities on em3.11: 0x400 This also prevented manually disabling LRO from any vlan interface. PR: 254596 Reported by: Paul Vixie <paul@redbarn.org> (cherry picked from commit b1a39c31a3569bd045a0f40057c3773fc8166f6d) sys/net/if_vlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Passing to Ed per above comment about EN.
^Triage: to committer: was the EN for 12.3 ever completed?
^Triage: 12.3 has since gone EOL. The idea of an EN is now OBE.