Bug 273046 - xn: xen netfront does LRO even if packet forwarding is enabled
Summary: xn: xen netfront does LRO even if packet forwarding is enabled
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: dfr
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-10 14:57 UTC by dfr
Modified: 2024-11-25 15:12 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dfr 2023-08-10 14:57:38 UTC
On a test VM running on an XCP-NG host, I noticed that large-receive was happening on the VM's interface for traffic that was intended to be routed to a vnet-enabled jail. The resulting large coalesced packets were too large to be routed to the jail, causing retransmits and very slow performance inside the vnet jail.

In my setup which uses podman to manage setting up network bridging for the vnet jail, net.inet.ip.forwarding is set quite late and after network interfaces are configured so I tried adding net.inet.ip.forwarding=1 to sysctl.conf to see if that fixed the problem but the netfront driver was still coalescing segments on receive - this is clearly visible in packet captures as oversized packets.

My impression from reading iflib.c was that LRO should not happen if packet forwarding is enabled but netfront is missing this logic.

Based on user reports, this problem may also be present in the virtio vtnet driver - one user reported slow podman network performance for a VM running on a Proxmox host.
Comment 1 dfr 2023-08-12 11:57:01 UTC
I tried to fix this today by adding logic to stop requesting LRO from the backend when forwarding is enabled. Unfortunately, the code which sets up features for the backend runs long before the value of net.inet.ip.forwarding is set (via sysctl.conf in my test). If LRO is disabled manually, the driver re-initialises the backend state.

There seems to be no way for the driver to detect changes to the forwarding state. The linux kernel seems to actively disable LRO on all devices when forwarding is enabled.

Also, we have confirmed that a similar issue of slow network throughput for vmm jails affects the vtnet driver for VMs running as guests on Proxmox hosts which can also be mitigated by disabling LRO. Perhaps a better assignee for this should be freebsd-net?
Comment 2 Roger Pau Monné freebsd_committer freebsd_triage 2023-08-12 12:01:36 UTC
Can you confirm that manually disabling LRO on the interface using ifconfig solves the issue?

How do other (non virt maybe?) drivers deal with this?

I'm currently on PTO, will try to take a look when I'm back (on the 20th of August).
Comment 3 dfr 2023-08-12 12:11:29 UTC
Manually disabling LRO using something like 'ifconfig xn0 -lro' does solve the issue.

I can see logic in sys/netinet/tcp_lro.c which checks for forwarding in tcp_lro_rx_common but I think that code is only active for software LRO. Iflib also checks before calling tcp_lro_rx but that seems also to be only effective for software LRO.

So far I haven't been able to figure out how any other driver manages this. I'm not even 100% sure which drivers support hardware LRO where coalescing happens inside the NIC.
Comment 4 dfr 2023-08-12 13:11:27 UTC
The netfront driver also has a tunable which seems to be intended for disabling LRO but it doesn't seeem to work as intended - I'll probably just fix that to make an easy workaround for the problem.
Comment 5 Roger Pau Monné freebsd_committer freebsd_triage 2023-08-15 17:51:28 UTC
I once tried to fix this, but my knowledge of networking internals is not that good.  I came up with:

https://reviews.freebsd.org/D6656
https://reviews.freebsd.org/D6612
https://reviews.freebsd.org/D6611

I was hoping that someone from the net side would pick those up.
Comment 6 commit-hook freebsd_committer freebsd_triage 2023-10-20 10:52:24 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=da4b0d6eb06d730487d48e15d2d5e10c56266fd9

commit da4b0d6eb06d730487d48e15d2d5e10c56266fd9
Author:     Doug Rabson <dfr@FreeBSD.org>
AuthorDate: 2023-08-12 13:19:47 +0000
Commit:     Doug Rabson <dfr@FreeBSD.org>
CommitDate: 2023-10-20 10:50:20 +0000

    netfront: fix the support for disabling LRO at boot time

    The driver has a tunable hw.xn.enable_lro which is intended to control
    whether LRO is enabled. This is currently non-functional - even if its
    set to zero, the driver still requests LRO support from the backend.
    This change fixes the feature so that if enable_lro is set to zero, LRO
    no longer appears in the interface capabilities and LRO is not requested
    from the backend.

    PR:             273046
    MFC after:      1 week
    Reviewed by:    royger
    Differential Revision: https://reviews.freebsd.org/D41439

 sys/dev/xen/netfront/netfront.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)
Comment 7 commit-hook freebsd_committer freebsd_triage 2023-11-12 11:31:32 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=61ba094af4b4798bf2887c30af5c14d66ead706f

commit 61ba094af4b4798bf2887c30af5c14d66ead706f
Author:     Doug Rabson <dfr@FreeBSD.org>
AuthorDate: 2023-08-12 13:19:47 +0000
Commit:     Doug Rabson <dfr@FreeBSD.org>
CommitDate: 2023-11-12 10:37:16 +0000

    netfront: fix the support for disabling LRO at boot time

    The driver has a tunable hw.xn.enable_lro which is intended to control
    whether LRO is enabled. This is currently non-functional - even if its
    set to zero, the driver still requests LRO support from the backend.
    This change fixes the feature so that if enable_lro is set to zero, LRO
    no longer appears in the interface capabilities and LRO is not requested
    from the backend.

    PR:             273046
    MFC after:      1 week
    Reviewed by:    royger
    Differential Revision: https://reviews.freebsd.org/D41439

    (cherry picked from commit da4b0d6eb06d730487d48e15d2d5e10c56266fd9)

 sys/dev/xen/netfront/netfront.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)
Comment 8 Mark Linimon freebsd_committer freebsd_triage 2024-11-25 05:11:46 UTC
^Triage: committed and MFCed.
Comment 9 Ed Maste freebsd_committer freebsd_triage 2024-11-25 14:10:11 UTC
The underlying issue likely still exists - I believe the referenced commit just makes it so that there's a viable workaround.
Comment 10 Roger Pau Monné freebsd_committer freebsd_triage 2024-11-25 15:12:36 UTC
I'm indeed not aware of the underlying issue being fixed.  As a better mitigation it seems like netfront should disable LRO if `net.inet.ip.forwarding` is set (suggested by dfr).  Not sure how other interfaces with LRO deal with this.