Created attachment 266610 [details] non-working rc.conf (14.3 and 15.0) I have a Honeycomb LX2 board running with multiple jails. It was running fine with dpni0 having an IP and the bridge associating between the dpni0 and the epair devices. However, this no longer works with 15.0-RELEASE, the bridge must be assigned the IP address, and everything fall from that. With net.link.bridge.inherit_mac=1, I can access the system remotely, but can't access any of the vnet jails connected to the bridge. I'm attaching 2 trimmed rc.conf's, one working (rc.conf.up_good) on 14.3-RELEASE, the other non-working on both 14.3 and 15.0 (rc.conf). If I simply s/dpni0/ue0/g on the config, it works correctly with a USB ethernet adapter, which led to this bug report.
Created attachment 266611 [details] working rc.conf (14.3)
(In reply to Justin Hibbits from comment #0) Could you try replacing this: ifconfig_dpni0="up -tso -vlanhwtso -rxcsum -txcsum -rxcsum6 -txcsum6" with this: ifconfig_dpni0="up -tso -vlanhwtso" in your rc.conf for 15.0-RELEASE? epair was modified in https://reviews.freebsd.org/rG39d4094173f9a49ff52f5f4408e4dbd5d6ef0409 which caused a regression I fixed in https://reviews.freebsd.org/D53436. It landed before 15.0-RELEASE. It isn't available for 14.3 though (and I _must not_ forget to MFC the fix to stable/14).
(In reply to Dmitry Salychev from comment #2) If you want to track L3/L4 chekcums validation/generation status, look for "L3/L4 checksum validation enabled/disabled" in dmesg.
Hmmm... it looks like either of stable/14 and stable/15 didn't have my patch. I've MFC'ed to the both already. Workaround is to disable and enable checksums generation/validation in runtime with ifconfig: ifconfig dpniX -rxcsum -txcsum -rxcsum6 -txcsum6 and ifconfig dpniX rxcsum txcsum rxcsum6 txcsum6 Try to verbose boot and you should see checksums disabled/enabled in dmesg after those two commands above. You should be able to communicate with your jails after that.
(In reply to Dmitry Salychev from comment #4) I did this, and your suggestion in comment #2, but no change. I can access the jails from the host, but not remotely, and the jails cannot access beyond the host, with errors including openvpn connections to a remote gateway failing. I also set debug.bootverbose after booting, before I did the checksum disable/enable dance, and didn't see any messages.
Are we sure this is related to dpni and not to the entire bridge rework? Adding ivy@ just as the extra pair of eyes on the bridge setup.
(In reply to Bjoern A. Zeeb from comment #6) I'm pretty certain. Replacing `dpni0` with `ue0` (axge driver) in the rc.conf gets the machine working correctly.
(In reply to Justin Hibbits from comment #7) Can you tcpdump a little bit of TCP and UDP traffic to/from jails? If my assumption is correct, checksums of the packets coming to the host from jails should be incorrect at least.
(In reply to Bjoern A. Zeeb from comment #6) i'm on contact with Justin about this privately, and from what we've tested so far, it doesn't appear to be an issue with bridge(4) specifically. Justin: sorry, i haven't had time to follow up on this the last few days, but i'll try to have a look today.
(In reply to Dmitry Salychev from comment #8) I did a little testing, and trying to ssh out from a jail to my desktop did show invalid checksums. I verified ARP entries are in the cache on my desktop and tried fetching files from the web server on one of my jails, and didn't see any activity to it while watching dpni0, which seemed odd. One other odd datapoint, pings over the vlan work fine, I'm able to ping other devices on that vlan, but pings on the main LAN aren't working, so vlans work, but epairs don't it appears. I haven't tried building a new kernel with your recent MFC, so will try to give that a shot in the next few days when I have time. I'll also have to check how to build kernel packages now, so I can install it as a package, and it gets managed in the NWO.
Ok, so now we are on epairs. Adding tuexen. Given it's date I assume 39d4094173f9a and the ones before made it into 15.0. Can you disable checksum offloading on epairs and see if things start to work then? I haven't followed this closely and we still don't know if it's dpni(4) which is at fault here but if we find the actual change which causes the trouble, it would certainly help to know what to look for in dpni to fix ;-)
(In reply to Bjoern A. Zeeb from comment #11) It would be very good to know if things work if you disable txcsum and rxcsum on the epair before adding it to the bridge. Justin: It is also useful to capture traffic on the remote host when the communication with a jail does not work. This allows us to determine if it is a problem related to checksums.
I have no access to such a hardware, just looked at the code in sys/dev/dpaa2. I also looked again at D53436. Is hardware checksum offloading actually working? I do not see any modification of hwassist and non of the constants like CSUM_TCP or CSUM_UDP. Just wondering...
(In reply to Michael Tuexen from comment #13) It does for sure. I checked https://reviews.freebsd.org/D53436 on my Honeycomb before submitting.
(In reply to Michael Tuexen from comment #12) When I attempt to SSH from a jail to my desktop I do see incorrect checksums from tcpdump running on my desktop. When I ping from the jail to my desktop, with tcpdump on the desktop side I see both ICMP echo requests and ICMP echo replies. However, when I tcpdump on bridge0 or dpni0, I only see the echo requests, not any replies. I verified ARP entries are in my desktop's ARP cache, and the ARP entry for my desktop is in my jail's ARP cache.
(In reply to Justin Hibbits from comment #15) This is running with 22f8973d1 cherry-picked to 15.0-RELEASE as well.
(In reply to Dmitry Salychev from comment #14) Great. Could you point me to the code which sets the if_hwassist flags? The code is complex and I failed to find it...
(In reply to Michael Tuexen from comment #17) https://cgit.freebsd.org/src/tree/sys/dev/dpaa2/dpaa2_ni.c#n562
(In reply to Dmitry Salychev from comment #18) That sets if_capabilities, but not if_hwassist. if_hwassist is what the network stack looks at. That is why I am looking how that is set.
(In reply to Michael Tuexen from comment #19) I don't set if_hwassist at all. What's the logic behind these flags?
(In reply to Dmitry Salychev from comment #20) Let us focus on TCP/IPv4 checksum offloading on the sending side. To enable this you would set IFCAP_TXCSUM in if_capabilities. This can be controlled by the user via ifconfig. The semantic is that the interface has enabled some sort of transmit checksum offloading. What this exactly means is controlled by if_hwassist. This if_hwassist contains CSUM_IP_TCP, then transmit checksum for TCP/IPv4 is supported, CSUM_IP means support for IPv4 header checksum offloading, CSUM_IP_UDP means UDP/IPv4 transmit checksum offloading and CSUM_IP_SCTP mean checksum offloading for SCTP/IPv4. Then TCP/IPv4 packets are sent by the local stack CSUM_IP_TCP | CSUM_IP is set in the m_pkthdr.csum_flags and the pseudo header checksum is computed. When the IPv4 layer determines the outgoing interface, it compares the if_hwassist of the outgoing interface with the m_pkthdr.csum_flags whether the interface supports the requested computation in hardware or not. If the hardware does not support the requested operation, it will be performed in software. The driver is requested to keep if_capabilities semantically in sync with if_hwassist. So in your case where you added IFCAP_TXCSUM to if_capabilities but keep if_hwassist 0, all checksum operations will be performed in software. When adding an interface to a bridge, IFCAP_TXCSUM will be synced in the sense that it will be turned off, if at least one interface has it turned off. For testing dpni, I would suggest to add CSUM_IP_UDP | CSUM_IP_TCP | CSUM_IP to if_hwassist when adding IFCAP_TXCSUM. Then test if UDP and TCP works from the host to external hosts via a dpni interface. No bridge involved. Once that is working, we can focus on the interaction between dpni and bridge.
(In reply to Michael Tuexen from comment #21) I haven't disappeared, just struggling to find time. Thanks for the detailed explanation. I'll try and get back here.
(In reply to Dmitry Salychev from comment #22) I have a patch for fixing TXCSUM, trying to create one for the RXCSUM side. I fail to find the information for a received frame whether a L3/L4 checksum validation was performed and what the result is. Could you provide any hint?
Created attachment 267339 [details] Fix for transmit checksum offloading
Dmitry, could you have a look at review D54805? Just a cleanup to make the patch for fixing the TXCSUM shorter.
(In reply to Michael Tuexen from comment #23) Generally, L3/L4 checksums validation/generation is completely transparent to the driver and DPAA2 doesn't differentiate between v4 and v6: https://cgit.freebsd.org/src/tree/sys/dev/dpaa2/dpaa2_ni.c#n1592. I simply send a command to the DPAA2 HW (DPNI object in its terms) to enable or disable L3 and L4 checksums validation (on RX): https://cgit.freebsd.org/src/tree/sys/dev/dpaa2/dpaa2_ni.c#n1615 and generation (on TX): https://cgit.freebsd.org/src/tree/sys/dev/dpaa2/dpaa2_ni.c#n1631. i.e. if any of the IFCAP_RXCSUM or IFCAP_RXCSUM_IPV6 is set -> I enable checksums validation for both L3/L4 and v4/v6. Same logic is applicable to the IFCAP_TXCSUM or IFCAP_TXCSUM_IPV6.
(In reply to Dmitry Salychev from comment #26) I understand. For the transmit side: we need to let the network stack know, which checksum can be offloaded. If it doesn't know, it computes the checksums and there is no performance gain. See review D54809 for a fix. For the receive side: we need to let the network stack know, that the checksum was checked. If we don't, the network stack does the check in software again and we don't gain anything. So we need something like: diff --git a/sys/dev/dpaa2/dpaa2_ni.c b/sys/dev/dpaa2/dpaa2_ni.c index eda5bab78bde..df29ee9e639f 100644 --- a/sys/dev/dpaa2/dpaa2_ni.c +++ b/sys/dev/dpaa2/dpaa2_ni.c @@ -3178,6 +3178,16 @@ dpaa2_ni_rx(struct dpaa2_channel *ch, struct dpaa2_ni_fq *fq, struct dpaa2_fd *f mtx_unlock(&bch->dma_mtx); m->m_flags |= M_PKTHDR; + if ((fas_status & DPAA2_NI_FAS_L3CV) != 0) { + m->m_flags |= CSUM_IP_CHECKED; + if ((fas_status & DPAA2_NI_FAS_L3CE) == 0) + m->m_flags |= CSUM_IP_VALID; + } + if ((fas_status & DPAA2_NI_FAS_L4CV) != 0 && + (fas_status & DPAA2_NI_FAS_L4CE) == 0) { + m->m_flags |= CSUM_DATA_VALID | CSUM_PSEUDO_HDR; + m->m_pkthdr.csum_data = 0xffff; + } I just have not figured out how to get the fas_status... I expect it in the HW or SW annotation frame. But I don't know which one, how to access it and what the layout is. Could you point me to a spec? I use DPAA2 User Manual, but that is missing this information (or at least I don't find it there).
(In reply to Michael Tuexen from comment #27) The patch should be: diff --git a/sys/dev/dpaa2/dpaa2_ni.c b/sys/dev/dpaa2/dpaa2_ni.c index eda5bab78bde..8ee8f2718972 100644 --- a/sys/dev/dpaa2/dpaa2_ni.c +++ b/sys/dev/dpaa2/dpaa2_ni.c @@ -3178,6 +3178,16 @@ dpaa2_ni_rx(struct dpaa2_channel *ch, struct dpaa2_ni_fq *fq, struct dpaa2_fd *f mtx_unlock(&bch->dma_mtx); m->m_flags |= M_PKTHDR; + if ((fas_status & DPAA2_NI_FAS_L3CV) != 0) { + m->m_pkthdr.csum_flags |= CSUM_IP_CHECKED; + if ((fas_status & DPAA2_NI_FAS_L3CE) == 0) + m->m_pkthdr.csum_flags |= CSUM_IP_VALID; + } + if ((fas_status & DPAA2_NI_FAS_L4CV) != 0 && + (fas_status & DPAA2_NI_FAS_L4CE) == 0) { + m->m_pkthdr.csum_flags |= CSUM_DATA_VALID | CSUM_PSEUDO_HDR; + m->m_pkthdr.csum_data = 0xffff; + } m->m_data = buf_data; m->m_len = buf_len; m->m_pkthdr.len = buf_len;
(In reply to Michael Tuexen from comment #27) You need LX2160A Family Data Path Acceleration Architecture, Second Generation (DPAA2) Low-Level Hardware Reference Manual (LX2160ADPAA2RM, Rev. 0, 06/2020) from NXP. It is not possible to download it for free. You've to create an account which is free, but could take a couple of days to confirm. I'd share my copy if you prefer. What you're asking for is located in the hardware (in this case Wire rate IO Processor, WRIOP) frame annotation which is a part of the Frame Annotation located at FD[ADDRESS], i.e. there static int dpaa2_ni_rx(struct dpaa2_channel *ch, struct dpaa2_ni_fq *fq, struct dpaa2_fd *fd, struct dpaa2_ni_rx_ctx *ctx) { bus_addr_t paddr = (bus_addr_t)fd->addr; struct dpaa2_fa *fa = (struct dpaa2_fa *)PHYS_TO_DMAP(paddr); // <---- struct dpaa2_buf *buf = fa->buf; struct dpaa2_channel *bch = (struct dpaa2_channel *)buf->opt; struct dpaa2_ni_softc *sc = device_get_softc(bch->ni_dev); struct dpaa2_bp_softc *bpsc; struct mbuf *m; device_t bpdev; bus_addr_t released[DPAA2_SWP_BUFS_PER_CMD]; void *buf_data; int buf_len, error, released_n = 0; However, dpaa2_fa describes a 64-bytes software annotation which is not touched by the WRIOP anyhow. If you want HW annotation, you'd need to extend dpaa2_fa according to "7.34 WRIOP Frame Annotation (FA)" and "Figure 7-22. WRIOP hardware frame annotation offsets in single buffer frame" in particular as we're receiving anything in the single buffer frames at the moment.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f31336b3e3146fed9cc517fef8e877c17496f9e0 commit f31336b3e3146fed9cc517fef8e877c17496f9e0 Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2026-01-23 07:59:57 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2026-01-23 07:59:57 +0000 dpnaa2: announce transmit checksum support Let the network stack know that the NIC supports checksum offloading for the IPv4 header checksum and the TCP and UDP transport checksum. This avoids the computation in software and therefore provides the expected performance gain. PR: 292006 Reviewed by: dsl, Timo Völker MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D54809 sys/dev/dpaa2/dpaa2_ni.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
(In reply to Dmitry Salychev from comment #29) Thank you for the reference. I could create an account and download the spec. Will try to implement it. There is one additional issue I observed. I use the ten64 board and when I ssh into it, start building a kernel, I observe TCP payload corruption. I think building world is not relevant, it is more that the system is under load. This happens: (a) If TXCSUM is disabled, the software computes the TCP checksum based on the correct payload, the TCP segment is sent to the external peer and the peer detects the payload corruption via a TCP checksum error. Retransmissions will fix it. The application will only see uncorrupted payload and if the payload is SSH, the SSH session will work perfectly. (b) If TXCSUM is enabled, the hardware computes the TCP checksum based on the corrupted payload and the TCP peer will accept it, since the checksum is correct for the corrupted payload. In case the payload is SSH, the SSH session detects this and will terminate the connection. So it looks like the memory handling on the transmit side is missing a boundary or so, which only shows up under load. Does this ring a bell?
(In reply to Michael Tuexen from comment #31) Yes, it definitely does. I noticed it about 3 months ago, but had hoped that DPAA2 drivers have nothing to do with it. It seems not.
(In reply to Michael Tuexen from comment #31) https://reviews.freebsd.org/D56144 should fix it. It looks like we're consistent with the TX path, aren't we?
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=968164eb650fd986f293512a3faac5c1c9e4d51f commit 968164eb650fd986f293512a3faac5c1c9e4d51f Author: Dmitry Salychev <dsl@FreeBSD.org> AuthorDate: 2026-03-28 18:57:45 +0000 Commit: Dmitry Salychev <dsl@FreeBSD.org> CommitDate: 2026-03-29 18:23:51 +0000 dpaa2: Perform bus_dma pre-write sync before enqueue operation Without a proper synchronization payload of the egress TCP segments can be corrupted as tuexen@ described in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292006#c31. This patch is indirectly related to 292006 because a properly enabled and announced support for the TX checksum offloading hides potentially corrupted frame payload. PR: 292006 Reported by: tuexen@ Reviewed by: ... Tested by: dsl@ Differential Revision: <https://reviews.freebsd.org/D###> MFC after: 3 days sys/dev/dpaa2/dpaa2_ni.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5812415bee55a9063508b02fda9418b0eadb0bb4 commit 5812415bee55a9063508b02fda9418b0eadb0bb4 Author: Dmitry Salychev <dsl@FreeBSD.org> AuthorDate: 2026-03-28 18:57:45 +0000 Commit: Dmitry Salychev <dsl@FreeBSD.org> CommitDate: 2026-03-29 18:34:09 +0000 dpaa2: Perform bus_dma pre-write sync before enqueue operation Without a proper synchronization payload of the egress TCP segments can be corrupted as tuexen@ described in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292006#c31. This patch is indirectly related to 292006 because a properly enabled and announced support for the TX checksum offloading hides potentially corrupted frame payload. NOTE: Returned back with updated placeholders. PR: 292006 Reported by: tuexen@ Reviewed by: tuexen@ Tested by: dsl@, tuexen@ Differential Revision: https://reviews.freebsd.org/D56144 MFC after: 3 days sys/dev/dpaa2/dpaa2_ni.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
A commit in branch stable/15 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=26b27a82880347828205700a5d54a0516a3f15b1 commit 26b27a82880347828205700a5d54a0516a3f15b1 Author: Dmitry Salychev <dsl@FreeBSD.org> AuthorDate: 2026-03-28 18:57:45 +0000 Commit: Dmitry Salychev <dsl@FreeBSD.org> CommitDate: 2026-04-04 09:28:09 +0000 dpaa2: Perform bus_dma pre-write sync before enqueue operation Without a proper synchronization payload of the egress TCP segments can be corrupted as tuexen@ described in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292006#c31. This patch is indirectly related to 292006 because a properly enabled and announced support for the TX checksum offloading hides potentially corrupted frame payload. PR: 292006 Reported by: tuexen@ Reviewed by: tuexen@ Tested by: dsl@, tuexen@ Differential Revision: https://reviews.freebsd.org/D56144 MFC after: 3 days (cherry picked from commit 5812415bee55a9063508b02fda9418b0eadb0bb4) sys/dev/dpaa2/dpaa2_ni.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c417ed80fd2fed7ae0561a5bf772408c910f2fec commit c417ed80fd2fed7ae0561a5bf772408c910f2fec Author: Dmitry Salychev <dsl@FreeBSD.org> AuthorDate: 2026-03-28 18:57:45 +0000 Commit: Dmitry Salychev <dsl@FreeBSD.org> CommitDate: 2026-04-04 09:44:07 +0000 dpaa2: Perform bus_dma pre-write sync before enqueue operation Without a proper synchronization payload of the egress TCP segments can be corrupted as tuexen@ described in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292006#c31. This patch is indirectly related to 292006 because a properly enabled and announced support for the TX checksum offloading hides potentially corrupted frame payload. PR: 292006 Reported by: tuexen@ Reviewed by: tuexen@ Tested by: dsl@, tuexen@ Differential Revision: https://reviews.freebsd.org/D56144 MFC after: 3 days (cherry picked from commit 5812415bee55a9063508b02fda9418b0eadb0bb4) sys/dev/dpaa2/dpaa2_ni.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=8e994533806d8aa0ae4582a52d811ede2b19bb26 commit 8e994533806d8aa0ae4582a52d811ede2b19bb26 Author: Dmitry Salychev <dsl@FreeBSD.org> AuthorDate: 2026-01-25 16:53:57 +0000 Commit: Dmitry Salychev <dsl@FreeBSD.org> CommitDate: 2026-04-08 19:48:11 +0000 dpaa2: Extract frame-specific routines to dpaa2_frame.[h,c] As soon as we need information from the hardware frame annotation to make sure that checksums of the ingress frames were verified by the DPAA2 HW, I've decided to make a preparation and extracted all of the frame related routines into the separate dpaa2_frame.[h,c] along with some clean up and improvements, e.g. no more dpaa2_fa, but dpaa2_swa and dpaa2_hwa structures to describe software and hardware frame annotations respectively, dpaa2_fa_get_swa/dpaa2_fa_get_hwa to obtain those annotations from the frame descriptor. The next step is to implement dpaa2_fa_get_hwa. PR: 292006 Approved by: tuexen MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D56315 sys/conf/files.arm64 | 1 + sys/dev/dpaa2/dpaa2_buf.c | 9 +- sys/dev/dpaa2/dpaa2_buf.h | 2 + sys/dev/dpaa2/dpaa2_frame.c (new) | 165 ++++++++++++++++++++++++++++++ sys/dev/dpaa2/dpaa2_frame.h (new) | 174 +++++++++++++++++++++++++++++++ sys/dev/dpaa2/dpaa2_ni.c | 210 ++++++++++++-------------------------- sys/dev/dpaa2/dpaa2_ni.h | 3 +- sys/dev/dpaa2/dpaa2_swp.h | 51 +-------- sys/dev/dpaa2/dpaa2_types.h | 5 + sys/modules/dpaa2/Makefile | 1 + 10 files changed, 420 insertions(+), 201 deletions(-)