it's possible i'm doing something wrong here since i thought this problem should be fixed by 25dbba4fc6e152a05e091180b2e031ab495ba337, but i'm using f5aff1871d3273b3cd3621ea5d3e37cdd807e66f (with a couple of local patches, including the fix for bug 284866) and i'm still seeing the issue. the problem is that traceroute over af-to NAT64 returns garbage once it hits pf: # traceroute6 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2a00:1098:6b:200::1, 64 hops max, 28 byte packets 1 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 0.544 ms 0.411 ms 0.305 ms 2 uk-aai-1.le-fay.org (2001:8b0:aab5:100::1) 6.738 ms 7.558 ms 7.520 ms 3 64:ff9b::101:101 (64:ff9b::101:101) 12.666 ms 12.443 ms 11.981 ms 4 64:ff9b::101:101 (64:ff9b::101:101) 12.904 ms 11.460 ms 13.006 ms 5 64:ff9b::101:101 (64:ff9b::101:101) 14.095 ms 13.377 ms 13.012 ms 6 64:ff9b::101:101 (64:ff9b::101:101) 12.984 ms 13.523 ms 14.175 ms 7 64:ff9b::101:101 (64:ff9b::101:101) 13.939 ms 13.436 ms 13.025 ms on the router's external interface, i see the correct outgoing and incoming traffic: 09:42:39.937871 IP (tos 0x0, ttl 1, id 4079, offset 0, flags [none], proto UDP (17), length 40) 81.187.47.193.25587 > 1.1.1.1.33441: UDP, length 12 09:42:39.942958 IP (tos 0x0, ttl 64, id 13105, offset 0, flags [none], proto ICMP (1), length 56) 90.155.53.128 > 81.187.47.193: ICMP time exceeded in-transit, length 36 IP (tos 0x0, id 4079, offset 0, flags [none], proto UDP (17), length 40) 81.187.47.193.25587 > 1.1.1.1.33441: UDP, length 12 but the translated outgoing error on the internal interface is wrong: 09:42:39.937819 IP6 (hlim 1, next-header UDP (17) payload length: 20) 2a00:1098:6b:200::1.25587 > 64:ff9b::101:101.33441: [udp sum ok] UDP, length 12 09:42:39.942997 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 56) 64:ff9b::101:101 > 2a00:1098:6b:200::1: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 specifically the source address is 64:ff9b::101:101, when it should be 64:ff9b::5a9b:3580 (64:ff9b::90.155.53.128). the af-to rule is fairly straightforward: anchor on ep.uk-aai-1 { pass from fe80::/10 to any pass in from <lf> to any pass in inet6 proto ipv6-icmp from <dn42> to any icmp6-type echoreq pass in proto tcp from <dn42> to <dn42-fuchsia> port { smtp, domain, http, https } pass in proto udp from <dn42> to <dn42-fuchsia> port domain pass in from any to 64:ff9b::/96 af-to inet from 81.187.47.193/32 }
I can confirm that bug, and I see why it's broken. This even affects OpenBSD. I'm testing a patch along these lines: diff --git a/sys/netpfil/pf/pf.c b/sys/netpfil/pf/pf.c index f3c9ea7a2fb1..ac4bab45ffda 100644 --- a/sys/netpfil/pf/pf.c +++ b/sys/netpfil/pf/pf.c @@ -8109,8 +8109,18 @@ pf_test_state_icmp(struct pf_kstate **state, struct pf_pdesc *pd, nk->port[didx], 1, pd->af, nk->af); m_copyback(pd2.m, pd2.off, sizeof(uh), (c_caddr_t)&uh); - PF_ACPY(&pd->nsaddr, - &nk->addr[pd2.sidx], nk->af); + if (pd->af == AF_INET) { + struct pf_addr prefix, nsaddr; + int prefixlen = in6_mask2len( + (struct in6_addr *)&(*state)->rule->dst.addr.v.a.mask, NULL); + if (prefixlen < 32) + prefixlen = 96; + PF_ACPY(&prefix, &nk->addr[pd2.sidx], nk->af); + PF_ACPY(&nsaddr, pd->src, pd->af); + inet_nat64(AF_INET6, pd->src, &nsaddr, &prefix, + prefixlen); + PF_ACPY(&pd->nsaddr, &nsaddr, AF_INET6); + } PF_ACPY(&pd->ndaddr, &nk->addr[pd2.didx], nk->af); pd->naf = nk->af; (Though that only fixes the issue for UDP ICMP payloads, and can stand some cleanup.)
would you like me to test this patch here? i have a non-critical NAT64 gateway i can update to test.
(In reply to Lexi Winter from comment #2) It's a bit early for that, I think. It needs more cleanup and needs to also be implemented for other protocols (TCP, ICMP and SCTP).
i enjoy living on the edge, so i applied it anyway for shits and giggles. seems to work for UDP traceroute: root@rose:~ # traceroute6 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2a00:1098:6b:200::2, 64 hops max, 28 byte packets 1 uk-myb-2.le-fay.org (2a00:1098:6b:200::1) 0.093 ms 0.097 ms 0.042 ms 2 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 0.770 ms 0.329 ms 0.273 ms 3 64:ff9b::ac10:300 (64:ff9b::ac10:300) 0.606 ms 0.469 ms 0.467 ms 4 64:ff9b::ac10:20b (64:ff9b::ac10:20b) 0.503 ms 0.438 ms 0.409 ms 5 64:ff9b::5d5d:850a (64:ff9b::5d5d:850a) 0.388 ms 64:ff9b::5d5d:850b (64:ff9b::5d5d:850b) 0.392 ms 0.396 ms 6 64:ff9b::5d5d:8501 (64:ff9b::5d5d:8501) 0.780 ms 0.616 ms 0.636 ms 7 64:ff9b::539:514b (64:ff9b::539:514b) 2.228 ms 64:ff9b::c342:e1b3 (64:ff9b::c342:e1b3) 9.416 ms 13.658 ms 8 64:ff9b::8d65:4785 (64:ff9b::8d65:4785) 1.645 ms 64:ff9b::8d65:475d (64:ff9b::8d65:475d) 2.334 ms 64:ff9b::8d65:476b (64:ff9b::8d65:476b) 2.420 ms 9 64:ff9b::101:101 (64:ff9b::101:101) 1.634 ms 1.866 ms 1.609 ms (the weird routing is expected, this ISP does L4 ECMP routing.) however ICMP traceroute doesn't work at all, which i think is a regression: root@rose:~ # traceroute6 -I 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2a00:1098:6b:200::2, 64 hops max, 20 byte packets 1 uk-myb-2.le-fay.org (2a00:1098:6b:200::1) 0.078 ms 0.055 ms 0.046 ms 2 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 0.370 ms 0.266 ms 0.241 ms 3 * * * 4 * * * 5 * * * 6 *^C ping is fine though: root@rose:~ # ping 64:ff9b::1.1.1.1 PING(56=40+8+8 bytes) 2a00:1098:6b:200::2 --> 64:ff9b::101:101 16 bytes from 64:ff9b::101:101, icmp_seq=0 hlim=56 time=2.089 ms 16 bytes from 64:ff9b::101:101, icmp_seq=1 hlim=56 time=2.066 ms
There's a more complete patch in https://reviews.freebsd.org/D49143 along with test cases in https://reviews.freebsd.org/D49144 The current version of the fix is based on a proposed OpenBSD patch. I'm likely to go with the OpenBSD version, if only to reduce diff between OpenBSD and FreeBSD pf.
as mentioned on IRC, applied D49143 on top of 8cebb0630046a8eb10c551a856397ed230e73833 but it doesn't seem to work. FreeBSD yarrow.eden.le-fay.org 15.0-CURRENT FreeBSD 15.0-CURRENT #6 lf/main-n269114-4eb4ddc3c02f: Fri Feb 28 12:00:07 GMT 2025 srcmastr@hemlock.eden.le-fay.org:/build/obj/freebsd/build/src/freebsd/lf/main/amd64.amd64/sys/LF-PF amd64 UDP traceroute works: root@yarrow:~ # traceroute6 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2a00:1098:6b:100::2, 64 hops max, 28 byte packets 1 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 0.209 ms 0.123 ms 0.108 ms 2 64:ff9b::ac10:300 (64:ff9b::ac10:300) 0.448 ms 0.343 ms 0.339 ms 3 64:ff9b::ac10:20b (64:ff9b::ac10:20b) 0.284 ms 64:ff9b::ac10:20a (64:ff9b::ac10:20a) 0.394 ms 0.300 ms 4 64:ff9b::5d5d:850a (64:ff9b::5d5d:850a) 0.271 ms 0.353 ms 0.380 ms 5 64:ff9b::5d5d:8501 (64:ff9b::5d5d:8501) 0.511 ms 0.495 ms 64:ff9b::5d5d:8506 (64:ff9b::5d5d:8506) 0.654 ms 6 64:ff9b::c342:e1b3 (64:ff9b::c342:e1b3) 2.757 ms 1.474 ms * 7 64:ff9b::8d65:475d (64:ff9b::8d65:475d) 9.995 ms 64:ff9b::8d65:473d (64:ff9b::8d65:473d) 2.095 ms 64:ff9b::8d65:4787 (64:ff9b::8d65:4787) 12.876 ms 8 64:ff9b::101:101 (64:ff9b::101:101) 1.492 ms 1.263 ms 1.477 ms ICMP does not: root@yarrow:~ # traceroute6 -I 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2a00:1098:6b:100::2, 64 hops max, 20 byte packets 1 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 0.216 ms 0.177 ms 0.128 ms 2 * * * even though the incoming packets seem fine: 14:01:00.708206 IP6 (hlim 1, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 1 14:01:00.708270 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 68) 2a00:1098:6b:100::1 > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 14:01:00.713707 IP6 (hlim 1, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 2 14:01:00.713766 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 68) 2a00:1098:6b:100::1 > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 14:01:00.713893 IP6 (hlim 1, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 3 14:01:00.713933 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 68) 2a00:1098:6b:100::1 > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 14:01:00.714056 IP6 (hlim 2, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 4 14:01:00.714496 IP6 (class 0xc0, hlim 63, next-header ICMPv6 (58) payload length: 68) 64:ff9b::ac10:300 > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 14:01:05.727039 IP6 (hlim 2, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 5 14:01:05.727525 IP6 (class 0xc0, hlim 63, next-header ICMPv6 (58) payload length: 68) 64:ff9b::ac10:300 > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 14:01:10.749036 IP6 (hlim 2, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 6 14:01:10.749507 IP6 (class 0xc0, hlim 63, next-header ICMPv6 (58) payload length: 68) 64:ff9b::ac10:300 > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 14:01:15.784896 IP6 (hlim 3, next-header ICMPv6 (58) payload length: 20) 2a00:1098:6b:100::2 > 64:ff9b::101:101: [icmp6 sum ok] ICMP6, echo request, id 3038, seq 7 14:01:15.785237 IP6 (class 0xc0, hlim 62, next-header ICMPv6 (58) payload length: 68) 64:ff9b::ac10:20b > 2a00:1098:6b:100::2: [icmp6 sum ok] ICMP6, time exceeded in-transit for 64:ff9b::101:101 the only difference i see between hop 1 and hop 2 is the packets for hop 2 have a traffic class set, but i'm not sure why that would make any difference.
(In reply to Lexi Winter from comment #6) Can you apply https://reviews.freebsd.org/D49231 on top as well? With the entire series my Windows client can now traceroute over nat64.
the test patch didn't apply here: Patching file tests/sys/netpfil/pf/nat64.py using Plan A... No such line 237 in input file, ignoring Hunk #1 failed at 238. Hunk #2 failed at 282. 2 out of 2 hunks failed while patching tests/sys/netpfil/pf/nat64.py however the pf.c patch is fine. for reference, this is the diff between my local branch and freebsd/main: https://www.le-fay.org/tmp/30d/netpfil.diff i'm rebuilding now.
the test patch didn't apply because i didn't have D49144. looks much better now, both ICMP and UDP traceroute work: root@uk-jmp-2:~ # traceroute6 -I 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2001:ba8:404a:100::1, 64 hops max, 20 byte packets 1 uk-myb-2.le-fay.org (2a00:1098:6b:200::1) 1.547 ms 0.985 ms 1.119 ms 2 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 1.987 ms 1.978 ms 1.704 ms 3 64:ff9b::ac10:300 (64:ff9b::ac10:300) 2.044 ms 2.539 ms 2.114 ms 4 64:ff9b::ac10:20b (64:ff9b::ac10:20b) 2.142 ms 1.928 ms 2.099 ms 5 64:ff9b::5d5d:850b (64:ff9b::5d5d:850b) 1.744 ms 2.146 ms 1.447 ms 6 64:ff9b::5d5d:8501 (64:ff9b::5d5d:8501) 1.353 ms 1.398 ms 1.358 ms 7 64:ff9b::539:514b (64:ff9b::539:514b) 3.650 ms 2.611 ms 2.834 ms 8 64:ff9b::8d65:4787 (64:ff9b::8d65:4787) 2.565 ms 3.320 ms 2.981 ms 9 64:ff9b::101:101 (64:ff9b::101:101) 2.465 ms 2.113 ms 2.297 ms root@uk-jmp-2:~ # traceroute6 64:ff9b::1.1.1.1 traceroute6 to 64:ff9b::1.1.1.1 (64:ff9b::101:101) from 2001:ba8:404a:100::1, 64 hops max, 28 byte packets 1 uk-myb-2.le-fay.org (2a00:1098:6b:200::1) 1.003 ms 0.866 ms 0.796 ms 2 uk-myb-1.le-fay.org (2a00:1098:6b:100::1) 1.039 ms 1.020 ms 1.127 ms 3 64:ff9b::ac10:300 (64:ff9b::ac10:300) 1.307 ms 1.163 ms 1.146 ms 4 64:ff9b::ac10:20a (64:ff9b::ac10:20a) 1.246 ms 64:ff9b::ac10:20b (64:ff9b::ac10:20b) 1.329 ms 1.101 ms 5 64:ff9b::5d5d:850a (64:ff9b::5d5d:850a) 1.042 ms 1.105 ms 1.209 ms 6 64:ff9b::5d5d:8501 (64:ff9b::5d5d:8501) 1.313 ms 1.479 ms 1.329 ms 7 64:ff9b::c342:e1b3 (64:ff9b::c342:e1b3) 10.437 ms 12.894 ms 2.782 ms 8 64:ff9b::8d65:4785 (64:ff9b::8d65:4785) 2.225 ms 64:ff9b::8d65:473d (64:ff9b::8d65:473d) 27.034 ms 64:ff9b::8d65:4785 (64:ff9b::8d65:4785) 2.668 ms 9 64:ff9b::101:101 (64:ff9b::101:101) 1.944 ms 2.210 ms 2.240 ms also UDP and TCP in quick testing. i'll leave it running for a bit and see if any other problems crop up.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2f77491169cacafd269fb653bec11087d85af035 commit 2f77491169cacafd269fb653bec11087d85af035 Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2025-02-25 13:20:18 +0000 Commit: Kristof Provost <kp@FreeBSD.org> CommitDate: 2025-03-05 09:37:57 +0000 pf tests: test ICMP error translation with nat64 Ensure that when we translate an ICMPv4 to ICMPv6 message we set the correct source IP address. PR: 284944 Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D49144 tests/sys/netpfil/pf/nat64.py | 94 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 92 insertions(+), 2 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e9a490b10d6cf2006bc2442648f8b2de39aacc5f commit e9a490b10d6cf2006bc2442648f8b2de39aacc5f Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2025-03-04 16:13:38 +0000 Commit: Kristof Provost <kp@FreeBSD.org> CommitDate: 2025-03-05 09:37:56 +0000 pf: fix ICMP source address translation for nat64 While handling an ICMP error related to another state (e.g. TTL expired, port closed, fragmentation needed, ...) we can't just use the state's source address as the ICMP source address. We have to translate the IPv4 address back to an IPv6 nat64 address. Failing to do so breaks things like traceroute, where the intermediate router generates an ICMP error message, and the traceroute tool uses the source address to build the path. Use the state's source address to figure out the prefix, and copy the IPv4 IP address to the last bytes to construct the mapped IPv6 address. PR: 284944 Sponsored by: Rubicon Communications, LLC ("Netgate") In collaboration with: sashan <sashan@openbsd.org>, dac07517c7 Differential Revision: https://reviews.freebsd.org/D49143 sys/netpfil/pf/pf.c | 72 ++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 55 insertions(+), 17 deletions(-)