I have an issue with pf in FreeBSD 9.3. Looks there is something wrong with pf's NAT while processing ICMP packets of type 3 (destination unreachable). Here is what I see on LAN interface: 16:46:10.334993 IP (tos 0xc0, ttl 64, id 63254, offset 0, flags [none], proto ICMP (1), length 289) 10.12.0.198 > 84.47.xx.yy: ICMP 10.12.0.198 udp port 8293 unreachable, length 269 IP (tos 0x0, ttl 60, id 34284, offset 0, flags [none], proto UDP (17), length 261) 84.47.xx.yy.53 > 10.12.0.198.8293: 37288 2/4/4 www.jdm022.com. CNAME sbsfe-p8.geo.mf0.yahoodns.net., sbsfe-p8.geo.mf0.yahoodns.net. A 98.138.19.143 (233) I.e. some server (84.47.xx.yy) send an UDP packet to client (10.12.0.198, port 8293). This port is closed on client and client send ICMP packet "Port unreachable" to server 84.47.xx.yy. This ICMP packet contains the header of that UDP packet that was sent to closed client's port: 84.47.xx.yy.53 > 10.12.0.198.8293: 37288 2/4/4 www.jdm022.com. CNAME sbsfe-p8.geo.mf0.yahoodns.net., sbsfe-p8.geo.mf0.yahoodns.net. A 98.138.19.143 (233) And this is what I see on external WAN interface: 16:46:10.335012 IP (tos 0xc0, ttl 63, id 63254, offset 0, flags [none], proto ICMP (1), length 289) 10.12.0.198 > 84.47.xx.yy: ICMP 213.208.kkk.zz udp port 61534 unreachable, length 269 IP (tos 0x0, ttl 60, id 34284, offset 0, flags [none], proto UDP (17), length 261) 84.47.xx.yy.53 > 213.208.kkk.zz.61534: 37288 2/4/4 www.jdm022.com. CNAME sbsfe-p8.geo.mf0.yahoodns.net., sbsfe-p8.geo.mf0.yahoodns.net. A 98.138.19.143 (233) As you can see, pf translated UDP header that was included into ICMP packet: "ICMP 213.208.kkk.zz udp port 61534 unreachable". IP 213.208.kkk.zz is IP of my external WAN interface where NAT works. But it did not change ICMP packet itself. So I have outgoing ICMP "port unreachable" packet with source address 10.12.0.198 ON EXTERNAL interface. Also I found that pf can't block this kind of packets. Rule like: block out quick on $wan_if proto icmp from 10.12/16 to any icmp-type 3 code 3 does not work at all. So I have to use IPFW to block those ICMP packets. Here is my NAT rule: nat on $wan_if from <clients> to any -> 213.208.kkk.zz Table <clients> defines like this: table <clients> { 10.12/16, 10.13/16 } Also I found a mention about this issue in OpenBSD pf: http://openbsd-archive.7691.n7.nabble.com/system-6564-pf-not-nating-does-not-see-icmp4-port-unreachable-packets-from-machine-behind-pf-td187997.html They said that this bug is fixed in 2011. But in FreeBSD 9.3 it is not fixed so far? My system: FreeBSD vpn2-lesnoy.isp.local 9.3-RELEASE-p2 FreeBSD 9.3-RELEASE-p2 #0: Mon Sep 15 16:44:27 UTC 2014 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 I checked if I can reproduce this issue with CURRENT. Well, CURRENT has the same problem. Here is my test lab: # uname -a FreeBSD test-BSD-01.hyperv.local 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r285351: Fri Jul 10 14:49:08 MSK 2015 root@test-BSD-01.hyperv.local:/usr/obj/usr/src/sys/GENERIC amd64 Here is dump on LAN interface: # tcpdump -npi hn1 host 172.16.129.18 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on hn1, link-type EN10MB (Ethernet), capture size 262144 bytes 11:43:25.506775 IP 172.16.129.18.29490 > 208.67.220.220.53: 9125+ A? freebsd.org. (29) 11:43:25.570851 IP 208.67.220.220.53 > 172.16.129.18.29490: 9125 1/0/0 A 8.8.178.110 (45) 11:43:25.571635 IP 172.16.129.18 > 208.67.220.220: ICMP 172.16.129.18 udp port 29490 unreachable, length 36 Dump on external WAN interface at the same moment: # tcpdump -npi hn0 \(udp and port 53\) or icmp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on hn0, link-type EN10MB (Ethernet), capture size 262144 bytes 11:43:30.741672 IP 213.208.xx.yy.55677 > 208.67.220.220.53: 1319+ A? ya.ru. (23) 11:43:30.795961 IP 208.67.220.220.53 > 213.208.xx.yy.55677: 1319 3/0/0 A 93.158.134.3, A 213.180.193.3, A 213.180.204.3 (71) 11:43:30.796700 IP 172.16.129.18 > 208.67.220.220: ICMP 213.208.xx.yy udp port 55677 unreachable, length 36 Here is my /etc/pf.conf: nat on hn0 from 172.16.129.18 to any -> hn0 pass in all pass out all
I have the exact same problem on: FreeBSD r1 10.2-RELEASE-p5 FreeBSD 10.2-RELEASE-p5 #0: Sun Oct 11 14:19:57 CEST 2015
See https://lists.freebsd.org/pipermail/freebsd-pf/2016-May/008047.html for a patch.
This patch is not fully tested. releng/10.3. --- sys/netpfil/pf/pf.c.orig 2016-05-21 17:57:29.420602000 +0300 +++ sys/netpfil/pf/pf.c 2016-05-22 00:54:16.043961000 +0300 @@ -4793,8 +4793,7 @@ pf_test_state_icmp(struct pf_state **sta &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != th.th_dport) pf_change_icmp(pd2.dst, &th.th_dport, - NULL, /* XXX Inbound NAT? */ - &nk->addr[pd2.didx], + saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); @@ -4866,8 +4865,7 @@ pf_test_state_icmp(struct pf_state **sta &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != uh.uh_dport) pf_change_icmp(pd2.dst, &uh.uh_dport, - NULL, /* XXX Inbound NAT? */ - &nk->addr[pd2.didx], + saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], &uh.uh_sum, pd2.ip_sum, icmpsum, pd->ip_sum, 1, pd2.af); @@ -4934,8 +4932,7 @@ pf_test_state_icmp(struct pf_state **sta &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != iih.icmp_id) pf_change_icmp(pd2.dst, &iih.icmp_id, - NULL, /* XXX Inbound NAT? */ - &nk->addr[pd2.didx], + saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET); @@ -4987,8 +4984,7 @@ pf_test_state_icmp(struct pf_state **sta &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != iih.icmp6_id) pf_change_icmp(pd2.dst, &iih.icmp6_id, - NULL, /* XXX Inbound NAT? */ - &nk->addr[pd2.didx], + saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET6); @@ -5027,8 +5023,7 @@ pf_test_state_icmp(struct pf_state **sta if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af)) - pf_change_icmp(pd2.src, NULL, - NULL, /* XXX Inbound NAT? */ + pf_change_icmp(pd2.dst, NULL, saddr, &nk->addr[pd2.didx], 0, NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af);
A commit references this bug: Author: kp Date: Mon May 23 12:41:29 UTC 2016 New revision: 300501 URL: https://svnweb.freebsd.org/changeset/base/300501 Log: pf: Fix ICMP translation Fix ICMP source address rewriting in rdr scenarios. PR: 201519 Submitted by: Max <maximos@als.nnov.ru> MFC after: 1 week Changes: head/sys/netpfil/pf/pf.c
(In reply to Max from comment #3) Awesome work Max! I'll try to MFC this to stable/10 next week.
(In reply to Kristof Provost from comment #5) https://svnweb.freebsd.org/base/head/sys/netpfil/pf/pf.c?annotate=300501&pathrev=300501#l5017 should be "pf_change_icmp(pd2.dst, NULL, saddr,", not "pf_change_icmp(pd2.src, NULL, saddr,"
A commit references this bug: Author: kp Date: Mon May 23 13:59:49 UTC 2016 New revision: 300508 URL: https://svnweb.freebsd.org/changeset/base/300508 Log: pf: Fix more ICMP mistranslation In the default case fix the substitution of the destination address. PR: 201519 Submitted by: Max <maximos@als.nnov.ru> MFC after: 1 week Changes: head/sys/netpfil/pf/pf.c
A commit references this bug: Author: kp Date: Mon May 30 01:21:44 UTC 2016 New revision: 300979 URL: https://svnweb.freebsd.org/changeset/base/300979 Log: MFC 300501, 300508 pf: Fix ICMP translation Fix ICMP source address rewriting in rdr scenarios. pf: Fix more ICMP mistranslation In the default case fix the substitution of the destination address. PR: 201519 Submitted by: Max <maximos@als.nnov.ru> Changes: _U stable/10/ stable/10/sys/netpfil/pf/pf.c
Upgrading my Router/firwall from 9.3-STABLE svn 299225 to 10.3-STABLE svn 303269 I found that NATed traceroute's from the internal network to an external system displayed the IPv4 addresses/names of the final destination system instead of the IPv4 addresses/names of the intermediate systems/routers. I reverted 300979 and obtained correct traceroute addresses/name display. So I dare think that the bug cannot be closed.
(In reply to clbuisson from comment #9) I'm afraid I don't understand what the problem is. Can you add a description of your network setup, the trace route output and a network capture (please specify where in the network the capture was made)?
There is nothing complicated in my setup ! 1. An Internal network with "private" IPv4 addresses 2. A Gateway/Router/Firewall connected to this internal network, and to the Internet (ADSL), and NATing the traffic betwwen 1 and 3 3. The Internet with any system, for exemple www.freebsd.org On a system on the internal network, if I do traceroute www.freebsd.org I get - first line: the internal address/name of the gateway (OK) - a number of lines, one for each intermediate router on the Internet, but labelled with the address/name of www.freebsd.org (!OK) - last line: the address/name of www.freebsd.org (OK) Details seem irrelevant (anyone can find the address of www/freebsd.org ..), and the effect of outgoing NAT on UDP or ICMP (in case of traceroute -I) is supposed known. It is clear that the bug is in the NAT of the ICMP TIME_EXCEEDED received from the Internet (invalid substitution of the address of the responding router with address of the traceroute target).
(In reply to clbuisson from comment #11) I'm unable to reproduce the described behaviour on my system. Please make a network capture so we can look in detail at what's going wrong.
(In reply to clbuisson from comment #11) Show please your network diagram - L1 and L2. As well as the route to the external IP. I'm on FreeBSD 10.3-STABLE r302074 bunch of miracles happening with traceroute :( Only I still used carp, route-to with several uplinks ...
(In reply to Vladislav V. Prodan from comment #13) I've been talking to clbuisson@orange.fr in private, and it looks like there is indeed something wrong in 10.3, but not in 11 or 12. Right now I have no idea why.
I can confirm that the patches break traceroute output on 10.3. Can this be reopened?
Yes, it's on the top of my list.
I suspect I know what the cause is. stable/10 does not include the fix for 204005, so PF_ANEQ() doesn't work correctly. Merging r289932 and r289940 should fix the problem. I'm currently building a version of stable/10 with the fix, if I'm correct this will be fixed soon.
This should be fixed as of r304293 in stable/10. Can one of the affected users confirm so we can close?
Running now with a patched kernel: first (quick) tests are positive ! Thank you, for your work
Looks good, thanks!
MARKED AS SPAM