Created attachment 239212 [details] Script to setup a test. Run this, then try 'telnet 10.123.0.1 8080' I am using rules that look like this: rdr log (all) inet proto tcp from any to ! 10.123.0.2 port = 8080 -> 10.123.0.2 port 80 to allow access to workloads running in jails/containers which is a common pattern for container runtimes like containerd or podman. Typically, a workload (e.g. nginx) runs in a container on its usual port and is 'published' to the host by mapping a host port to the container's address and port. In the example above, the container is in a vnet jail with address 10.123.0.2 and the intention is to publish the container's port 80 as the host's port 8080. This type of rule works well for traffic originating on some other machine with destination set to the host's IP and port 8080 but when traffic originates on the host itself, the rule matches but traffic doesn't flow. As far as I can tell, the pf rule matches outgoing packets (e.g. SYN for new connections) as being 'received' by lo0 where the packet is rewritten to the container IP/port and travels to the container. The return packet (e.g. SYN+ACK) is not received and packet traces show the host rejecting with RST. My hypothesis is that a state entry is created for the outgoing packet and registed with lo0 but the reply is received on a different interface (in my case a bridge) and the reverse translation is not performed.
Packet trace showing the problem: # telnet 10.123.0.1 8080 Trying 10.123.0.1... 13:03:58.531882 IP (tos 0x10, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 10.123.0.1.13733 > 10.123.0.2.http: Flags [S], cksum 0x3465 (incorrect -> 0x5ae4), seq 2886169781, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 3565974209 ecr 0], length 0 13:03:58.531897 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 10.123.0.2.http > 10.123.0.1.13733: Flags [S.], cksum 0x9ccf (correct), seq 1244013501, ack 2886169782, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 640967696 ecr 3565974209], length 0 13:03:58.531908 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 10.123.0.1.13733 > 10.123.0.2.http: Flags [R], cksum 0x4435 (correct), seq 2886169782, win 0, length 0 13:03:59.557528 IP (tos 0x10, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 10.123.0.1.13733 > 10.123.0.2.http: Flags [S], cksum 0x3465 (incorrect -> 0x56dd), seq 2886169781, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 3565975240 ecr 0], length 0 13:03:59.557595 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 10.123.0.2.http > 10.123.0.1.13733: Flags [S.], cksum 0x94c1 (correct), seq 1244013501, ack 2886169782, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 640968727 ecr 3565975240], length 0 13:03:59.557655 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 10.123.0.1.13733 > 10.123.0.2.http: Flags [R], cksum 0x4435 (correct), seq 2886169782, win 0, length 0 13:04:01.760495 IP (tos 0x10, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 10.123.0.1.13733 > 10.123.0.2.http: Flags [S], cksum 0x3465 (incorrect -> 0x4e42), seq 2886169781, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 3565977443 ecr 0], length 0 13:04:01.760564 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 10.123.0.2.http > 10.123.0.1.13733: Flags [S.], cksum 0x838b (correct), seq 1244013501, ack 2886169782, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 640970930 ecr 3565977443], length 0 13:04:01.760625 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 10.123.0.1.13733 > 10.123.0.2.http: Flags [R], cksum 0x4435 (correct), seq 2886169782, win 0, length 0
Created attachment 239234 [details] second test scenario showing redirect failing if both ends are on the same bridge
Further testing show more strangeness. If I add a second vnet jail on the same bridge and attempt the telnet that should redirect, it also fails. The rule matches when the host receives the SYN and is re-written and directed to the first jail. The SYN+ACK reply doesn't match the state entry and is transmitted without reversing the redirect. I've added a second test script to set this up - run the script then 'jexec testvnet2 telnet 10.123.0.1 8080'
The second scenario with two vnet jails on the same bridge is possibly the bridge 'helping' by delivering the SYN+ACK reply directly, without allowing PF to re-write. Adding debug printfs to pf seems to show that it never sees the reply.
Turns out I was missing 'sysctl net.link.bridge.pfil_member=1' for the two jails redirecting via a bridge scenario so ignore that part. Adding this doesn't affect the original scenario with the host originating the redirected connection.
Created attachment 239274 [details] possible fix for redirects initiated by localhost Redirect rules are triggered on PF_IN events to allow the rule to replace the destination address+port and also on PF_OUT events to reverse the replacement for packets flowing back towards the original source address. If the source is a local address, this second event is not triggered since the return packet is delivered to the local protocol stack. A possible fix is to simulate the PF_OUT event for packets destined for local processing, allowing the second part of the redirect to be applied. This does conflict with source address validation in 14-current which I'm disabling for testing. That could be mitigated by relaxing source address validation to allow packets with non-local source addresses pre-filtering.
(In reply to dfr from comment #6) I've been poking at this a bit as well (briefly, because I'm supposed to be on vacation), and I believe your diagnosis may be correct. To summarise what I've done so far: after turning your script into a atf-sh test case I can reproduce the problem, and poke at things with dtrace. The test script sets up a vnet jail (IP 192.0.2.1) with an rdr rule redirecting traffic for 192.0.2.1:8080 to 192.0.2.2:8080. The following D script: #!/usr/sbin/dtrace -s fbt:pf:pf_state_insert:entry { s = (struct pf_state_key *)arg2; printf("%x -> %x, %d:%d af %d proto %d\n", s->addr[0].pfa.v4.s_addr, s->addr[1].pfa.v4.s_addr, ntohs(s->port[0]), ntohs(s->port[1]), s->af, s->proto); s = (struct pf_state_key *)arg3; printf("%x -> %x, %d:%d af %d proto %d", s->addr[0].pfa.v4.s_addr, s->addr[1].pfa.v4.s_addr, ntohs(s->port[0]), ntohs(s->port[1]), s->af, s->proto); stack(); } fbt:pf:pf_find_state:entry { s = (struct pf_state_key *)arg1; printf("%x -> %x, %d:%d af %d proto %d", s->addr[0].pfa.v4.s_addr, s->addr[1].pfa.v4.s_addr, ntohs(s->port[0]), ntohs(s->port[1]), s->af, s->proto); } fbt:pf:pf_find_state:return { printf("=> %p", arg1); } produces dtrace: script '/home/kp/pf.dtrace' matched 3 probes CPU ID FUNCTION:NAME 5 80283 pf_find_state:entry 10200c0 -> 10200c0, 8080:40477 af 2 proto 6 5 80284 pf_find_state:return => 0 6 80283 pf_find_state:entry 10200c0 -> 10200c0, 40477:8080 af 2 proto 6 6 80284 pf_find_state:return => 0 6 80537 pf_state_insert:entry 10200c0 -> 10200c0, 40477:8080 af 2 proto 6 10200c0 -> 20200c0, 40477:8080 af 2 proto 6 pf.ko`pf_test_rule+0x282b pf.ko`pf_check_in+0x25 kernel`pfil_mbuf_in+0x55 kernel`ip_input+0x3c6 kernel`swi_net+0x191 kernel`ithread_loop+0x279 kernel`fork_exit+0x80 kernel`0xffffffff810a44ae 6 80283 pf_find_state:entry 20200c0 -> 10200c0, 8080:40477 af 2 proto 6 6 80284 pf_find_state:return => 0 So I think that matches your diagnosis, in that we're creating states (both the original 192.0.2.1 to 192.0.2.1 and rdr'd 192.0.2.1 to 192.0.2.2 traffic, but only on PF_IN. When the reply comes in we don't find a corresponding state, and things don't work. I'm rather wary of inserting special case handling for this in pf_test(). I wonder if we don't need to add a pfil call in the if_lo handling somewhere, to make that traffic match the normal (i.e. non-loopback) traffic patter.
I was also wary of adding complexity to pf_test although there is some precedent here - pf_route does recurse back to pf_test. The other idea I had was to add a second packet filter call from ip_input and ip6_input at the point where the packet is accepted as local but before its handed off to tcp (or udp) but that seemed a riskier option. Perhaps the PF_OUT test for packets with local destination could be performed in pf_route which would leave pf_test alone?
(In reply to dfr from comment #8) We don't hit pf_route() in this scenario, so I don't think that'd help. I'm going to have to find a bit of time to sit down and concentrate on this before I can say something more useful. I hope to do so next week.
You are right - in this case, r->rt will be NULL. I'm not particularly advocating for my attempted fix but I do think this feature is useful - I originally wanted this to work for 'publishing' ports in podman/docker containers to the host which is a very common pattern in the Linux world. Along the way, I noticed that bastillebsd uses the same kind of redirect rule and is also affected. Thanks for taking the time to look at this - its been on my TODO list for months and I'm just happy to see some progress.
I was testing this again today with a back-port of my patch to stable/13 and it seems to work fairly well so far. It doesn't work if the source address is 127.0.0.1 since the redirected packet still has a 127.0.0.1 as source address but I can ignore that for now.
(In reply to dfr from comment #11) I've poked at it for a little bit more, but I think I've mostly succeeded in confusing myself more. I think part of the issue is that we only create the state on the inbound side (so after we've sent the packet out on lo0 and done the outbound processing), and then end up sending it out again. In any event, this is going to need careful analysis and thought, and that's just not going to happen until my vacation is over. I'll keep this on my todo list, but expect no progress before March.
Oh. Wait. The setup in comment #1 tries to use 'rdr' to redirect an outbound packet. But the pf.conf man page says: "Then either the rdr rules are evaluated on an inbound packet or the nat rules on an outbound packet." So I don't think that's expected to work. You should be able (and indeed, in my test script I can) make this work using 'nat' rather than 'rdr'.
For my use-case, I need to be able to change both destination address and port and currently the nat rule only allows changing the address. Also, I'm not sure that nat will work here since it re-writes the source address and I need to change the destination address. As I understand the current situation with rdr, for new local connections, a PF_IN event is triggered when the packet leaves the local network stack and this matches the rdr rule, re-writing destination address and port and setting a state to match the reply. Unfortunately for reply packets no corresponding PF_OUT event is triggered when the packet is delivered to the local network stack so the reverse re-write does not happen. This is why my suggested change works since it simulates the PF_OUT event for packets which will be processed locally.
(In reply to dfr from comment #14) Right, but rdr very much expects to be used on inbound traffic only. I believe the relevant code to be in pf_get_translation(), where we only look at the RDR ruleset if direction != PF_OUT (i.e. it's PF_IN). So I think we have three choices: 1) extend nat (or binat) to be able to change the port and destination address (rather than source address). 2) teach rdr to work on PF_OUT 3) Build on the work in https://reviews.freebsd.org/D38025 and use OpenBSD's rdr-to, where the man page at least seems to suggest it can also work out outbound traffic.
As far as I understand the code (which is not very well TBH), both nat and rdr rules rely on both PF_IN and PF_OUT events but differ on which event triggers state creation: PF_IN for rdr and PF_OUT for nat with the opposite event matching the other direction to reverse the re-write. For connections initiated locally, this symmetry is broken since we generate PF_IN events for the SYN causing the re-write and creating the state but since there is no corresponding PF_OUT for the SYN+ACK reply, the connection fails. This is why I think we need PF_OUT for packets which will be delivered to the local stack rather than routed onward. I thought about whether it makes sense for rdr state creation to happen on PF_OUT but wouldn't that have other problems since the un-redirected destination address may direct the packet to the wrong outgoing interface? I haven't looked closely at upstream OpenBSD pf but it seems they have changed the rule language considerably which would likely be a problem for FreeBSD unless there is a compability mode. I seem to remember that they support tables for rdr target addresses which may be useful for managing a pool of load-balanced replicas. Anyway, for now I have a workable solution for my two (related) use-cases: 'publishing' container ports to the host for podman and mapping service VIPs to containers for kubernetes. I'm looking forward to seeing the right solution for this merged into main (and maybe stable/13).
(In reply to dfr from comment #16) > As far as I understand the code (which is not very well TBH), both nat and rdr rules rely on both PF_IN and PF_OUT events but differ on which event triggers state creation: PF_IN for rdr and PF_OUT for nat with the opposite event matching the other direction to reverse the re-write. Sort of. I find it more useful to think about when the packets pass through which parts of the stack. That does imply a direction (i.e. PF_IN/PF_OUT). In the normal case, when we're thinking of rdr, we'd be dealing with incoming packets, arriving through ip_input(), triggering a PF_IN pf_test(). For the failing case described here we're dealing with packets that are generated locally, and the first time they hit the firewall is from ip_output() (i.e. pf_test(PF_OUT)). > For connections initiated locally, this symmetry is broken since we generate PF_IN events for the SYN causing the re-write and creating the state but since there is no corresponding PF_OUT for the SYN+ACK reply, the connection fails. This is why I think we need PF_OUT for packets which will be delivered to the local stack rather than routed onward. For locally generated connections we're only getting the PF_IN after we've already had a PF_OUT. We can't usefully do anything on SYN+ACK packets, and if we're trying to add an artificial PF_OUT we're just papering over a problem that started earlier. > I thought about whether it makes sense for rdr state creation to happen on PF_OUT but wouldn't that have other problems since the un-redirected destination address may direct the packet to the wrong outgoing interface? I'm not sure I follow. > I haven't looked closely at upstream OpenBSD pf but it seems they have changed the rule language considerably which would likely be a problem for FreeBSD unless there is a compability mode. I seem to remember that they support tables for rdr target addresses which may be useful for managing a pool of load-balanced replicas. Kajetan intends to keep the old syntax supported, at least for a while. I still need to look at his work in detail, which I don't expect to be able to do before the end of February. > I'm looking forward to seeing the right solution for this merged into main (and maybe stable/13). To be clear, I'm not making any promises about the time and effort I can spend on this.
(In reply to Kristof Provost from comment #17) > Sort of. I find it more useful to think about when the packets pass through which parts of the stack. That does imply a direction (i.e. PF_IN/PF_OUT). In the normal case, when we're thinking of rdr, we'd be dealing with incoming packets, arriving through ip_input(), triggering a PF_IN pf_test(). For the failing case described here we're dealing with packets that are generated locally, and the first time they hit the firewall is from ip_output() (i.e. pf_test(PF_OUT)). I like this description of the situation. In this case though, locally generated packets are seen by the firewall via ip_input, causing PF_IN event and creating the state: (kgdb) bt #0 pf_create_state (r=0xfffffe00041c6df8, nr=0xfffff80027b0d800, a=0x0, nsn=0x0, nk=0xfffff80027b03a50, sk=0xfffff80027b03aa8, m=0xfffff800276b0200, off=20, sport=13975, dport=20480, rewrite=0xfffffe000378cae8, sm=0xfffffe000378cce8, tag=-1, bproto_sum=19927, bip_sum=0, hdrlen=20, pd=<optimized out>, kif=<optimized out>) at ../../../netpfil/pf/pf.c:4533 #1 pf_test_rule (rm=rm@entry=0xfffffe000378ccd8, sm=sm@entry=0xfffffe000378cce8, direction=direction@entry=1, kif=kif@entry=0xfffff800039b3200, m=m@entry=0xfffff800276b0200, off=off@entry=20, pd=0xfffffe000378cc00, am=0xfffffe000378ccb8, rsm=0xfffffe000378cca8, inp=0x0) at ../../../netpfil/pf/pf.c:4483 #2 0xffffffff80e6fc82 in pf_test (dir=dir@entry=1, pflags=65536, ifp=0xfffff800038f2800, m0=m0@entry=0xfffffe000378cdb0, inp=0x0) at ../../../netpfil/pf/pf.c:7217 #3 0xffffffff80e90f15 in pf_check_in (m=0xfffffe000378cdb0, ifp=0x0, flags=41175, ruleset=<optimized out>, inp=0x5000) at ../../../netpfil/pf/pf_ioctl.c:6463 #4 0xffffffff80d506e5 in pfil_mbuf_common (pch=<optimized out>, p=..., p@entry=..., ifp=0xfffff800038f2800, ifp@entry=0xfffffe000378cd80, flags=65536, inp=inp@entry=0x0) at ../../../net/pfil.c:214 #5 pfil_mbuf_in (head=0xfffff8000389de00, p=p@entry=..., ifp=ifp@entry=0xfffff800038f2800, inp=inp@entry=0x0) at ../../../net/pfil.c:226 #6 0xffffffff80dcfbb6 in ip_input (m=0xfffff800276b0200) at ../../../netinet/ip_input.c:613 #7 0xffffffff80d4d261 in netisr_process_workstream_proto (nwsp=0xffffffff825866c0, proto=1) at ../../../net/netisr.c:929 #8 swi_net (arg=0xffffffff825866c0) at ../../../net/netisr.c:976 In this case, the interface is lo0 and direction is PF_IN. The problem (at least as I understand it) is that the replies don't generate the corresponding PF_OUT and the state isn't matched. The return packets cause a PF_IN from the incoming interface (e.g. bridge0) which does nothing and then they are delivered to the local tcp stack without reversing the redirect and are immediately rejected with RST. >> I thought about whether it makes sense for rdr state creation to happen on PF_OUT but wouldn't that have other problems since the un-redirected destination address may direct the packet to the wrong outgoing interface? > > I'm not sure I follow. What I'm trying to say is that for connection attempts to some local address which would match the rdr rule, there will not be a PF_OUT event to trigger for the same reason as what I'm suggesting above for the current PF_IN triggered rule. The packet won't be re-written but instead will be delivered to the local tcp and rejected. > To be clear, I'm not making any promises about the time and effort I can spend on this. Also to be clear, I'm not asking for you to do the work necessarily. I am perfectly happy to do it if someone is prepared to review the change. I do have the time to work on this and the motivation since this problem is a blocker for my projects. I am going to spend some time to see if ipfirewall is also affected. I have a feeling that it might be for similar reasons.
Testing with ipfirewall's ipnat module shows the exact same pattern where the rule matches the initial connection attempt and translates the destination address and port but the reply does not reverse the translation. Adding a call to pfil_mbuf_out() in ip_input() right before the packet is handed off to upper-layer protocols fixes the problem for both pf and ipnat.
(In reply to dfr from comment #190 It looks like a feature, not a bug. Maybe you can rework your PF rules and/or either use multiple fibs with rtables keyword or pass with route-to keyword.
I don't think its a feature - if we are not supporting rdr rules where the initial source address is local, then the rule shouldn't match at all. As it is, for both pf and ipnat, the 'outgoing' rule matches, rewriting the destination address and port and creating state but reply packets never match that state because the firewall does not see the reply packet 'leaving' the network to be delivered to local tcp/udp/whatever.
(In reply to dfr from comment #21) >I don't think its a feature You are right. It's a missing feature, we lack more of them. But should one always submit a PR after finding a missing feature? Probably yes, if it's a patch with the implementation of that feature.
The patch attached to this bug only covers pf. Moving the proposed fix to ip_input/ip6_input seems to fix both pf and ipnat (haven't checked ipfw yet). I will keep testing and update with a new patch
Created attachment 239803 [details] Better fix for the problem, covering both pf and ipfilter This approach to fixing the issue moves the change to ip*_input, which works with both pf and ipfilter.
(In reply to dfr from comment #24) Am I misreading that, or is that now calling the out pfil hook for all incoming traffic?
(In reply to Kristof Provost from comment #25) Yes, its calling the hook for everything which matches a local address. This is partly why its conditional on a sysctl. It is similar to the previous fix which put the extra out filter in pf_test* but moving it up a level allows it to work for ipfilter as well (and possibly ipfw although I have not tested that).
(In reply to dfr from comment #26) While I can't articulate a scenario off the top of my head I can almost guarantee that this extra call into the firewall(s) is going to cause issues. It's going to confuse rulesets and break setups.
(In reply to Kristof Provost from comment #27) This is another reason for being conditional on the sysctl. I am open to alternative suggestions for this but I think we do need to treat the hand-off to local L4+ processing as a potential filtering event. I think that Linux iptables make this clearer, allowing filters to register for NF_INET_LOCAL_IN or NF_INET_LOCAL_OUT specifically.
maybe we should move this to Phabricator and/or a mailing list to get broader view?
(In reply to Mina Galić from comment #29) That would probably get more eyes on this. That said, my view is still that the latest proposed patch is the wrong approach. We should implement support for this use case in the firewall, not by adding this extra pfil hook. I still hope to look at this once my holiday is over, starting in about two weeks.
I've finally had time to get back to this today and I've opened a review (https://reviews.freebsd.org/D40256) with my proposal which adds optional pfil hooks to capture the missing event for messages delivered locally. The only difference with the patch attached here is that its rebased and I added a unit test. I still think this is a viable approach, especially since it fixes the issue for both pf and ipfilter (and possibly ipfw but I've not tried to test that).
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5ab151574c8a1824c6cd8eded28506cb983284bc commit 5ab151574c8a1824c6cd8eded28506cb983284bc Author: Doug Rabson <dfr@FreeBSD.org> AuthorDate: 2023-05-24 13:11:37 +0000 Commit: Doug Rabson <dfr@FreeBSD.org> CommitDate: 2023-05-31 10:11:05 +0000 netinet*: Fix redirects for connections from localhost Redirect rules use PFIL_IN and PFIL_OUT events to allow packet filter rules to change the destination address and port for a connection. Typically, the rule triggers on an input event when a packet is received by a router and the destination address and/or port is changed to implement the redirect. When a reply packet on this connection is output to the network, the rule triggers again, reversing the modification. When the connection is initiated on the same host as the packet filter, it is initially output via lo0 which queues it for input processing. This causes an input event on the lo0 interface, allowing redirect processing to rewrite the destination and create state for the connection. However, when the reply is received, no corresponding output event is generated; instead, the packet is delivered to the higher level protocol (e.g. tcp or udp) without reversing the redirect, the reply is not matched to the connection and the packet is dropped (for tcp, a connection reset is also sent). This commit fixes the problem by adding a second packet filter call in the input path. The second call happens right before the handoff to higher level processing and provides the missing output event to allow the redirect's reply processing to perform its rewrite. This extra processing is disabled by default and can be enabled using pfilctl: pfilctl link -o pf:default-out inet-local pfilctl link -o pf:default-out6 inet6-local PR: 268717 Reviewed-by: kp, melifaro MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40256 sys/netinet/ip_input.c | 22 ++++++++- sys/netinet/ip_var.h | 4 ++ sys/netinet6/ip6_input.c | 19 ++++++++ sys/netinet6/ip6_var.h | 4 ++ tests/sys/netpfil/common/Makefile | 1 + tests/sys/netpfil/{pf => common}/rdr.sh | 84 +++++++++++++++++++++++++++++---- tests/sys/netpfil/common/utils.subr | 4 ++ tests/sys/netpfil/pf/Makefile | 1 - 8 files changed, 127 insertions(+), 12 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4a6b92849e619c40ca9a78d38339798f2735ec56 commit 4a6b92849e619c40ca9a78d38339798f2735ec56 Author: Doug Rabson <dfr@FreeBSD.org> AuthorDate: 2023-06-02 15:58:33 +0000 Commit: Doug Rabson <dfr@FreeBSD.org> CommitDate: 2023-06-03 10:07:56 +0000 pf: Fix tests broken by enabling inet-local filtering Summary: Three of the pf dummynet tests were using filter rules which matched both the intended epair interface as well as lo0 which now receives PFIL_OUT events for messages delivered to the local network stack (if enabled). This commit changes the rules to match only for the expected epair interface. PR: 268717 Reviewed-by: kp MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40393 tests/sys/netpfil/common/dummynet.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3a1f834b5228986a7c14fd60da13cf2700e80996 commit 3a1f834b5228986a7c14fd60da13cf2700e80996 Author: Doug Rabson <dfr@FreeBSD.org> AuthorDate: 2023-06-20 13:01:58 +0000 Commit: Doug Rabson <dfr@FreeBSD.org> CommitDate: 2023-06-20 14:34:01 +0000 pf: Add code to enable filtering for locally delivered packets This is disabled by default since it potentially changes the behavior of existing filter rule sets. To enable this extra filter for packets being delivered locally, use: sysctl net.pf.filter_local=1 service pf restart PR: 268717 Reviewed-by: kp MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40373 UPDATING | 12 ++++++++++++ sys/netpfil/pf/pf_ioctl.c | 20 ++++++++++++++++++++ tests/sys/netpfil/common/utils.subr | 3 +-- tests/sys/netpfil/pf/fragmentation_compat.sh | 3 ++- tests/sys/netpfil/pf/fragmentation_pass.sh | 3 ++- tests/sys/netpfil/pf/killstate.sh | 24 ++++++++++++++++-------- tests/sys/netpfil/pf/map_e.sh | 3 ++- tests/sys/netpfil/pf/pass_block.sh | 3 ++- tests/sys/netpfil/pf/pfsync.sh | 1 + tests/sys/netpfil/pf/route_to.sh | 3 ++- tests/sys/netpfil/pf/set_skip.sh | 2 +- tests/sys/netpfil/pf/table.sh | 6 ++++-- 12 files changed, 65 insertions(+), 18 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b22299c457b21d77fc5770b9f1a9043487b25ed9 commit b22299c457b21d77fc5770b9f1a9043487b25ed9 Author: Doug Rabson <dfr@FreeBSD.org> AuthorDate: 2023-05-24 13:11:37 +0000 Commit: Doug Rabson <dfr@FreeBSD.org> CommitDate: 2023-07-14 10:07:58 +0000 netinet*: Fix redirects for connections from localhost Redirect rules use PFIL_IN and PFIL_OUT events to allow packet filter rules to change the destination address and port for a connection. Typically, the rule triggers on an input event when a packet is received by a router and the destination address and/or port is changed to implement the redirect. When a reply packet on this connection is output to the network, the rule triggers again, reversing the modification. When the connection is initiated on the same host as the packet filter, it is initially output via lo0 which queues it for input processing. This causes an input event on the lo0 interface, allowing redirect processing to rewrite the destination and create state for the connection. However, when the reply is received, no corresponding output event is generated; instead, the packet is delivered to the higher level protocol (e.g. tcp or udp) without reversing the redirect, the reply is not matched to the connection and the packet is dropped (for tcp, a connection reset is also sent). This commit fixes the problem by adding a second packet filter call in the input path. The second call happens right before the handoff to higher level processing and provides the missing output event to allow the redirect's reply processing to perform its rewrite. This extra processing is disabled by default and can be enabled using pfilctl: pfilctl link -o pf:default-out inet-local pfilctl link -o pf:default-out6 inet6-local PR: 268717 Reviewed-by: kp, melifaro MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40256 (cherry picked from commit 5ab151574c8a1824c6cd8eded28506cb983284bc) sys/netinet/ip_input.c | 22 ++++++++- sys/netinet/ip_var.h | 4 ++ sys/netinet6/ip6_input.c | 19 ++++++++ sys/netinet6/ip6_var.h | 4 ++ tests/sys/netpfil/common/Makefile | 1 + tests/sys/netpfil/{pf => common}/rdr.sh | 84 +++++++++++++++++++++++++++++---- tests/sys/netpfil/common/utils.subr | 4 ++ tests/sys/netpfil/pf/Makefile | 1 - 8 files changed, 127 insertions(+), 12 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=6dfb2c2dce0ffabd783ec24b8d4d128993363f72 commit 6dfb2c2dce0ffabd783ec24b8d4d128993363f72 Author: Doug Rabson <dfr@FreeBSD.org> AuthorDate: 2023-06-20 13:01:58 +0000 Commit: Doug Rabson <dfr@FreeBSD.org> CommitDate: 2023-07-14 10:07:58 +0000 pf: Add code to enable filtering for locally delivered packets This is disabled by default since it potentially changes the behavior of existing filter rule sets. To enable this extra filter for packets being delivered locally, use: sysctl net.pf.filter_local=1 service pf restart PR: 268717 Reviewed-by: kp MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40373 (cherry picked from commit 3a1f834b5228986a7c14fd60da13cf2700e80996) UPDATING | 12 ++++++++++++ sys/netpfil/pf/pf_ioctl.c | 20 ++++++++++++++++++++ tests/sys/netpfil/common/utils.subr | 3 +-- tests/sys/netpfil/pf/fragmentation.sh | 3 ++- tests/sys/netpfil/pf/killstate.sh | 24 ++++++++++++++++-------- tests/sys/netpfil/pf/map_e.sh | 3 ++- tests/sys/netpfil/pf/pass_block.sh | 3 ++- tests/sys/netpfil/pf/pfsync.sh | 1 + tests/sys/netpfil/pf/route_to.sh | 3 ++- tests/sys/netpfil/pf/set_skip.sh | 2 +- tests/sys/netpfil/pf/table.sh | 6 ++++-- 11 files changed, 63 insertions(+), 17 deletions(-)
Closing this - for 13.2 and later, redirecting from localhost mostly works as long as the target address for a connect attempt is not 127.0.0.1 or ::1. Fixing that probably needs a more flexible binat since in this case, the kernel will typically choose a localhost source address and the rewrite needs to change that to the outgoing interface address. Since that is a feature request for pf, probably best in a new bug.