Bug 255685 - PF: JAIL: fail to connect from jail to jail service when pf enabled
Summary: PF: JAIL: fail to connect from jail to jail service when pf enabled
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-jail (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-07 16:13 UTC by Emmanuel Vadot
Modified: 2022-02-01 17:53 UTC (History)
7 users (show)

See Also:


Attachments
script to reproduce the issue (2.51 KB, application/x-shellscript)
2021-05-07 16:13 UTC, Emmanuel Vadot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Emmanuel Vadot freebsd_committer freebsd_triage 2021-05-07 16:13:11 UTC
Created attachment 224752 [details]
script to reproduce the issue

After upgrading some of my servers to 13.0-RELEASE I had this weird behavior, I couldn't connect (atleast tcp) to the service running in the jail from the jail itself.
The jails are using ip alias, not much else.

With a simple pf.conf that just block in it's not possible to either connect from the host to the jail or even from the jail to the jail.

I've attached a simple script that can reproduce the issue.
Obviously don't run it on a production machine as it will screw your pf.conf and jail.conf :)
There is a few variable at the beginning that should be updated (like ip address of the machine etc ...)

For reason yet unknown the quirk rule that I added on my servers which fix the issue doesn't work when I tried to reproduce on a machine locally here with a reduced test case. I'll dig more into this later.
Comment 1 Emmanuel Vadot freebsd_committer freebsd_triage 2021-05-07 16:23:54 UTC
Forgot to say that the script is working on a 12.2-RELEASE freshly installed locally.
After upgrading to 13.0 this doesn't work anymore
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2021-05-11 20:26:34 UTC
At this point I believe this isn't a pf bug, but a change in routing behaviour.
In 13 we route the alias address via em0, while we route it via lo0 on 12. That means that on 12 the ssh traffic bypasses pf (because skip on lo0) and it doesn't on 13.

On 12:

Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
127.0.0.1          link#2             UH          lo0
192.168.1.100      link#1             UHS         lo0
192.168.1.100/32   link#1             U           em0

On 13:

Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.183.1      UGS         em0
127.0.0.1          link#2             UH          lo0
192.168.1.100      link#1             UH          em0
192.168.183.0/24   link#1             U           em0
192.168.183.14     link#1             UHS         lo0

(Look at the 192.168.1.100 route entry)

Also, if I try to add a link route (after deleting the 192.168.1.100 route):
sudo route add 192.168.1.100 -link lo0
route: writing to routing socket: Network is unreachable
add host 192.168.1.100: gateway lo0 fib 0: Network is unreachable

tl;dr: this looks like a routing issue, not a pf bug.
Comment 3 Pietro Cerutti freebsd_committer freebsd_triage 2021-05-12 06:00:00 UTC
I was able to fix the connectivity problem between jails (and host) by setting the jails on a dedicated interface:

## /etc/rc.conf
cloned_interfaces="${cloned_interfaces} lo1"

## /etc/pf.conf
JAIL_IF="lo1"
set skip on $JAIL_IF
Comment 4 Laurent Frigault 2021-12-01 16:48:07 UTC
I just hit the same regression.

It looks more like a jail issue to me.

When an IP address is added to a jail, the local route to the ip via lo0 is not added any more, this raises issues when  upgrading to 13.0 and violate the POLA.
Comment 5 Laurent Frigault 2022-02-01 15:27:46 UTC
(In reply to Laurent Frigault from comment #4)
I'd like to add a precision.

On 13.0 :
When adding an ipv4 with a non /32 mask to a jail 2 routes are added :
10.10.10.0/24      link#1             U          bge0
10.10.10.10        link#1             UHS         lo0


When adding an ipv4 with a /32 mask only one route is added to the physical interface  and the second route to lo0 is missing:
192.168.249.247    link#1             UH         bge0

Before 13.0, the 2 routes are always added even with /32 mask.

192.168.249.195    link#1             UHS         0        5    lo0 =>
192.168.249.195/32 link#1             U           0        0   bce0
Comment 6 Laurent Frigault 2022-02-01 17:53:17 UTC
(In reply to Laurent Frigault from comment #5)

man ifconfig still says:
..
     alias   Establish an additional network address for this interface.  This
             is sometimes useful when changing network numbers, and one wishes
             to accept packets addressed to the old interface.  If the address
             is on the same subnet as the first network address for this
             interface, a non-conflicting netmask must be given.  Usually
             0xffffffff is most appropriate.

but it looks like since 13.0  we can now add aliases with non /32 mask even if there is already an ip with the same non /32 subnet and this works with jail ips too.

example:
host configuration:
ifconfig_bge0_alias0="inet 192.168.249.240 netmask 255.255.255.128"

jail configuration:
    ip4.addr += "192.168.249.247/25";

# netstat -rn |fgrep 192.168.    
192.168.249.128/25 link#1             U          bge0
192.168.249.240    link#1             UHS         lo0
192.168.249.247    link#1             UHS         lo0

lo0 host routes are back and the 2 ips can talk to each other via lo0

This change may ne related to https://www.freebsd.org/releases/13.0R/relnotes/
...
Duplicate routes installation issue for /32 or /128 interface aliases has been fixed. 81728a538d24
...

maybe the ifconfig manual page should be updated to remove 
"Usually 0xffffffff is most appropriate" from the alias item