Bug 260973 - pf: firewall rules stop matching when vnet jails share interface names with the host
Summary: pf: firewall rules stop matching when vnet jails share interface names with t...
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-06 09:14 UTC by Thomas Steen Rasmussen / Tykling
Modified: 2022-02-14 19:36 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Steen Rasmussen / Tykling 2022-01-06 09:14:12 UTC
Hello,

I've been building a new vnet jailhost on 13 and I am hitting a weird issue where pf stops permitting traffic it clearly has rules to allow after interfaces inside vnet jails are renamed to the same name as the host interface with the pf rule.

This is on FreeBSD nuc1.servers.bornhack.org 13.0-STABLE FreeBSD 13.0-STABLE #1 stable/13-d208638c5: Wed Jan  5 13:32:08 UTC 2022     root@nuc1.servers.bornhack.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

The complete ruleset is pretty complex but I've managed to cook it down to a few lines:

[tykling@nuc1 ~]$ cat testpf.conf 
block log all
set skip on lo0
pass in quick on { em0 } proto { tcp } from { 85.235.250.87 } to { (em0) } port { 22 }
[tykling@nuc1 ~]$ 

The host has an em0 interface:

[tykling@nuc1 ~]$ ifconfig em0
em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
        ether 1c:69:7a:ab:fe:be
        inet 85.209.118.130/28 broadcast 85.209.118.143
        inet6 fe80::1e69:7aff:feab:febe%em0/64 scopeid 0x1
        inet6 2a09:94c4:55d1:7680::82/64
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
[tykling@nuc1 ~]$ 

The issue seems to be triggered by renaming epair interfaces inside vnet jails to the same name as an interface on the host.

The above pf ruleset works and keeps working if I don't start any vnet jails. It also keeps working if I start vnet jails but don't rename interfaces. It also keeps working if I start vnet jails but rename the interfaces to something other than em0.

Existing states established before the issue happens keep working (I am working remote via ssh on the server), but new states seem to just ignore the permit rule on em0, and the traffic gets blocked even though a rule should permit it:

06:08:46.357935 rule 0/0(match): block in on em0: 85.235.250.87.40108 > 85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 799486870 ecr 0], length 0
06:08:47.358590 rule 0/0(match): block in on em0: 85.235.250.87.40108 > 85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 799487870 ecr 0], length 0
06:08:49.557897 rule 0/0(match): block in on em0: 85.235.250.87.40108 > 85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 799490070 ecr 0], length 0

A wild guess as to the reason might be a race leading to some confusion over which em0 interface is which?

Some more observations:
- It didn't seem to happen with just one vnet jail when I tried narrowing it down. Enabling and starting three more made the problem occur almost instantly.
- Rebooting with four jails plus the above ruleset enabled means never getting any contact to the server at all (ie. the problem manifests from boot).
- Results with two jails were less consistent. The number of jails/interface renames seem to play a role in whether or not the issue is triggered.
- A "service jail restart" will trigger it almost instantly if it doesn't happen right away.
- Renaming interfaces to something other than "em0" also works without any issues.

I hope reproducing will be possible, I've included the jail.conf file for one of the jails below:

[tykling@nuc1 ~]$ cat /var/run/jail.syslog1_servers_bornhack_org.conf
# Generated by rc.d/jail at 2022-01-06 08:19:08
syslog1_servers_bornhack_org {
        host.hostname = "syslog1.servers.bornhack.org";
        path = "/usr/jails/syslog1.servers.bornhack.org";
        vnet;
        vnet.interface = "epair2b";
        exec.clean;
        exec.system_user = "root";
        exec.jail_user = "root";
        exec.prestart += "ifconfig epair2a destroy 2>/dev/null || true && ifconfig epair2 create up && ifconfig epair2a up && ifconfig bridge1 addm epair2a up";
        exec.start += "/sbin/ifconfig epair2b name em0 && ifconfig em0 10.1.0.3/24 && ifconfig em0 inet6 2a09:94c4:55d1:76A0::3/64";
        exec.start += "route add -inet default 10.1.0.1";
        exec.start += "route add -inet6 default 2a09:94c4:55d1:76A0::1";
        exec.poststop += "ifconfig bridge1 deletem epair2a && ifconfig epair2a destroy";
        exec.start += "/bin/sh /etc/rc";
        exec.stop = "/bin/sh /etc/rc.shutdown jail";
        exec.consolelog = "/var/log/jail_syslog1_servers_bornhack_org_console.log";
        mount.fstab = "/etc/fstab.syslog1_servers_bornhack_org";
        allow.set_hostname = 0;
        allow.sysvipc = 0;
        enforce_statfs = "2";
}
[tykling@nuc1 ~]$

The interesting sections I guess are:
- in exec.prestart (on the host) where the epair interface is destroyed, recreated and added to a bridge
- and in exec.start (inside the jail) where the interface is renamed to em0 and then configured with v4 and v6.
I've included the whole thing in case it is useful to someone.

I hope someone is able to reproduce, if not then I will have to narrow it down further, please let me know. I have run out of time for now.

Thanks! :)
Comment 1 Thomas Steen Rasmussen / Tykling 2022-01-06 09:49:42 UTC
Maybe related https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185619

Also, I forgot to mention, at some point yesterday while trying 100 things I saw the em0 on the host having multiple ether and hwaddr entries, the mac addresses were like the ones you see on epair interfaces. I have a screenshot of it if anyone is interested.
Comment 2 Thomas Steen Rasmussen / Tykling 2022-01-06 09:56:56 UTC
This statement

- Rebooting with four jails plus the above ruleset enabled means never getting any contact to the server at all (ie. the problem manifests from boot).

is not true, my testing was off. The problem only shows up when vnet jails with the same interface names as on the host are stopped/restarted. This also explains why I had such a hard time reproducing it right after a reboot. It only happens when a jail has been started and is then stopped (or restarted)

This fits the problem description in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185619 perfectly
Comment 3 Kristof Provost freebsd_committer freebsd_triage 2022-02-14 15:37:06 UTC
With the disclaimer that this is entirely from memory and may be incorrect or outdated:

I'm aware of several somewhat related issues around interface naming. One is this, that when an interface is moved between vnets (e.g. when the jail it lives in goes away) there's no check for name collisions.
That's non-trivial to solve, because the relevant code paths often have no ability to return errors if there's a name collision and the locking around interface names is also unclear (and likely wrong in several places).

There's a loosely related issue with interface groups as well (see #218895, #202178). Now that interfaces can be renamed it's possible to have an interface group and an interface with the same name (and the interface need not even be a member of the group). This has previously triggered panics in pf, as it assumes that interfaces and interface groups share a namespace (and this was historically the case, in that interfaces always ended with a number and groups never did. The former is no longer the case, but the latter is still enforced). This issue too is difficult to solve for the same reasons as the problem described in this bug (lack of error paths, unclear locking).

When I looked at it last I estimated this to be a significant (plausibly multi-month) effort to fix. I do not expect to work on these problems any time soon.
Comment 4 Thomas Steen Rasmussen / Tykling 2022-02-14 19:05:25 UTC
(In reply to Kristof Provost from comment #3)

Thank you for the input. The issue I was hitting is the first one you mention - also described in #185619 - and I've been able to work around it in my own setup by inventing some interface names inside the jails which are never used on the host (in my case the jail interfaces are called jail0, jail1 etc).

Also, this is not strictly needed, but one could add an exec.stop entry before rc.shutdown to rename the interfaces back to their original epairNb name which shouldn't be in use in the parent vnet.

Both of these are workarounds of course, and doesn't begin to consider nested jails with overlapping interface names.

Kristof, do you know the code well enough to say if it would be possible to deny the initial interface rename action if a parent vnet is using the same name?
Comment 5 Kristof Provost freebsd_committer freebsd_triage 2022-02-14 19:36:50 UTC
> Kristof, do you know the code well enough to say if it would be possible to deny the initial interface rename action if a parent vnet is using the same name?


That runs into the same problems as dealing with it when the interface is returned to the parent vnet, and doesn't account for possible renames after the interface is moved to a child vnet.