Bug 268976 - Traffic will not route across two bridges on the same /8
Summary: Traffic will not route across two bridges on the same /8
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.1-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-15 22:34 UTC by rtyler
Modified: 2023-07-29 03:18 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description rtyler 2023-01-15 22:34:21 UTC
When setting up a network topology with FreeBSD vnet jails, I found that I was unable to route traffic between jails attached to two different bridge interfaces. It appears that if bridge0 and bridge1 share the same /8, traffic will not route between them correctly.

Using the following topology as an example:

+-------+
| world |
+-------+
   |
  vtnet0
   |
  pf/nat

  +---------------+        +-------------------+
  | dmz (bridge0) |        | private (bridge1) |
  +---------------+        +-------------------+

  * http                           * db
  * git

When bridge0 is 10.10.1.1/24 and bridge0 is 10.200.2.1/24, traffic will *not* route properly between the `http` and the `db` jails.

However, if bridge1 is `192.168.100.1/24`, then traffic will route properly between the two jails. Basically any configuration of bridge1 to be under 10.xx.xx.xx resulted in traffic not routing properly.


Below are some configuration files from the test VM:

jail.conf
------------------------------------
persist;
mount.devfs;
path = "/jails/$name";
host.hostname = $name;

exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown jail";
exec.clean;

vnet;

$dmz = "bridge0";
$dmz_gw = "10.10.1.1";
$private = "bridge1";
#$private_gw = "10.10.2.1";
$private_gw = "192.168.100.1";

http {
	$id = "0";
	$ip = "10.10.1.80";

	vnet.interface = "epair${id}b";

	exec.prestart = "ifconfig epair${id} create up";
	exec.prestart += "ifconfig epair${id}a up descr vnet-${name}";
	exec.prestart += "ifconfig ${dmz} addm epair${id}a up";

	exec.start = "/sbin/ifconfig epair${id}b ${ip}";
	exec.start += "/sbin/route add default ${dmz_gw}";
	exec.start += "/bin/sh /etc/rc";

	exec.poststop = "ifconfig ${dmz} deletem epair${id}a";
	exec.poststop += "ifconfig epair${id}a destroy";
}

db {	
	$id = "1";
	# For reproducing the bug
	#$ip = "10.10.2.32";
	$ip = "192.168.100.32";

	vnet.interface = "epair${id}b";

	exec.prestart = "ifconfig epair${id} create up";
	exec.prestart += "ifconfig epair${id}a up descr vnet-${name}";
	exec.prestart += "ifconfig ${private} addm epair${id}a up";

	exec.start = "/sbin/ifconfig epair${id}b ${ip}";
	exec.start += "/sbin/route add default ${private_gw}";
	exec.start += "/bin/sh /etc/rc";

	exec.poststop = "ifconfig ${private} deletem epair${id}a";
	exec.poststop += "ifconfig epair${id}a destroy";
}   	

git {	
	$id = "2";
	$ip = "10.10.1.3";

	vnet.interface = "epair${id}b";

	exec.prestart = "ifconfig epair${id} create up";
	exec.prestart += "ifconfig epair${id}a up descr vnet-${name}";
	exec.prestart += "ifconfig ${dmz} addm epair${id}a up";

	exec.start = "/sbin/ifconfig epair${id}b ${ip}";
	exec.start += "/sbin/route add default ${dmz_gw}";
	exec.start += "/bin/sh /etc/rc";

	exec.poststop = "ifconfig ${dmz} deletem epair${id}a";
	exec.poststop += "ifconfig epair${id}a destroy";
}   	
------------------------------------

rc.conf
------------------------------------
hostname="vnet-test"
ifconfig_vtnet0="DHCP"
#ifconfig_vtnet0_ipv6="inet6 accept_rtadv"
sshd_enable="YES"
ntpdate_enable="YES"
ntpd_enable="YES"
powerd_enable="YES"
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="AUTO"
zfs_enable="YES"
sendmail_enable="NONE"

# Networking and Jails
jail_enable="YES"
pf_enable="YES"
gateway_enable="YES"
cloned_interfaces="bridge0 bridge1"
ifconfig_bridge0="inet 10.10.1.1/24"
ifconfig_bridge1="inet 192.168.100.1/24"
# Using this network results in not being able to route
# Make sure to update /etc/jail.conf for the db jail when changing
#ifconfig_bridge1="inet 10.10.2.1/24"
------------------------------------

pf.conf
------------------------------------
extif="vtnet0"
dmz="bridge0"
private="bridge1"

scrub in all fragment reassemble

nat on $extif from $dmz:network to any -> ($extif)
nat on $extif from $private:network to any -> ($extif)
------------------------------------
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2023-01-16 09:30:21 UTC
> db {
>	# For reproducing the bug
>	#$ip = "10.10.2.32";
>	$ip = "192.168.100.32";
> ...
>	exec.start = "/sbin/ifconfig epair${id}b ${ip}";
>	exec.start += "/sbin/route add default ${private_gw}";
> ...
> }

The netmask assigned to the epair interface in jails is apparently wrong.

You dmz (bridge0) network is 10.10.1.1/24, but you did it `/sbin/ifconfig epair${id}b 10.10.2.32` without netmask / prefixlen, then the netmask / prefixlen will end up with `255.0.0.0` or `/8`, that is default for classful address `10.x.x.x` . As for `192.168.100.32` the prefixlen is default 24.

Try classless (CIDR) addresses, example for db `$ip = "10.10.2.32/24" .

Good luck!
Comment 2 rtyler 2023-01-24 01:57:59 UTC
Changing the `ifconfig` invocation to explicitly specify the prefixlen definitely solved the problem, e.g.:

    exec.start = "/sbin/ifconfig epair${id}b ${ip}/24";


While the behavior was confusing, I'm not sure what the appropriate behavior should be from FreeBSD should be in the case of a missing prefixlen.

A warning or error certainly would have saved me tons of time :)
Comment 3 Mina Galić freebsd_triage 2023-01-24 07:07:23 UTC
given that CIDR has been introduced 30 years ago, a small warning that we're being transported back 31 years in time would be appropriate.

this behaviour isn't quite "POLA", or at least not for young folks, under 40.
Comment 4 Mike Karels freebsd_committer freebsd_triage 2023-01-24 15:07:05 UTC
When I do this operation manually on 13.1 or 14.0, I get this warning from ifconfig:

# ifconfig epair0a 126.1
ifconfig: WARNING: setting interface address without mask is deprecated,
default mask may not be correct.

In addition, the following message is logged by the kernel (should appear on the console and /var/log/messages):

epair0a: set address: WARNING: network mask should be specified; using historical default

Maybe the jail setup process is hiding these?
Comment 5 Mina Galić freebsd_triage 2023-05-21 09:37:45 UTC
I think my expectation is that when i assign an address without netmask, that the netmask is going to be /32, not sind primordial horror