Bug 271474 - Possible to "lose" a tap(4) interface in a jail
Summary: Possible to "lose" a tap(4) interface in a jail
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 13.2-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-05-17 21:54 UTC by Joshua Kinard
Modified: 2024-04-20 17:26 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Kinard 2023-05-17 21:54:07 UTC
So after some trial and error, I have discovered that it is possible to literally "lose" a tap(4) interface inside of a jail during shutdown if the conditions are right.  I am trying to run bhyve within a VNET jail, and thought I would be OK to create 'tap0' inside the jail itself via /etc/rc.conf rather than at the host level and passing it in via VNET.

This kinda works for the most part, but in the process of trying to work out how to get bhyve to run inside of a jail, I had to start/stop the jail several times.  In one of those instances, I forgot to run "bhyvectl --vm=foo --destroy" before I ran "jail -r foojail".  The jail shutdown like it should, then when I went to run "jail -c foojail", I noticed in the console output that it was unable to create the 'tap0' interface because it already existed.

Dropping into the jail, 'ifconfig' showed no visible 'tap0' interface.  Same at the host level.  Somewhere in kernel memory, though, there was a 'tap0' interface with no way to get at it.  Doing a proper shutdown of the bhyve instance did not release it, nor did restarting the jail.  Only a full system reboot got things back to normal.

That's the problem description, anyways.  As for a fix, I honestly don't have a good suggestion.  Starting and stopping of jails is really just being clever with shell scripting in /etc/jail.conf, so it would be really hard to handle a case where jail shutdown can catch a failure to stop an interface and return that back to 'jail' and have it abort the shutdown.  For me, I decided to play it safe and moved the creation of the tap(4) interface to the host level and then just pass it to the jail as a vnet.interface parameter.
Comment 1 Meyser+bugs.freebsd.org 2023-05-18 05:36:46 UTC
/etc/rc.d/netif is NOT invoked inside vnetjails. (novnetjail Keyword)
so cloned interfaces are NOT destroyed during shutdown.

After changing exec.stop in /etc/jail.conf to

exec.stop = "/bin/sh /etc/rc.d/netif stop; /bin/sh /etc/rc.shutdown";

cloned interfaces are destroyed before shutdown.

Perhaps this works in your case too.
Comment 2 Joshua Kinard 2023-05-21 17:59:55 UTC
(In reply to Meyser+bugs.freebsd.org from comment #1)
> /etc/rc.d/netif is NOT invoked inside vnetjails. (novnetjail Keyword)
> so cloned interfaces are NOT destroyed during shutdown.
> 
> After changing exec.stop in /etc/jail.conf to
> 
> exec.stop = "/bin/sh /etc/rc.d/netif stop; /bin/sh /etc/rc.shutdown";
> 
> cloned interfaces are destroyed before shutdown.
> 
> Perhaps this works in your case too.
It looks like "nojailvnet" actually means "allow vnet jails", according to the rc(8) manpage.  The wording for that rc keyword could've been chosen better, IMHO, but in my case, I am using a VNET-enabled jail to run bhyve in.  So I don't think this is the exact cause.

Rather than keep running this experiment on my production appliance, I dug an older appliance out of a box and setup a clone of the production appliance's filesystem on it to use as a toy, and tried to re-replicate the case where tap0 was getting "lost", but so far, I have no been able to find a reproducible cause.  Stopping the jail while bhyve was running and then trying to restart it kinda-reproduced my initial scenario where netif was unable to create tap0 on jail start-up, but simply destroying the "dead" bhyve instance from the host-level, then restarting the jail cleared that and tap0 could be re-created.

So I am a bit stumped how I originally triggered this fluke in a way that was unrecoverable w/o a reboot.  That said, dropping the jail w/ bhyve running will still create a point where the handle for tap0 is still held by bhyve and thus, it can't be recreated next time the jail is restarted until the bhyve session is destroyed at the host level.  So that might be a good debugging point for someone to look into the tap(4) driver to see if there's a way to mitigate this.
Comment 3 matthias+freebsd+bugzilla 2023-05-21 18:48:49 UTC
(In reply to Joshua Kinard from comment #2)

Ok since 2020 ist has grown a "nojailvnet" ( confused me )
but its still missing "shutdown" so not run on shutdown.

cloned interfaces that are active members of a bridge inside the jail triggered the problem.
Comment 4 Joshua Kinard 2023-05-21 21:46:27 UTC
(In reply to matthias+freebsd+bugzilla from comment #3)
I started and stopped the jail multiple times using the normal commands and it destroyed tap0 and recreated tap0 each time as long as I had the VM fully destroyed.  Could be a race-like condition and so not easily triggerable.  I may try again later tonight or tomorrow to see if I can re-create it, and may also re-try on my production appliance.

FWIW, I use cloned_interfaces to create the bridge interface, too, but I have learned to list bridge interfaces last in that variable after all member interfaces have been created, and that does seem to avoid some issues that I saw on one of my other appliances.
Comment 5 Stephen Fox 2024-04-20 17:26:28 UTC
I ran into this issue as well under in a similar scenario (trying to run
a bhyve VM from a jail). While trying to understand this issue, I have been
doing a lot of "ls /dev" and "ls /dev/tapN"... and I realized that
"ls -l /dev/tapN" creates an entry in "/dev"...

```
root@x:/etc/jail.conf.d # ifconfig tap4141
ifconfig: interface tap4141 does not exist
root@x:/etc/jail.conf.d # ls -l /dev | grep tap4141
root@x:/etc/jail.conf.d # ls -l /dev/tap4141
crw-------  1 uucp dialer 0x70 Apr 20 12:53 /dev/tap4141
root@x:/etc/jail.conf.d # ifconfig tap4141
tap4141: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 58:9c:fc:10:97:4a
	groups: tap
	media: Ethernet 1000baseT <full-duplex>
	status: no carrier
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
```

Needless to say - this is not the behavior I expected.

The steps to reproduce the issue described by Joshua appear to be:

```
service jail start lose-tap-example
jexec lose-tap-example ls -l /dev/tap41
service jail stop lose-tap-example
```

Here is the jail configuration file ("/etc/jail.conf.d/lose-tap-example.conf"):

```
lose-tap-example {
  path = "/zroot/jails/${name}";
  mount.devfs;
  vnet;

  exec.start += "/bin/sh /etc/rc";
  exec.stop += "/bin/sh /etc/rc.shutdown";
}
```

The host system and jail versions:

```
# freebsd-version -uk
14.0-RELEASE-p5
14.0-RELEASE-p5
# jexec lose-tap-example freebsd-version -u
14.0-RELEASE-p6
```

And here is what it looks like from a shell for more context:

```
root@x:~ # ls -l /dev | grep tap
root@x:~ # service jail start lose-tap-example
Starting jails: lose-tap-example.
root@x:~ # jexec lose-tap-example ifconfig
lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet 127.0.0.1 netmask 0xff000000
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
root@x:~ # jexec lose-tap-example ls -l /dev/tap41
ls: /dev/tap41: No such file or directory
root@x:~ # jexec lose-tap-example ifconfig
lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet 127.0.0.1 netmask 0xff000000
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
tap41: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 52:72:e6:7e:7c:ab
	groups: tap
	media: Ethernet 1000baseT <full-duplex>
	status: no carrier
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
root@x:~ # service jail stop lose-tap-example
Stopping jails: lose-tap-example.
root@x:~ # ifconfig
vtnet0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
	options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE>
	ether (...)
	inet (...)  netmask 0xffffff00 broadcast (...)
	media: Ethernet autoselect (10Gbase-T <full-duplex>)
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet 127.0.0.1 netmask 0xff000000
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
root@x:~ # ifconfig tap41 create
ifconfig: interface tap41 already exists
root@x:~ # rm /dev/tap41
root@x:~ # ifconfig tap41 create
ifconfig: interface tap41 already exists
```