Bug 235704 - [net] [patch] tun(4) can't be destroyed on a VNET jail if it's renamed
Summary: [net] [patch] tun(4) can't be destroyed on a VNET jail if it's renamed
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: patch, vimage
Depends on:
Blocks:
 
Reported: 2019-02-12 20:56 UTC by genneko217
Modified: 2019-03-19 00:31 UTC (History)
4 users (show)

See Also:


Attachments
diff to /usr/src (564 bytes, patch)
2019-02-12 20:56 UTC, genneko217
no flags Details | Diff
Reproduction test log (941 bytes, text/plain)
2019-02-12 20:57 UTC, genneko217
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description genneko217 2019-02-12 20:56:45 UTC
Created attachment 201965 [details]
diff to /usr/src

If a tun(4) interface is created on a VNET jail and renamed to the name
other than "tunX", the interface cannot be destroyed thereafter. If it
is renamed back to "tunX", it becomes destroyable again. This issue only
occurs on VNET jails and doesn't seem to happen on a host or a/non-VNET
environment.


How to Reproduce:

1. Boot a VIMAGE-enabled FreeBSD system.
   (I used 12.0-RELEASE-p3 with GENERIC kernel. See attached log.txt)
2. Create a VNET jail.
3. Create a tun(4) interface on the VNET jail,
   then rename it to the name other than "tunX".
4. Try to destroy the tun interface on the VNET jail.

Expected Result: Step 4 removes the interface successfully.
Actual Result: Step 4 fails with "SIOCIFDESTROY: Invalid argument".


Minimal Set of Commands to Reproduce:
(Assume the second command creates 'tun0')

jail -c -n test vnet persist
jexec test ifconfig tun create
jexec test ifconfig tun0 name wg0
jexec test ifconfig wg0 destroy


Workaround:

The problem can be worked around by renaming the undestroyable tun
interface to "tunX" (X can be any number which is not being used by
other tun interfaces on the system).

jexec test ifconfig wg0 name tun0
jexec test ifconfig tun0 destroy


Possible Fix:

See attached patch for if_clone_destroy() in sys/net/if_clone.c.

I wrote and tested it on releng/12.0 and confirmed it solved this
particular problem, but because of my lack of knowledge and experience
(this is my first kernel debugging, report and patch submission ever)
I'm not sure it's reasonable and doesn't break any other stuff.


Background:

I found this issue when I was experimenting with the WireGuard VPN
(net/wireguard package) on VNET jails.

On FreeBSD, WireGuard uses a userspace program (net/wireguard-go) to
create its tunnel interface. The program first generates a tun interface
by opening /dev/tun, then renames it to the one specified on the command
line.

When I used 'wg0' as the WireGuard interface name on a VNET jail, I
noticed that it couldn't be destroyed as 'service wireguard stop' got
stuck waiting the interface to be destroyed.

Further investigation shows me that it's not specific to WireGuard.
Thus I create this report.
Comment 1 genneko217 2019-02-12 20:57:46 UTC
Created attachment 201966 [details]
Reproduction test log
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2019-02-12 21:27:08 UTC
Excellent bug report. Bjoern knows this code a bit better than me and will hopefully be able to review your patch.
Comment 3 genneko217 2019-02-13 00:04:44 UTC
First, thank you so much for looking at my report.

What follows is my humble analysis on the issue and my patch.
Although this may be redundant or even inappropriate to include in bug
reports, as a newbie, please let me summarize my thoughts.

"ifconfig name destroy" issues a SIOCIFDESTROY ioctl on the interface
"name", which finally calls if_clone_destroy(name) in sys/net/if_clone.c.

https://svnweb.freebsd.org/base/releng/12.0/sys/net/if_clone.c?revision=340470&view=markup#l247

This function has two LIST_FOREACH loops which search for the 'cloner'
used to create the interface "name".

The first loop iterates the curvnet's cloner list and looks for a cloner
whose name matches the driver name of the interface (ifp->if_dname.
"tun" in this case). Because tun(4) seems to have only a single global
cloner and doesn't seem to have per-VNET ones for some reason (related
to the legacy devfs cloning functionality?), the cloner "tun" is not in
the list and not found in the first loop. Thus the second loop is
executed to look for the cloner "tun" in the global vnet0's cloner list.

In the second loop, however, matching code uses the interface name (the
function's argument "name". "wg0" in this case) instead of the driver
name. As the interface name can be changed as seen in this report, there
are cases where the second loop cannot find the cloner "tun" which is
actually there. In those cases, the function exits with EINVAL, which
matches the Actual Result of this report.

On the other hand, a tun(4) interface on a host or non-VNET environment
can be destroyed because the curvnet that the first loop iterates is the
global vnet0's cloner list, which does contain the cloner "tun" and
comparison is done against the driver name "tun" (not the interface name
which can be changed to something like "wg0").

With those, I thought it's better for the second loop to use the same
logic (comparing ifc->ifc_name with ifp->if_dname) as the first loop
because the interface name (name) is changeable while the driver name
(ifp->if_dname) seems not.

I also noticed that with this change, it's no longer necessary to have
separate matching codes depending on cloner types (SIMPLE or ADVANCED)
because they are for matching full interface name including a unit
number (and ADVANCED cloner even has a custom match function to handle
non-starndard interface names such as epairXa/b or vlanX.Y) but now it
only requires a simple, perfect match to perform lookups by driver name.
Comment 4 Kristof Provost freebsd_committer freebsd_triage 2019-02-19 18:29:19 UTC
I've proposed a slightly different patch in https://reviews.freebsd.org/D19248
Comment 5 genneko217 2019-02-20 12:14:10 UTC
Thank you for working on this issue.
I've briefly tested the patch and confirmed the reported issue was solved
with it.

I also confirmed that it changed the way unit numbers are assigned to if_tun.
While if_tun's unit numbers are assigned sequentially across a host and all
jails on it before the patch, it's now assigned independently so that vnet
jails and the host can have interfaces with the same name.

But now I began to wonder how the interfaces with the same name, say tun0,
are related to /dev/tun0 on the system.

When I did a quick testing with the following commands, the fourth command
removed tun0 interface from the host and also deleted /dev/tun0 while tun0
interface on the vnet jail "test" is still there. Then the fifth command
caused a kernel panic at destroy_devl in sys/kern/kern_conf.c.

# Assume there's no if_tun on the system.
jail -c -n test vnet persist
ifconfig tun create
jexec test ifconfig tun create
ifconfig tun0 destroy
jexec test ifconfig tun0 destroy
Comment 6 Kristof Provost freebsd_committer freebsd_triage 2019-02-20 12:55:33 UTC
Thanks. You're right, the unit numbers are used to find the device node (or vice versa), so they need to be system-unique.

I'll take a look at what we can do about that.
Comment 7 Kristof Provost freebsd_committer freebsd_triage 2019-02-27 09:53:33 UTC
I've updated the patch on https://reviews.freebsd.org/D19248 to fix the unit number issue.
Comment 8 genneko217 2019-02-28 15:11:04 UTC
(In reply to Kristof Provost from comment #7)

The updated patch fixed the issue.
I think it's also a good example of virtualizing an interface.
Comment 9 genneko217 2019-02-28 15:27:49 UTC
Sorry to bother you again and again, but let me ask one more thing.

Because some interface types (tap, vxlan etc.) still don't have
virtualized cloners, I think similar problems could occur to them.
(I know this is truly a corner case, though.) 

if_clone_destroy() in sys/net/if_clone.c has the following code to
find the cloner for an interface to destroy.

What I was wondering is why the line 263-267 (the first loop) and
the line 271-278 (the second loop) use different code.
Can't they be the same?
(My original patch came from this question.)

   261		/* Find the cloner for this interface */
   262		IF_CLONERS_LOCK();
   263		LIST_FOREACH(ifc, &V_if_cloners, ifc_list) {
-> 264			if (strcmp(ifc->ifc_name, ifp->if_dname) == 0) {
   265				break;
   266			}
   267		}
   268	#ifdef VIMAGE
   269		if (ifc == NULL && !IS_DEFAULT_VNET(curvnet)) {
   270			CURVNET_SET_QUIET(vnet0);
   271			LIST_FOREACH(ifc, &V_if_cloners, ifc_list)
   272				if (ifc->ifc_type == SIMPLE) {
-> 273					if (ifc_simple_match(ifc, name))
   274						break;
   275				} else {
-> 276					if (ifc->ifc_match(ifc, name))
   277						break;
   278				}
   279			CURVNET_RESTORE();
   280		}
   281	#endif
Comment 10 Kristof Provost freebsd_committer freebsd_triage 2019-03-02 14:56:49 UTC
(In reply to genneko217 from comment #9)
That's a very good question that I don't have a great answer for.
Looking at the commit that introduced it I'm inclined to think that might be a mistake.

I've cc'd Gleb because he might know more.
Comment 11 commit-hook freebsd_committer freebsd_triage 2019-03-05 13:21:54 UTC
A commit references this bug:

Author: kp
Date: Tue Mar  5 13:21:08 UTC 2019
New revision: 344794
URL: https://svnweb.freebsd.org/changeset/base/344794

Log:
  tun: VIMAGE fix for if_tun cloner

  The if_tun cloner is not virtualised, but if_clone_attach() does use a
  virtualised list of cloners.
  The result is that we can't find the if_tun cloner when we try to remove
  a renamed tun interface. Virtualise the cloner, and move the final
  cleanup into a sysuninit so that we're sure this happens after all of
  the vnet_sysuninits

  Note that we need unit numbers to be system-unique (rather than unique
  per vnet, as is done by if_clone_simple()). The unit number is used to
  create the corresponding /dev/tunX device node, and this node must match
  with the interface.
  Switch to if_clone_advanced() so that we have control over the unit
  numbers.

  Reproduction scenario:
  	jail -c -n foo persist vnet
  	jexec test ifconfig tun create
  	jexec test ifconfig tun0 name wg0
  	jexec test ifconfig wg0 destroy

  PR:		235704
  Reviewed by:	bz, hrs, hselasky
  MFC after:	2 weeks
  Differential Revision:	https://reviews.freebsd.org/D19248

Changes:
  head/sys/net/if_tun.c
Comment 12 commit-hook freebsd_committer freebsd_triage 2019-03-05 15:50:00 UTC
A commit references this bug:

Author: kp
Date: Tue Mar  5 15:49:31 UTC 2019
New revision: 344797
URL: https://svnweb.freebsd.org/changeset/base/344797

Log:
  tun tests: Test renaming and destroying a tun interface in a vnet jail

  There was a problem destroying renamed tun interfaces in vnet jails. This was
  fixed in r344794. Test the previously failing scenario.

  PR:		235704
  MFC after:	2 weeks

Changes:
  head/tests/sys/net/Makefile
  head/tests/sys/net/if_tun_test.sh
Comment 13 commit-hook freebsd_committer freebsd_triage 2019-03-19 00:27:52 UTC
A commit references this bug:

Author: kp
Date: Tue Mar 19 00:27:46 UTC 2019
New revision: 345285
URL: https://svnweb.freebsd.org/changeset/base/345285

Log:
  MFC r344794:

  tun: VIMAGE fix for if_tun cloner

  The if_tun cloner is not virtualised, but if_clone_attach() does use a
  virtualised list of cloners.
  The result is that we can't find the if_tun cloner when we try to remove
  a renamed tun interface. Virtualise the cloner, and move the final
  cleanup into a sysuninit so that we're sure this happens after all of
  the vnet_sysuninits

  Note that we need unit numbers to be system-unique (rather than unique
  per vnet, as is done by if_clone_simple()). The unit number is used to
  create the corresponding /dev/tunX device node, and this node must match
  with the interface.
  Switch to if_clone_advanced() so that we have control over the unit
  numbers.

  Reproduction scenario:
  	jail -c -n foo persist vnet
  	jexec test ifconfig tun create
  	jexec test ifconfig tun0 name wg0
  	jexec test ifconfig wg0 destroy

  PR:		235704
  Reviewed by:	bz, hrs, hselasky
  Differential Revision:	https://reviews.freebsd.org/D19248

Changes:
_U  stable/12/
  stable/12/sys/net/if_tun.c
Comment 14 commit-hook freebsd_committer freebsd_triage 2019-03-19 00:28:57 UTC
A commit references this bug:

Author: kp
Date: Tue Mar 19 00:27:48 UTC 2019
New revision: 345286
URL: https://svnweb.freebsd.org/changeset/base/345286

Log:
  MFC r344794:

  tun: VIMAGE fix for if_tun cloner

  The if_tun cloner is not virtualised, but if_clone_attach() does use a
  virtualised list of cloners.
  The result is that we can't find the if_tun cloner when we try to remove
  a renamed tun interface. Virtualise the cloner, and move the final
  cleanup into a sysuninit so that we're sure this happens after all of
  the vnet_sysuninits

  Note that we need unit numbers to be system-unique (rather than unique
  per vnet, as is done by if_clone_simple()). The unit number is used to
  create the corresponding /dev/tunX device node, and this node must match
  with the interface.
  Switch to if_clone_advanced() so that we have control over the unit
  numbers.

  Reproduction scenario:
  	jail -c -n foo persist vnet
  	jexec test ifconfig tun create
  	jexec test ifconfig tun0 name wg0
  	jexec test ifconfig wg0 destroy

  PR:		235704
  Reviewed by:	bz, hrs, hselasky
  Differential Revision:	https://reviews.freebsd.org/D19248

Changes:
_U  stable/11/
  stable/11/sys/net/if_tun.c
Comment 15 commit-hook freebsd_committer freebsd_triage 2019-03-19 00:30:01 UTC
A commit references this bug:

Author: kp
Date: Tue Mar 19 00:29:18 UTC 2019
New revision: 345287
URL: https://svnweb.freebsd.org/changeset/base/345287

Log:
  MFC r344797:

  tun tests: Test renaming and destroying a tun interface in a vnet jail

  There was a problem destroying renamed tun interfaces in vnet jails. This was
  fixed in r344794. Test the previously failing scenario.

  PR:		235704

Changes:
_U  stable/12/
  stable/12/tests/sys/net/Makefile
  stable/12/tests/sys/net/if_tun_test.sh