Bug 185619 - [VNET] Name conflict not checked when a child vnet goes away and returns its interface(s) back to the parent
Summary: [VNET] Name conflict not checked when a child vnet goes away and returns its ...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-09 22:30 UTC by Eugene M. Kim
Modified: 2024-02-22 10:09 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene M. Kim 2014-01-09 22:30:00 UTC
Each vnet has its own namespace for network interfaces.  As a result, two network interfaces may have the same name if they belong to distinct vnets.

When one of these interfaces tries to move into the other's vnet, the name conflict should - and does - block the operation, except in one case: When a child vnet goes away and returns its interfaces to its parent vnet, the name conflict is not checked and the parent vnet ends up having both interfaces of the same name.  This confuses various tools such as ifconfig(8).

Fix: 

One of the following would fix the problem (among other approaches I cannot think of):

Option 1: Give the returned interface a random, unique name.

Option 2: When injecting an interface into a child vnet, leave a "shadow" of its name in the parent vnet.  Don't let other interfaces in the parent vnet take the shadowed name, and give the shadowed name to the moved interface when it returns from the child vnet.

Option 3: Block destruction of a vnet if doing so would cause a name conflict in the parent vnet.

Option 3 opens a bigger problem and is probably impractical, as such blocking should be cascaded to and handled by the triggering event such as jail destruction, blocking which is probably a bad idea.

Option 1 is simpler, but the resulting behavior is random/nondeterministic and makes interface tracking harder.

Option 2 is more predictable and deterministic, at the cost of more complex implementation.  And it doesn't cover the case of pseudo-interfaces created locally inside a vnet, because such interfaces have no shadowed name in the parent vnet; falling back to option 1 would be one way to solve this.
How-To-Repeat: The first scenario shown below renames two epair(4) interfaces as "jnet" (one renamed in a parent vnet, another renamed in a child vnet), then destroys the child vnet to bring its jnet interface back to the parent.  ifconfig(8) output merges these two interfaces into one block (shown by two MAC addresses).

root@hydrogen:~ # jail -c name=test vnet persist
root@hydrogen:~ # ifconfig epair create
epair0a
root@hydrogen:~ # ifconfig epair0a
epair0a: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether 02:ff:40:00:04:0a
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
root@hydrogen:~ # ifconfig epair0b
epair0b: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether 02:ff:90:00:05:0b
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
root@hydrogen:~ # ifconfig epair0a name jnet
root@hydrogen:~ # ifconfig epair0b vnet test
root@hydrogen:~ # jexec test ifconfig epair0b name jnet
root@hydrogen:~ # jail -r test
root@hydrogen:~ # ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 
	inet 127.0.0.1 netmask 0xff000000 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
jnet: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether 02:ff:40:00:04:0a
	ether 02:ff:90:00:05:0b
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
root@hydrogen:~ # ifconfig jnet destroy
root@hydrogen:~ # 

The second scenario shown below creates two vnets and two epair(4) pairs (one pair for each vnet), injects the "b" end of each pair into the corresponding vnet then renames it as "jnet", then destroys the two vnets, showing the parent vnet ending up with both jnet interfaces.  At the end, "ifconfig jnet destroy" can be done twice: The first command picks and destroys one of the two pairs.

root@hydrogen:~ # ifconfig epair create
epair0a
root@hydrogen:~ # ifconfig epair create
epair1a
root@hydrogen:~ # jail -c name=test1 vnet persist
root@hydrogen:~ # jail -c name=test2 vnet persist
root@hydrogen:~ # ifconfig epair0b vnet test1
root@hydrogen:~ # jexec test1 ifconfig epair0b name jnet
root@hydrogen:~ # ifconfig epair1b vnet test2
root@hydrogen:~ # jexec test2 ifconfig epair1b name jnet
root@hydrogen:~ # jail -r test1
root@hydrogen:~ # jail -r test2
root@hydrogen:~ # ifconfig 
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
	ether 74:d0:2b:13:66:fc
	inet 10.0.0.11 netmask 0xffffff00 broadcast 10.0.0.255 
	inet6 fe80::76d0:2bff:fe13:66fc%em0 prefixlen 64 scopeid 0x1 
	inet6 2001:470:1f05:155:76d0:2bff:fe13:66fc prefixlen 64 autoconf 
	inet6 2002:43bc:72e6:1:76d0:2bff:fe13:66fc prefixlen 64 autoconf 
	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
em1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
	ether 74:d0:2b:13:6b:43
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 
	inet 127.0.0.1 netmask 0xff000000 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
epair0a: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether 02:ff:40:00:04:0a
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
epair1a: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether 02:ff:40:00:06:0a
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
jnet: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8<VLAN_MTU>
	ether 02:ff:90:00:05:0b
	ether 02:ff:90:00:07:0b
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
root@hydrogen:~ # ifconfig jnet destroy
root@hydrogen:~ # ifconfig jnet destroy
root@hydrogen:~ #
Comment 1 Edoardo Spadolini 2017-01-19 12:22:42 UTC
Stumbled upon this on 11.0-RELEASE-p7, my experience matches the first scenario in the how-to-repeat perfectly.
Comment 2 Eitan Adler freebsd_committer freebsd_triage 2018-05-20 23:49:58 UTC
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017

Do
- Set Status to "Open"
Comment 3 Thomas Steen Rasmussen / Tykling 2022-01-06 19:33:05 UTC
This seems like it might be the same issue as https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973
Comment 4 Zhenlei Huang freebsd_committer freebsd_triage 2023-01-18 10:05:20 UTC
(In reply to Eugene M. Kim from comment #0)
> Option 2 is more predictable and deterministic, at the cost of more complex 
> implementation.  And it doesn't cover the case of pseudo-interfaces created locally 
> inside a vnet, because such interfaces have no shadowed name in the parent vnet; 
> falling back to option 1 would be one way to solve this.

For pseudo-interfaces created locally inside a vnet (n), their's `home-vnet` is same with its vnet (n) and will be destroyed on vnet destroy, so no name conflicts.

I think that is a design defeat, as FreeBSD assumes the name of interface is unique in one vnet (namespace) and it allows renaming interfaces.

An combination of Option 1 and Option 2:
> Option 1: Give the returned interface a random, unique name.
> Option 2: When injecting an interface into a child vnet, leave a "shadow" of its
> name in the parent vnet.  Don't let other interfaces in the parent vnet take the 
> shadowed name, and give the shadowed name to the moved interface when it returns 
> from the child vnet.

I think we may give the interface a global unique unchangeable name (called xname) on create (physical or cloned ones), and refine current name as an alias and guarantee the uniqueness only within its vnet (namespace). (In practical an interface may have multiple aliases).

On vnet destroy an interface returns to its home-vnet, if the alias name conflicts we can remove the alias.

That may require KPI/ABI changes.

Linux has similar mechanic called `altname`, see https://lwn.net/Articles/794289/ 

CC @kp and @melifaro
Comment 5 Kristof Provost freebsd_committer freebsd_triage 2023-01-18 21:36:16 UTC
(In reply to Zhenlei Huang from comment #4)
I currently don't have any strong opinions on the best path to take here.

My initial thought was to, on return-to-home-vnet, check for name conflicts and to rename if there was one. That's somewhat unpredictable though.

On the other hand, tracking globally unique names risks significant complexity (because some interfaces are created in a vnet, i.e. not all interfaces have vnet0 as their home vnet), and also risks leaking information between vnets (i.e. vnet1 creates an epair interface, and now knows there are 5 other epairs on the system, because it got epair6a/b). That's probably not hugely important though.

I will point out that I recall looking at related issues and discovering that the locking and error handling around interface renaming is either beyond me or just plain incorrect.
Comment 6 c433li 2024-02-21 22:08:51 UTC
Just encountered this issue on 14.0-RELEASE.

> Option 1: Give the returned interface a random, unique name.

Since jids are never recycled, does this approach really have to be non-deterministic? I mean, it seems to me that we could make up some sort of convention that interfaces recycled from a destroyed vnet be given a special name such as `<if_type>_recycle_<jid>`, and for hierarchical jails we can append the nested jids to it, such as `<if_type>_recycle_<jid>_<nested_jid>`.

It is still possible to have naming conflict if the user insist on renaming their interface to one of these "special" names, but this approach can eliminate the majority of these conflicts without architectural changes.
Comment 7 Kristof Provost freebsd_committer freebsd_triage 2024-02-22 10:09:41 UTC
(In reply to c433li from comment #6)
IFNAMSIZ is 16. That means you have no more than 15 characters for an interface name, so renaming the interface is also fraught, and also requires an additional check for conflicts after the rename. Which would also require correct lock handling (which is currently absent). This doesn't actually avoid the problem it tries to avoid.