Bug 229957

Summary: [epair] MAC addresses all the same, no randomness
Product: Base System Reporter: O. Hartmann <ohartmann>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed FIXED    
Severity: Affects Many People CC: eugen, ohartmann, pizzamig, sascha.folie, wollman
Priority: --- Keywords: ipfilter, regression, vimage
Version: CURRENTFlags: eugen: mfc-stable11-
eugen: mfc-stable10-
Hardware: Any   
OS: Any   
See Also: https://reviews.freebsd.org/D15329
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=176671
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184149
Attachments:
Description Flags
proposed fix for if_epair.c none

Description O. Hartmann 2018-07-22 10:05:14 UTC
On recent CURRENT (FreeBSD 12.0-CURRENT #248 r336596: Sun Jul 22 09:31:53 CEST 2018 amd64), having vnet jails (VIMAGE kernel) owning their private epair pseudo NIC and having their external end being part of a bridge(4), results in a weird behaviour since CURRENT creates on ALL external parts of the epair pair (b-part in my case) the very same MAC address! Each vnet jail then gets the appropriate internal part of the epair pair (a-part in my case) but since I group them on bridge(4) if-devices, then there is a MAC address problem and this results in a very undpredictable, weirsd behaviour on FreeBSD (nothing is reported to the console/log/kernel so far).

This is my ifconfig result after creating a bunch of epairs:


epair3b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair52b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair10013b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair17b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair10015b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active


One of the jails:

root@ns01:~ # ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet 127.0.0.1 netmask 0xff000000 
        groups: lo 
enc0: flags=0<> metric 0 mtu 1536
        groups: enc 
epair3a: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0a
        inet 192.168.0.3 netmask 0xffffff00 broadcast 192.168.0.255 
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active

And another jail on the very same bridge pseudo device:

root@db01:~ # ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet 127.0.0.1 netmask 0xff000000 
        groups: lo 
enc0: flags=0<> metric 0 mtu 1536
        groups: enc 
epair52a: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0a
        inet 192.168.0.52 netmask 0xffffff00 broadcast 192.168.0.255 
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active


There was an issue earlier with broken epair code on CURRENT (and prior) which has been supposedly "fixed" a couple of time ago, but it seems the "fix" broke more than it fixed.

I tried to solve the problem by applying manually MAC addresses via the ifconfig "ether" option to guarantee different MACs on each collision doamain but there is a weird behaviour now using IPFW. While the setup I use including the workaround with the manually set "ether" on each epair works perfectly on 11.2-RELENG, I have trouble in CURRENT: Although OPEN firewall (ipfw) setting in each jail, pinging the host owning the physical NIC which is part of the bridge from any jail also member of the bridge doesn't work until the host owning the physical NIC is pinging first that jail's address. I can not say whether this is an IPFW issue or also related to the corrupt epair handling.
Comment 1 Eugene Grosbein freebsd_committer freebsd_triage 2018-07-22 17:03:03 UTC
Please describe how do you create your jail and, more important, how do you create epairs and move their parts to distinct VNET. The order is important, too.

After https://svnweb.freebsd.org/base?view=revision&revision=334094 there may be no randomness in assigning MAC address to epair if current hostid is not zero while epair is being created. Instead, the MAC address is determined by hostid and interface index. If you create single epair within some VNET and it gets some interface index and MAC but you move it to another VNET then create next epair, it may get same interface index (?) and same hostid, so same MAC address.
Comment 2 Eugene Grosbein freebsd_committer freebsd_triage 2018-07-22 17:39:16 UTC
Assuming you create epairs using "host" environment (not within some jail), please temporary wrap your epair creation procedure to verify if mentioned change is really a root of the problem:

savedid=$(sysctl -n kern.hostid)
sysctl kern.hostid=0
<insert your epair creation here>
sysctl kern.hostid=savedid

This way, epair(4) should add some randomness to created MAC due to zero hostid. Please report if this changes results.
Comment 3 Eugene Grosbein freebsd_committer freebsd_triage 2018-07-22 18:06:25 UTC
Created attachment 195377 [details]
proposed fix for if_epair.c

Please test proposed patch. Apply it to /usr/src then rebuild the kernel or just the module if_epair.ko if you use it.

The patch makes sure that if_index is not re-used to generate MAC in similar cases.
Comment 4 O. Hartmann 2018-07-22 18:37:20 UTC
(In reply to Eugene Grosbein from comment #1)

epair(4) interfaces are created in /etc/jail.conf within the "common" portion of the config file via

[...]
exec.prestart=          "";
exec.prestart+=         "ifconfig ${if_vnet} create";
exec.prestart+=         "ifconfig ${if_vnet}a ether ${epair_ether_base}:0a";
exec.prestart+=         "ifconfig ${if_vnet}b ether ${epair_ether_base}:0b";
exec.prestart+=         "ifconfig ${if_vnet}b up";
exec.prestart+=         "ifconfig ${if_home_bridge} addm ${if_vnet}b up";
[...]

Each jail definition has a set of variables comprising ${if_vnet} from literals "epairXXX" and then in the prestart section a and b. 

${epair_ether_base} is set to something "hand-randomised", means I try to give each epair a distinguished MAC after I ran into these problems. So according to your question, epairs are created as walking through a for-loop, creating each epair and put the a-part into the vnet jail and going on with the next. I find this way much more convenient than creating all necessary epairs at once and putting them afterwards into vnet.
Comment 5 O. Hartmann 2018-07-22 19:18:29 UTC
(In reply to Eugene Grosbein from comment #2)

Indeed, if sysctl kern.hostid=0 is set, MAC randomisation is perfect. But I do not see ad hoc any way how to expand within /etc/jail.conf sysctl -n kern.hostid to reuse it as a variable setting to reinstall the hostid as suggested.
Comment 6 Eugene Grosbein freebsd_committer freebsd_triage 2018-07-22 19:23:27 UTC
(In reply to O. Hartmann from comment #5)

You do not need to play with zero hostid anymore, as we have determined the real source of the problem. Instead, try attached patch.
Comment 7 O. Hartmann 2018-07-22 19:54:13 UTC
Added your patch as suggested:

FreeBSD 12.0-CURRENT #254 r336614M: Sun Jul 22 21:33:36 CEST 2018 amd64

It seems all epairs created via jail.conf as described in this PR, now with applied patch and without the additional "ether" option of ifconfig(8) have sufficiently randomised MACs:

[...]

 epair3b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:48:e2:b0:8c:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair52b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:82:ad:29:54:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair10013b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:21:ef:9a:64:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair17b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:2e:ef:e7:ab:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
epair10015b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:24:93:13:48:0b
        groups: epair 
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
Comment 8 Eugene Grosbein freebsd_committer freebsd_triage 2018-07-23 04:39:19 UTC
Adding pizzamig@ and wollman@ used to discuss/prepare/commit the code in question to CC: list so thay have a chance to take a look at this problem and proposed fix.
Comment 9 commit-hook freebsd_committer freebsd_triage 2018-07-23 07:12:19 UTC
A commit references this bug:

Author: eugen
Date: Mon Jul 23 07:11:58 UTC 2018
New revision: 336628
URL: https://svnweb.freebsd.org/changeset/base/336628

Log:
  epair(4): make sure we do not duplicate MAC addresses
  in case of reused if_index.

  PR:		229957
  Tested by:	O. Hartmann <ohartmann@walstatt.org>
  Approved by:	avg (mentor)

Changes:
  head/sys/net/if_epair.c
Comment 10 Eugene Grosbein freebsd_committer freebsd_triage 2018-07-23 07:54:25 UTC
MFC is not applicable as stable branches has different code.