Bug 205156 - [Hyper-V] NICs' (hn0, hn1) MAC addresses can appear in an uncertain way across reboot
Summary: [Hyper-V] NICs' (hn0, hn1) MAC addresses can appear in an uncertain way acros...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Some People
Assignee: Xin LI
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-09 12:29 UTC by Dexuan Cui
Modified: 2016-05-09 06:16 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dexuan Cui 2015-12-09 12:29:37 UTC
FreeBSD VMs  of 10.2 and the head with multiple NICs (e.g., 2 NICs) have such an issue: 

occasionally hn0 and hn1 can swap their MAC address!

e.g., normally we have this

[root@decui-bsd102 ~/bsd.git/sys]# ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=31b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,TSO6>
        ether 00:15:5d:4c:a1:06
        inet6 fe80::215:5dff:fe4c:a106%hn0 prefixlen 64 scopeid 0x2
        inet 10.156.76.89 netmask 0xfffffc00 broadcast 10.156.79.255
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
hn1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=31b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,TSO6>
        ether 00:15:5d:4c:a1:0f
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

But sometimes, after a VM reboot, hn0's MAC can become 00:15:5d:4c:a1:0f and hn1's can become 00:15:5d:4c:a1:06.

This causes confusion to users.

This issue was discussed in http://www.cluster21.comdex.phpw.opensubscriber.com/message/freebsd-net@freebsd.org/17554434.html

I thought the host could enumerate the NIC devices to the VM in an unpredictable way, but I was wrong: the fact is that: if the VM's devices in the Hyper-V Manager's Setting remain the same, the order in which the host enumerates the devices should be the same.

Later I found a race condition in the VM's VMBus driver code and as a result, the "second" 
 NIC device can be firstly registered, as hn0.

I'm making a patch for this.
Comment 1 Xin LI freebsd_committer freebsd_triage 2015-12-29 07:55:22 UTC
Patched in head/ as r292859.  MFC reminder.
Comment 2 commit-hook freebsd_committer freebsd_triage 2015-12-29 07:55:56 UTC
A commit references this bug:

Author: delphij
Date: Tue Dec 29 07:54:56 UTC 2015
New revision: 292859
URL: https://svnweb.freebsd.org/changeset/base/292859

Log:
  hyperv: vmbus: remove the per-channel control_work_queue

  Now vmbus_channel_on_offer() -> vmbus_channel_process_offer() can
  safely run on the global hv_vmbus_g_connection.work_queue now.

  We remove the per-channel control_work_queue to achieve the proper
  serialization of the message handling.

  I removed the bogus TODO in vmbus_channel_on_offer(): a vmbus offer
  can only come from the parent partition, i.e., the host.

  PR:		kern/205156
  Submitted by:	Dexuan Cui <decui microsoft com>
  Reviewed by:	Howard Su <howard0su gmail com>, delphij
  MFC after:	2 weeks
  Differential Revision:	https://reviews.freebsd.org/D4597

Changes:
  head/sys/dev/hyperv/include/hyperv.h
  head/sys/dev/hyperv/vmbus/hv_channel_mgmt.c
Comment 3 Dexuan Cui 2015-12-29 08:14:11 UTC
To fix this bug, we need
https://reviews.freebsd.org/D4596
and
https://reviews.freebsd.org/D4597
Comment 4 Xin LI freebsd_committer freebsd_triage 2015-12-29 08:21:34 UTC
(In reply to Dexuan Cui from comment #3)
Ah sorry for that.  I have committed D4596 as well.
Comment 5 Dexuan Cui 2015-12-29 08:26:47 UTC
Thanks a lot, Xin! I saw the other revision:

r292861 | delphij | 2015-12-29 16:19:43 +0800 (Tue, 29 Dec 2015) | 35 lines

hyperv: vmbus: run non-blocking message handlers in vmbus_msg_swintr()

We'll remove the per-channel control_work_queue because it can't properly
do serialization of message handling, e.g., when there are 2 NIC devices,
vmbus_channel_on_offer() -> hv_queue_work_item() has a race condition:
for an SMP VM, vmbus_channel_process_offer() can run concurrently on
different CPUs and if the second NIC's
vmbus_channel_process_offer() -> hv_vmbus_child_device_register() runs
first, the second NIC's name will be hn0 and the first NIC's name will
be hn1!

We can fix the race condition by removing the per-channel control_work_queue
and run all the message handlers in the global
hv_vmbus_g_connection.work_queue -- we'll do this in the next patch.

With the coming next patch, we have to run the non-blocking handlers
directly in the kernel thread vmbus_msg_swintr(), because the special
handling of sub-channel: when a sub-channel (e.g., of the storvsc driver)
is received and being handled in vmbus_channel_on_offer() running on the
global hv_vmbus_g_connection.work_queue, vmbus_channel_process_offer()
invokes channel->sc_creation_callback, i.e., storvsc_handle_sc_creation,
and the callback will invoke hv_vmbus_channel_open() -> hv_vmbus_post_message
and expect a further reply from the host, but the handling of the further
messag can't be done because the current message's handling hasn't finished
yet; as result, hv_vmbus_channel_open() -> sema_timedwait() will time out
and th device can't work.

Also renamed the handler type from hv_pfn_channel_msg_handler to
vmbus_msg_handler: the 'pfn' and 'channel' in the old name make no sense.

Submitted by:   Dexuan Cui <decui microsoft com>
Reviewed by:    royger
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D4596
Comment 6 commit-hook freebsd_committer freebsd_triage 2016-01-13 08:23:20 UTC
A commit references this bug:

Author: delphij
Date: Wed Jan 13 08:22:54 UTC 2016
New revision: 293820
URL: https://svnweb.freebsd.org/changeset/base/293820

Log:
  MFC r292861:

  hyperv: vmbus: run non-blocking message handlers in vmbus_msg_swintr()

  We'll remove the per-channel control_work_queue because it can't properly
  do serialization of message handling, e.g., when there are 2 NIC devices,
  vmbus_channel_on_offer() -> hv_queue_work_item() has a race condition:
  for an SMP VM, vmbus_channel_process_offer() can run concurrently on
  different CPUs and if the second NIC's
  vmbus_channel_process_offer() -> hv_vmbus_child_device_register() runs
  first, the second NIC's name will be hn0 and the first NIC's name will
  be hn1!

  We can fix the race condition by removing the per-channel control_work_queue
  and run all the message handlers in the global
  hv_vmbus_g_connection.work_queue -- we'll do this in the next patch.

  With the coming next patch, we have to run the non-blocking handlers
  directly in the kernel thread vmbus_msg_swintr(), because the special
  handling of sub-channel: when a sub-channel (e.g., of the storvsc driver)
  is received and being handled in vmbus_channel_on_offer() running on the
  global hv_vmbus_g_connection.work_queue, vmbus_channel_process_offer()
  invokes channel->sc_creation_callback, i.e., storvsc_handle_sc_creation,
  and the callback will invoke hv_vmbus_channel_open() -> hv_vmbus_post_message
  and expect a further reply from the host, but the handling of the further
  messag can't be done because the current message's handling hasn't finished
  yet; as result, hv_vmbus_channel_open() -> sema_timedwait() will time out
  and th device can't work.

  Also renamed the handler type from hv_pfn_channel_msg_handler to
  vmbus_msg_handler: the 'pfn' and 'channel' in the old name make no sense.

  Submitted by: Dexuan Cui <decui microsoft com>
  Reviewed by:  royger
  Differential Revision:        https://reviews.freebsd.org/D4596

  MFC r292859:

  hyperv: vmbus: remove the per-channel control_work_queue

  Now vmbus_channel_on_offer() -> vmbus_channel_process_offer() can
  safely run on the global hv_vmbus_g_connection.work_queue now.

  We remove the per-channel control_work_queue to achieve the proper
  serialization of the message handling.

  I removed the bogus TODO in vmbus_channel_on_offer(): a vmbus offer
  can only come from the parent partition, i.e., the host.

  PR:           kern/205156
  Submitted by: Dexuan Cui <decui microsoft com>
  Reviewed by:  Howard Su <howard0su gmail com>, delphij
  Differential Revision:        https://reviews.freebsd.org/D4597

Changes:
_U  stable/10/
  stable/10/sys/dev/hyperv/include/hyperv.h
  stable/10/sys/dev/hyperv/vmbus/hv_channel_mgmt.c
  stable/10/sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c
  stable/10/sys/dev/hyperv/vmbus/hv_vmbus_priv.h
Comment 7 Dexuan Cui 2016-03-23 12:43:13 UTC
The issue has been fixed in 10.3 and 11-CURRENT.
I don't think we want to make errata for 10.1 and 10.2.
So, I suppose we can close the bug.