Bug 207446

Summary: Hang bringing up vtnet(4) on >8 cpu GCE VMs
Product: Base System Reporter: Andy Carrel <wac>
Component: kernAssignee: Steven Hartland <smh>
Status: Closed FIXED    
Severity: Affects Only Me CC: gonzo, jonolson, smh, swills, wac
Priority: --- Keywords: patch
Version: 10.3-BETA2   
Hardware: amd64   
OS: Any   
Description Flags
patch to sys/dev/virtio/network/if_vtnetvar.h
patch to vtnet(4)
Fix vtnet hang with max_virtqueue_pairs > VTNET_MAX_QUEUE_PAIRS none

Description Andy Carrel 2016-02-23 21:30:45 UTC
Created attachment 167336 [details]
patch to sys/dev/virtio/network/if_vtnetvar.h

FreeBSD on Google Compute Engine currently requires "hw.vtnet.mq_disable=1" on virtual machines of more than 8 CPUs to avoid a hang when bringing up the vtnet interface.

I have prepared a patch which raises the VTNET_MAX_QUEUE_PAIRS to 64 which solves this hang on GCE VMs with 64 or fewer CPUs. 

With this patch, the release engineer should be able to remove the hw.vtnet.mq_disable line from the /boot/loader.conf that is placed in images intended for Google Compute Engine and users should see the benefits of multiqueue as a result.

The current limit of 8 is in CURRENT as well as earlier release branches (e.g. stable/10).
Comment 1 Glen Barber freebsd_committer 2016-02-23 21:47:42 UTC
Comment 2 Glen Barber freebsd_committer 2016-02-23 21:49:37 UTC
Fix assignment to correct person(s).
Comment 3 Bryan Venteicher freebsd_committer 2016-02-23 22:31:46 UTC
When I last looked at this, it appears to be a VirtIO specification violation in the GCE implementation: it assumes the number of Tx/Rx queues will be the same as the number of CPUS. I'm not sure what the correct escalation path is for GCE.
Comment 4 Jon Olson 2016-02-23 23:07:49 UTC
As far as I know we're compliance with the VIRTIO spec here. We do advertise max queues == to number of vCPUs, but there is not requirement that a guest configure/use all of them to take advantage of multiqueue support. From the (draft) spec change introducing multi-queue (edited for readability from https://github.com/rustyrussell/virtio-spec/commit/67023431c8796bc430ec0a79b15bab57e2e0f1f6): 

Only receiveq0, transmitq0 and controlq are used by default. To use more queues driver must negotiate the VIRTIO_NET_F_MQ feature; initialize up to `max_virtqueue_pairs` of each of transmit and receive queues; execute VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command specifying the number of the transmit and receive queues that is going to be used and wait until the device consumes the controlq buffer and acks this command.

This is what we do -- by default we service only rx0, tx0, and cq. Upon receiving VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET we will begin servicing the number of queues requested up to the value from the `max_virtqueue_pairs`. Larger values (and zero) should be NAK'd by the virtual device (and no change in the number of queue serviced will occur).

If a guest kernel does not wish to use more than, e.g., 8 tx/rx virtqueue pairs it need not configure more than the first eight and the control queue (always at index 2*max_queue_pairs + 1, per the spec).

As I said, I think we're in compliance with the spec here, but certainly if not I'll treat it as a bug.
Comment 5 Bryan Venteicher freebsd_committer 2016-02-24 15:28:14 UTC
(In reply to Jon Olson from comment #4)

The issue appears that GCE doesn't work when the guest driver requests fewer than device's advertised max number of queue pairs. The FreeBSD driver will request at most 8 queue pairs. The driver ends up hanging when calling into the host (IIRC, during the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command).
Comment 6 Andy Carrel 2016-02-26 22:17:36 UTC
After further investigation it looks like the driver is accidentally using driver's vtnet_max_vq_pairs*2 + 1 for the control virtqueue instead of device's max_virtqueue_pairs*2 + 1.

I'm about to attach a patch to current which propagates the device's max_virtqueue_pairs number in order to make sure the control virtqueue winds up in the correct place per the virtio spec. "vt_device_max_vq_pairs"  The patch also exposes this as a read-only sysctl dev.vtnet.X.device_max_vq_pairs.

e.g. # sysctl -a | grep vq_pair
dev.vtnet.0.act_vq_pairs: 3
dev.vtnet.0.max_vq_pairs: 3
dev.vtnet.0.device_max_vq_pairs: 16

I've tested the patch successfully with a VM that supports 16 max_virtqueue_pairs with vtnet_max_vq_pairs at the default of 8, as well as hw.vtnet.mq_max_pairs=3, and with hw.vtnet.mq_disable=1.

It'd be nice to include the original patch that raises VTNET_MAX_QUEUE_PAIRS as well though since that should have some performance advantages on many cpu-ed VMs.
Comment 7 Andy Carrel 2016-02-26 22:22:30 UTC
Created attachment 167464 [details]
patch to vtnet(4)
Comment 8 Steven Hartland freebsd_committer 2016-06-10 01:05:23 UTC
Created attachment 171249 [details]
Fix vtnet hang with max_virtqueue_pairs > VTNET_MAX_QUEUE_PAIRS

While the analysis seems sound the patch to add vt_device_max_vq_pairs doesn't seem like the right thing to me.

I believe the block which limits the max queues in vtnet_setup_features should be used to limit npairs passed to vtnet_ctrl_mq_cmd, so something like the attached (untested).
Comment 9 Andy Carrel 2016-06-10 15:01:48 UTC
I definitely prefer the requested vs. max terminology in your patch.

Ultimately we just need to make sure that the initialization here...

>  891         if (sc->vtnet_flags & VTNET_FLAG_CTRL_VQ) {
>  892                 VQ_ALLOC_INFO_INIT(&info[idx], 0, NULL, NULL,
>  893                     &sc->vtnet_ctrl_vq, "%s ctrl", device_get_nameunit(dev));
>  894         }
...is using idx that is equal to the max_vq_pairs number provided by the host, rather than a number that is limited by factors we take into account in the guest.

I'll try to build and test this weekend.
Comment 10 Andy Carrel 2016-06-11 02:30:14 UTC
The patch boots and functions successfully when applied to a releng/10.3 kernel on GCE, which fixes the problems I was observing originally. It also does the right thing with /boot/loader.conf values like hw.vtnet.mq_max_pairs=77 (using the 16 pairs that my test VM supported).

Thanks Steven!
Comment 11 commit-hook freebsd_committer 2016-08-11 21:14:08 UTC
A commit references this bug:

Author: smh
Date: Thu Aug 11 21:13:58 UTC 2016
New revision: 303971
URL: https://svnweb.freebsd.org/changeset/base/303971

  Fix vtnet hang with max_virtqueue_pairs > VTNET_MAX_QUEUE_PAIRS

  Correctly limit npairs passed to vtnet_ctrl_mq_cmd. This ensures that
  VQ_ALLOC_INFO_INIT is called with the correct value, preventing the system
  from hanging when max_virtqueue_pairs > VTNET_MAX_QUEUE_PAIRS.

  Add new sysctl requested_vq_pairs which allow the user to configure
  the requested number of virtqueue pairs. The actual value will still take
  into account the system limits.

  Also missing sysctls for the current tunables so their values can be seen.

  PR:		207446
  Reported by:	Andy Carrel
  MFC after:	3 days
  Relnotes:	Yes
  Sponsored by:	Multiplay

Comment 12 Oleksandr Tymoshenko freebsd_committer freebsd_triage 2019-01-21 18:41:39 UTC
There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved.