Bug 237166

Summary: vmx(4) override_ntxds/nrxds tunable has no effect
Product: Base System Reporter: ncrogers
Component: kernAssignee: Patrick Kelsey <pkelsey>
Status: Closed Not A Bug    
Severity: Affects Only Me CC: pkelsey
Priority: ---    
Version: 12.0-STABLE   
Hardware: amd64   
OS: Any   

Description ncrogers 2019-04-09 17:18:41 UTC
Since the MFC of r343291 to 12-STABLE in r344027 (Convert vmx(4) to being an iflib driver.), I am no longer able to increase the number of tx/rx descriptors via the usual iflib sysctls.

For example the following loader.conf:
dev.vmx.0.iflib.override_ntxqs=2
dev.vmx.0.iflib.override_nrxqs=2
dev.vmx.0.iflib.override_ntxds=4096
dev.vmx.0.iflib.override_nrxds=2048

Yields bootup messages:

Apr  9 10:03:22 rxg kernel: vmx0: <VMware VMXNET3 Ethernet Adapter> port 0x4000-0x400f mem 0xfea03000-0xfea03fff,0xfea02000-0xfea02fff,0xfea00000-0xfea01fff at device 0.0 on pci3
Apr  9 10:03:22 rxg kernel: vmx0: Using 512 tx descriptors and 256 rx descriptors
Apr  9 10:03:22 rxg kernel: vmx0: Using 2 rx queues 2 tx queues
Apr  9 10:03:22 rxg kernel: vmx0: Using MSI-X interrupts with 3 vectors
Apr  9 10:03:22 rxg kernel: vmx0: Ethernet address: 00:0c:29:2b:73:76
Apr  9 10:03:22 rxg kernel: vmx0: netmap queues/slots: TX 2/512, RX 2/512
Apr  9 10:03:22 rxg kernel: vmx0: link state changed to UP

Notice that ntxqs/nrxqs is decreased from the default of 8 (or lower of number of CPU cores), however ntxd/nrxd remains the default.
Comment 1 ncrogers 2019-04-09 19:08:02 UTC
Here's what happens when trying to explicitly indicate number of descriptors for each queue... In general whatever is set in loader.conf for override_nrxds/txds, the resulting sysctl output has an additional ",0" appended to it for vmx1, but not vmx0.

test# grep vmx /boot/loader.conf
dev.vmx.0.iflib.override_ntxds=4096,4096
dev.vmx.0.iflib.override_nrxds=2048,2048
dev.vmx.1.iflib.override_ntxds=4096
dev.vmx.1.iflib.override_nrxds=2048
dev.vmx.0.iflib.override_ntxqs=2
dev.vmx.0.iflib.override_nrxqs=2
dev.vmx.1.iflib.override_ntxqs=2
dev.vmx.1.iflib.override_nrxqs=2

test# sysctl -a | grep override
dev.vmx.1.iflib.override_nrxds: 2048,0,0
dev.vmx.1.iflib.override_ntxds: 4096,0
dev.vmx.1.iflib.override_qs_enable: 0
dev.vmx.1.iflib.override_nrxqs: 2
dev.vmx.1.iflib.override_ntxqs: 2
dev.vmx.0.iflib.override_nrxds: 0,0,0
dev.vmx.0.iflib.override_ntxds: 0,0
dev.vmx.0.iflib.override_qs_enable: 0
dev.vmx.0.iflib.override_nrxqs: 2
dev.vmx.0.iflib.override_ntxqs: 2

test# grep vmx /var/run/dmesg.boot 
vmx0: <VMware VMXNET3 Ethernet Adapter> port 0x4000-0x400f mem 0xfea03000-0xfea03fff,0xfea02000-0xfea02fff,0xfea00000-0xfea01fff at device 0.0 on pci3
vmx0: Using 512 tx descriptors and 256 rx descriptors
vmx0: Using 2 rx queues 2 tx queues
vmx0: Using MSI-X interrupts with 3 vectors
vmx0: Ethernet address: 00:0c:29:2b:73:76
vmx0: netmap queues/slots: TX 2/512, RX 2/512
vmx1: <VMware VMXNET3 Ethernet Adapter> port 0x3000-0x300f mem 0xfe203000-0xfe203fff,0xfe202000-0xfe202fff,0xfe200000-0xfe201fff at device 0.0 on pci4
vmx1: Using 512 tx descriptors and 256 rx descriptors
vmx1: Using 2 rx queues 2 tx queues
vmx1: Using MSI-X interrupts with 3 vectors
vmx1: Ethernet address: 00:0c:29:2b:73:80
vmx1: netmap queues/slots: TX 2/512, RX 2/512
vmx0: link state changed to UP
vmx1: link state changed to UP
test#
Comment 2 Patrick Kelsey freebsd_committer freebsd_triage 2019-04-11 00:04:51 UTC
There are a couple of things going on here that need to be sorted out in your configuration in order for things to work as expected.

The first thing to understand has to do with how iflib.override_n{t,r}xds tunables work.  In the iflib model, what we think of as a NIC queue is modeled as a 'queue set' that itself is made up of 'queues'.  So, NIC queue -> iflib queue set -> one or more iflib queues.  The reason for this model is that there are NIC devices that have multiple descriptor rings per NIC queue.  An iflib 'queue' models what we might otherwise call a descriptor ring.

vmx(4) is a device that uses multiple descriptor rings per NIC queue.  Specifically, a vmx tx queue consists of a completion ring and a buffer ring, and a vmx rx queue consists of a completion ring and two buffer rings (let's just accept that latter detail as a decision neither of us got to make).

The iflib.override_n{t,r}xds tunables contain a string that specifies the number of descriptors for each iflib queue in the corresponding type of iflib queue set, and the number of of queues in each queue set depends on the device.

As vmx has 2 queues in tx queue sets and 3 queues in rx queue sets, you see default values for iflib.override_n{t,r}xds of "0,0" and "0,0,0" respectively.  Specifically, the tx queue set overrides are "<completion queue>,<buffer queue>" and the rx queue set overrides are "<completion queue>,<buffer queue 0>,<buffer queue 1>".

The second thing to understand about loader.conf(5) is that you are supposed to quote your values.  You can get away without quotes for some things, but a string value containing a comma isn't one of them.

The third of the couple things to understand is that the vmx driver will enforce some required relationships between the number of descriptors in each queue in a given type of queue set.  For tx queue sets, the number of descriptors in each queue must be equal (one completion descriptor per buffer descriptor), so the driver will always set the number of completion descriptors to whatever the number of configured buffer queue descriptors is.  For rx queue sets, the number of descriptors in each buffer queue needs to be the same, and the number of descriptors in the completion queue needs to be the sum of the number of descriptors in the buffer queues, so the driver will always set the number of descriptors in the second buffer queue equal to the number of descriptors in the first buffer queue before setting the number of descriptors in the completion queue to twice that value.  That's a long-winded way of saying that for vmx, in each override specification, only the second number has any effect on anything.
Comment 3 ncrogers 2019-04-11 22:12:39 UTC
Thanks again for the explanation.

For anyone else running into this, the following loader.conf entries will correctly max out the tx/rx descriptors:

dev.vmx.0.iflib.override_ntxds="0,4096"
dev.vmx.0.iflib.override_nrxds="0,2048,0"

I suppose the manual needs updating...