I've setup a simple throughput test on my ESX 6.7: 10.6.0.10 vmx1 vmx2 10.6.10.10 [ debian 1 ] <==> [ freebsd ] <==> [ debian 2 ] iperf server router iperf client freebsd is version 13-RELEASE-p4 and both debians are of version 11. When I run iperf test I get suspiciously low throughput: root@debian2:~# iperf3 -t 30 -i 10 -c 10.6.0.10 Connecting to host 10.6.0.10, port 5201 [ 5] local 10.6.10.10 port 56006 connected to 10.6.0.10 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 2.24 GBytes 1.93 Gbits/sec 79 2.36 MBytes [ 5] 10.00-20.00 sec 2.24 GBytes 1.93 Gbits/sec 2 2.30 MBytes [ 5] 20.00-30.00 sec 2.27 GBytes 1.95 Gbits/sec 1 2.21 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-30.00 sec 6.76 GBytes 1.94 Gbits/sec 82 sender [ 5] 0.00-30.00 sec 6.76 GBytes 1.93 Gbits/sec receiver iperf Done. While test is running, kernel{if_io_tqg_X} is very high (hitting 100% on one CPU). If I run the test with more parallel sessions (iperf -P X), I get higher throughput with more kernel{if_io_tqg_X} threads utilizing the CPUs (this is expected as parallel sessions are spread across CPUs). Now If I replace router with some linux (tested with centos 7), I get more than 10Gbps with single session (all three VMs are running on the same ESX host) with CPU on the linux router doing almost nothing. LRO and TSO are disabled on both vmx adapters. root@freebsd:~ # dmesg | grep vmx vmx0: <VMware VMXNET3 Ethernet Adapter> port 0x4000-0x400f mem 0xfe903000-0xfe903fff,0xfe902000-0xfe902fff,0xfe900000-0xfe901fff at device 0.0 on pci3 vmx0: Using 4096 TX descriptors and 2048 RX descriptors vmx0: Using 4 RX queues 4 TX queues vmx0: Using MSI-X interrupts with 5 vectors vmx0: Ethernet address: 00:50:56:a7:85:e4 vmx0: netmap queues/slots: TX 4/4096, RX 4/2048 vmx1: <VMware VMXNET3 Ethernet Adapter> port 0x3000-0x300f mem 0xfe103000-0xfe103fff,0xfe102000-0xfe102fff,0xfe100000-0xfe101fff at device 0.0 on pci4 vmx1: Using 4096 TX descriptors and 2048 RX descriptors vmx1: Using 4 RX queues 4 TX queues vmx1: Using MSI-X interrupts with 5 vectors vmx1: Ethernet address: 00:50:56:a7:1f:01 vmx1: netmap queues/slots: TX 4/4096, RX 4/2048 vmx2: <VMware VMXNET3 Ethernet Adapter> port 0x2000-0x200f mem 0xfd903000-0xfd903fff,0xfd902000-0xfd902fff,0xfd900000-0xfd901fff at device 0.0 on pci5 vmx2: Using 4096 TX descriptors and 2048 RX descriptors vmx2: Using 4 RX queues 4 TX queues vmx2: Using MSI-X interrupts with 5 vectors vmx2: Ethernet address: 00:50:56:a7:39:49 vmx2: netmap queues/slots: TX 4/4096, RX 4/2048 vmx0: link state changed to UP vmx1: link state changed to UP vmx2: link state changed to UP Are there any more tweaks I can try to improve this? regards Michal
I am also seeing the same issue with FREEBSD 13 with vmx interface. TSO and LRO disabled improves only little also tried disabling hw.pci.honor_msi_blacklist. Any update or fix on the same?
Adding more to this. Installed today system is running FreeBSD 13.1-RELEASE-p2 VMXNET3 driver is slow on all throughput. Typically 300 Mb/s slower than any other Linux VM or more on a 1 GB link. Host system is ESXI 6.7 U3 Dell R720 2 x 2650 v2 128GB ram VM has 4 Vcpu and 8GB of ram. 2 x nics 1 LAN 1 WAN ESXO Host has other VM's all linux VM's using VMXNET3 driver get full wire speed across lan 1GB link. Tried all the tweaks mentioned to no avail. So will not spam this report with endless iperf3 tests that are dismal! Can this be acknowledged and prioritised as a bug. VMWARE ESXI is one of the most prolific virtualisation platforms it would be great to see FBSD running well. Cheers Tony.
This issue is still present in 13.2-RELEASE-p10. Is really nobody interested in solving this?