We are observing a strange problem on VMware when a kernel is compiled with RSS option. The problem appears as lib resolver not being able to resolve DNS names. E.g., "host google.com" would resolve the name, but "ping google.com" would fail with "Host name lookup failure". A packet capture shows that both commands send exactly the same requests and get the same replies, but in the case of ping the host replies a port unreachable ICMP message. For example: 10.180.106.180.29707 > 10.180.106.5.53: 50740+ A? s3.us-east-2.amazonaws.com. (55) 10.180.106.5.53 > 10.180.106.180.29707: 50740 1/0/0 A 52.219.80.194 (71) 10.180.106.180 > 10.180.106.5: ICMP 10.180.106.180 udp port 29707 unreachable, length 107 One difference between host(1)'s code and libc resolver's code might be that the latter performs connect(2) on the datagram / UDP socket used for DNS queries. On the kernel side, I see that in_pcblookup_mbuf() fails to find the PCB matching the reply packet. Potentially interesting fields from the mbuf are: uint32_t flowid = 0xc48d76be uint8_t rsstype = 0x81 // M_HASHTYPE_RSS_IPV4
I see that base r343291, in addition to converting vmx to iflib, enabled previously ifdef-ed out code that sets packet's rsstype based on the hardware reported rss_type. Before that commit rsstype was always set M_HASHTYPE_OPAQUE_HASH.
I see that vmxnet3_reinit_rss_shared_data() uses an RSS key that's different from the system RSS key defined sys/net/rss_config.c. I think that the different keys can result in in_pcblookup_mbuf() failure because of mismatching hash values.
(In reply to Andriy Gapon from comment #2) When I converted the vmxnet3 driver to iflib, I enabled the RSS code based on iflib internals and looking sideways at the bnxt driver, and not so much by thinking through the RSS code's fundamental requirements. What I saw in the bnxt driver was that it was setting the RSS key using arc4rand() in bnxt_attach_pre(), and that it is always using the hash value for the flowid in bnxt_pkt_get_l2(). That lead me to believe that the rss key value did not have to be anything specific, and is why the way the vmxnet3 code behaves with respect to this issue is functionally the same as what bnxt does. If I am not missing something further, perhaps this same issue exists for the bnxt driver as well.
A commit references this bug: Author: avg Date: Thu Jan 23 11:05:03 UTC 2020 New revision: 357042 URL: https://svnweb.freebsd.org/changeset/base/357042 Log: vmxnet3: add support for RSS kernel option We observe at least one problem: if a UDP socket is connect(2)-ed, then a received packet that matches the connection cannot be matched to the corresponding PCB because of an incorrect flow ID. That was oberved for DNS requests from the libc resolver. We got this problem because FreeBSD r343291 enabled code that can set rsstype of received packets to values other than M_HASHTYPE_OPAQUE_HASH. Earlier that code was under 'ifdef notyet'. The essence of this change is to use the system-wide RSS key instead of some historic hardcoded key when the software RSS is enabled and it is configured to use Toeplitz algorithm (the default). In all other cases, the driver reports the opaque hash type for received packets while still using Toeplitz algorithm with the internal key. PR: 242890 Reviewed by: pkelsey Sponsored by: Panzura Differential Revision: https://reviews.freebsd.org/D23147 Changes: head/sys/dev/vmware/vmxnet3/if_vmx.c head/sys/dev/vmware/vmxnet3/if_vmxvar.h head/sys/modules/vmware/vmxnet3/Makefile
^Triage: Assign to committer resolving Assume the iflib changes didn't end up in stable/11, so mark this not for merging there. If they did and the plan is to also MFC base r357042 there, please set the mfc-stable11 flag to ? accordingly @Andriy/Patrick Would we want a new issue for bnxt if it turns out it's affected too, or will we track that here, and close when both drivers fixes are committed/merged?
A commit references this bug: Author: avg Date: Thu Feb 27 15:08:44 UTC 2020 New revision: 358386 URL: https://svnweb.freebsd.org/changeset/base/358386 Log: MFC r357042: vmxnet3: add support for RSS kernel option We observe at least one problem: if a UDP socket is connect(2)-ed, then a received packet that matches the connection cannot be matched to the corresponding PCB because of an incorrect flow ID. That was oberved for DNS requests from the libc resolver. We got this problem because FreeBSD r343291 enabled code that can set rsstype of received packets to values other than M_HASHTYPE_OPAQUE_HASH. Earlier that code was under 'ifdef notyet'. The essence of this change is to use the system-wide RSS key instead of some historic hardcoded key when the software RSS is enabled and it is configured to use Toeplitz algorithm (the default). In all other cases, the driver reports the opaque hash type for received packets while still using Toeplitz algorithm with the internal key. PR: 242890 Sponsored by: Panzura Changes: _U stable/12/ stable/12/sys/dev/vmware/vmxnet3/if_vmx.c stable/12/sys/dev/vmware/vmxnet3/if_vmxvar.h stable/12/sys/modules/vmware/vmxnet3/Makefile