Bug 261746 - hn(4): Communication stops when enabling SR-IOV secondary mlx5en(4) interface (640FLR-SFP28) on Windows Server 2022
Summary: hn(4): Communication stops when enabling SR-IOV secondary mlx5en(4) interface...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Hans Petter Selasky
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-06 14:04 UTC by Michael
Modified: 2022-02-11 10:16 UTC (History)
3 users (show)

See Also:
koobs: maintainer-feedback+
koobs: mfc-stable13?
koobs: mfc-stable12-


Attachments
Permanent patch to try (1.51 KB, patch)
2022-02-09 15:33 UTC, Hans Petter Selasky
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael 2022-02-06 14:04:45 UTC
Hypervisor: Windows Server 2022 (21H2, 20348.502)
Network adapters:
    1. Mellanox ConnectX-3 EN NIC for OCP; 10GbE; dual-port SFP+; PCIe3.0 x8; IPMI disabled; R6 (Firmware version: 2.42.5000, Driver version: 5.50.14740.1)
    2. Mellanox ConnectX-4 Lx  -  HPE Ethernet 10/25Gb 2-port 640FLR-SFP28 Adapter (Firmware version: 14.31.1200, Driver version: 2.80.25134.0) 

Guest: FreeBSD-14.0-CURRENT-amd64-20220203-e2fe58d61b7-252875-disc1.iso
Generation: 2 (Configuration Version: 10.0)
No changes were made (installing the system out of the box).

FreeBSD 14.0-CURRENT #0 main-n252875-e2fe58d61b7: Thu Feb  3 05:57:35 UTC 2022
    root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
. . .
VT(efifb): resolution 1024x768
Hyper-V Version: 10.0.20348 [SP0]
  Features=0x2e7f<VPRUNTIME,TMREFCNT,SYNIC,SYNTM,APIC,HYPERCALL,VPINDEX,REFTSC,IDLE,TMFREQ>
  PM Features=0x20 [C2]
  Features3=0xe0bed7b2<DEBUG,XMMHC,IDLE,NUMA,TMFREQ,SYNCMC,CRASH,NPIEP>
Timecounter "Hyper-V" frequency 10000000 Hz quality 2000
CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz (2597.02-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x406f1  Family=0x6  Model=0x4f  Stepping=1
  Features=0x1f83fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,SS,HTT>
  Features2=0xfefa3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x1c2fb9<FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,NFPUSG,RDSEED,ADX,SMAP>
  Structured Extended Features3=0xbc000400<MD_CLEAR,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
  XSAVE Features=0x1<XSAVEOPT>
Hypervisor: Origin = "Microsoft Hv"
real memory  = 8388608000 (8000 MB)
avail memory = 8083980288 (7709 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: <VRTUAL MICROSFT>
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 2 package(s) x 2 core(s) x 2 hardware threads
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
ioapic0 <Version 1.1> irqs 0-23
Launching APs: 6 5 7 3 4 2 1
Timecounter "Hyper-V-TSC" frequency 10000000 Hz quality 3000
random: entropy device external interface
kbd0 at kbdmux0
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
smbios0: <System Management BIOS> at iomem 0xf7fd8000-0xf7fd801e
smbios0: Version: 3.1, BCD Revision: 3.1
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS>
acpi0: <VRTUAL MICROSFT>
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_syscontainer0: <System Container> on acpi0
vmbus0: <Hyper-V Vmbus> on acpi_syscontainer0
vmgenc0: <VM Generation Counter> on acpi0
vmbus_res0: <Hyper-V Vmbus Resource> irq 5 on acpi0
Timecounters tick every 10.000 msec
usb_needs_explore_all: no devclass
vmbus0: version 4.0
hvet0: <Hyper-V event timer> on vmbus0
Event timer "Hyper-V" frequency 10000000 Hz quality 1000
hvheartbeat0: <Hyper-V Heartbeat> on vmbus0
hvkvp0: <Hyper-V KVP> on vmbus0
hvshutdown0: <Hyper-V Shutdown> on vmbus0
hvtimesync0: <Hyper-V Timesync> on vmbus0
hvtimesync0: RTT
hvvss0: <Hyper-V VSS> on vmbus0
storvsc0: <Hyper-V SCSI> on vmbus0
hvkbd0: <Hyper-V KBD> on vmbus0
hn0: <Hyper-V Network Interface> on vmbus0
hn0: Ethernet address: 00:15:5d:d0:8b:40
hn1: <Hyper-V Network Interface> on vmbus0
hn0: link state changed to UP
hn1: Ethernet address: 00:15:5d:d0:8b:41
hn1: link state changed to UP

root@frw05v04:~ # ifconfig
. . .
hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:40
        inet 172.27.0.24 netmask 0xffffff00 broadcast 172.27.0.255
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
hn1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:41
        inet 172.27.172.24 netmask 0xffffff00 broadcast 172.27.172.255
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

We make sure that both the first and second virtual network adapters (hn0 and hn1) have a network connectivity. All ok.

Enable SR-IOV on the first network adapter:

hn0: got notify, nvs type 128
pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
pci0: <PCI bus> on pcib0
pci0: <network, ethernet> at device 2.0 (no driver attached)

root@frw05v04:~ # kldload mlx4en
mlx4_core0: <mlx4_core> at device 2.0 on pci0
mlx4_core: Mellanox ConnectX core driver v3.7.0 (July 2021)
mlx4_core: Initializing mlx4_core
mlx4_core0: Detected virtual function - running in slave mode
mlx4_core0: Sending reset
mlx4_core0: Sending vhcr0
mlx4_core0: HCA minimum page size:512
mlx4_core0: Timestamping is not supported in slave mode
mlx4_en mlx4_core0: Activating port:1
mlxen0: Ethernet address: 00:15:5d:d0:8b:40
mlx4_en: mlx4_core0: Port 1: Using 8 TX rings
mlx4_en: mlx4_core0: Port 1: Using 8 RX rings
mlx4_en: mlxen0: Using 8 TX rings
mlx4_en: mlxen0: Using 8 RX rings
mlxen0: link state changed to DOWN
hn0: link state changed to DOWN
mlx4_en: mlxen0: Initializing port
mlxen0: tso6 disabled due to -txcsum6.
hn0: disable IPV6 mbuf hash delivery
mlxen0: link state changed to UP
mlx4_core0: going promisc on 1
hn0: link state changed to UP

We make sure that there is a network connection through the first network adapter. Yes everything is OK.

Enable SR-IOV on the second network adapter:

hn1: got notify, nvs type 128
pcib1: <Hyper-V PCI Express Pass Through> on vmbus0
pci1: <PCI bus> on pcib1
mlx5_core0: <mlx5_core> at device 2.0 on pci1
mlx5: Mellanox Core driver 3.7.0 (July 2021)
mlx5_core0: WARN: mlx5_init_once:962:(pid 0): Unable to find vendor specific capabilities
mce0: Ethernet address: 00:15:5d:d0:8b:41
mce0: link state changed to DOWN
hn1: link state changed to DOWN
mlx5_core0: WARN: mlx5_fwdump_prep:94:(pid 0): Unable to find vendor-specific capability, error 2
mce0: ERR: mlx5e_ioctl:3542:(pid 0): tso6 disabled due to -txcsum6.
mce0: link state changed to UP
hn1: disable IPV6 mbuf hash delivery
hn1: link state changed to UP

We make sure that there is a network connection through the first network adapter. Communication has stopped through this network adapter, both from the VM and from outside to the VM.

root@frw05v04:~ # ifconfig
. . .
hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:40
        inet 172.27.0.24 netmask 0xffffff00 broadcast 172.27.0.255
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
hn1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:41
        inet 172.27.172.24 netmask 0xffffff00 broadcast 172.27.172.255
        media: Ethernet 10GBase-CR1 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
mlxen0: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8c05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO,LINKSTATE>
        ether 00:15:5d:d0:8b:40
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
mce0: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8805bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:41
        media: Ethernet 10GBase-CR1 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

Windows message log:
    "FRW_05_v4": The network adapter (1EE87C8C-AD86-4162-AAB5-866E08A8045A--8C222756-00C4-4BA1-A16E-B593B25E8453) has allocated a virtual function. (Virtual Function ID: 0; LUID: 0:15455; Virtual Machine ID: 1EE87C8C-AD86-4162-AAB5-866E08A8045A)
    "FRW_05_v4": Virtual PCI device successfully added to virtual machine: "Bus {56158769-787C-4343-AC67-D9C7AAE1AACA} Slot 2". PnpID = "PCI\VEN_15B3&DEV_1016&SUBSYS_00D31590&REV_80", FunctionType = PhysicalOrVirtual. (Virtual Machine ID 1EE87C8C-AD86-4162-AAB5-866E08A8045A)
    "FRW_05_v4": The network adapter (1EE87C8C-AD86-4162-AAB5-866E08A8045A--8C222756-00C4-4BA1-A16E-B593B25E8453) has assigned a virtual function. (Virtual Function ID: 0; LUID: 0:15455) (Virtual Machine ID: 1EE87C8C-AD86-4162-AAB5-866E08A8045A)
    "FRW_05_v4": The virtual machine successfully negotiated PCI virtual bus protocol version 0x10001 on "Channel {56158769-787C-4343-AC67-D9C7AAE1AACA}". (Virtual Machine ID 1EE87C8C-AD86-4162-AAB5-866E08A8045A)
    "FRW_05_v4": Guest OS enabled virtual PCI device: "Bus {56158769-787C-4343-AC67-D9C7AAE1AACA} Slot 2". (Virtual Machine ID 1EE87C8C-AD86-4162-AAB5-866E08A8045A)
    HPE Ethernet 10/25Gb 2-port 640FLR-SFP28 Adapter #4: VF #0 attached to VM FRW_05_v4 was loaded with driver version: FreeBSD,mlx5_core,14.0.51,3.7.0.

We turn off SR-IOV on the second network adapter - communication through it is resumed.

OK. Recompile the GENERIC kernel with an additional option: "options RATELIMIT" and set the boot_verbose="YES" option in /boot/loader.conf

. . .
Hyper-V Version: 10.0.20348 [SP0]
  Features=0x2e7f<VPRUNTIME,TMREFCNT,SYNIC,SYNTM,APIC,HYPERCALL,VPINDEX,REFTSC,IDLE,TMFREQ>
  PM Features=0x20 [C2]
  Features3=0xe0bed7b2<DEBUG,XMMHC,IDLE,NUMA,TMFREQ,SYNCMC,CRASH,NPIEP>
  Recommends: 000e2e24 00000fff
  Limits: Vcpu:1024 Lcpu:1024 Int:10416
  HW Features: 0000004f, AMD: 00000000
. . .
TSC timecounter disables C2 and C3.
TSC timecounter discards lower 1 bit(s)
No TSX change made
TCP_ratelimit: Is now initialized
hyperv: Hypercall created
. . .
acpi_syscontainer0: <System Container> on acpi0
vmbus0: <Hyper-V Vmbus> on acpi_syscontainer0
vmgenc0: <VM Generation Counter> on acpi0
vmbus_res0: <Hyper-V Vmbus Resource> irq 5 on acpi0
ACPI: Enabled 1 GPEs in block 00 to 0F
Device configuration finished.
. . .
vmbus0: intrhook
vmbus_res0: walking _CRS, pass=0
acpi0: walking _CRS, pass=0
acpi0: _CRS: not found, pass=0
acpi_syscontainer0: walking _CRS, pass=0
vmbus0: decoding 3 range 0xfe0000000-0xfffffffff
vmbus_res0: walking _CRS, pass=1
acpi0: walking _CRS, pass=1
acpi0: _CRS: not found, pass=1
acpi_syscontainer0: walking _CRS, pass=1
vmbus0: decoding 3 range 0xf8000000-0xfed3ffff
vmbus0: fb: fb_addr: 0xf8000000, size: 0x800000, actual size needed: 0xc0000
vmbus0: allocated type 3 (0xf8000000-0xf87fffff) for rid 0 of vmbus0
vmbus0: successfully reserved memory for framebuffer starting at 0xf8000000, size 0x800000
vmbus0: vmbus IDT vector 252
vmbus0: smp_started = 1
vmbus0: version 4.0
hvet0: <Hyper-V event timer> on vmbus0
Event timer "Hyper-V" frequency 10000000 Hz quality 1000
vmbus0: chan2 subidx0 offer
vmbus0: chan2 assigned to cpu0 [vcpu0]
vmbus0: chan7 subidx0 offer
vmbus0: chan7 assigned to cpu0 [vcpu0]
vmbus0: chan8 subidx0 offer
hvheartbeat0: <Hyper-V Heartbeat>vmbus0: chan8 assigned to cpu0 [vcpu0]
vmbus0:  on vmbus0
hvheartbeat0: chan9 subidx0 offer
chan7 update cpu0 flag_cnt to 1
vmbus0: chan9 assigned to cpu0 [vcpu0]
vmbus0: chan10 subidx0 offer
vmbus0: chan10 assigned to cpu0 [vcpu0]
vmbus0: chan11 subidx0 offer
vmbus0: chan11 assigned to cpu0 [vcpu0]
vmbus0: chan14 subidx0 offer
vmbus0: chan14 assigned to cpu0 [vcpu0]
vmbus0: chan1 subidx0 offer
vmbus0: chan1 assigned to cpu0 [vcpu0]
vmbus0: chan3 subidx0 offer
vmbus0: chan3 assigned to cpu0 [vcpu0]
vmbus0: chan4 subidx0 offer
vmbus0: chan4 assigned to cpu0 [vcpu0]
vmbus0: chan5 subidx0 offer
vmbus0: chan5 assigned to cpu0 [vcpu0]
vmbus0: chan6 subidx0 offer
vmbus0: chan6 assigned to cpu0 [vcpu0]
vmbus0: chan12 subidx0 offer
vmbus0: chan12 assigned to cpu0 [vcpu0]
vmbus0: chan13 subidx0 offer
vmbus0: chan13 assigned to cpu0 [vcpu0]
vmbus0: chan15 subidx0 offer
vmbus0: chan15 assigned to cpu0 [vcpu0]
hvheartbeat0: gpadl_conn(chan7) succeeded
hvheartbeat0: chan7 opened
hvkvp0: <Hyper-V KVP> on vmbus0
hvkvp0: gpadl_conn(chan8) succeeded
hvkvp0: chan8 opened
hvshutdown0: <Hyper-V Shutdown> on vmbus0
hvshutdown0: gpadl_conn(chan9) succeeded
hvshutdown0: sel framework version: 3.0
hvshutdown0: hvshutdown0: supp framework version: 1.0
hvshutdown0: supp framework version: 3.0
chan9 opened
hvtimesync0: <Hyper-V Timesync>hvshutdown0: sel message version: 3.0
hvshutdown0: supp message version: 1.0
hvshutdown0: supp message version: 3.0
hvshutdown0: supp message version: 3.1
hvshutdown0: supp message version: 3.2
 on vmbus0
hvtimesync0: gpadl_conn(chan10) succeeded
hvtimesync0: sel framework version: 3.0
hvtimesync0: hvtimesync0: supp framework version: 1.0
hvtimesync0: supp framework version: 3.0
chan10 opened
hvvss0: <Hyper-V VSS>hvtimesync0: sel message version: 4.0
 on vmbus0
hvtimesync0: supp message version: 1.0
hvtimesync0: supp message version: 3.0
hvtimesync0: supp message version: 4.0
hvtimesync0: RTT
hvvss0: gpadl_conn(chan11) succeeded
hvtimesync0: apply sync request, hv: 1644152790950713500, vm: 1220771799
hvvss0: chan11 opened
storvsc0: Enlightened SCSI device detected
 last message repeated 1 times
storvsc0: <Hyper-V SCSI> on vmbus0
storvsc ringbuffer size: 262144, max_io: 512
storvsc0: chan14 assigned to cpu0 [vcpu0]
storvsc0: gpadl_conn(chan14) succeeded
storvsc0: chan14 opened
storvsc0: max chans 2, multi-chan capable
storvsc0: chan16 subidx1 offer
storvsc0: chan16 assigned to cpu0 [vcpu0]
storvsc0: chan16 assigned to cpu1 [vcpu1]
storvsc0: chan16 update cpu1 flag_cnt to 1
storvsc0: gpadl_conn(chan16) succeeded
storvsc0: chan16 opened
Storvsc create multi-channel success!
(probe0:storvsc0:0:0:0): Down reving Protocol Version from 4 to 2?
hvkbd0: <Hyper-V KBD>(probe1:storvsc0:0:1:0): Down reving Protocol Version from 4 to 2?
. . .
hvheartbeat0: sel framework version: 3.0
hvheartbeat0: supp framework version: 1.0
hvheartbeat0: supp framework version: 3.0
hvheartbeat0: sel message version: 3.0
hvheartbeat0: supp message version: 3.0
hvheartbeat0: supp message version: 3.1
. . .
hn1: <Hyper-V Network Interface> on vmbus0
hn0: link state changed to UP
hn1: LRO: entry count 128
hn1: link RX ring 0 to chan15
hn1: link TX ring 0 to chan15
hn1: chan15 assigned to cpu0 [vcpu0]
hn1: gpadl_conn(chan15) succeeded
hn1: chan15 opened
hn1: NVS version 0x60001, NDIS version 6.30
hn1: nvs ndis conf done
hn1: gpadl_conn(chan15) succeeded
 last message repeated 1 times
hn1: chimney sending buffer 6144/2560
hn1: RNDIS ver 1.0, aggpkt size 4026531839, aggpkt cnt 8, aggpkt align 8
hn1: hwcaps rev 3
hn1: hwcaps csum: ip4 tx 0x155/0x2 rx 0x155/0x2, ip6 tx 0x55/0x2 rx 0x55/0x2
hn1: hwcaps lsov2: ip4 maxsz 62780 minsg 2 encap 0x2, ip6 maxsz 62780 minsg 2 encap 0x2 opts 0x5
hn1: hwcaps rsc: ip4 1 ip6 1
hn1: NDIS TSO szmax 62780 sgmin 2
hn1: offload csum: ip4 4, tcp4 4, udp4 4, tcp6 4, udp6 4
hn1: offload lsov2: ip4 2, ip6 2
hn1: offload rsc: ip4 2, ip6 2
hn1: offload config done
hn1: 64 RX rings
hn1: RSS indirect table size 128
hn1: RSS caps 0x301
hn1: RX rings offered 64, requested 8
hn1: chan24 subidx1 offer
hn1: chan24 assigned to cpu0 [vcpu0]
hn1: chan25 subidx2 offer
hn1: chan25 assigned to cpu0 [vcpu0]
hn1: chan26 subidx3 offer
hn1: chan26 assigned to cpu0 [vcpu0]
hn1: chan27 subidx4 offer
hn1: chan27 assigned to cpu0 [vcpu0]
hn1: chan28 subidx5 offer
hn1: chan28 assigned to cpu0 [vcpu0]
hn1: chan29 subidx6 offer
hn1: chan29 assigned to cpu0 [vcpu0]
hn1: chan30 subidx7 offer
hn1: chan30 assigned to cpu0 [vcpu0]
hn1: 8 TX ring, 8 RX ring
hn1: link RX ring 1 to chan24
hn1: link TX ring 1 to chan24
hn1: chan24 assigned to cpu1 [vcpu1]
hn1: gpadl_conn(chan24) succeeded
hn1: chan24 opened
hn1: link RX ring 2 to chan25
hn1: link TX ring 2 to chan25
hn1: chan25 assigned to cpu2 [vcpu2]
hn1: gpadl_conn(chan25) succeeded
hn1: chan25 opened
hn1: link RX ring 3 to chan26
hn1: link TX ring 3 to chan26
hn1: chan26 assigned to cpu3 [vcpu3]
hn1: gpadl_conn(chan26) succeeded
hn1: chan26 opened
hn1: link RX ring 4 to chan27
hn1: link TX ring 4 to chan27
hn1: chan27 assigned to cpu4 [vcpu4]
hn1: gpadl_conn(chan27) succeeded
hn1: chan27 opened
hn1: link RX ring 5 to chan28
hn1: link TX ring 5 to chan28
hn1: chan28 assigned to cpu5 [vcpu5]
hn1: gpadl_conn(chan28) succeeded
hn1: chan28 opened
hn1: link RX ring 6 to chan29
hn1: link TX ring 6 to chan29
hn1: chan29 assigned to cpu6 [vcpu6]
hn1: gpadl_conn(chan29) succeeded
hn1: chan29 opened
hn1: link RX ring 7 to chan30
hn1: link TX ring 7 to chan30
hn1: chan30 assigned to cpu7 [vcpu7]
hn1: gpadl_conn(chan30) succeeded
hn1: chan30 opened
hn1: 7 sub-channels attached
hn1: setup default RSS key
hn1: setup default RSS indirect table
hn1: RSS indirect table size 128, hash 0x00005701
hn1: RSS config done
hn1: TX agg size 6144, pkts 8, align 8
hn1: set RX filter 0x00000000 done
hn1: RNDIS mtu 1500
hn1: support HASHVAL pktinfo
hn1: TSO size max 62762
hn1: bpf attached
hn1: Ethernet address: 00:15:5d:d0:8b:41
hn1: TSO segcnt 31 segsz 4096
vmbus0: device scan, probe and attach done
hn1: link state changed to UP
. . .
GEOM: new disk da0
efirtc0: providing initial system time
start_init: trying /sbin/init
lo0: link state changed to UP
hn0: set RX filter 0x00000009 done
hn0: set RX filter 0x0000000d done
hn1: set RX filter 0x00000009 done
hn1: set RX filter 0x0000000d done
hvkvp0: sel framework version: 3.0
hvkvp0: supp framework version: 1.0
hvkvp0: supp framework version: 3.0
hvkvp0: sel message version: 4.0
hvkvp0: supp message version: 3.0
hvkvp0: supp message version: 4.0
hvkvp0: supp message version: 5.0
hvvss0: sel framework version: 3.0
hvvss0: supp framework version: 1.0
hvvss0: supp framework version: 3.0
hvvss0: sel message version: 5.0
hvvss0: supp message version: 5.0
hvvss0: supp message version: 6.0
hvvss0: supp message version: 7.0
hvtimesync0: apply sample request, hv: 1644152795952966900, vm: 1644152797140655199

We make sure that there is a network connection through the first network adapter. Yes everything is OK.

Enable SR-IOV on the second network adapter:

hn1: got notify, nvs type 128
vmbus0: chan31 subidx0 offer
vmbus0: chan31 assigned to cpu0 [vcpu0]
pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
vmbus0: allocated type 3 (0xfe0000000-0xfe0001fff) for rid 0 of pcib0
pcib0: gpadl_conn(chan31) succeeded
pcib0: chan31 opened
vmbus_pcib: initialize bar 0 by writing all 1s
vmbus_pcib: initialize bar 1 by writing all 1s
pci0: <PCI bus> on pcib0
pci0: domain=3, physical bus=0
found->        vendor=0x15b3, dev=0x1016, revid=0x80
       domain=3, bus=0, slot=2, func=0
       class=02-00-00, hdrtype=0x00, mfdev=0
       cmdreg=0x0000, statreg=0x0010, cachelnsz=0 (dwords)
       lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
       MSI-X supports 8 messages in map 0x10
       map[10]: type Prefetchable Memory, range 64, base rxfffffffffff00000, size 20, memory disabled
mlx5_core0: <mlx5_core> at device 2.0 on pci0
mlx5: Mellanox Core driver 3.7.0 (July 2021)vmbus0: allocated type 3 (0xfe0100000-0xfe01fffff) for rid 10 of mlx5_core0
mlx5_core0: Lazy allocation of 0x100000 bytes rid 0x10 type 3 at 0xfe0100000
mlx5_core0: WARN: mlx5_init_once:962:(pid 0): Unable to find vendor specific capabilities
mlx5_core0: attempting to allocate 8 MSI-X vectors (8 supported)
msi: routing MSI-X IRQ 24 to local APIC 0 vector 50
msi: routing MSI-X IRQ 25 to local APIC 2 vector 48
msi: routing MSI-X IRQ 26 to local APIC 0 vector 51
msi: routing MSI-X IRQ 27 to local APIC 2 vector 49
msi: routing MSI-X IRQ 28 to local APIC 0 vector 52
msi: routing MSI-X IRQ 29 to local APIC 2 vector 50
msi: routing MSI-X IRQ 30 to local APIC 0 vector 53
msi: routing MSI-X IRQ 31 to local APIC 2 vector 51
mlx5_core0: using IRQs 24-31 for MSI-X
mce0: bpf attached
mce0: Ethernet address: 00:15:5d:d0:8b:41
mce0: link state changed to DOWN
hn1: link state changed to DOWN
mlx5_core0: WARN: mlx5_fwdump_prep:94:(pid 0): Unable to find vendor-specific capability, error 2
mce0: ERR: mlx5e_ioctl:3542:(pid 0): tso6 disabled due to -txcsum6.
hn1: delayed initialize mce0
hn1: try bringing up mce0
mce0: link state changed to UP
hn1: disable IPV6 mbuf hash delivery
hn1: disable RSS
hn1: RSS indirect table size 128, hash 0x00005701
hn1: link state changed to UP
hn1: RSS config done
hn1: reconfig RSS
hn1: RSS indirect table size 128, hash 0x00005701
hn1: RSS config done

Communication has stopped through this network adapter, both from the VM and from outside to the VM.

Turn off SR-IOV on the second network adapter:

mce0: link state changed to DOWN
hn1: link state changed to DOWN
hn1: link state changed to UP
mlx5_core0: detached
pcib0: chan31 revoked
pcib0: chan31 detached
pci0: detached
hn1: got notify, nvs type 128
pcib0: chan31 closed
pcib0: detached
vmbus0: chan31 freed

communication through second network adapter resumed.

Any ideas?
Comment 1 Michael 2022-02-06 14:28:20 UTC
Made a mistake here:
> "We make sure that there is a network connection through the first network adapter. Communication has stopped through this network adapter, both from the VM and from outside to the VM."

Network communication problems occur when SR-IOV is enabled only on the second network adapter VF mlx5en hn1 (ConnectX-4 Lx)
Comment 2 Michael 2022-02-06 15:10:27 UTC
Hypervisor: Windows Server 2019 (1809, 17763.2510)
Network adapters:
1.	Mellanox ConnectX-3 EN NIC for OCP; 10GbE; dual-port SFP+; PCIe3.0 x8; IPMI disabled; R6 (Firmware version: 2.42.5000, Driver version: 5.50.14740.1)
2.	Mellanox ConnectX-4 Lx  -  HPE Ethernet 10/25Gb 2-port 640FLR-SFP28 Adapter (Firmware version: 14.26.1040, Driver version: 2.80.25134.0) 

Guest: FreeBSD-14.0-CURRENT-amd64-20220203-e2fe58d61b7-252875-disc1.iso
Generation: 2 (Configuration Version: 9.0)
No changes were made (installing the system out of the box).

Same behavior: SR-IOV enabled on ConnectX-3 VF mlx4en communication works, SR-IOV enabled on ConnectX-4 VF mlx5en communication does not work.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2022-02-06 23:02:33 UTC
Thank you for your report Michael. Are you able to test reproducibility with FreeBSD 12 and 13 images?
Comment 4 Wei Hu 2022-02-07 05:20:56 UTC
The logs look normal to me. Several questions:

1. If only the second SRIOV nic (Mellanox CX-4) causes problem, what happen if you just enable this nic, not enable the first one (Mellanox CX-3)? Does the second interface work in this case?

2. How do you enable and disable the SRIOV interfaces?

3. How do you verify that the second interface stopped working? 'sysctl -a | grep mce' gives a lot of stats on the mce interface. Do you see any number changes after loading some traffic on this interface?
Comment 5 Michael 2022-02-07 15:08:56 UTC
  Firstly, I did a little investigation to find out what kind of commit cuts the connection when using the SR-IOV technology and mlx5en VF. This is commit e059c120b4223fd5ec3af9def21c0519f439fe57.
  With the GENERIC kernel and the previous commit a8e715d21b963251e449187c98292fff77dc7576, everything works as it should - SR-IOV VF works for both ConnectX-3 mlx4en and ConnectX-4 mlx5en.

root@frw05v5:/usr/src # git checkout e059c120b4223fd5ec3af9def21c0519f439fe57
Previous HEAD position was a8e715d21b9 mlx5en: Add race protection for SQ remap
HEAD is now at e059c120b42 mlx5en: Create and destroy all flow tables and rules when the network interface attaches and detaches.

  After the e059c120b42 checkout, the mlx5en VF network connection breaks.

> How do you verify that the second interface stopped working?
    The working state is checked in an elementary way: Ping from the VM IP address outside the VM (in the same subnet, of course) and ping from outside the VM IP of the network interface of our VM we need.

> If only the second SRIOV nic (Mellanox CX-4) causes problem, what happen if 
you just enable this nic, not enable the first one (Mellanox CX-3)? Does the second interface work in this case?
    I cited the use of the first network interface as an example to illustrate that the SR-IOV VF technology is operational on the hypervisor. And no, even with only one ConnectX-4 mlx5en network interface, this behavior is as described at the beginning.

> How do you enable and disable the SRIOV interfaces?
    Hyper-V Manager -> right click on the VM -> Options -> Network adapter -> Hardware acceleration -> "checkbox" Enable SR-IOV

Secondly.

FreeBSD-12.3-STABLE-amd64-20220203-r371543-disc1.iso - no changes were made (installing the system out of the box) - Everything is OK - SR-IOV VF works for both ConnectX-3 mlx4en and ConnectX-4 mlx5en.

FreeBSD-13.0-STABLE-amd64-20220203-40b816bd4f0-249223-disc1.iso - no changes were made (installing the system out of the box) - Everything is OK - SR-IOV VF works for both ConnectX-3 mlx4en and ConnectX-4 mlx5en.
Comment 6 Wei Hu 2022-02-07 16:25:15 UTC
I was able to reproduce on a VM in Azure as well. Following commit broke the Cx-4 VF driver on Hyper-V:

commit e059c120b4223fd5ec3af9def21c0519f439fe57
Author: Hans Petter Selasky <hselasky@FreeBSD.org>
Date:   Tue Feb 1 16:20:12 2022 +0100

    mlx5en: Create and destroy all flow tables and rules when the network interface attaches and detaches.


Add HPS for comment and further investigation.
Comment 7 Hans Petter Selasky freebsd_committer freebsd_triage 2022-02-08 15:22:38 UTC
I'm sorry for the breakage.

I'll look into it ASAP.

Possibly I should have waited pushing this patch to 13-stable.

Let's hope there is a quick fix.
Comment 8 Hans Petter Selasky freebsd_committer freebsd_triage 2022-02-08 16:13:24 UTC
I will try to reproduce later today.

Meanwhile:

Does manually re-adding the IP address for the hn/mce interface or setting the link up/down change anything?

--HPS
Comment 9 Michael 2022-02-08 17:39:36 UTC
> Does manually re-adding the IP address for the hn/mce interface or setting
> the link up/down change anything?

No. Up/Down - doesn't change anything.

root@frw05v5:~ # ifconfig
. . .
hn1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:43
        inet 172.27.172.23 netmask 0xffffff00 broadcast 172.27.172.255
        media: Ethernet 10GBase-CR1 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
mce0: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8805bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,LINKSTATE>
        ether 00:15:5d:d0:8b:43
        media: Ethernet 10GBase-CR1 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
root@frw05v5:~ # ifconfig hn1 down
root@frw05v5:~ # ifconfig hn1 up
root@frw05v5:~ # ifconfig mce0 down
root@frw05v5:~ # ifconfig mce0 up
Comment 10 Kubilay Kocak freebsd_committer freebsd_triage 2022-02-08 22:36:33 UTC
^Triage: Update Version to reflect earliest affected branch/version. Original report is for CURRENT.
Comment 11 Hans Petter Selasky freebsd_committer freebsd_triage 2022-02-09 15:20:57 UTC
Hi,

Can you verify executing the following two commands gets communication back?

ifconfig mce0 promisc
ifconfig mce0 -promisc

--HPS
Comment 12 Hans Petter Selasky freebsd_committer freebsd_triage 2022-02-09 15:33:27 UTC
Created attachment 231684 [details]
Permanent patch to try
Comment 13 Michael 2022-02-09 16:59:28 UTC
  Yes, after this command:

root@frw05v04:~ # ifconfig mce0 promisc

  connection appears, and, after this command:

root@frw05v04:~ # ifconfig mce0 -promisc

  the connection also works.
Comment 14 Michael 2022-02-09 18:53:12 UTC
> Created attachment 231684 [details]
> Permanent patch to try

The patch also fixes network communication with the SR-IOV VF of the mlx5en network adapter
Comment 15 commit-hook freebsd_committer freebsd_triage 2022-02-10 10:18:54 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=04f407a3e5e7bf452768201ace260b575f1a7924

commit 04f407a3e5e7bf452768201ace260b575f1a7924
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-02-10 10:12:21 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-02-10 10:17:42 +0000

    mlx5en: Make sure the NIC IP addresses are written to firmware on link up.

    Fixes e059c120b4223fd5ec3af9def21c0519f439fe57 .

    PR:             261746
    MFC after:      1 day
    Sponsored by:   NVIDIA Networking

 sys/dev/mlx5/mlx5_en/mlx5_en_flow_table.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
Comment 16 Hans Petter Selasky freebsd_committer freebsd_triage 2022-02-10 10:20:27 UTC
Will be MFC'ed tomorrow.
Comment 17 commit-hook freebsd_committer freebsd_triage 2022-02-11 10:16:20 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=16635c7b213a8da75bd54cf81abb984f69b0bbc5

commit 16635c7b213a8da75bd54cf81abb984f69b0bbc5
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-02-10 10:12:21 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-02-11 10:15:00 +0000

    mlx5en: Make sure the NIC IP addresses are written to firmware on link up.

    Fixes e059c120b4223fd5ec3af9def21c0519f439fe57 .

    PR:             261746
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit 04f407a3e5e7bf452768201ace260b575f1a7924)

 sys/dev/mlx5/mlx5_en/mlx5_en_flow_table.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)