Bug 222996 - FreeBSD 11.1-12 on Hyper-V with PCI Express Pass Through
Summary: FreeBSD 11.1-12 on Hyper-V with PCI Express Pass Through
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-virtualization mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-14 07:42 UTC by Dmitry
Modified: 2019-09-04 11:11 UTC (History)
5 users (show)

See Also:


Attachments
Log (24.49 KB, image/png)
2017-10-14 07:42 UTC, Dmitry
no flags Details
pciconf -lbv (10.46 KB, image/png)
2018-06-28 13:25 UTC, Dmitry
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry 2017-10-14 07:42:34 UTC
Created attachment 187158 [details]
Log

Windows Server 2016 Hyper-V, VM gen 2, have setup an PCI Passthrough Intel NICs for VM guest — FreeBSD 11.1 release P1 (PF Sense 2.4), have troubles with NIC setup (log during boot in attachment). I have installed FreeBSD 12, but it's same thing.
Trying to test working PCI Passthrough in other OS - ubuntu 17.04 (same VM configuration, just another disk image) works without any issues.
Maybe i need some changes in config files of OS?
Comment 1 Sepherosa Ziehau 2017-10-15 01:50:18 UTC
Do you mean pass-through or SR-IOV?

For SR-IOV, we have got mlx4_en work (connectx-3), which is also used in Azure.  Last time we tested, ixgbe's VF didn't not work.

PCI pass-through requires some extra per-VM configuration through powershell, I will check w/ Dexuan next Monday.
Comment 2 Dmitry 2017-10-15 05:35:48 UTC
No, i mean not SR-IOV.
I am already have extra configuration through powershell for PCI Passthrough, and it's works for other OS in Generation 2 Hyper-V VM, except FreeBSD.
I have reconfigured VM for legacy Generation 1, and PCI Passtrought works great in FreeBSD 11.1, but in new UEFI Generation 2 VM Passthrought didn't work.
Comment 3 Sepherosa Ziehau 2017-10-15 07:34:12 UTC
OK, I see.  Please post the bootverbose dmesg, if possible.
Comment 4 Dmitry 2017-11-08 07:05:16 UTC
I have noticed, that in Generation 1 Hyper-V 2016 VM PCI passthrough works without issues, in Generation 2 VM it cannot works.
Comment 5 Dexuan Cui 2018-06-19 21:38:36 UTC
Not sure why I didn't see the bug in time.Sorry. :-)

Here the error in the attachmetn in comment #1 is: "Setup of Shared code failed, error -2". 

#define E1000_ERR_PHY                   2

It looks dev/e1000/if_em.c: e1000_setup_init_funcs() -> e1000_init_phy_params fails.

I'm not sure why it works with Gen-1 VM, but not with Gen-2.

What's the PCI Device ID of the Intel NIC? I'm not sure if I have the same NIC here.
Comment 6 Dmitry 2018-06-20 04:56:58 UTC
Hi, it is Intel® Gigabit 82583V Ethernet NIC, i don't know how to know PCI Device ID...
Comment 7 Dexuan Cui 2018-06-20 23:59:28 UTC
(In reply to Dmitry from comment #6)
After you successfully assign the NIC to Linux VM (or Gen-1 FreeBSD), please run "lspci -vvv -x" (or "pciconf -lbv").
Comment 8 Dmitry 2018-06-28 13:23:31 UTC
(In reply to Dexuan Cui from comment #7)

here is.
Comment 9 Dmitry 2018-06-28 13:25:01 UTC
Created attachment 194707 [details]
pciconf -lbv
Comment 10 Dmitry 2018-10-03 18:19:46 UTC
Hi, is there any good news? I have test FreeBSD 11.2 release, and there is same problem with intel NICs passthrough on Gen 2 Hyper-V (Microsoft Windows Server 2016).
Comment 11 Dexuan Cui 2018-10-03 18:23:33 UTC
+Wei
Comment 12 Dmitry 2018-12-25 14:13:30 UTC
Have tried it on Windows Server 2019 — same problem still here, on Gen1 is all works ok, on Gen2 get:

pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
pci0: <PCI bus> on pcib0
em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> at device 0.0 on pci0
em0: Setup of Shared code filed, error -2
device_attach: em0 attach returned 6

I have try to boot FreeBSD 12, and get:

pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
pcib0: filed to get resources for cfg window
device_attach: pcib0 attach returned 6

And still no hope to get work Hyper-V Gen2 PCI Passthrough on FreeBSD in near future?
Comment 13 Dmitry 2018-12-25 14:31:53 UTC
FreeBSD 13-CURRENT same as on FreeBSD 12.
Comment 14 Dexuan Cui 2018-12-25 18:02:49 UTC
(In reply to Dmitry from comment #13)
@Wei, can you please take a look?
Comment 15 Dmitry 2018-12-25 18:57:56 UTC
I can give an SSH access to the Gen2 VM with FreeBSD 11.1, if it needs.
Comment 16 Wei Hu 2019-01-14 09:40:03 UTC
There are two issues. 

1. For FreeBSD R12 and current head branch, the passthrough support on Hyper-V Gen VM was broken by commit r330113. No devices work on these releases. I have informed Andrew@freebsd.org to take a look of this problem.

2. On 11.2 and earlier releases, passthrough generally works on Gen 2 FreeBSD VMs. I have checked one Intel NIC with using ig driver which works fine. However, Dmitry reported it doesn't work with Intel 82583v NIC, which uses em driver on FreeBSD. I don't have such NIC so I am using a VM that that Dmitry provided to debug this issue. Looks it fails in the routine reading NIC's MDI control register:

/**
 *  e1000_read_phy_reg_mdic - Read MDI control register
 *  @hw: pointer to the HW structure
 *  @offset: register offset to be read
 *  @data: pointer to the read data
 *
 *  Reads the MDI control register in the PHY at offset and stores the
 *  information read to data.
 **/
s32 e1000_read_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 *data)
{
        struct e1000_phy_info *phy = &hw->phy;
        u32 i, mdic = 0;

        ...
        /* Poll the ready bit to see if the MDI read completed
         * Increasing the time out as testing showed failures with
         * the lower time out
         */
        for (i = 0; i < (E1000_GEN_POLL_TIMEOUT * 10); i++) {
                usec_delay_irq(50);
                mdic = E1000_READ_REG(hw, E1000_MDIC);
                if (mdic & E1000_MDIC_READY)
                        break;
        }
        if (!(mdic & E1000_MDIC_READY)) {
                DEBUGOUT("MDI Read did not complete\n");        <--- saw this message when driver's debug flag is on.
                return -E1000_ERR_PHY;
        }
        ...
}

Looks the register status never turned ready. Here is some dmesg output when the debug flag is on:

em0: Lazy allocation of 0x20000 bytes rid 0x10 type 3 at 0xf8000000
em0: in em_allocate_pci_resource, bus tag is 0x1, handle is 0xfffff800f8000000
e1000_set_mac_type
e1000_init_mac_ops_generic
e1000_init_phy_ops_generic
e1000_init_nvm_ops_generic
e1000_init_function_pointers_82571
e1000_init_mac_params_82571
e1000_init_nvm_params_82571
e1000_init_phy_params_82571
e1000_get_phy_id_82571
e1000_read_phy_reg_bm2
e1000_get_hw_semaphore_82574
e1000_get_hw_semaphore_82573
e1000_read_phy_reg_mdic
MDI Read did not complete             <-- 
e1000_put_hw_semaphore_82574
e1000_put_hw_semaphore_82573
Error getting PHY ID

Will update more.
Comment 17 Wei Hu 2019-01-18 07:50:55 UTC
It turns out the address between 0xf8000000 and 0xf87fffff on HyperV doesn't work for MMIO or FreeBSD has bug to incorrectly assign device MMIO address to this range. When manually assign the em0 MMIO address to and above 0xf8800000, the passthrough NIC works fine. 

On same Gen 2 Linux VM, the device was assigned MMIO address above this range. On Gen 1 FreeBSD VM, the same card was assigned to an address above 4G. So both of these works fine. 

The workaround I am using is intercepting the memory allocation request, changing the starting address to 0xf8800000 whenever find a request fall into this problem range. This works fine on customer's site. However still need to root cause the issue when this only happens on FreeBSD Gen 2 VMs, not other Linux Gen 2 VMs.
Comment 18 Michael 2019-08-14 12:02:49 UTC
(In reply to Wei Hu from comment #17)

How manually assign the MMIO address to and above 0xf8800000 ?

I have Windows server 2019 Hyper-V and Intel x710. SR-IOV do not work :(

The host has the latest Intel drivers for the x710
network adapter. Second Generation Virtual Machine with GERERIC
Recompiled FreeBSD-Current Core for today. Host is Windows Server 2019
with Intel drivers (i40ea68.sys 1.10.128.0).

hn0 without SR-IOV function
hn1 with SR-IOV

Log messages from virtual machine:

Aug 14 10:56:18 r03 kernel: ---<<BOOT>>---
Aug 14 10:56:18 r03 kernel: Copyright (c) 1992-2019 The FreeBSD Project.
Aug 14 10:56:18 r03 kernel: Copyright (c) 1979, 1980, 1983, 1986,
1988, 1989, 1991, 1992, 1993, 1994
Aug 14 10:56:18 r03 kernel:     The Regents of the University of
California. All rights reserved.
Aug 14 10:56:18 r03 kernel: FreeBSD is a registered trademark of The
FreeBSD Foundation.
Aug 14 10:56:18 r03 kernel: FreeBSD 13.0-CURRENT
a2166b0cec5-c261904(master) R03 amd64
Aug 14 10:56:18 r03 kernel: FreeBSD clang version 8.0.1
(tags/RELEASE_801/final 366581) (based on LLVM 8.0.1)
Aug 14 10:56:18 r03 kernel: SRAT: Ignoring memory at addr 0x1fc000000
...
Aug 14 10:56:18 r03 kernel: SRAT: Ignoring memory at addr 0xc0000000000
Aug 14 10:56:18 r03 kernel: VT(efifb): resolution 1024x768
Aug 14 10:56:18 r03 kernel: Hyper-V Version: 10.0.17763 [SP0]
Aug 14 10:56:18 r03 kernel:
Features=0x2e7f<VPRUNTIME,TMREFCNT,SYNIC,SYNTM,APIC,HYPERCALL,VPINDEX,REFTSC,IDLE,TMFREQ>
Aug 14 10:56:18 r03 kernel:   PM Features=0x0 [C2]
Aug 14 10:56:18 r03 kernel:
Features3=0xbed7b2<DEBUG,XMMHC,IDLE,NUMA,TMFREQ,SYNCMC,CRASH,NPIEP>
Aug 14 10:56:18 r03 kernel: Timecounter "Hyper-V" frequency 10000000
Hz quality 2000
Aug 14 10:56:18 r03 kernel: module iavf already present!
Aug 14 10:56:18 r03 kernel: CPU: Intel(R) Xeon(R) CPU E5-2623 v3 @
3.00GHz (2996.53-MHz K8-class CPU)
Aug 14 10:56:18 r03 kernel:   Origin="GenuineIntel"  Id=0x306f2
Family=0x6  Model=0x3f  Stepping=2
...
Aug 14 10:56:18 r03 kernel: vmbus0: <Hyper-V Vmbus> on acpi_syscontainer0
Aug 14 10:56:18 r03 kernel: vmbus_res0: <Hyper-V Vmbus Resource> irq 5 on acpi0
...
Aug 14 10:56:18 r03 kernel: vmbus0: version 4.0
Aug 14 10:56:18 r03 kernel: hvet0: <Hyper-V event timer> on vmbus0
Aug 14 10:56:18 r03 kernel: Event timer "Hyper-V" frequency 10000000
Hz quality 1000
Aug 14 10:56:18 r03 kernel: hvkbd0: <Hyper-V KBD> on vmbus0
Aug 14 10:56:18 r03 kernel: hvheartbeat0: <Hyper-V Heartbeat> on vmbus0
Aug 14 10:56:18 r03 kernel: hvkvp0: <Hyper-V KVP> on vmbus0
Aug 14 10:56:18 r03 kernel: hvshutdown0: <Hyper-V Shutdown> on vmbus0
Aug 14 10:56:18 r03 kernel: hvtimesync0: <Hyper-V Timesync> on vmbus0
Aug 14 10:56:18 r03 kernel: hvtimesync0: RTT
Aug 14 10:56:18 r03 kernel: hvvss0: <Hyper-V VSS> on vmbus0
Aug 14 10:56:18 r03 kernel: storvsc0: <Hyper-V SCSI> on vmbus0
Aug 14 10:56:18 r03 kernel: hn0: <Hyper-V Network Interface> on vmbus0
Aug 14 10:56:18 r03 kernel: hn0: Ethernet address: 00:15:5d:00:88:29
Aug 14 10:56:18 r03 kernel: hn0: link state changed to UP
Aug 14 10:56:18 r03 kernel: hn1: <Hyper-V Network Interface> on vmbus0
Aug 14 10:56:18 r03 kernel: hn1: got notify, nvs type 128
Aug 14 10:56:18 r03 kernel: hn1: Ethernet address: 00:15:5d:00:88:2a
Aug 14 10:56:18 r03 kernel: hn1: link state changed to UP
Aug 14 10:56:18 r03 kernel: pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
Aug 14 10:56:18 r03 kernel: pci0: <PCI bus> on pcib0
Aug 14 10:56:18 r03 kernel: pci0: <network, ethernet> at device 2.0
(no driver attached)
Aug 14 10:56:18 r03 kernel: lo0: link state changed to UP
...

root@r03:/# pciconf -lvb
none0@pci2:0:2:0:       class=0x020000 card=0x00018086 chip=0x15718086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Virtual Function 700 Series'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base r, size 65536, disabled
    bar   [1c] = type Prefetchable Memory, range 64, base r, size 16384, disabled
Comment 19 Wei Hu 2019-08-14 12:29:05 UTC
(In reply to Michael from comment #18)
This bug is about PCI passthrough device, not SRIOV. For SRIOV, currently we only support Mellanox NICs on HyperV.

For the broken of PCI passthrough, I have not checked in fix yet. I think we can just reserve the MMIO address (0xf8000000, 0xf8800000) so it won't get assigned to other devices. If you are talk about PCI passthrough, maybe you can provide a reproducible test machine so I can verify the fix on it.
Comment 20 Dexuan Cui 2019-08-14 15:32:37 UTC
(In reply to Wei Hu from comment #19)
> For the broken of PCI passthrough, I have not checked in fix yet.
> I think we can just reserve the MMIO address (0xf8000000, 0xf8800000)
FWIW, we can not hardcode the MMIO range that should be reserved, because the range's base address can change if the user sets the Low MMIO space size by
something like "Set-VM -LowMemoryMappedIoSpace 3Gb -VMName $vm" (See https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/plan/plan-for-deploying-devices-using-discrete-device-assignment).
Comment 21 Dexuan Cui 2019-08-14 15:38:40 UTC
(In reply to Michael from comment #18)
Hi Michael,
Wei's patch is needed to make the MMIO mapping work.

However, here it looks your log shows the Intel VF driver doesn't load at all? Did you build the VF driver and made it auto load?

When the VF driver loads, what's the error from it, if any?
Comment 22 Dexuan Cui 2019-08-15 03:16:09 UTC
(In reply to Dexuan Cui from comment #21)
Hi Michael,
It looks you're using a Generation-2 VM on Hyper-V? Did you try Generation-1 VM and can it work for you?
Comment 23 Michael 2019-08-17 14:25:28 UTC
Passthrough Intel X710 in Windows 2019 Gen2 Hyper-V guest Freebsd-CURRENT works, but not as we would like.

Action roadmap:
1. disable the necessary adapter in the Device Manager
2. Add adapter to R03 VM
$MyEth = Get-PnpDevice  -PresentOnly | Where-Object {$_.Class -eq "Net"} | Where-Object {$_.Status -eq "Error"}
$DataOfNetToDDismount = Get-PnpDeviceProperty DEVPKEY_Device_LocationPaths -InstanceId $MyEth[0].InstanceId
$locationpath = ($DataOfNetToDDismount).data[0]
Dismount-VmHostAssignableDevice -locationpath $locationpath -force
Get-VMHostAssignableDevice
Add-VMAssignableDevice -LocationPath $locationpath -VMName R03

R03 dmesg:
pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
pci0: <PCI bus> on pcib0
ixl0: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.1.0-k> at device 0.0 on pci0
ixl0: fw 5.40.47690 api 1.5 nvm 5.40 etid 80002d35 oem 1.264.0
ixl0: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl0: Using 1024 TX descriptors and 1024 RX descriptors
ixl0: queue equality override not set, capping rx_queues at 4 and tx_queues at 4
ixl0: Using 4 RX queues 4 TX queues
ixl0: Using MSI-X interrupts with 5 vectors
ixl0: Ethernet address: 3c:fd:fe:21:02:e2
ixl0: Allocating 4 queues for PF LAN VSI; 4 queues active
ixl0: PCI Express Bus: Speed 8.0GT/s Width x8
ixl0: Failed to initialize SR-IOV (error=2)
ixl0: netmap queues/slots: TX 4/1024, RX 4/1024
ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: Full
ixl0: link state changed to UP

3. Add vlan to ixl0
root@r03:~ # ifconfig vlan48 create
root@r03:~ # ifconfig vlan48 vlan 48 vlandev ixl0
root@r03:~ # ifconfig vlan48 inet 10.10.221.51/24
root@r03:~ # ping 10.10.221.1
PING 10.10.221.1 (10.10.221.1): 56 data bytes
ping: sendto: Network is down
ping: sendto: Network is down
^C
--- 10.10.221.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@r03:~ # ifconfig ixl0 up
root@r03:~ # ping 10.10.221.1
PING 10.10.221.1 (10.10.221.1): 56 data bytes
64 bytes from 10.10.221.1: icmp_seq=0 ttl=64 time=0.965 ms
64 bytes from 10.10.221.1: icmp_seq=1 ttl=64 time=0.381 ms
64 bytes from 10.10.221.1: icmp_seq=2 ttl=64 time=0.526 ms
^C
--- 10.10.221.1 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.381/0.624/0.965/0.248 ms

4. Remove X710 adapter from R03 VM for "neutered" live migration
$locationpath = (Get-VmHostAssignableDevice).LocationPath
Remove-VMAssignableDevice -LocationPath $locationpath -VMName R03
Mount-VmHostAssignableDevice -locationpath $locationpath

R03 dmesg:
ixl0: Vlan in use, detach first
ixl0: Vlan in use, detach first
pcib0: detached
ixl0: Vlan in use, detach first

5. Adding again the adapter to the migrated R03 VM (#1 and #2)
root@r03:~ # ifconfig
...
ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 3c:fd:fe:21:02:e2
        media: Ethernet autoselect (10GBase-AOC <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan48: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 3c:fd:fe:21:02:e2
        inet 10.10.221.51 netmask 0xffffff00 broadcast 10.10.221.255
        groups: vlan
        vlan: 48 vlanpcp: 0 parent interface: ixl0
        media: Ethernet autoselect (10GBase-AOC <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ixl1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 3c:fd:fe:21:02:e2
        media: Ethernet autoselect (10GBase-AOC <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
root@r03:~ # ping 10.10.221.1
PING 10.10.221.1 (10.10.221.1): 56 data bytes
^C
--- 10.10.221.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss

How to make the adapter return to its place, rather than "double" in ixl1. Live migration is one of the cornerstones of virtualization.

And what threatens, for normal functioning, an error message:
ixl0: Failed to initialize SR-IOV (error=2)
Comment 24 Dexuan Cui 2019-08-17 15:30:30 UTC
(In reply to Michael from comment #23)
It looks Wei's patch might help. Wei, please comment.
Comment 25 Wei Hu 2019-08-19 09:06:41 UTC
(In reply to Michael from comment #23)
Can you share the 'pciconf -lvb' outputs when the passthrough NIC working and not working? Also share /boot/loader.conf file.
Comment 26 Wei Hu 2019-09-04 11:11:57 UTC
I think Comment 18 and Comment 23 are two different scenarios. 

Comment 18 is a true SRIOV VF that FreeBSD guest OS is seeing. The device ID for this NIC VF is 0x1571. It should be supported by Intel ixlv driver. However, the driver on FreeBSD (both 11.3 and head) doesn't have 0x1571 enabled. That's why the driver is not loaded and BARs are not configured in Comment 18. The ixlv driver needs to add more code to support HyperV case. If you can provide a environment for me to test, I may be able to enable it.

The scenario in Comment 23 looks like a Passthrough device to me, which in Microsoft's term DDA (Direct Device Access). I am not sure which device ID is seen in guest since Michael did not mention. Since ixl driver was successfully loaded and the NIC works, it was not the same VF as the one in Comment 18. 

Neither scenario is related to the original error of this bug. So it is better to be traces in a new bug.