Bug 243640

Summary: QEMU / KVM Q35 V4.X PCIe Virtual and Physical (Passthrough) Devices not detected
Product: Base System Reporter: John Hartley <drum>
Component: kernAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: Open ---    
Severity: Affects Some People CC: accounts.steven.roch, christian.rohmann, freebsd, freebsd, mzaferyahsi, wyatt
Priority: --- Keywords: regression
Version: 12.1-STABLE   
Hardware: amd64   
OS: Any   

Description John Hartley 2020-01-27 01:43:16 UTC
Bug / Defect:

PCIe attached devices are not detected on when running FreeBSD 12.1 on QEMU Q35 V 4.x Virtual Machines.

This bug affects the following PCIe based devices:

VirtIO - All
em - When using e1000e QEMU emulator (PCIe attached Intel 1GbE NIC
ix - PCI Passthrough Intel X550 10GbE NIC

Likely all other PCIe devices whether via emulation or PCI Passthrough

Issues was discovered while testing:

Q35 with VirtIO:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236922

Q35 / OVMF with SCSI and Network Devices: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241774

Also appears to be root cause of defect raised with Q35 V4 and PCI Passthrough:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241581


Observed Behaviour:

Running QEMU/KVM VM - Q35 V4.x / OVMF / SATA / VirtIO / e1000e / e1000 / vmxnet3 / PCI Passthrough to Intel X550 10GbE / FreeBSD 12.1 with recompiled kernel to disable netmap ("dev netmap") sys/amd64/conf/GENERIC to address found bug with Q35 + netmap devices (see above reported and resolved bugs).

SATA - disk found
VirtIO - disk not found
e1000e - PCIe NIC not found
e1000 - Legacy PCI NIC found as em0
vmxnet3 - PCI connected NIC found as vmx0
X550 10GbE - not found.

Get the following:

dmesg errors

<<DMESG>>
...
pcib2: <PCI-PCI bridge> mem 0xc8b87000-0xc8b87fff irq 22 at device 2.1 on pci0
pcib2: Failed to allocate interrupt for PCI-e events
pcib3: <PCI-PCI bridge> mem 0xc8b86000-0xc8b86fff irq 22 at device 2.2 on pci0
pcib3: Failed to allocate interrupt for PCI-e events
pcib4: <PCI-PCI bridge> mem 0xc8b85000-0xc8b85fff irq 22 at device 2.3 on pci0
pcib4: Failed to allocate interrupt for PCI-e events
pcib5: <PCI-PCI bridge> mem 0xc8b84000-0xc8b84fff irq 22 at device 2.4 on pci0
pcib5: Failed to allocate interrupt for PCI-e events
...
<<END DMESG>>

ifconfig

<<IFCONFIG>>
# ifconfig -a
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
	inet 127.0.0.1 netmask 0xff000000
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
	ether 52:53:01:17:15:aa
	inet XX.XXX.XXX.53 netmask 0xffffff80 broadcast 203.XXX.XXX.127
	media: Ethernet autoselect
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
em0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=81209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER>
	ether 52:54:00:a4:13:df
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
<<END IFCONFIG>>


Expected Behaviour:

Should get all NIC visible: 
m0 (e1000), em1 (e1000e), ix0 (X550 10GbE PCI Passthrough), vmx0 (vmxnet3)

Should get:
-- SATA - /dev/adaNpN storage devices
-- SCSI - /dev/daNpN VirtIO SCSI devices


I then retested with except with QEMU Q35 V3.1:

Storage"
SATA - OK
VirtIO SCSI - only ok when built with new VirtIO sub-system as per https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236922

Networking:
e1000 - ok comes up as em0
e1000e - ok comes up as em1
vmxnet3 - ok comes up as vmx0
X550 10GbE - ok comes up as ix0

ifconfig

<<IFCONFIG>>
# ifconfig -a
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
	ether 52:53:01:17:15:aa
	inet XXX.XXX.XXX.53 netmask 0xffffff80 broadcast XXX.XXX.XXX.127
	media: Ethernet autoselect
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
em0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=81209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER>
	ether 52:54:00:a4:13:df
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ix0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=e53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
	ether b4:96:91:21:4a:ce
	media: Ethernet autoselect
	status: no carrier
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
em1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
	ether 52:54:00:f8:3b:94
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
	inet 127.0.0.1 netmask 0xff000000
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
<<END IFCONFIG>>

Here is QEMU Q35 V3.1 with PCI passthrough XML snippet:

<<LIBVIRT XML>>
virsh dumpxml test-freebsd-12.1 
<domain type='kvm' id='5'>
  <name>test-freebsd-12.1</name>
  <uuid>a50005d7-7425-435f-82e9-e76f18784693</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://freebsd.org/freebsd/12.0"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-3.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram>/home/XXX/DIR/OVMF_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Broadwell-IBRS</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='vmx'/>
    <feature policy='require' name='f16c'/>
    <feature policy='require' name='rdrand'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='umip'/>
    <feature policy='require' name='md-clear'/>
    <feature policy='require' name='stibp'/>
    <feature policy='require' name='arch-capabilities'/>
    <feature policy='require' name='ssbd'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
    <feature policy='disable' name='skip-l1dfl-vmentry'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/home/XXX/DIR/test-hd1-01.qcow2'/>
      <backingStore/>
      <target dev='sda' bus='sata'/>
      <boot order='1'/>
      <alias name='sata0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/home/XXX/DIR/FreeBSD-12.1-RELEASE-amd64-dvd1.iso'/>
      <backingStore/>
      <target dev='sdb' bus='sata'/>
      <readonly/>
      <boot order='2'/>
      <alias name='sata0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
...
...
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-to-pci-bridge'>
      <model name='pcie-pci-bridge'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x11'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x12'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x13'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x14'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:18:15:aa'/>
      <source bridge='br20'/>
      <target dev='vnet0'/>
      <model type='vmxnet3'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:a4:13:df'/>
      <source bridge='br20'/>
      <target dev='vnet1'/>
      <model type='e1000'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:f8:3b:94'/>
      <source bridge='br20'/>
      <target dev='vnet2'/>
      <model type='e1000e'/>
      <alias name='net2'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </interface>
...
...
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
...
...

</domain>

<<END LIBVIRT XML>>

Diagnosis:

Considerable testing and diagnosis as been done as part of find and resolving VirtIO and netmap issues.

This has found how to replicate and work around the issues Q35 V4.X (Issue) and Q35 V31 (work around).

So problem appears to be within PCI device driver and relates to Q35 V4.x using PCIe GEN4 specification aligned emulation.

Tommy P has done considerable work to pin point likely source of issues: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241774#c74

I quote: "Starting with the Q35 QEMU 4.0 machine type, generic pcie-root-port will default to the maximum PCIe link speed (16GT/s) and width (x32) provided by the PCIe 4.0 specification."

This now needs to be confirmed by FreeBSD PCI core team for resolution.

Please advise if you need additional testing or diagnostic information.

Cheers,


John Hartley.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2020-06-20 16:10:28 UTC
So I am trying to understand how to classify this: does this issue have to do with a defect in the qemu port, a defect in the base system, or both?
Comment 2 John Hartley 2020-06-23 04:38:37 UTC
(In reply to Mark Linimon from comment #1)

Hi Mark,

I believe the bug is in the Base as relates to PCIe Device Driver and its behaviour with a particular "hardware" type (PCEe Gen4).

It would seem that the place to start looking in the PCIe device discovery code.

I have obviously on tested this on virtual machine, but it could be that similar issue occur on physical machine (depending on quality of equation).

Cheers,


John Hartley.
Comment 3 Wyatt 2020-06-30 05:27:42 UTC
Hello!  I experienced what I believe to have been this issue with LXC/QEMU.

A workaround I discovered is to specify "pc-q35-2.6" in the QEMU machine config.


My related thread - https://discuss.linuxcontainers.org/t/lxc-vm-running-freebsd-cant-see-hard-disk/8214/15
Comment 4 John Hartley 2020-06-30 08:20:57 UTC
Hi Wyatt,

yes this work around results in use of Gen3/Gen3 PCIe emulation.

I have raised bug reports on various NIC/SCSI & VirtIO issues...

I also am running a pfsense machine and have deliberately avoided upgrading this so I do not hit these uncovered bugs.

I did test with 11.4 and believe that the issue with netmap is resolved, but virtio issues are not.

Cheers,


John Hartley.
Comment 5 Steven Roch 2020-07-12 19:14:31 UTC
I can confirm the issue still existing in Proxmox 6.2-9 with and 'pve-qemu-kvm/stable 5.0.0-9 amd64' when trying PCIe-passthrough of LSI HBA with option PCIe-device enabled.

FreeNAS did not recognize the HBA until I manually set machine to pc-q35-3.1 as suggested by John Hartley.
Comment 6 John Hartley 2020-07-13 04:30:29 UTC
(In reply to John Hartley from comment #2)

Hi Mark,

this should have been "depending on the quality of the emulation".

Has there been much testing with physical Gen4 PCIe machines ?

These are still few and far between at the moment, last I looked only AMD had Gen4 PCIe machines.

In general though testing via emulation first is a good way to go as emulation is able to deliver new spec "hardware" faster than it can be produced in metal.

Cheers,


John Hartley.
Comment 7 Mina Galić freebsd_triage 2023-06-25 15:29:27 UTC
has anyone been able to test this lately?
i believe quite a few issues to be resolved in 13.2
Comment 8 John Hartley 2023-07-03 01:53:45 UTC
(In reply to Mina Galić from comment #7)

Hi Mina Galic,

my latest testing has shown that unfortunately FreeBSD 13.1 & 13.2 have regressed and you cannot boot ISO with modern firmware (UEFI):

QEMU kVM:
- Machine type: Q35 with OVMF / UEFI
- QEMU & Libvirt API Version 8.0.0
- QEMU Hypervisor Version: 6.2.0


Does not boot from CD-ROM at all.

So looks like need to raise new bug, report.

Cheers from Oz,


John Hartley.
Comment 9 John Hartley 2023-07-03 01:54:20 UTC
(In reply to Mina Galić from comment #7)

Hi Mina Galic,

my latest testing has shown that unfortunately FreeBSD 13.1 & 13.2 have regressed and you cannot boot ISO with modern firmware (UEFI):

QEMU kVM:
- Machine type: Q35 with OVMF / UEFI
- QEMU & Libvirt API Version 8.0.0
- QEMU Hypervisor Version: 6.2.0


Does not boot from CD-ROM at all.

So looks like need to raise new bug, report.

Cheers from Oz,


John Hartley.
Comment 10 John Hartley 2023-07-18 03:24:17 UTC
(In reply to John Hartley from comment #9)
H Mina Galic,

I have now done further testing with 13.2.

You can get this up and running Q35 V4.2 & 6.2 but you need to ensure that is is using specific OVMF (UEFI) firmwaare.

My libvirt configuraton to achieve this:

<WORKS>
...
  <os>
    <type arch='x86_64' machine='pc-q35-6.2'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram>/home/XXXX/Documents/current.dev.freebsd/OVMF_VARS.fd</nvram>
  </os>
...
</WORKS>

The default UEFI / OVMF libvirt configuration is:

<FAILS>
...
  <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-6.2'>hvm</type>
  </os>
...
</FAILS>

This result in it loading a different OVMF version, which then fails UEFI boot.

With the working boot e1000e (Intel 1GbE on PCIe bus) if found ok.

So it looks like 13.2 has fixed the problem with PCIe Devices, but I have not tested with real devices via PCIe passthrough...


Cheers,

John Hartley.