Bug 260178 - bhyve: passthru makes ahci-hd boot fail
Summary: bhyve: passthru makes ahci-hd boot fail
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords: bhyve
Depends on:
Blocks:
 
Reported: 2021-12-02 19:24 UTC by Bjoern A. Zeeb
Modified: 2022-10-31 14:21 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bjoern A. Zeeb freebsd_committer freebsd_triage 2021-12-02 19:24:14 UTC
Next funny problem with passthru;  I am booting from a raw disk which works fine.

        -S \
        -s 0,hostbridge \
        -s 3,ahci-hd,/dev/da0,sectorsize=512 \
        -s 10,virtio-net,tap1 \
        -s 31,lpc \


Now I added a passthru device to the same bhyve config (-S was there already) and the boot starts to fail:

        -s 11,passthru,2/0/0 \

BdsDxe: failed to load Boot0001 "UEFI BHYVE SATA DISK BHYVE-9E55-9829-EEEE" from PciRoot(0x0)/Pci(0x3,0x0)/Sata(0x0,0xFFFF,0x0): Not Found

Removing the passthru device from this config and the virtual machine boots just fine again.

Anyone suggestions where to start before I dive in?
Comment 1 Corvin Köhne 2021-12-03 07:04:57 UTC
Would be helpful if you give some more information about your /dev/da0 device and your pci device 2/0/0.

Are these devices somehow related to each other?
Comment 2 Bjoern A. Zeeb freebsd_committer freebsd_triage 2021-12-03 09:41:39 UTC
In that case one was an external USB disk, the other an internal M.2 PCIe card.
Comment 3 mario felicioni 2022-02-03 10:50:06 UTC
I have your exact problem when I try to boot a linux / windows / freebsd guest os installed physically on a sata or usb disk (it makes no difference) with every version of bhyve (the corvin's bhyve and also bhyve present on freebsd 14,while the most updated version present on FreeBSD 13R the passthrough does not start at all). So,I've created some cases to show you in a incontrovertible way that the passthrough of any device (I tried with my nvidia geforce RTX 2080 ti and with my USB renesas controller),interferes with the booting of any OS installed physically on the disks. it happens if I use virtio-blk and also ahci-hd. If I use virtio-blk I have an additional problem,in addition to the fact that the passthrough does not work,without passthrough,at some point the VM can't load the root partition,while using ahci-hd it can).

case 1)

bhyve -S -c sockets=1,cores=2,threads=2 -m 4G -w -H \
-s 0,hostbridge \
-s 1,virtio-blk,/dev/da1 \
-s 2:0,passthru,2/0/0,rom=TU102.rom \
-s 31,lpc \
-l com1,stdio \                           
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
vm1

= NO (da1 with freebsd 14 installed does not boot)

case 2)

bhyve -S -c sockets=1,cores=2,threads=2 -m 4G -w -H \
-s 0,hostbridge \
-s 1,virtio-blk,/dev/da1 \
-s 31,lpc \
-l com1,stdio \                           
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
vm1

= yes.

case 3)

bhyve -s 0,hostbridge \
-s 1,virtio-blk,/dev/da1 \
-s 31,lpc \
-l com1,stdio \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
vm1

= yes

case 4)

bhyve -S -s 0,hostbridge \
-s 1,virtio-blk,/dev/da1 \
-s 31,lpc \
-l com1,stdio \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
vm1

= yes

case 5)

bhyve -S -s 0,hostbridge \
-s 1,virtio-blk,/dev/da1 \
-s 2:0,passthru,2/0/0,rom=TU102.rom \
-s 31,lpc \
-l com1,stdio \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
vm1

= no
Comment 4 mario felicioni 2022-02-04 18:56:30 UTC
I've passed through the graphic card and the renesas controller (because they are on the same IOMMU group) and I've got a precise bug :

This command :

bhyve -S -c sockets=1,cores=2,threads=2 -m 4G -w -H -A \
-s 0,hostbridge \
-s 1,ahci-hd,/dev/da1,sectorsize=512 \
-s 3:0,passthru,2/0/0 \
-s 3:1,passthru,2/0/1 \
-s 3:2,passthru,2/0/2 \
-s 3:3,passthru,2/0/3 \
-s 4:0,passthru,1/0/0 \
-s 8,virtio-net,tap1 \
-s 9,virtio-9p,sharename=/ \
-s 30,xhci,tablet \
-s 31,lpc \                                                      
-s 29,fbuf,tcp=0.0.0.0:5901,w=1440,h=900 \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
-l com1,stdio \
vm1

produces this error :

root@marietto:/usr/home/marietto/bhyve # ./freebsd14.sh

VM:vm0 is not created.
VM:vm1 is not created.
VM:vm0 is not created.
VM:vm1 is not created.
fbuf frame buffer base: 0x945e00000 [sz 16777216]
Assertion failed: (baridx == 0), function pci_fbuf_write, file /usr/src/usr.sbin/bhyve/pci_fbuf.c, line 134

Abort : core dumped.
Comment 5 mario felicioni 2022-02-04 19:08:35 UTC
this is what happens without the framebuffer defined in bhyve : 

Assertion failed: (pi->pi_bar[baridx].type == PCIBAR_IO), function passthru_read, file /usr/src/usr.sbin/bhyve/pci_passthru.c, line 942.
Comment 6 Bjoern A. Zeeb freebsd_committer freebsd_triage 2022-02-08 18:38:05 UTC
I start to wonder if we are seeing a uefi problem here?
Comment 7 mario felicioni 2022-02-08 18:44:32 UTC
contact me 4 email. I've studied this situation and I've started another aggregated bug report.
Comment 8 Bjoern A. Zeeb freebsd_committer freebsd_triage 2022-03-13 17:59:37 UTC
The original problem is now understood better;  I have a proof-of-concept workaround for a minimal config, fixing reads (which is not fit for bhyve, only to demonstrate the problem).

We'll keep working on this as time permits and as we better understand how to find a proper solution for this.  For the moment please be patient.
Comment 9 mario felicioni 2022-03-13 18:06:51 UTC
I've studied more cases that are strictly related to this bug and I've opened different tickets and I've sent to you an (old) email,but you never replied. So Im sure that you don't know what are all the implications of this bug.
Comment 10 mario felicioni 2022-03-14 12:32:57 UTC
What I want to achieve is to pass thru two of my NTFS "formatted" disks to a Windows 11 VM,but *WITHOUT* passing them thru using the USB controller in FreeBSD with a bhyve virtual machine (in the example below I tried to boot Windows 11 from the nvme disk nvd0) AT at the same time I want to pass thru my graphic card to a Windows 11 and / or Linux VM.

I'm using this FreeBSD version :

FreeBSD marietto 13.0-RELEASE FreeBSD 13.0-RELEASE #5 n244809-dff3dead3734: Wed Feb 23 13:16:32 CET
2022     marietto@marietto:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

I've configured the bhyve VM like this :

bhyve -S -c sockets=1,cores=2,threads=2 -m 4G -w -H -A \
-s 0,hostbridge \
-s 1,ahci-hd,/dev/nvd0 \
-s 2,virtio-blk,/dev/da4 \
-s 3,virtio-blk,/dev/da2 \
-s 4:0,passthru,2/0/0 \
-s 4:1,passthru,2/0/1 \
-s 4:2,passthru,2/0/2 \
-s 4:3,passthru,2/0/3 \
-s 8,virtio-net,tap2 \
-s 9,virtio-9p,sharename=/ \
-s 10,hda,play=/dev/dsp,rec=/dev/dsp \
-s 29,fbuf,tcp=0.0.0.0:5902,w=1440,h=900 \
-s 30,xhci,tablet \
-s 31,lpc \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_BHF_CODE.fd \
vm2 < /dev/null & sleep 2 && vncviewer 0:2

These are the NTFS disks that I would like to see inside the Windows 11 guest os :

-s 2,virtio-blk,/dev/da4 \
-s 3,virtio-blk,/dev/da2 \


=>         34  19532873661  da4  GPT  (9.1T)
           34        32734    1  ms-reserved  (16M)
32768  19532838912    2  ms-basic-data  (9.1T)
19532871680         2015       - free -  (1.0M)


=>         34  23437705149  da2  GPT  (11T)
           34         2014       - free -  (1.0M)
2048  23437701120    1  ms-basic-data  (11T)
23437703168         2015       - free -  (1.0M)

Using the whole disks in bhyve they are recognized by fdisk. But as I said,what matters a lot for me,is to boot the nvme physical disk passing through at the same time my graphic card but it does not work. I care so much about this feature,because Ubuntu boots much much faster if I use the physical disk instead of the img / raw file. And this is again more valid if I want to boot Windows 11. It gets frozen e if I use a raw / img disk. I can't use it. But if I use the nvme disk where I have installed it physically it is much,much faster and I can use it.

This is what happens : (linux and windows 11 installed on the disk nvme don't boot)

https://forums.freebsd.org/attachments/screenshot_2022-03-13_16-35-42-jpg.13323/

I will keep the discussion updated also here :

https://forums.freebsd.org/threads/usb-3-0-disks-not-recognized-by-windows-if-passed-through-as-slots.84402/#post-559892
Comment 11 commit-hook freebsd_committer freebsd_triage 2022-03-24 15:26:56 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=246c398145674e4a9337fd933a6e6da7f160118e

commit 246c398145674e4a9337fd933a6e6da7f160118e
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2022-03-18 20:39:06 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2022-03-24 15:21:24 +0000

    bhyve: Do not remove guest physical addresses from IOMMU host domain

    This permits I/O devices on the host to directly access wired memory
    dedicated to guests using passthru devices.  Note that wired memory
    belonging to guests that do not use passthru devices has always been
    accessible by I/O devices on the host.

    bhyve maps guest physical addresses into the user address space of
    the bhyve process by mmap'ing /dev/vmm/<vmname>.  Device models pass
    pointers derived from this mapping directly to system calls such as
    preadv() to minimize copies when emulating DMA.  If the backing store
    for a device model is a raw host device (e.g. when exporting a raw disk
    device such as /dev/ada<n> as a drive in the guest), the host device
    driver (e.g. ahci for /dev/ada<n>) can itself use DMA on the host
    directly to the guest's memory.  However, if the guest's memory is
    not present in the host IOMMU domain, these DMA requests by the host
    device will fail without raising an error visible to the host device
    driver or to the guest resulting in non-working I/O in the guest.

    It is unclear why guest addresses were removed from the IOMMU host domain
    initially, especially only for VM's with a passthru device as the
    host IOMMU domain does not affect the permissions of passthru devices,
    only devices on the host.

    A considered alternative was using bounce buffers instead (D34535
    is a proof of concept), but that adds additional overhead for unclear
    benefit.

    This solves a long-standing problem when using passthru devices and
    physical disks in the same VM.

    Thanks to:      grehan (patience and help)
    Thanks to:      jhb (for improving the commit message)
    PR:             260178
    Reviewed by:    grehan, jhb
    MFC after:      3 days
    Differential Revision: https://reviews.freebsd.org/D34607

 sys/amd64/vmm/vmm.c | 2 --
 1 file changed, 2 deletions(-)
Comment 12 commit-hook freebsd_committer freebsd_triage 2022-03-27 20:15:13 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=dd113f67dfb5bdaf5d8b3a87bb19924ad447494c

commit dd113f67dfb5bdaf5d8b3a87bb19924ad447494c
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2022-03-18 20:39:06 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2022-03-27 17:57:28 +0000

    bhyve: Do not remove guest physical addresses from IOMMU host domain

    This permits I/O devices on the host to directly access wired memory
    dedicated to guests using passthru devices.  Note that wired memory
    belonging to guests that do not use passthru devices has always been
    accessible by I/O devices on the host.

    bhyve maps guest physical addresses into the user address space of
    the bhyve process by mmap'ing /dev/vmm/<vmname>.  Device models pass
    pointers derived from this mapping directly to system calls such as
    preadv() to minimize copies when emulating DMA.  If the backing store
    for a device model is a raw host device (e.g. when exporting a raw disk
    device such as /dev/ada<n> as a drive in the guest), the host device
    driver (e.g. ahci for /dev/ada<n>) can itself use DMA on the host
    directly to the guest's memory.  However, if the guest's memory is
    not present in the host IOMMU domain, these DMA requests by the host
    device will fail without raising an error visible to the host device
    driver or to the guest resulting in non-working I/O in the guest.

    It is unclear why guest addresses were removed from the IOMMU host domain
    initially, especially only for VM's with a passthru device as the
    host IOMMU domain does not affect the permissions of passthru devices,
    only devices on the host.

    A considered alternative was using bounce buffers instead (D34535
    is a proof of concept), but that adds additional overhead for unclear
    benefit.

    This solves a long-standing problem when using passthru devices and
    physical disks in the same VM.

    Thanks to:      grehan (patience and help)
    Thanks to:      jhb (for improving the commit message)
    PR:             260178, 215740
    Reviewed by:    grehan, jhb
    Differential Revision: https://reviews.freebsd.org/D34607

    (cherry picked from commit 246c398145674e4a9337fd933a6e6da7f160118e)

 sys/amd64/vmm/vmm.c | 2 --
 1 file changed, 2 deletions(-)
Comment 13 commit-hook freebsd_committer freebsd_triage 2022-03-30 15:50:31 UTC
A commit in branch releng/13.1 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1c6abf864ecd3bbf07ace2018f9aab45b6406ce2

commit 1c6abf864ecd3bbf07ace2018f9aab45b6406ce2
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2022-03-18 20:39:06 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2022-03-30 15:33:47 +0000

    bhyve: Do not remove guest physical addresses from IOMMU host domain

    This permits I/O devices on the host to directly access wired memory
    dedicated to guests using passthru devices.  Note that wired memory
    belonging to guests that do not use passthru devices has always been
    accessible by I/O devices on the host.

    bhyve maps guest physical addresses into the user address space of
    the bhyve process by mmap'ing /dev/vmm/<vmname>.  Device models pass
    pointers derived from this mapping directly to system calls such as
    preadv() to minimize copies when emulating DMA.  If the backing store
    for a device model is a raw host device (e.g. when exporting a raw disk
    device such as /dev/ada<n> as a drive in the guest), the host device
    driver (e.g. ahci for /dev/ada<n>) can itself use DMA on the host
    directly to the guest's memory.  However, if the guest's memory is
    not present in the host IOMMU domain, these DMA requests by the host
    device will fail without raising an error visible to the host device
    driver or to the guest resulting in non-working I/O in the guest.

    It is unclear why guest addresses were removed from the IOMMU host domain
    initially, especially only for VM's with a passthru device as the
    host IOMMU domain does not affect the permissions of passthru devices,
    only devices on the host.

    A considered alternative was using bounce buffers instead (D34535
    is a proof of concept), but that adds additional overhead for unclear
    benefit.

    This solves a long-standing problem when using passthru devices and
    physical disks in the same VM.

    Approved by:    re (gjb)
    Thanks to:      grehan (patience and help)
    Thanks to:      jhb (for improving the commit message)
    PR:             260178, 215740
    Reviewed by:    grehan, jhb
    Differential Revision: https://reviews.freebsd.org/D34607

    (cherry picked from commit 246c398145674e4a9337fd933a6e6da7f160118e)
    (cherry picked from commit dd113f67dfb5bdaf5d8b3a87bb19924ad447494c)

 sys/amd64/vmm/vmm.c | 2 --
 1 file changed, 2 deletions(-)