Installing Centos or Rocky 8.4 results in a failed boot. The initial install works, but on reboot I get this while loading:
* found guest in /storage/vm/webhost04a
BdsDxe: failed to load Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-68-C1-20-FC-9C-58): Not Found
Logging from vm-bhyve:
Jun 04 17:18:00: booting
Jun 04 17:18:00: [bhyve options: -c 8 -m 16G -Hwl bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -U 62ff48d0-c58d-11eb-9187-f8bc1251963e]
Jun 04 17:18:00: [bhyve devices: -s 0,hostbridge -s 31,lpc -s 4:0,nvme,/dev/zvol/storage/vm/webhost04a/disk0 -s 5:0,virtio-net,tap0,mac=58:9c:fc:07:6d:b7 -s 6:0,fbuf,tcp=192.168.1.150:5900 -s 7:0,xhci,tablet]
Note, Rocky 8.3 and Centos 8.3 both install and boot fine. with exactly the same configs in vm-bhyve
Does the guest boot if you change the device from nvme to ahci?
Installing guest again, using virtio-blk instead of nvme results in a working client.
Same guest, just change nvme to ahci-hd, works.
Appears nvme support in UEFI bios has a change in the new RHEL 8.4 and successors.
I was able to repro this with Alma 8.3/8.4 (identical to Centos 8.3/8.4).
With a file-backed image on ZFS, the sectorsize parameter was forced to 4K and 512, with no difference in getting the system to boot.
The error appears to be in the EFI loader on Centos:
BdsDxe: loading Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-DD-44-20-FC-9C-58)
BdsDxe: starting Boot0001 "UEFI bhyve-NVMe NVME-4-0" from PciRoot(0x0)/Pci(0x4,0x0)/NVMe(0x1,01-00-DD-44-20-FC-9C-58)
Unexpected return from initial read: Device Error, buffersize 0
Failed to load image \EFI\almalinux\grubx64.efi: Device Error
start_image() returned Device Error
StartImage failed: Device Error
Has this been tested against bare metal that has UEFI and NVMe?
I got the same as grehan@ when testing with both CentOS 8.4 and Stream. Observations suggest there is something up with the CentOS EFI shim for GRUB.
I have done testing against the following, fully updated as of 20210609 12:10 +10UTC:
Ubuntu impish 21.10 nightly
Artix (Arch) GRUB 2.04-10 Linux 5.12.8
None of these experienced any issues with the NVMe device presented by bhyve.
I successfully updated CentOS from 8.3 to 8.4 and it is running fine on a NVMe bhyve device.
It is looking more like how the installer deals with determining the boot device and then writing this and the GRUB components to the storage.
Is it possible to recompile pci_nvme.c and enable debug in the failing case? I.e. change the code to:
static int nvme_debug = 1;
This looks to be an edge condition in the EFI NVMe driver, caused by the large maximum data transfer size advertised by bhyve NVMe (2MB), and the increase in size of grubx64.efi from 1.9MB in centos 8.3, to 2.3MB in centos 8.4.
In 8.4, EFI attempts to read 2MB of grubx64.efi. However, the buffer starts at a non page-aligned address, using PRP1 in the command descriptor with an offset. PRP2 points to a PRP list, but with a 2MB transfer size, all 512 PRP entries in a page will be used. Since the first buffer was unaligned, there is a small amount left at the end, and EFI is putting garbage into that entry.
(Copying the smaller 8.3 grubx64.efi to an 8.4 system resulted in a successful boot).
A suggested fix is to drop the advertised mdts to something that isn't right on the verge of requiring a chained PRP list. Qemu defaults to 512KB, and h/w I've looked at advertises 256K. e.g.
@@ -106,7 +106,7 @@ static int nvme_debug = 0;
#define NVME_MPSMIN_BYTES (1 << (12 + NVME_MPSMIN))
#define NVME_PRP2_ITEMS (PAGE_SIZE/sizeof(uint64_t))
-#define NVME_MDTS 9
+#define NVME_MDTS 7
8.4 boots fine with this change.
I can confirm the patch from grehan@ works as described. Tested against:
Windows Server 2022
No regression was introduced into the latter two existing operating systems on our system.