Bug 262113

Summary: bhyveclt: Every second attempt to destroy->run fails with: invalid argument error
Product: Base System Reporter: risner <risner>
Component: bhyveAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: Closed FIXED    
Severity: Affects Some People CC: afedorov, emaste, manu, markj, rew, virtualization
Priority: --- Keywords: needs-qa
Version: 13.0-STABLEFlags: koobs: maintainer-feedback? (markj)
koobs: maintainer-feedback? (manu)
koobs: maintainer-feedback? (rew)
Hardware: amd64   
OS: Any   

Description risner 2022-02-22 00:31:13 UTC
using an UEFI OS and the commands executed below, you may start bhyve. Regardless of the way you exit (ctrl-c, halt or reboot from inside the vm); when you retry you will get an invalid argument error. Retrying immediately after, it works. The cycle then repeats.

bhyvectl --destroy --vm=test0
bhyve -AHP -c 1 -m 1024M -S -s 0,hostbridge -s 1,lpc -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 3,ahci-hd,efiboot.img -s 29,fbuf,tcp=203.0.113.46:5900,w=800,h=600 test0

From testing modifications to bhyve, the libvmmapi, and the kernel module I found that line 1006 of bhyverun.c calls vm_create and returns with EEXIST despite there being no entry in /dev/vmm and having ran bhyvectl --destroy immediately before bhyve.

I could not conclusively prove what exactly is going on inside the kernel module.

It fails on FreeBSD-13 p4 and p7. It failed on two different Cisco amd64 systems I had.

If you don't have a testing OS with uefi, you can make a temporary one with this:
dd if=/dev/zero of=efiboot.img bs=300m count=1
mdconfig -a -t vnode -f efiboot.img
gpart create -s gpt md0
gpart add -t efi -s 256m md0
newfs_msdos -F 32 -c 1 -m 0xf8 /dev/md0p1 # format efi
mount -t msdosfs /dev/md0p1 /mnt
mkdir -p /mnt/EFI/BOOT
wget -O /mnt/EFI/BOOT/BOOTX64.efi https://sourceforge.net/projects/supergrub2/files/2.04s1/super_grub2_disk_2.04s1/super_grub2_disk_standalone_x86_64_efi_2.04s1.EFI/download
umount /mnt
mdconfig -d -u 0
Comment 1 Aleksandr Fedorov freebsd_committer freebsd_triage 2022-02-22 07:49:32 UTC
The main problem is that the VM is destroyed asynchronously via sysctl:
https://github.com/freebsd/freebsd-src/blob/main/usr.sbin/bhyvectl/bhyvectl.c#L2397
https://github.com/freebsd/freebsd-src/blob/main/lib/libvmmapi/vmmapi.c#L88
https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/vmm/vmm_dev.c#L1080

Therefore, bhyvectl --destroy --vm=test0 completes before the VM is actually destroyed.

Moreover, even the /dev/vmm/test0 device is removed before the actual destruction of the VM occurs.

And I don't know of a guaranteed way to check if a VM is destroyed from userspace.

As workaround you can use:
bhyvectl --destroy --vm=test0
sleep [N[
bhyve -AHP -c 1 -m 1024M ...


But this is not a 100% reliable solution.
Comment 2 risner 2022-02-22 13:28:23 UTC
It seems I had gotten confused by the handbook.
https://people.freebsd.org/~blackend/doc/handbook/virtualization-host-bhyve.html
Under 21.7.3. Creating a Linux® Guest, it has a line saying "The instance of the virtual machine needs to be destroyed before it can be started again". So I just put that in my script. Many other guides have variations of my script (--destroy before bhyve), so it seemed to be the recommended way to start up guests.

I had a number of issues with following the guide including never getting grub-bhyve working. I abandoned grub-bhyve and went with UEFI booting from the UEFI boot file menu.

I just test rebooting the host and starting the guest without doing a destroy. All seemed well. So this appears to be a user error or misunderstanding of the handbook.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2022-02-22 23:50:31 UTC
@Reporter Are you able to run these under truss, dtrace or similar and include the trace output as an attachment, compressed if necessary?

Note: For triage/resolution purposes, re comment 2, we have two issues

- the docs issue that would seem to imply destroy is necessary in certain cases. Is this true or required for any cases at present, and if so, are there alternative changes that may be made to preclude the need for this?

- presumably we'd like a destroy/run cycle/procedure to either a) work, or, at least b) have an improved user experience that an error every second attempt.

^Triage: Request feedback from folks playing in the bhyve tree recently in the event they have opinions or suggestions for resolution of this issue.
Comment 4 risner 2022-02-28 18:12:00 UTC
As advised, I found it worked if I give it several second between the destroy and the creation of a new session. The destroy_dev_sched_cb() call is the scheduled task in the kernel. It appears to remove the /dev/vmm entry, but not yet all the kernel entries are removed by the time the create call tries to add them back.

While looking at the code it felt to me this line:
https://github.com/freebsd/freebsd-src/blob/22054f88914b51113f77f6eccc11353a891f9f3e/usr.sbin/bhyve/bhyverun.c#L1081
Only covered the situation of "the VM still exists" and doesn't cover "the VM is being destroy but not yet completely destroyed."

In the case of it being destroyed, the code falls through to:
https://github.com/freebsd/freebsd-src/blob/22054f88914b51113f77f6eccc11353a891f9f3e/usr.sbin/bhyve/bhyverun.c#L1105
with an invalid argument return value vm_open() as it's been destroy by the time this happens.

Is there no way to catch the scheduled destroy and return something other than EEXIST in the create to catch this state?

Addressing the points brought up (lines beginning with >> are my responses):
- the docs issue that would seem to imply destroy is necessary in certain cases.
>> My confusion would have been alleviated by adding a note that destroy is only needed if you have no plans to resume the vm and wish to free the wired memory.

- presumably we'd like a destroy/run cycle/procedure to either a) work, or, at least b) have an improved user experience that an error every second attempt.
>> Adding another state response distinct from EEXIST say perhaps ESHUTDOWN if there is scheduled destroy, the error could be less confusing by responding with "destroy in progress" or similar.
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2023-10-04 19:04:19 UTC
This should have been fixed by https://cgit.freebsd.org/src/commit/?id=7a0c23da4eaa63f00e53aa18f3ab1f2bb32f593a