| Summary: | bhyveclt: Every second attempt to destroy->run fails with: invalid argument error | ||
|---|---|---|---|
| Product: | Base System | Reporter: | risner <risner> |
| Component: | bhyve | Assignee: | freebsd-virtualization (Nobody) <virtualization> |
| Status: | Closed FIXED | ||
| Severity: | Affects Some People | CC: | afedorov, emaste, manu, markj, rew, virtualization |
| Priority: | --- | Keywords: | needs-qa |
| Version: | 13.0-STABLE | Flags: | koobs:
maintainer-feedback?
(markj) koobs: maintainer-feedback? (manu) koobs: maintainer-feedback? (rew) |
| Hardware: | amd64 | ||
| OS: | Any | ||
|
Description
risner
2022-02-22 00:31:13 UTC
The main problem is that the VM is destroyed asynchronously via sysctl: https://github.com/freebsd/freebsd-src/blob/main/usr.sbin/bhyvectl/bhyvectl.c#L2397 https://github.com/freebsd/freebsd-src/blob/main/lib/libvmmapi/vmmapi.c#L88 https://github.com/freebsd/freebsd-src/blob/main/sys/amd64/vmm/vmm_dev.c#L1080 Therefore, bhyvectl --destroy --vm=test0 completes before the VM is actually destroyed. Moreover, even the /dev/vmm/test0 device is removed before the actual destruction of the VM occurs. And I don't know of a guaranteed way to check if a VM is destroyed from userspace. As workaround you can use: bhyvectl --destroy --vm=test0 sleep [N[ bhyve -AHP -c 1 -m 1024M ... But this is not a 100% reliable solution. It seems I had gotten confused by the handbook. https://people.freebsd.org/~blackend/doc/handbook/virtualization-host-bhyve.html Under 21.7.3. Creating a Linux® Guest, it has a line saying "The instance of the virtual machine needs to be destroyed before it can be started again". So I just put that in my script. Many other guides have variations of my script (--destroy before bhyve), so it seemed to be the recommended way to start up guests. I had a number of issues with following the guide including never getting grub-bhyve working. I abandoned grub-bhyve and went with UEFI booting from the UEFI boot file menu. I just test rebooting the host and starting the guest without doing a destroy. All seemed well. So this appears to be a user error or misunderstanding of the handbook. @Reporter Are you able to run these under truss, dtrace or similar and include the trace output as an attachment, compressed if necessary? Note: For triage/resolution purposes, re comment 2, we have two issues - the docs issue that would seem to imply destroy is necessary in certain cases. Is this true or required for any cases at present, and if so, are there alternative changes that may be made to preclude the need for this? - presumably we'd like a destroy/run cycle/procedure to either a) work, or, at least b) have an improved user experience that an error every second attempt. ^Triage: Request feedback from folks playing in the bhyve tree recently in the event they have opinions or suggestions for resolution of this issue. As advised, I found it worked if I give it several second between the destroy and the creation of a new session. The destroy_dev_sched_cb() call is the scheduled task in the kernel. It appears to remove the /dev/vmm entry, but not yet all the kernel entries are removed by the time the create call tries to add them back. While looking at the code it felt to me this line: https://github.com/freebsd/freebsd-src/blob/22054f88914b51113f77f6eccc11353a891f9f3e/usr.sbin/bhyve/bhyverun.c#L1081 Only covered the situation of "the VM still exists" and doesn't cover "the VM is being destroy but not yet completely destroyed." In the case of it being destroyed, the code falls through to: https://github.com/freebsd/freebsd-src/blob/22054f88914b51113f77f6eccc11353a891f9f3e/usr.sbin/bhyve/bhyverun.c#L1105 with an invalid argument return value vm_open() as it's been destroy by the time this happens. Is there no way to catch the scheduled destroy and return something other than EEXIST in the create to catch this state? Addressing the points brought up (lines beginning with >> are my responses): - the docs issue that would seem to imply destroy is necessary in certain cases. >> My confusion would have been alleviated by adding a note that destroy is only needed if you have no plans to resume the vm and wish to free the wired memory. - presumably we'd like a destroy/run cycle/procedure to either a) work, or, at least b) have an improved user experience that an error every second attempt. >> Adding another state response distinct from EEXIST say perhaps ESHUTDOWN if there is scheduled destroy, the error could be less confusing by responding with "destroy in progress" or similar. This should have been fixed by https://cgit.freebsd.org/src/commit/?id=7a0c23da4eaa63f00e53aa18f3ab1f2bb32f593a |