After upgrading to stable/14 a bhyve VM deployed on a ZVOL cannot access its zpool when zfsd(8) is running on the bhyve host. The VM boots fine when zfsd(8) on the host machine is shut down. The sysctl knob vfs.zfs.vol.recursive is set to 0 on the host. Trying to mount root from zfs:vmzroot/ROOT/default []... vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 Mounting from zfs:vmzroot/ROOT/default failed with error 6; retrying for 3 more seconds vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 vtbd0: hard error cmd=write 1592-1607 vtbd0: hard error cmd=write 94371896-94371911 vtbd0: hard error cmd=write 94372408-94372423 Mounting from zfs:vmzroot/ROOT/default failed with error 6. Loader variables: vfs.root.mountfrom=zfs:vmzroot/ROOT/default Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input
See also 273663 as the preceding bug.
More details covering OS versions involved. Bhyve host: FreeBSD 14.0-STABLE amd64 1400500 (BSDONDELL) #4 stable/14-n265114-b93f514b9951: Thu Sep 14 08:02:01 CEST 2023 It is stable/14 on commit 3ea83e94cdfa34745641dfa5f43debfdcd79e229 with cherry-picked b93f514b995139c35a23c6a10b700705a07994de (Fix zfsd with the device_removal pool feature.) Bhyve guest: FreeBSD virtbsd.pwste.edu.pl 13.2-STABLE FreeBSD 13.2-STABLE #78 stable/13test-n255603-07255b281a06: Sun Jun 18 10:33:17 CEST 2023
This doesn't make sense to me. zfsd does not do anything that would interfere with some other process reading zvols. I think there must be something else going on. Two things to check: 1) What does "zpool status" show on the host, when the guest is unable to boot? 2) What does "top" show on the host at that time? Is zfsd spinning the CPU?
(In reply to Alan Somers from comment #3) >1) What does "zpool status" show on the host, when the guest is unable to boot? It shows the same as shows when the guest is able to boot. Guest's zpool is never imported, neither vfs.zfs.vol.recursive is enabled. >2) What does "top" show on the host at that time? Is zfsd spinning the CPU? Nothing special, among others: 7280 root 18 32 0 12G 32M kqread 10 3:51 100.50% bhyve 6833 root 1 20 0 20M 11M select 26 0:00 0.00% zfsd It is also probably worth mentioning, that I have had from time to time (very rarely though) problems with importing that pool (booting bhyve guest VM) in the past. It begun in stable/13 somewhere between 13.1 and 13.2 release IIRC. To fix that I had to enable vfs.zfs.vol.recursive, import zpool from from the host, perform some writes, for example clear a few snapshots, export zpool and then the VM booted fine. It wasn't reproducible though. It's worth to mention that all that time zfsd(8) was running. Now, after upgrading the bhyve host to stable/14, the problem with booting bhyve guest VM is 100% reproducible and fully dependent on the state of zfsd(8).
Here's another thing to check: sudo fstat -p `pgrep zfsd` That will tell you if zfsd is holding the zvol open for some reason. It shouldn't be. I can't reproduce this, BTW. I have a server running 14.0-CURRENT from august (And I'm updating it as we speak) hosting multiple VMs with storage on zvols. They boot fine.
(In reply to Alan Somers from comment #5) # fstat -p `pgrep zfsd` USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root zfsd 8038 text / 279328 -r-xr-xr-x 105480 r root zfsd 8038 wd / 4 drwxr-xr-x 30 r root zfsd 8038 root / 4 drwxr-xr-x 30 r root zfsd 8038 0 /dev 20 crw-rw-rw- null rw root zfsd 8038 1 /dev 20 crw-rw-rw- null rw root zfsd 8038 2 /dev 20 crw-rw-rw- null rw root zfsd 8038 3* pipe fffffe0074847c70 <-> fffffe0074847dc8 0 rw root zfsd 8038 4* pipe fffffe0074847dc8 <-> fffffe0074847c70 0 rw root zfsd 8038 5 /dev 116 crw-rw-rw- zfs rw root zfsd 8038 6 /dev 116 crw-rw-rw- zfs rw root zfsd 8038 7* local dgram fffff80891b10400 <-> fffff808916ef200 I tried to access nested zpool from the host after setting vfs.zfs.vol.recursive=1 to workaround the issue old way, by performing snapshot cleanup, but it seems to be impossible after switching to stable/14. So the whole problem is probably geneal ZFS issue, and zfsd(8) only triggers it.
Those are the expected file descriptors. What we don't see is any open zvols. And since zfsd doesn't change any in-kernel state, I really don't see any way that it could interfere with the VM. Unless you can come up with steps to reproduce starting from a clean system, I don't think it would be productive to investigate zfsd any more. > I tried to access nested zpool from the host after setting > vfs.zfs.vol.recursive=1 to workaround the issue old way, by performing > snapshot cleanup, but it seems to be impossible after switching to stable/14. In what sense does it "seems to be impossible"? What kind of error do you get when you try this?
(In reply to Alan Somers from comment #7) >In what sense does it "seems to be impossible"? What kind of error do you get >when you try this? The operation is unable to complete.
What operation? Please post the exact command and error message.
(In reply to Alan Somers from comment #9) >What operation? Please post the exact command and error message. # sysctl vfs.zfs.vol.recursive=1 vfs.zfs.vol.recursive: 0 -> 1 # zpool import -R /mnt -d /dev/zvol/zroot/ZVOL/virtbsdp2 -f vmzroot ... IIRC a few months ago the import worked fine from the stable/13 host, but now it doesn't work, neither from stable/14 nor from stable/13. The whole import back then was applied as a workaround to problems with importing zpool on the guest as reported above. Anyway, bhyve VM works fine, it's not zfsd(8) fault, so I don't really know what to do about this PR. To save our time and resources it will be wise to close or postpone it for a while.
Now both: bhyve host and guest are running FreeBSD stable/14. ZFS pools on host and guest were upgraded. The report is still valid and the issue is persistent. Maybe the transition to mirror (the guest is nested on a ZVOL of pool with a top-level-vdev that had previously been removed) has something to do with it (see bug 273663). The guest is using host's ZVOL zroot/ZVOL/virtbsd as vtbd0. This is how ZFS looks on the host: # zpool status zroot pool: zroot state: ONLINE scan: scrub repaired 0B in 00:11:06 with 0 errors on Fri Jan 26 03:21:55 2024 scan warning: skipped blocks that are only referenced by the checkpoint. remove: Removal of vdev 1 copied 15.1G in 0h0m, completed on Fri May 15 20:04:34 2020 119K memory used for removed device mappings checkpoint: created Mon Dec 4 20:10:12 2023, consumes 49.6G config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da0p2 ONLINE 0 0 0 da1p2 ONLINE 0 0 0 errors: No known data errors # zpool get all zroot NAME PROPERTY VALUE SOURCE zroot size 476G - zroot capacity 63% - zroot altroot - default zroot health ONLINE - zroot guid 6762271100094851702 - zroot version - default zroot bootfs zroot/ROOT/default14 local zroot delegation on default zroot autoreplace off default zroot cachefile - default zroot failmode wait default zroot listsnapshots off default zroot autoexpand off default zroot dedupratio 1.00x - zroot free 176G - zroot allocated 300G - zroot readonly off - zroot ashift 0 default zroot comment - default zroot expandsize - - zroot freeing 0 - zroot fragmentation 58% - zroot leaked 0 - zroot multihost off default zroot checkpoint 49.6G - zroot load_guid 2254167222130845727 - zroot autotrim on local zroot compatibility off default zroot bcloneused 0 - zroot bclonesaved 0 - zroot bcloneratio 1.00x - zroot feature@async_destroy enabled local zroot feature@empty_bpobj active local zroot feature@lz4_compress active local zroot feature@multi_vdev_crash_dump enabled local zroot feature@spacemap_histogram active local zroot feature@enabled_txg active local zroot feature@hole_birth active local zroot feature@extensible_dataset active local zroot feature@embedded_data active local zroot feature@bookmarks enabled local zroot feature@filesystem_limits enabled local zroot feature@large_blocks enabled local zroot feature@large_dnode enabled local zroot feature@sha512 enabled local zroot feature@skein enabled local zroot feature@edonr enabled local zroot feature@userobj_accounting active local zroot feature@encryption enabled local zroot feature@project_quota active local zroot feature@device_removal active local zroot feature@obsolete_counts active local zroot feature@zpool_checkpoint active local zroot feature@spacemap_v2 active local zroot feature@allocation_classes enabled local zroot feature@resilver_defer enabled local zroot feature@bookmark_v2 enabled local zroot feature@redaction_bookmarks enabled local zroot feature@redacted_datasets enabled local zroot feature@bookmark_written enabled local zroot feature@log_spacemap active local zroot feature@livelist enabled local zroot feature@device_rebuild enabled local zroot feature@zstd_compress enabled local zroot feature@draid enabled local zroot feature@zilsaxattr active local zroot feature@head_errlog active local zroot feature@blake3 enabled local zroot feature@block_cloning enabled local zroot feature@vdev_zaps_v2 active local # zfs get all zroot/ZVOL/virtbsd NAME PROPERTY VALUE SOURCE zroot/ZVOL/virtbsd type volume - zroot/ZVOL/virtbsd creation Wed Mar 10 8:59 2021 - zroot/ZVOL/virtbsd used 109G - zroot/ZVOL/virtbsd available 144G - zroot/ZVOL/virtbsd referenced 11.3G - zroot/ZVOL/virtbsd compressratio 1.13x - zroot/ZVOL/virtbsd reservation none default zroot/ZVOL/virtbsd volsize 50G local zroot/ZVOL/virtbsd volblocksize 8K - zroot/ZVOL/virtbsd checksum on default zroot/ZVOL/virtbsd compression lz4 inherited from zroot zroot/ZVOL/virtbsd readonly off default zroot/ZVOL/virtbsd createtxg 5153440 - zroot/ZVOL/virtbsd copies 1 default zroot/ZVOL/virtbsd refreservation 51.6G local zroot/ZVOL/virtbsd guid 1428736673261547338 - zroot/ZVOL/virtbsd primarycache all default zroot/ZVOL/virtbsd secondarycache all default zroot/ZVOL/virtbsd usedbysnapshots 45.8G - zroot/ZVOL/virtbsd usedbydataset 11.3G - zroot/ZVOL/virtbsd usedbychildren 0B - zroot/ZVOL/virtbsd usedbyrefreservation 51.5G - zroot/ZVOL/virtbsd logbias latency default zroot/ZVOL/virtbsd objsetid 66109 - zroot/ZVOL/virtbsd dedup off default zroot/ZVOL/virtbsd mlslabel none default zroot/ZVOL/virtbsd sync standard default zroot/ZVOL/virtbsd refcompressratio 1.21x - zroot/ZVOL/virtbsd written 85.1M - zroot/ZVOL/virtbsd logicalused 64.4G - zroot/ZVOL/virtbsd logicalreferenced 13.6G - zroot/ZVOL/virtbsd volmode default default zroot/ZVOL/virtbsd snapshot_limit none default zroot/ZVOL/virtbsd snapshot_count none default zroot/ZVOL/virtbsd snapdev hidden default zroot/ZVOL/virtbsd context none default zroot/ZVOL/virtbsd fscontext none default zroot/ZVOL/virtbsd defcontext none default zroot/ZVOL/virtbsd rootcontext none default zroot/ZVOL/virtbsd redundant_metadata all default zroot/ZVOL/virtbsd encryption off default zroot/ZVOL/virtbsd keylocation none default zroot/ZVOL/virtbsd keyformat none default zroot/ZVOL/virtbsd pbkdf2iters 0 default zroot/ZVOL/virtbsd snapshots_changed Fri Feb 2 4:20:00 2024 - This is how ZFS looks inside the guest: [virtbsd] ~# gpart show => 40 104857520 vtbd0 GPT (50G) 40 1024 1 freebsd-boot (512K) 1064 94371840 2 freebsd-zfs (45G) 94372904 10484656 3 freebsd-swap (5.0G) [virtbsd] ~# zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT vmzroot 44.5G 8.98G 35.5G - - 32% 20% 1.00x ONLINE - [virtbsd] ~# zpool status pool: vmzroot state: ONLINE scan: scrub repaired 0B in 00:00:16 with 0 errors on Sat Jan 27 03:01:33 2024 config: NAME STATE READ WRITE CKSUM vmzroot ONLINE 0 0 0 gpt/vmzfs0 ONLINE 0 0 0 errors: No known data errors [virtbsd] ~# zpool get all vmzroot NAME PROPERTY VALUE SOURCE vmzroot size 44.5G - vmzroot capacity 20% - vmzroot altroot - default vmzroot health ONLINE - vmzroot guid 14496911069470365237 - vmzroot version - default vmzroot bootfs vmzroot/ROOT/default local vmzroot delegation on default vmzroot autoreplace off default vmzroot cachefile - default vmzroot failmode wait default vmzroot listsnapshots off default vmzroot autoexpand off default vmzroot dedupratio 1.00x - vmzroot free 35.5G - vmzroot allocated 8.98G - vmzroot readonly off - vmzroot ashift 0 default vmzroot comment - default vmzroot expandsize - - vmzroot freeing 0 - vmzroot fragmentation 32% - vmzroot leaked 0 - vmzroot multihost off default vmzroot checkpoint - - vmzroot load_guid 9652613418712187666 - vmzroot autotrim on local vmzroot compatibility off default vmzroot bcloneused 0 - vmzroot bclonesaved 0 - vmzroot bcloneratio 1.00x - vmzroot feature@async_destroy enabled local vmzroot feature@empty_bpobj active local vmzroot feature@lz4_compress active local vmzroot feature@multi_vdev_crash_dump enabled local vmzroot feature@spacemap_histogram active local vmzroot feature@enabled_txg active local vmzroot feature@hole_birth active local vmzroot feature@extensible_dataset active local vmzroot feature@embedded_data active local vmzroot feature@bookmarks enabled local vmzroot feature@filesystem_limits enabled local vmzroot feature@large_blocks enabled local vmzroot feature@large_dnode enabled local vmzroot feature@sha512 enabled local vmzroot feature@skein enabled local vmzroot feature@edonr enabled local vmzroot feature@userobj_accounting active local vmzroot feature@encryption enabled local vmzroot feature@project_quota active local vmzroot feature@device_removal enabled local vmzroot feature@obsolete_counts enabled local vmzroot feature@zpool_checkpoint enabled local vmzroot feature@spacemap_v2 active local vmzroot feature@allocation_classes enabled local vmzroot feature@resilver_defer enabled local vmzroot feature@bookmark_v2 enabled local vmzroot feature@redaction_bookmarks enabled local vmzroot feature@redacted_datasets enabled local vmzroot feature@bookmark_written enabled local vmzroot feature@log_spacemap active local vmzroot feature@livelist enabled local vmzroot feature@device_rebuild enabled local vmzroot feature@zstd_compress enabled local vmzroot feature@draid enabled local vmzroot feature@zilsaxattr active local vmzroot feature@head_errlog active local vmzroot feature@blake3 enabled local vmzroot feature@block_cloning enabled local vmzroot feature@vdev_zaps_v2 active local