Run makefs twice, and it will produce zpools with the same guid. It makes it impossible to import them at the same time. You have to import them one at a time and `zpool reguid` them.
I see there is a comment [1] about using a fixed seed: /* * Use a fixed seed to provide reproducible pseudo-random numbers for * on-disk structures when needed (e.g., GUIDs, ZAP hash salts). */ When are these needed to be reproducible? Should makefs take another flag to produce random GUIDs, or have a note in the man page that it will always produce the same GUID? I spent quite a bit of time trying to load two images into bhyve before realizing it was a guid conflict. I don't think it should be necessary to import the zpool and reguid it, so I'd be in favor of a flag if there's some reason the default should always produce the same GUID. [1] https://cgit.freebsd.org/src/tree/usr.sbin/makefs/zfs.c#n787
The same GUID is used because I didn't want to break reproducibility of VM images (the main use-case for makefs -t zfs). That is, if you and I both build an image with the same inputs, the output images should be byte-identical. Certainly the documentation is deficient, I'll work on that. I don't have strong feelings on what the default behaviour should be, but I'm a bit inclined towards keeping the current default and adding a non-reproducible mode. How exactly are you using makefs?
I am using it to create two disk images that I attach to a single bhyve. One is zroot and gets replaced periodically (using it like a BE basically). The second is zdata which is long-lived. So, I made two images, started bhyve, and it failed to boot [1]. Turns out itβs because the zpools have the same guid. [1] https://gist.github.com/patmaddox/da981282718fc033b05053716bc36144#file-2_first_boot-txt
bhyve command is: bhyve -c 2 -m 4G -A -H -P \ -s 0:0,hostbridge \ -s 1:0,virtio-net,tap1 \ -s 2:0,ahci-hd,/tmp/bhyve-pb/poudriere-builder-15.0-stabweek-2024-10.img \ -s 3:0,ahci-hd,/tmp/bhyve-pb/zdata.img \ -s 31,lpc -l com1,stdio \ -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \ pb
(In reply to Pat Maddox from comment #3) But the root pool should be reguid'ed on first boot anyway. The official VM images configure this automatically (they set zpool_reguid=zroot in /etc/rc.conf), and in general you'd want to make sure that two VMs using the same image will have different pool GUIDs. How did you build the root pool?
My root pool is copied from release / our prior convo: makefs -t zfs -s 20g \ -o poolname=zroot -o bootfs=zroot/ROOT/default -o rootpath=/ \ -o fs=zroot\;mountpoint=none \ -o fs=zroot/ROOT\;mountpoint=none \ -o fs=zroot/ROOT/default\;mountpoint=/ \ -o fs=zroot/home\;mountpoint=/home \ -o fs=zroot/tmp\;mountpoint=/tmp\;exec=on\;setuid=off \ -o fs=zroot/usr\;mountpoint=/usr\;canmount=off \ -o fs=zroot/usr/ports\;setuid=off \ -o fs=zroot/usr/src \ -o fs=zroot/usr/obj \ -o fs=zroot/var\;mountpoint=/var\;canmount=off \ -o fs=zroot/var/audit\;setuid=off\;exec=off \ -o fs=zroot/var/crash\;setuid=off\;exec=off \ -o fs=zroot/var/log\;setuid=off\;exec=off \ -o fs=zroot/var/mail\;atime=on \ -o fs=zroot/var/tmp\;setuid=off \ ${outfileroot} ${rootdir} and the data pool is another typical invocation: makefs -t zfs -s 100m \ -o poolname=zdata -o rootpath=/ \ -o fs=zdata\;mountpoint=/\;canmount=noauto \ -o fs=zdata/usr\;mountpoint=/usr\;canmount=off \ -o fs=zdata/usr/local\;canmount=off \ -o fs=zdata/usr/local/poudriere \ ${BUILDDIR}/zdata.zfs ${BUILDDIR}/data ----- > the root pool should be reguid'ed on first boot anyway. The official VM images configure this automatically (they set zpool_reguid=zroot in /etc/rc.conf) Good to know, I will check that out. I would kind of expect it to not work, because I think the boot process doesn't even make it that far as I showed above. I'll try it out and report back though. So I think it may be worth providing a way to 1) randomize the guid on creation and/or 2) seed the RNG on creation. Extend this to a third disk: I have one root pool, one read-only pool with a dataset, and a third writable pool that contains long-lived data. I need all of these to have different GUIDs. The reason I may want to seed the RNG is because if I replace the root pool, I want the VM to think it's the same. From the VM standpoint, it's like I exported the pool, imported it to another host, did some stuff on it, and imported it back on the VM. I happen to be reconstructing the disk via code, but no reason the VM needs to know that.
I'll have to think about this some more... because I wonder if zfs-reguid should accept a fixed value? Consider this: a build script that creates a root pool, a read-only data pool, and a writeable data pool. I would want the build script to just produce a single image each. Then when attaching them to VMs, I would want them to have a different GUID per VM - but also retain their GUIDs within a single VM. So it would look like: for vm in vm1 vm2 vm3; do cp root.zfs ${vm}.root.zfs import_pool ${vm}.root.zfs zfs reguid ${vm}-root $(lookup_guid vm1 root) export_pool ${vm}-root cp data.zfs ${vm}.data.zfs import_pool ${vm}.data.zfs zfs reguid ${vm}-data $(lookup_guid vm1 data) export_pool ${vm}-data done
You're booting a VM with two disks that each have their own pool generated by makefs, and the kernel can't mount root because both pools have the same GUID? That seems surprising. I just tried that experiment myself and was able to boot, so I think something else is going on there.
Here's an example I put together: https://github.com/patmaddox/lab/blob/trunk/share/examples/bhyve/two-makefs-images/Makefile Does that work for you? Or do you see something wrong I'm doing in it? For me it fails with: Mounting from zfs:zroot failed with error 22; retrying for 3 more seconds random: unblocking device. Mounting from zfs:zroot failed with error 22. Loader variables: vfs.root.mountfrom=zfs:zroot Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input
(In reply to Pat Maddox from comment #9) Just a guess, but is the root pool missing a bootfs property?
> is the root pool missing a bootfs property? I believe per your article it's not necessary: https://freebsdfoundation.org/zfs-images-from-scratch-or-makefs-t-zfs/ And based on my observation, it's not necessary. In any case, I have added bootfs to the example: https://github.com/patmaddox/lab/commit/5e351e22a45cfe43af0d7709b954033319abb457 Same behavior. If you boot without the second disk, it boots. If you reguid either pool, it works. It's when both pools have the same guid that it fails to boot. If my example fails for you, then there's something different between your test and mine. If my example passes for you, then there's something different between your machine and mine.
(In reply to Pat Maddox from comment #11) If I use your makefile, but modify the bhyve/bhyveload invocation to boot the official 14.2-BETA2 zfs image[1], it boots fine. I verified that all three pools have the same guid, per zdb -u it's 4116862866898151352. So I suspect that there's something else going on, and that importing one of the pools has some side effect which fixes the problem. In particular, if I generate zdata.zfs using your script, then boot the 14.2 image into single user mode (so reguid hasn't run), I can see that both pools have the same GUID and yet the kernel was able to mount root successfully. So something else is going on. root@:/ # zdb -l /dev/ada0p4 ------------------------------------ LABEL 0 ------------------------------------ txg: 4 version: 5000 state: 1 name: 'zroot' pool_guid: 4016146626377348012 top_guid: 100716240520803340 guid: 100716240520803340 vdev_children: 1 features_for_read: vdev_tree: type: 'disk' ashift: 12 asize: 5363990528 guid: 100716240520803340 id: 0 path: '/dev/null' whole_disk: 1 create_txg: 4 metaslab_array: 2 metaslab_shift: 29 labels = 0 1 2 3 root@:/ # zdb -l /dev/ada1 ------------------------------------ LABEL 0 ------------------------------------ txg: 4 version: 5000 state: 1 name: 'zdata' pool_guid: 4016146626377348012 top_guid: 100716240520803340 guid: 100716240520803340 vdev_children: 1 features_for_read: vdev_tree: type: 'disk' ashift: 12 asize: 100139008 guid: 100716240520803340 id: 0 path: '/dev/null' whole_disk: 1 create_txg: 4 metaslab_array: 2 metaslab_shift: 24 labels = 0 1 2 3 I tried booting with all three pools, and that works too. [1] https://download.freebsd.org/releases/VM-IMAGES/14.2-BETA3/amd64/Latest/FreeBSD-14.2-BETA3-amd64-zfs.raw.xz
Just so I'm understanding correctly: if you run `make` on my example (perhaps taking out /adjusting the tap device first), it boots all the way to the login prompt?
(In reply to Pat Maddox from comment #13) No, if I run your example unmodified, I can reproduce the problem. But if I substitute a FreeBSD zfs image for your zroot.zfs, the VM boots to a login prompt, despite both the FreeBSD image and zdata.zfs having the same pool GUID.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=a20249443be111e8a3cb3b7bbe4a0d0e460a6058 commit a20249443be111e8a3cb3b7bbe4a0d0e460a6058 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-11-19 21:07:56 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2024-11-19 21:18:38 +0000 makefs.8: Clarify that makefs-generated zpools always have the same GUID PR: 282832 MFC after: 1 week usr.sbin/makefs/makefs.8 | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
Doc update works for me! Thanks (I don't know if I get to close this PR, or someone else should)
(In reply to Pat Maddox from comment #16) I'll take care of closing this after the change is merged to stable/14.
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c9f9f1a282ab81276feb81d0509e44535ebda504 commit c9f9f1a282ab81276feb81d0509e44535ebda504 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-11-19 21:07:56 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2024-11-28 14:38:17 +0000 makefs.8: Clarify that makefs-generated zpools always have the same GUID PR: 282832 MFC after: 1 week (cherry picked from commit a20249443be111e8a3cb3b7bbe4a0d0e460a6058) usr.sbin/makefs/makefs.8 | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)