Created attachment 241996 [details] Installation log for AutoZFS on MBR layout that will destroy disklabel While trying several time this installation: - FreeBSD-13.2-RELEASE-amd64-disc1.iso When I select this layout: - AutoZFS - MBR It always ends up in broken - Non-bootable system (famous error "Missing operating system" Analysis of such system (booting CD in Shell mode) has shown that somehow whole disklabel is missing. Only primary FreeBSD partition exists. I traced down the problem that this command (from bsdinstall_log): # this works disklabel da0s1 # this command wipes 1st 4KB of da0s1a thus destroying disklabel! zpool create -o altroot=/mnt -m "/bootpool" -f "bootpool" da0s1a # the command below will now end with error! disklabel da0s1 Tracing above "zpool create" command I have found that it really wipes 1st 4KB of "da0s1a" thus destroying disklabel that occupies offset from 512 to 1024 bytes. Here is relevant excerpt from truss tracing: 702: fstatat(AT_FDCWD,"/dev/da0s1a",{ mode=crw-r----- ,inode=104,size=0,blksize=4096 },0x0) = 0 (0x0) 702: openat(AT_FDCWD,"/dev/da0s1a",O_WRONLY|O_EXCL,00) = 6 (0x6) 702: write(6,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,4096) = 4096 (0x1000) 702: fdatasync(6) = 0 (0x0) 702: close(6) = 0 (0x0) 702: open("/mnt/bootpool",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC,015) ERR#2 'No such file or directory' You can clearly see that device da0s1a is opened and immediately wiped with 4096 bytes of zeroes thus liquidating disklabel (!).
Created attachment 241997 [details] Trace of zfs create command that will wipe disklabel s1a
Additional details: The original installation disklabel looks that way: (the da0s1a was create by install using: gpart add -i 1 -t freebsd-zfs -s 2147483648b "da0s1") # /dev/da0s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 4194304 0 ZFS b: 4194304 4194304 swap c: 18874304 0 unused 0 0 # "raw" part, don't edit d: 4194304 8388608 ZFS Please notice that da0s1 starts at 0 - so extreme care must be taken to not overwrite disklabel on 2nd sector (byte offest from 512 to 1023). =========================================================================== Using development machine with 2nd disk and GDB I have found following stacktrace (by putting breakpoint just at __sys_write): reakpoint 1.2, _write () at _write.S:4 4 _write.S: No such file or directory. (gdb) bt #0 _write () at _write.S:4 #1 0x00000008011af1f6 in __thr_write (fd=6, buf=0x7fffffffb040, nbytes=4096) at /usr/src/lib/libthr/thread/thr_syscalls.c:618 #2 0x000000000104c900 in zero_label (path=path@entry=0x7fffffffc190 "/dev/da1s1a") at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_vdev.c:905 #3 0x000000000104b5f0 in make_disks (zhp=zhp@entry=0x0, nv=0x801c10420) at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_vdev.c:1048 #4 0x000000000104b333 in make_disks (zhp=zhp@entry=0x0, nv=nv@entry=0x801c1d080) at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_vdev.c:1070 #5 0x000000000104b8e1 in make_root_vdev (zhp=0x6, zhp@entry=0x0, props=<optimized out>, force=force@entry=1, check_rep=<optimized out>, replacing=replacing@entry=B_FALSE, dryrun=dryrun@entry=B_FALSE, argc=1, argv=0x801c36040) at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_vdev.c:1864 #6 0x0000000001036ef9 in zpool_do_create (argc=<optimized out>, argv=<optimized out>) at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_main.c:1587 #7 0x00000000010364b6 in main (argc=9, argv=0x7fffffffeb38) at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_main.c:10755 The problematic code is on #3 0x000000000104b5f0 in make_disks (zhp=zhp@entry=0x0, nv=0x801c10420) at /usr/src/sys/contrib/openzfs/cmd/zpool/zpool_vdev.c:1048 Which looks like: ret = zero_label(udevpath); if (ret) return (ret); Here is source pointer but the line offset is a bit different from what I see in local sources of 13.2. release: - https://cgit.freebsd.org/src/blame/sys/contrib/openzfs/cmd/zpool/zpool_vdev.c?h=releng/13.2#n1080 And this is problem - overwriting disklabel..
(In reply to Henryk Paluch from comment #2) Sentence: > Please notice that da0s1 starts at 0 - so extreme care must be taken Should be: > Please notice that da0s1a starts at 0 - so extreme care must be taken
I suspect that this commit that is supposed to fix bootloader problem: - https://cgit.freebsd.org/src/commit/?id=a94af9543df400cc2b8f41a4b6866b9684064728 causes this "wipe disklabel" problem, because it does take into account that "zfs create" wipes 1st 4KB of that slice "da0s1a" thus killing disklabel.
(In reply to Henryk Paluch from comment #4) Correction > causes this "wipe disklabel" problem, because it does take into account that "zfs create" wipes 1st 4KB of that slice "da0s1a" thus killing disklabel. Should be: > causes this "wipe disklabel" problem, because it does NOT take into account that "zfs create" wipes 1st 4KB of that slice "da0s1a" thus killing disklabel.
Might we just remove the MBR option in zfsboot? The usefulness of Auto ZFS + ZFS in 2023 is questionable and it is currently broken as-is. We either need to fix it or remove the option to eliminate footgunning
Also worth noting that this is broken on every release since OpenZFS (13 and beyond).
I would concur that removing MBR support is better. There are significant issues with updating `zfsboot` (the next stage bootstrap) in an MBR pool, since it lives within the partition which is opened exclusively when the system is booted. So you have to use GEOM debug flags to allow scribbling over random offsets of the disk in order to update your bootcode. So even if it was not broken, it is a bad idea in general. It only really existed because of some BIOSes that couldn't boot from GPT, but that was fixed with the "GPT + Active" and "Lenovo Fix" options.
(In reply to Allan Jude from comment #8) Allan, Thanks for the reply. I'm pretty novice when it comes to low-level storage, GEOM, CAM, etc...but I wanted to use this opportunity to learn something even if MBR is indeed on the path to being removed :) A few notes: 1. I tweaked bsdinstall/scripts/zfsboot to offset the first freebsd-zfs label (the boot pool partition) by `-b 8` (4096 bytes). This at least preserves the labels that were getting clobbered (noted earlier in this report). 2. After making the above tweak the /boot/mbr just sits at a blinking cursor, and /boot/boot0 just prints a `#`...but it no longer says "Missing operating system" 3. Booting back into the installer and breaking out into a shell, I confirm bsdlabel is able to read /dev/ada0s1 without any issues. Without the tweak in #1, this failed (again, as noted earlier in this report). 4. Also I reinstalled zfsboot using `dd if=/boot/zfsboot of=/dev/ada0s1 count=1`and `dd if=/boot/zfsboot of=/dev/ada0s1 skip=1 seek=1024` After doing this I am able to boot.
Actually I'm realizing now that I probably just got lucky... It is probably safer to offset the boot pool partition farther than `-b 8` , something like `-b 2048` is safer. I believe my `dd if=/boot/zfsboot of=/dev/ada0s1 skip=1 seek=1024` is likely stomping on something it shouldn't be.
https://reviews.freebsd.org/D40816 diff --git a/usr.sbin/bsdinstall/scripts/zfsboot b/usr.sbin/bsdinstall/scripts/zfsboot --- a/usr.sbin/bsdinstall/scripts/zfsboot +++ b/usr.sbin/bsdinstall/scripts/zfsboot @@ -106,6 +106,11 @@ # : ${ZFSBOOT_BOOT_POOL_SIZE:=2g} +# +# Default offset for the boot pool when enabled (e.g., geli(8) or MBR) +# +: ${ZFSBOOT_BOOT_POOL_OFFSET:=2048} + # # Default disks to use (always empty unless being scripted) # @@ -1008,12 +1013,14 @@ # # Always prepare a boot pool on MBR - # Do not align this partition, there must not be a gap + # + # Offset the boot pool in order to protect the disk label + # and zfsboot from being clobbered by ZFS # ZFSBOOT_BOOT_POOL=1 f_eval_catch $funcname gpart \ "$GPART_ADD_ALIGN_INDEX_WITH_SIZE" \ - "" 1 freebsd-zfs ${bootsize}b ${disk}s1 || + "-b $ZFSBOOT_BOOT_POOL_OFFSET" 1 freebsd-zfs ${bootsize}b ${disk}s1 || return $FAILURE # Pedantically nuke any old labels f_eval_catch -d $funcname zpool "$ZPOOL_LABELCLEAR_F" \ @@ -1424,11 +1431,11 @@ fi f_dprintf "$funcname: Updating MBR boot loader on disks..." - # Stick the ZFS boot loader in the "convenient hole" after - # the ZFS internal metadata + # The boot pool is offset from the beginning of the slice in order + # to protect the disk label and zfsboot from being clobbered by ZFS for disk in $disks; do f_eval_catch $funcname dd "$DD_WITH_OPTIONS" \ - /boot/zfsboot /dev/$disk$bootpart \ + /boot/zfsboot /dev/${disk}s1 \ "skip=1 seek=1024" || return $FAILURE done
(In reply to Allan Jude from comment #8) If MBR is removed, then I (and expect others) wouldn't be able to install from the dvd isos and memstick images available at the moment on download.freebsd.org, for any version 13.1 and upward. https://lists.freebsd.org/archives/freebsd-stable/2023-July/001346.html https://lists.freebsd.org/archives/freebsd-stable/2023-July/001347.html motherboard is an Asus Z87-PRO both UEFI and legacy are supported, and GPT booting has worked before. In the end, I had to install auto-ufs with MBR as I couldn't get a zfs install to boot with any version of the installation media.
(In reply to void from comment #12) I should have mentioned it more clearly before, my issue I concluded in the end was with GPT specifically (auto-ufs2 & GPT). I tried auto-zfs first (I understood this to default to GPT, I can't remember now), failed, tried auto-ufs, selected GPT (defaults to MBR), failed, then noticed the install media which booted fine was MBR, so selected auto-ufs with MBR and that worked. maybe my issue needs a different PR?