Bug 235206

Summary: [loader] add patch to fix starting with a damaged GPT scheme under legacy BIOS
Product: Base System Reporter: Emrion <kmachine>
Component: miscAssignee: Warner Losh <imp>
Status: Open ---    
Severity: Affects Only Me CC: imp, kmachine
Priority: --- Keywords: patch
Version: 12.0-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Patch to add an automatic GPT recovery feature to gptboot
none
Patch to add an automatic GPT recovery feature to gptzfsboot none

Description Emrion 2019-01-25 21:37:36 UTC
Created attachment 201403 [details]
Patch to add an automatic GPT recovery feature to gptboot

pmbr is the protective MBR program used by the FreeBSD GPT implementation. It allows to boot a GPT system disk under legacy BIOS. Its main task is to find a "freebsd-boot" partiton, load its content and execute it. This content, depending on the type of root file system, is either gptboot (UFS) or gptzfsboot (ZFS).

pmbr has a nice feature: it can boot a disk even if its primary GPT header is damaged, provided the backup header is healthy. To be clear, it only checks the tag "EFI PART" at the beginning of the header. It does not verify the crc32 of the header nor the one of the partition table.

The problem is that nor gptboot nor gptzfsboot handles correctly a damaged primary header and, as a result, the boot process dies after pmbr.

To reproduce this issue (warning! This is destructive, don't do that unless you don't care of the system you operate on):
    - Boot on a disc1.iso or a memstick image.
    - In the shell, identify the target disc.
    - For example, if the target disk is ada0, type: dd if=/dev/zero of=/dev/ada0 seek=1 count=1 
    - Reboot (you need to start in legacy BIOS mode).

If the root file system is ZFS, you'll see something like this:
---------------
shortening read at 20971505 from 16 to 15
gptzfsboot: No ZFS pools located, can't boot
---------------

Or:
---------------
gptzfsboot: error 12 lba 20971505
gptzfsboot: No ZFS pools located, can't boot
---------------

This example is on a VM with a 10 GiB disk. It seems that the first error occurs when the disk is "fixed" and the last if the disk is "dynamic".

If the root file system is UFS, you will get:
---------------
gptboot: invalid primary GPT header
gptboot: using backup GPT

BTX loader 1.00 BTX version is 1.02
Consoles: internal video/keyboard
BIOS drive C: is disk0
BIOS 639kB/3668928kB available memory

FreeBSD/x86 bootstrap loader, Revision 1.1
Startup error in /boot/lua/loader.lua:
LUA ERROR: cannot open /boot/lua/loader.lua: no such file or directory

can't load 'kernel'

OK _
---------------

I managed to bring a solution for gptboot. I thought that the base problem is it does not mend the GPT scheme, despite it is aware of its corruption. It could, though. This the reason why I wrote a patch for /usr/src/stand/libsa/gpt.c. After applying this patch, gptboot recovers automatically the GPT scheme in the scenario where the primary header/table are damaged and the backup header/table are valid.

In such a case, with the patched gptboot you get:

gptboot: invalid primary GPT header
gptboot: trying to recover GPT...
gptboot: GPT recovering -> SUCCESS

And then, the boot process continue.

I have to look for a gptzfsboot solution but I feel it's more complex. It doesn't use gpt.c apparently so I have to search deeper for this one.

PS : this is more or less the first time I propose a patch so be indulgent with me as I don't know many things in FreeBSD programming. I tried to do my best.
Comment 1 Emrion 2019-01-26 20:49:10 UTC
Created attachment 201423 [details]
Patch to add an automatic GPT recovery feature to gptzfsboot

I eventually made a patch for gptzfsboot. It modifies /usr/src/stand/i386/zfsboot/zfsboot.c and add the automatic recovery of a GPT scheme feature but not only. It also verifies the crc of the GPT header and the crc of the partition table.

I have picked the code in /usr/src/stand/libsa/gpt.c, including mine from the previous patch, and adapted it to the zfsboot.c environment.

It's a pity that the gpt.c code is used in gptboot but not in gptzfsboot. I think this would require to rewrite a good part of gpt.c and gptboot.c (not forget zfsboot.c itself). Besides that, if I well understood, the code of zfsboot.c is also used for booting a MBR scheme. So there is a mix of MBR and GPT functions & instructions in the same file. Could the design be improved?

Well, this is just my little understanding as I'm far to know enough in this matter.
Comment 2 Pokemon999 2019-06-17 07:13:52 UTC
MARKED AS SPAM
Comment 3 Warner Losh freebsd_committer freebsd_triage 2021-07-08 21:36:39 UTC
I'll take a look at these. These patches look like they may be good. Will evaluate.
Comment 4 Emrion 2021-07-09 12:32:01 UTC
(In reply to Warner Losh from comment #3)
Thank you Warner to examine these patches. They work of course, but the problem is that some code is duplicated beetween /usr/src/stand/libsa/gpt.c and /usr/src/stand/i386/zfsboot/zfsboot.c. That's a design problem which deserves to be solved, I think.