Bug 222425 - GPT broken after "gpart backup ada0 | gpart restore -F ada1"
Summary: GPT broken after "gpart backup ada0 | gpart restore -F ada1"
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 11.1-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-18 11:12 UTC by chris
Modified: 2018-12-05 14:43 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description chris 2017-09-18 11:12:11 UTC
There is a problem with the MBR after running "gpart backup ada0 | gpart restore -F ada1" and because of this it's not possible to boot using this disk.

This issue happens only with FreeBSD 11.1 and maybe 11.0. It doesn't happen with 10.3.

How to reproduce the issue:

gpart backup ada0 | gpart restore -F ada1
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1

Then boot using ada1 I see:

gptboot: invalid primary GPT header
gptboot: invalid backup GPT header
gptboot: unable to load GPT

The only workaround is copying the MBR from ada0 to ada1:

gpart backup ada0 | gpart restore -F ada1
dd if=/dev/ada0 of=/dev/ada1 bs=512 count=40
gpart recover ada1
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1

More information here:

https://forums.freebsd.org/threads/62376/
Comment 1 Andrey V. Elsukov freebsd_committer 2017-09-21 14:23:09 UTC
(In reply to chris from comment #0)
> There is a problem with the MBR after running "gpart backup ada0 | gpart
> restore -F ada1" and because of this it's not possible to boot using this
> disk.
> 
> This issue happens only with FreeBSD 11.1 and maybe 11.0. It doesn't happen
> with 10.3.

In the forum you told that you changed the hardware, so I'm not sure that this statement is fully correct. You need to check that 10.3 works on the same hardware, because this can be just  BIOS issue or something like.
 
> How to reproduce the issue:
> 
> gpart backup ada0 | gpart restore -F ada1
> gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1
> 
> Then boot using ada1 I see:
> 
> gptboot: invalid primary GPT header
> gptboot: invalid backup GPT header
> gptboot: unable to load GPT

There were no changes in the gpart(8), but boot code was changed.
Did you try to swap disks? I.e. connect ada0 that works to the place of ada1? Does it still work? Does ada1 connected to the port of ada0 works?
 
> The only workaround is copying the MBR from ada0 to ada1:
> 
> gpart backup ada0 | gpart restore -F ada1

This should create on the ada1 similar to ada0 partition table, what if you try to do `true > ada1` after that? This command will invoke "GEOM retaste", and if GPT is invalid at this time, this means that something wrong happens and table was not written to the disk. If GPT is valid, this means that something wrong happens after reboot. 

Also, if you reboot from ada0 and partition tables from both disks will be detected correctly, this means that something wrong with bootcode/BIOS.
Comment 2 chris 2017-09-21 14:40:58 UTC
Yes 10.3 works on the same hardware without issue.

The problem is not related to specific hardware as I test it in different servers with different motherboard and different disks.

Also the datacenter swap the disks so ada0 is ada1 and ada1 is ada0, but the same issue again after I install FreeBSD 11.1.

I will do the "true > /dev/ada1" and let you know.
Comment 3 chris 2017-09-21 14:43:39 UTC
> Also, if you reboot from ada0 and partition tables from both disks will be detected correctly, this means that something wrong with bootcode/BIOS.

Partition tables from both disks are correct after the reboot.
Comment 4 chris 2017-09-21 15:39:40 UTC
I run:

gpart backup ada0 | gpart restore -F ada1
true > /dev/ada1
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1

The same issue.
Comment 5 Andrey V. Elsukov freebsd_committer 2017-09-21 15:46:27 UTC
(In reply to chris from comment #4)
> I run:
> 
> gpart backup ada0 | gpart restore -F ada1
> true > /dev/ada1
> gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1

So, it partition table is still here after `true > /dev/ada1`, this means that before reboot you have correct partition table.

And this means that gptboot by some cause thinks that GPT at the boot moment is wrong.
I assume, that when you return back the booting from ada0 in the BIOS, partition tables on both disk are good? I.e. `gpart show` shows all partitions without CORRUPT word.
Can you try to write gptboot from 10.3?
I.e. `gpart bootcode -b /boot/pmbr -p /10.3/boot/gptboot -i 1 ada1` 
where /10.3/boot/gptboot - is bootcode from 10.3.
Comment 6 Andrey V. Elsukov freebsd_committer 2017-09-21 15:47:15 UTC
(In reply to chris from comment #4)
> I run:
> 
> gpart backup ada0 | gpart restore -F ada1
> true > /dev/ada1
> gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1

So, if partition table is still here after `true > /dev/ada1`, this means that before reboot you have correct partition table.

And this means that gptboot by some cause thinks that GPT at the boot moment is wrong.
I assume, that when you return back the booting from ada0 in the BIOS, partition tables on both disk are good? I.e. `gpart show` shows all partitions without CORRUPT word.
Can you try to write gptboot from 10.3?
I.e. `gpart bootcode -b /boot/pmbr -p /10.3/boot/gptboot -i 1 ada1` 
where /10.3/boot/gptboot - is bootcode from 10.3.
Comment 7 chris 2017-09-21 16:31:41 UTC
Yes after booting from ada0, "gpart show" shows all partitions without CORRUPT word:

=>        40  7814037088  ada0  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064     8388608     2  freebsd-ufs  (4.0G)
     8389672    33554432     3  freebsd-swap  (16G)
    41944104    33554432     4  freebsd-ufs  (16G)
    75498536   134217728     5  freebsd-ufs  (64G)
   209716264    67108864     6  freebsd-ufs  (32G)
   276825128  1610612736     7  freebsd-ufs  (768G)
  1887437864  5926599256     8  freebsd-ufs  (2.8T)
  7814037120           8        - free -  (4.0K)

=>        40  7814037088  ada1  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064     8388608     2  freebsd-ufs  (4.0G)
     8389672    33554432     3  freebsd-swap  (16G)
    41944104    33554432     4  freebsd-ufs  (16G)
    75498536   134217728     5  freebsd-ufs  (64G)
   209716264    67108864     6  freebsd-ufs  (32G)
   276825128  1610612736     7  freebsd-ufs  (768G)
  1887437864  5926599256     8  freebsd-ufs  (2.8T)
  7814037120           8        - free -  (4.0K)

I wrote bootcode from 10.3:

root@server2:/home/10.3 # pwd
/home/10.3

root@server2:/home/10.3 # restore -i -f /home2/dump/root.dump
restore > add *
restore > extract
You have not read any tapes yet.
If you are extracting just a few files, start with the last volume
and work towards the first; restore can quickly skip tapes that
have no further files to extract. Otherwise, begin with volume 1.
Specify next volume #: 1
set owner/mode for '.'? [yn] y
restore >
root@server2:/home/10.3 #

root@server2:/home/10.3 # gpart bootcode -b /home/10.3/boot/pmbr -p /home/10.3/boot/gptboot -i 1 ada1
partcode written to ada1p1
bootcode written to ada1
root@server2:/home/10.3 #


The same problem.
Comment 8 Andrey V. Elsukov freebsd_committer 2017-09-21 16:39:23 UTC
Oh, I didn't noticed that you use 4T disks. This may be the cause. But this doesn't explain, why it works if you use overwritten GPT.
Comment 9 chris 2017-09-21 16:50:53 UTC
Yes it doesn't make sense.

I also tried without success:

sysctl kern.geom.debugflags=17
dd if=/dev/zero of=/dev/ada1 bs=512 count=40
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1
Comment 10 Warren Block freebsd_committer 2017-12-06 15:59:24 UTC
I'm seeing this also when trying to switch from a 64G to 200G virtual disk in a VirtualBox VM.

If I use gpart backup | gpart recover to copy the partition information, then delete and recreate the later partitions as larger ones, then write the bootcode, I get the

gptboot: invalid primary GPT header
gptboot: invalid backup GPT header
gptboot: unable to load GPT

flickering and repeating quickly on the VM console.  Rewriting the bootcode again with gpart bootcode -b /boot/pmbr -p /boot/gptboot does not fix it, nor does setting the disk active.  sha256 checksums on the boot partitions showed the working and non-working disks were identical.  gpart says the old and new disk partition tables are valid, but the new one does not boot.

Doing a gpart destroy -F on the new disk and manually creating partitions and writing bootcode works, and the drive boots normally.

I have not tested on physical hardware.
Comment 11 chris 2018-05-23 23:09:12 UTC
Does someone has more information about this issue?
Comment 12 chris 2018-12-05 14:43:36 UTC
​The problem is related to:

gpart backup ada1 | gpart restore -F ada0

If after the gpart restore I remove and recreate the freebsd-boot partition then it works:

gpart delete -i 1 ada0
gpart add -a 4k -t freebsd-boot -s 512k ada0
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada0

Keep in mind that you have to use mfsbsd to remove the partition or maybe you can use this command to do it in live system:

sysctl kern.geom.debugflags=16

Another user in freebsd forum report this:

MBR + UFS --- OK
GPT + UFS --- NO
GPT + ZFS --- OK