Bug 208742

Summary: /boot/gptzfsboot boots kernel from another zpool (not root pool)
Product: Base System Reporter: Markus Grundmann <markus>
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: Closed Works As Intended    
Severity: Affects Some People CC: eugen, markus, matthew, nowak, smh
Priority: ---    
Version: 10.2-RELEASE   
Hardware: amd64   
OS: Any   

Description Markus Grundmann 2016-04-12 14:04:40 UTC
Hi FreeBSD developer,

on my server the following GPT configuration exists:

$ gpart show -l ada2
=>       34  234441581  ada2  GPT  (112G)
         34          6        - free -  (3.0K)
         40       1024     1  gptboot0  (512K)
       1064        984        - free -  (492K)
       2048   16777216     2  backup0  (8.0G) <= Type ZFS
   16779264  217661440     3  operatingsystem0  (104G) <= Type ZFS
  234440704        911        - free -  (456K)

[mg16373@trinitron ~]$ gpart show -l ada3
=>       34  234441581  ada3  GPT  (112G)
         34          6        - free -  (3.0K)
         40       1024     1  gptboot1  (512K)
       1064        984        - free -  (492K)
       2048   16777216     2  backup1  (8.0G) <= Type ZFS
   16779264  217661440     3  operatingsystem1  (104G) <= Type ZFS
  234440704        911        - free -  (456K)

The "partition" with ID 3 contains the preferred boot pool named "zroot" and was mirrored fine. The partition ID 2 has an zpool with the name "zbackup". On this "zbackup"-Pool a directory "/boot" exists and contains a backup of my previous "zroot/boot" directory.

Today I have upgraded the server from FreeBSD 10.2-RELEASE P#14 to 10.3-RELEASE but the server starts everytime with my custom kernel "10.2-RELEASE P#14". The root-filesystem was seems to be used from partition ID 3. In this case the "pf.ko" (based on 10.3) module was not loaded because the message ~"kernel version mismatch" was displayed.

After some investigations I have copied all files from "zroot/boot" to "zbackup/boot" (previously unmounted and the pool was not imported by ZFS) and I have restarted the server. WOW! The system boots into kernel 10.3-RELEASE.

Now I have destroyed the ZFS pool "zbackup" and zeroing the partition ID 3. Now the server boots the wanted "zroot/boot" filesystem with the right kernel.

I believe that "gptzfsboot" is using the first available ZFS pool that contains a valid "/boot" filesystem or directory.

Best regards,
Markus Grundmann
Germany

Sorry for my bad english ;-)
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2016-04-13 08:18:21 UTC
(In reply to Markus Grundmann from comment #0)
Apologies, but did you just want to tell a story? Or did you want to ask a question? Or did you want to report a bug? Could you please be a little bit more specific and tell what action you expect in response?
Comment 2 Markus Grundmann 2016-04-13 08:36:42 UTC
(In reply to Andriy Gapon from comment #1)


I have no idea why a false pool was used as basis for the first stage to boot the kernel. I think it is a bug. After the kernel is booted right zpool is used.

"/boot/loader" says:
vfs.root.mountfrom = "zfs:zroot/ROOT/default"

and not

vfs.root.mountfrom = "zfs:zbackup/..."

I think gptzfsboot code is not interested in the name of the pool to boot.

Best regards,
Markus
Comment 3 Steven Hartland freebsd_committer freebsd_triage 2016-04-13 12:24:29 UTC
vfs.root.mountfrom tells the kernel which FS to use as the root FS, this is after the kernel is loaded, which is done from the first valid partition on the boot drive hence what your seeing.

Would be nice if we had more control of this but the quick fix is remove the loader from your old device.
Comment 4 Markus Grundmann 2016-04-13 12:45:47 UTC
The old device (partition ID 2) has no boot loader.
Comment 5 Andriy Gapon freebsd_committer freebsd_triage 2016-04-13 12:56:06 UTC
(In reply to Markus Grundmann from comment #2)
The behaviour of gptzfsboot is described in its manual page, likewise the behaviour of zfsloader. Do you see anything that works contrary to the documented behaviour?
https://www.freebsd.org/cgi/man.cgi?query=gptzfsboot&sektion=8&apropos=0&manpath=FreeBSD+10.3-RELEASE
Comment 6 Steven Hartland freebsd_committer freebsd_triage 2016-04-13 12:57:38 UTC
gptzfsboot is not use for EFI.
Comment 7 Steven Hartland freebsd_committer freebsd_triage 2016-04-13 12:59:58 UTC
For clarification my comments are for EFI, not that this is case here, sorry if that was confusing.
Comment 8 nowak 2016-04-13 13:03:05 UTC
Changing partition type to other than freebsd-zfs should make gptzfsboot skip that partition.

Changing the order of 2nd and 3rd partition would also work.
Comment 9 Andriy Gapon freebsd_committer freebsd_triage 2016-04-13 13:12:56 UTC
(In reply to Steven Hartland from comment #3)
We have some control via /boot.config or /boot/config, but they have to be placed on the first pool.  BTW, I am not sure if just removing zfsloader from the first pool is going to help. I suspect that that can result in a boot failure, but I am not sure.

There has also been a proposal for the ZFS boot chain to ignore pools that do not have bootfs explicitly set unless such a pool is an only discovered pool.
I think that that makes sense.
Comment 10 Markus Grundmann 2016-04-13 13:18:02 UTC
No "boofs" is defined. I understand the "issue" when gptzfsboot is using the first available zpool that holds all files in /boot for first stage of boot.


# zpool import zbackup
# zpool get all zbackup
NAME     PROPERTY                       VALUE                          SOURCE
zbackup  size                           7.94G                          -
zbackup  capacity                       6%                             -
zbackup  altroot                        -                              default
zbackup  health                         ONLINE                         -
zbackup  guid                           16990376562684171633           default
zbackup  version                        -                              default
zbackup  bootfs                         -                              default
zbackup  delegation                     on                             default
zbackup  autoreplace                    off                            default
zbackup  cachefile                      -                              default
zbackup  failmode                       wait                           default
zbackup  listsnapshots                  on                             local
zbackup  autoexpand                     off                            default
zbackup  dedupditto                     0                              default
zbackup  dedupratio                     1.00x                          -
zbackup  free                           7.40G                          -
zbackup  allocated                      550M                           -
zbackup  readonly                       off                            -
zbackup  comment                        -                              default
zbackup  expandsize                     -                              -
zbackup  freeing                        0                              default
zbackup  fragmentation                  5%                             -
zbackup  leaked                         0                              default
zbackup  feature@async_destroy          enabled                        local
zbackup  feature@empty_bpobj            active                         local
zbackup  feature@lz4_compress           active                         local
zbackup  feature@multi_vdev_crash_dump  enabled                        local
zbackup  feature@spacemap_histogram     active                         local
zbackup  feature@enabled_txg            active                         local
zbackup  feature@hole_birth             active                         local
zbackup  feature@extensible_dataset     enabled                        local
zbackup  feature@embedded_data          active                         local
zbackup  feature@bookmarks              enabled                        local
zbackup  feature@filesystem_limits      enabled                        local
zbackup  feature@large_blocks           enabled                        local
Comment 11 Steven Hartland freebsd_committer freebsd_triage 2016-04-13 14:27:30 UTC
(In reply to Andriy Gapon from comment #9)
That's a good idea in principle but I suspect if implemented it may result in a lot of unbootable machines :(
Comment 12 Andriy Gapon freebsd_committer freebsd_triage 2016-04-14 14:00:52 UTC
(In reply to Steven Hartland from comment #11)
I've never conducted any survey on this, but my impression is that the share of systems with multiple ZFS pools is not that great. Also, of those systems the share of systems where bootfs property of the boot pool is not set / default is even smaller. And the preparation for the change would be trivial given a sufficiently early heads-up warning.
Comment 13 Steven Hartland freebsd_committer freebsd_triage 2016-04-14 15:00:06 UTC
(In reply to Andriy Gapon from comment #12)
Looking at a handful of machines here that seems to be the case:
zpool get bootfs
NAME  PROPERTY  VALUE      SOURCE
data  bootfs    -          default
tank  bootfs    tank/root  local
Comment 14 Matthew Seaman freebsd_committer freebsd_triage 2016-04-14 17:15:54 UTC
But that looks identical to what the installer generates when using an encrypted pool:

backup-4:~:% zpool get bootfs 
NAME      PROPERTY  VALUE               SOURCE
bootpool  bootfs    -                   default
zroot     bootfs    zroot/ROOT/default  local

'bootpool' contains the unencrypted kernel we want to boot from, but zroot 
has the (encrypted) root filesystem.
Comment 15 Eugene Grosbein freebsd_committer freebsd_triage 2019-03-01 17:01:24 UTC
gptzfsboot(8) manual page clearly states that partition containing ZFS boot pool must be first and not second. Just change order of partitions and you will be fine.