Bug 271989 - zfs root mount error 6 after upgrade from 11.1-release to 13.2-release
Summary: zfs root mount error 6 after upgrade from 11.1-release to 13.2-release
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2023-06-14 09:02 UTC by Markus Wild
Modified: 2026-05-05 21:58 UTC (History)
4 users (show)

See Also:


Attachments
boot transcript (21.65 KB, application/x-bzip)
2023-06-14 09:03 UTC, Markus Wild
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Wild 2023-06-14 09:02:37 UTC
Problem:
- system runs fine with 11.1-RELEASE
- prepared upgrade to 13.2-RELEASE in new boot environment
- updated first efi partition to load loader_4th.efi  (the lua default will fail when trying to load the 11.1 environment)
- upon reboot to 13.2, the kernel loads fine, but fails to mount the root pool:

   Mounting from zfs:zroot/ROOT/fbsd132 failed with error 6.

   Loader variables:
      vfs.root.mountfrom=zfs:zroot/ROOT/fbsd132


Environment:
- supermicro X10DRi server, used to implement an iscsi nas within separate data pool (lsi controllers)
- contains 2 NVME P3700 cards to host zroot pool, and contains additional partitions to provide log mirror to a data pool (p4)

$ gpart show -l nvd0
=>       40  781422688  nvd0  GPT  (373G)
         40     409600     1  efiboot0  (200M)
     409640       2008        - free -  (1.0M)
     411648   33554432     2  swap0  (16G)
   33966080  104857600     3  (null)  (50G)
  138823680   33554432     4  (null)  (16G)
  172378112  609044616        - free -  (290G)

$ gpart show -l nvd1
=>       40  781422688  nvd1  GPT  (373G)
         40     409600     1  efiboot1  (200M)
     409640       2008        - free -  (1.0M)
     411648   33554432     2  swap1  (16G)
   33966080  104857600     3  (null)  (50G)
  138823680   33554432     4  (null)  (16G)
  172378112  609044616        - free -  (290G)

zroot:
        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            nvd0p3  ONLINE       0     0     0
            nvd1p3  ONLINE       0     0     0

GUIDs:
zroot  guid                          2596257223750470056            default
data  guid                           10144353125609547525           default

I enabled vfs.zfs.debug = "1" in loader.conf, and it seems the boot loader is trying to 
load the wrong pool guid 10350976291831707895, but I have no idea where it gets this guid
from:

vdev_geom_open_by_guids:767[1]: Searching by guids [10350976291831707895:18166718715878495545].
vdev_geom_open_by_guids:767[1]: Searching by guids [10350976291831707895:8932737958006505339].
vdev_attach_ok:687[1]: pool guid mismatch for provider nvd1p4: 10350976291831707895 != 10144353125609547525.
vdev_attach_ok:687[1]: pool guid mismatch for provider nvd1p4: 10350976291831707895 != 10144353125609547525.
vdev_attach_ok:687[1]: pool guid mismatch for provider nvd1p3: 10350976291831707895 != 2596257223750470056.
vdev_attach_ok:687[1]: pool guid mismatch for provider nvd1p3: 10350976291831707895 != 2596257223750470056.
[...]
vdev_geom_open_by_guids:779[1]: Attach by guid [10350976291831707895:18166718715878495545] succeeded, provider nvd0.
vdev_geom_open_by_guids:779[1]: Attach by guid [10350976291831707895:8932737958006505339] succeeded, provider nvd1.
[...]
Mounting from zfs:zroot/ROOT/fbsd132 failed with error 6.

Can I somehow force the use of a specific guid, or blacklist the wrong one?

Regards,
Markus
Comment 1 Markus Wild 2023-06-14 09:03:52 UTC
Created attachment 242771 [details]
boot transcript
Comment 2 Markus Wild 2023-06-14 13:31:30 UTC
Turns out, there was an old zfs label AT THE END of the unused 
disk space of the two nvd drives from the original auto-setup.

Why on earth does the 13.2 kernel prefer an incomplete label (label 0/1 were
missing, label 2/3 at the end of the whole disk contained the old information),
when there was a perfectly valid "freebsd-zfs" gpt partition with all
valid labels!? Shouldn't the search path be:
- check partition table
- check every entry of type "freebsd-zfs" for labels
- if nothing found, check whole disk
Comment 3 Graham Perrin freebsd_committer freebsd_triage 2023-06-15 00:54:51 UTC
grep -e kern.geom.label.disk_ident.enable -e kern.geom.label.gptid.enable /boot/loader.conf


What's found?
Comment 4 Markus Wild 2023-06-15 04:59:38 UTC
(In reply to Graham Perrin from comment #3)
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"

these were set by the installer as far as I remember.
Comment 5 Dan Langille freebsd_committer freebsd_triage 2023-06-21 17:39:34 UTC
It seems you are not alone. I have the same problem on a 13.1 to 13.2 upgrade.

I'm nearly out of battery here, so: https://twitter.com/DLangille/status/1671569674419437596

Screen shots in that thread.
Comment 6 Magnus Kaiser 2023-06-21 20:12:04 UTC
You may need to upgrade the root zpool you boot from before the first reboot with 13.x kernel. Had the same error code before where I never upgraded the zroot pool but tried a system upgrade. Maybe do more steps in the upgrade process since you jump over one major release and whenever your pool can be upgraded, do so.
Comment 7 Markus Wild 2023-06-22 05:08:34 UTC
(In reply to Dan Langille from comment #5)
Check whether you have "ghost" zfs labels on your main drives, besides
the expected ones in the zfs partitions:

zdb -l /dev/ada0
zdb -l /dev/ada1

if yours is the same problem as mine was, you'll find the last two of 
the 4 labels valid, and the kernel will then happily use this wrong label
(it will plain ignore the partitions) just to fail to mount the pool later.
I see no good reason to upgrade the root pool in such a situation, besides
making a fallback to the working system impossible.

Cheers,
Markus
Comment 8 Dan Langille freebsd_committer freebsd_triage 2023-06-22 12:03:32 UTC
(In reply to Markus Wild from comment #7)

Here is ada0 (ada1 is similar, differing only by 'guid: 18059354552686318005'):

[x8dtu dan ~] % sudo zdb -l /dev/ada0                                                                          11:57:36
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 13307170
    pool_guid: 18320603570228782289
    errata: 0
    hostname: ''
    top_guid: 9280302292269909223
    guid: 3142406804989521377
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 9280302292269909223
        metaslab_array: 68
        metaslab_shift: 31
        ashift: 12
        asize: 228578557952
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 18059354552686318005
            path: '/dev/ada1p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@2/elmdesc@Slot_01/p3'
            whole_disk: 1
            DTL: 20272
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3142406804989521377
            path: '/dev/ada0p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@1/elmdesc@Slot_00/p3'
            whole_disk: 1
            DTL: 20271
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3
Comment 9 Markus Wild 2023-06-22 12:30:28 UTC
(In reply to Dan Langille from comment #8)
there you go, the bogus 

pool_guid: 18320603570228782289

is what causes your kernel to fail to load the pool, since it shows up in
your console messages as mismatched comparisons against the vdevs the kernel found.
This is most likely -as with my installation- the result of originally installing 
the zpool on the entire disk, and then later removing that pool and reducing the 
zfs partition and recreating the pool. From what I reverse engineered, a zpool
seems to put 2 labels at the beginning of its assigned disk space and
2 labels at the end, most likely in an effort to be able to restore those
labels should someone/something accidentally overwrite them.

The stupidity of the whole thing is: the kernel code to load the zfs root
filesystem seems to first scan the "entire disk device" for these 4 labels, and
if it finds any, will insist in using them and NOT consider any valid labels
of partitions in the GPT partition table. zpool import doesn't do this, it's
just the mount code in the kernel.

There is a "zpool labelclear" command which is supposed to clear these
wrong old labels, but I personally didn't trust it to not go ahead and 
clear the labels of ALL zfs instances on the disk if you let it loose on the
entire disk device. The man page is not very clear in this respect, and searching 
for this shows I was not the only one confused on the exact behavior of that 
command.

What I did in my case is:
- use gpart to add an additional temporary swap partition to fill the disk:
  gpart add -t freebsd-swap nvd0
- this resulted in a nvd0p5 in my case
- then I did
  dd if=/dev/zero of=/dev/nvd0p5 bs=1024M
  to clear that temp partition, and thus the end of the disk from the old 
  zpool label
- remove the temp partition again:
  gpart delete -i 5 nvd0
if you check the device again after this (zdb -l), it shouldn't find any
labels anymore.

What I'd expect for the future, and why I didn't ask for this bug report 
to be closed after I fixed my problem:
- kernel mount code should first check all valid zfs partitions for
  labels
- only if no labels are found in valid partitions should it also consider the
  entire disk device (nvd0, ada0, etc) to cover the cases where people define
  a zpool like "mirror /dev/ada0 /dev/ada1". I know this works for data pools,
  but I'm not sure you could actually boot from such a pool.

Cheers,
Markus
Comment 10 Dan Langille freebsd_committer freebsd_triage 2023-06-22 14:27:22 UTC
(In reply to Markus Wild from comment #9)
I wound up creating two temporary partitions on each of ada0 and ada1 - then dd'd over them. It did not fix the boot issue.

Details at https://gist.github.com/dlangille/af34e873727c62689ec937530b9ce398#file-gistfile6-txt

I did not do the 'zdb -l' before rebooting, but here is is now.  I think I need to do the same with ada2 and ada3.

[x8dtu dan ~] % zpool status zroot                                                                             14:21:51
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0B in 00:03:15 with 0 errors on Mon Jun 19 03:09:30 2023
config:

	NAME        STATE     READ WRITE CKSUM
	zroot       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    ada1p3  ONLINE       0     0     0
	    ada0p3  ONLINE       0     0     0

errors: No known data errors
[x8dtu dan ~] % sudo zdb -l /dev/ada0                                                                          14:21:57
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 13322558
    pool_guid: 18320603570228782289
    errata: 0
    hostname: 'x8dtu.unixathome.org'
    top_guid: 9280302292269909223
    guid: 3142406804989521377
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 9280302292269909223
        metaslab_array: 68
        metaslab_shift: 31
        ashift: 12
        asize: 228578557952
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 18059354552686318005
            path: '/dev/ada1p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@2/elmdesc@Slot_01/p3'
            whole_disk: 1
            DTL: 20272
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3142406804989521377
            path: '/dev/ada0p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@1/elmdesc@Slot_00/p3'
            whole_disk: 1
            DTL: 20271
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3 
[x8dtu dan ~] % sudo zdb -l /dev/ada1                                                                          14:22:02
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 13322558
    pool_guid: 18320603570228782289
    errata: 0
    hostname: 'x8dtu.unixathome.org'
    top_guid: 9280302292269909223
    guid: 18059354552686318005
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 9280302292269909223
        metaslab_array: 68
        metaslab_shift: 31
        ashift: 12
        asize: 228578557952
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 18059354552686318005
            path: '/dev/ada1p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@2/elmdesc@Slot_01/p3'
            whole_disk: 1
            DTL: 20272
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3142406804989521377
            path: '/dev/ada0p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@1/elmdesc@Slot_00/p3'
            whole_disk: 1
            DTL: 20271
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3
Comment 11 Dan Langille freebsd_committer freebsd_triage 2023-06-22 14:28:51 UTC
(In reply to Dan Langille from comment #10)
This is ada2 and ada3 (part of the main_tank zpool). NOTE the references to zroot in here.

[x8dtu dan ~] % sudo zdb -l /dev/ada2                                                                          14:20:51
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 4
    pool_guid: 18066221975524666244
    hostname: 'x8dtu.unixathome.org'
    top_guid: 1284986517485020772
    guid: 7103001418055434556
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 1284986517485020772
        metaslab_array: 38
        metaslab_shift: 35
        ashift: 12
        asize: 4998827606016
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 7103001418055434556
            path: '/dev/ada0p3'
            whole_disk: 1
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 16802236127408357963
            path: '/dev/ada1p3'
            whole_disk: 1
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3 
[x8dtu dan ~] % sudo zdb -l /dev/ada3                                                                          14:21:03
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 4
    pool_guid: 18066221975524666244
    hostname: 'x8dtu.unixathome.org'
    top_guid: 1284986517485020772
    guid: 16802236127408357963
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 1284986517485020772
        metaslab_array: 38
        metaslab_shift: 35
        ashift: 12
        asize: 4998827606016
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 7103001418055434556
            path: '/dev/ada0p3'
            whole_disk: 1
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 16802236127408357963
            path: '/dev/ada1p3'
            whole_disk: 1
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3
Comment 12 Dan Langille freebsd_committer freebsd_triage 2023-06-22 14:47:31 UTC
Success after clearing ada2/3.

[10:44 air01 dan ~] % x8dtu
Last login: Thu Jun 22 14:41:09 2023 from [redacted]
[x8dtu dan ~] % uname -a                                                                                       14:45:23
FreeBSD x8dtu.unixathome.org 13.2-RELEASE-p1 FreeBSD 13.2-RELEASE-p1 GENERIC amd64
[x8dtu dan ~] % freebsd-version -ukr                                                                           14:45:28
13.2-RELEASE-p1
13.2-RELEASE-p1
13.2-RELEASE-p1
[x8dtu dan ~] % sudo zdb -l /dev/ada0                                                                          14:45:32
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 13322837
    pool_guid: 18320603570228782289
    errata: 0
    hostname: ''
    top_guid: 9280302292269909223
    guid: 3142406804989521377
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 9280302292269909223
        metaslab_array: 68
        metaslab_shift: 31
        ashift: 12
        asize: 228578557952
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 18059354552686318005
            path: '/dev/ada1p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@2/elmdesc@Slot_01/p3'
            whole_disk: 1
            DTL: 20272
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3142406804989521377
            path: '/dev/ada0p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@1/elmdesc@Slot_00/p3'
            whole_disk: 1
            DTL: 20271
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3 
[x8dtu dan ~] % sudo zdb -l /dev/ada1                                                                          14:46:36
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
    version: 5000
    name: 'zroot'
    state: 0
    txg: 13322837
    pool_guid: 18320603570228782289
    errata: 0
    hostname: ''
    top_guid: 9280302292269909223
    guid: 18059354552686318005
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 9280302292269909223
        metaslab_array: 68
        metaslab_shift: 31
        ashift: 12
        asize: 228578557952
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 18059354552686318005
            path: '/dev/ada1p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@2/elmdesc@Slot_01/p3'
            whole_disk: 1
            DTL: 20272
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3142406804989521377
            path: '/dev/ada0p3'
            phys_path: 'id1,enc@n3061686369656d30/type@0/slot@1/elmdesc@Slot_00/p3'
            whole_disk: 1
            DTL: 20271
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 2 3 
[x8dtu dan ~] % sudo zdb -l /dev/ada2                                                                          14:46:39
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
[x8dtu dan ~] % sudo zdb -l /dev/ada3                                                                          14:46:43
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
Comment 13 Graham Perrin freebsd_committer freebsd_triage 2023-06-22 20:24:05 UTC
(In reply to Markus Wild from comment #9)

The four vdev labels are visualised: 

a) in section 1.2 of Sun's 'ZFS On-Disk Specification' (2006, draft), 
   a copy of which is at <https://www.giis.co.in/Zfs_ondiskformat.pdf>

b) in slide 21 of Richard Elling's 'ZFS Tutorial USENIX June 2009' at 
   <https://www.slideshare.net/relling/zfs-tutorial-usenix-june-2009>

c) in figure 2 of 'End-to-end Data Integrity for File Systems: A ZFS 
   Case Study' (PDF) via 
   <https://www.usenix.org/conference/fast-10/end-end-data-integrity-file-systems-zfs-case-study>
Comment 14 Graham Perrin freebsd_committer freebsd_triage 2023-06-22 21:04:20 UTC
zpool-labelclear(8)

<https://openzfs.github.io/openzfs-docs/man/8/zpool-labelclear.8.html>

(In reply to Markus Wild from comment #9)

> … didn't trust it to not go ahead and clear the labels of ALL zfs instances 
> on the disk if you let it loose on the entire disk device. …

I see no reason for distrust. 

<https://reviews.freebsd.org/P584>

Under <https://reviews.freebsd.org/P584$111>: 

> failed to clear label for /dev/da3

– that was correct, for the non-labelled device; and there was no interference with vdev labels of partitions 1 and 2.
Comment 15 Markus Wild 2023-06-23 06:57:47 UTC
thank you for the references to the label definition, looks like I
guessed correctly from ktrace on zdb -l 

about labelclear: the man page for this potentially very destructive
command is very short. It currently just says 
"Removes ZFS label information from the specified device". It's 
great to see it's just clearing what zdb -l would list.

Considering I was not the first to be uncertain what that means
if the "device" is a whole-disk-device, it would help to add something
to the man page like "If the device refers to a whole disk, not a partition,
only label entries at the beginning and end of the disk are acted upon,
individual partitions are not affected as long as they don't overlap with
the beginning and end of the disk".
Comment 16 Allan Jude freebsd_committer freebsd_triage 2023-06-23 19:22:05 UTC
(In reply to Markus Wild from comment #2)
It is the boot loader that has the preference.
The boot loader looks at the raw disk first
Then looks at the partitions.

The boot loader sets the GUID that the kernel uses at mountroot to import the pool. When it is the wrong one, the kernel fails to continue booting.
Comment 17 Dan Langille freebsd_committer freebsd_triage 2023-06-23 21:15:44 UTC
(In reply to Graham Perrin from comment #14)

Graham: do I understand correctly?

If all is true:

* user is about to upgrade to FreeBSD 13.2
* user is running ZFS on partitions, no whole drives

Then:

* user runs "zpool-labelclear /dev/N" for on each paritioned drive

Result: This extra label situation should not be encountered, meaning, the upgrade "should just work".

I ask because the above is what I'm going to recommend to users if there's a change they have previously used drives.

Running "zpool-labelclear /dev/N" is also what I'm going to recommend should anyone change a ZFS drive from while-drive to partitioned.
Comment 18 Markus Wild 2023-06-26 09:37:30 UTC
(In reply to Allan Jude from comment #16)
Why would the bootloader prefer the raw disk to the partitions,
what's the rationale behind that decision? And why is there apparently
a regression in behavior as compared to older FreeBSD releases? My
(much older) 11.1 bootloader chose the correct GUID...
Comment 19 Henrich Hartzer 2023-11-18 17:29:05 UTC
I want to add to this that after rebooting today, I had several cases of this error 6. After a few reboots, it eventually booted properly and went away. Very mysterious.

(13.2)
Comment 20 Mark Linimon freebsd_committer freebsd_triage 2026-05-05 21:58:44 UTC
^Triage: I'm sorry that this PR did not get addressed in a timely fashion.

By now, the version that it was created against is out of support.
Please re-open if it is still a problem on a supported version.