Bug 244831

Summary: sysutils/openzfs-kmod 2020022700 mount of encrypted dataset fails with I/O error
Product: Ports & Packages Reporter: rk <rk>
Component: Individual Port(s)Assignee: Ryan Moeller <freqlabs>
Status: New ---    
Severity: Affects Only Me CC: fbsd-bugzilla, freqlabs, grahamperrin, rk
Priority: --- Flags: bugzilla: maintainer-feedback? (kmoore)
Version: Latest   
Hardware: Any   
OS: Any   

Description rk 2020-03-15 17:26:27 UTC
After upgrading the openzfs-kmod port to version 2020022700
attempts to mount encrypted datasets on 12.1-STABLE r358220 amd64 fail.

Loading the encryption key with "zfs load-key dp/store" works
but mounting using "zfs mount dp/store" fails with
"I/O error" and no further messages.

The pool itself then shows

  pool: dp
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 0 days 00:00:00 with 0 errors on Mon Sep 16 11:07:50 2019
config:

        NAME        STATE     READ WRITE CKSUM
        dp          ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0
            ada1p1  ONLINE       0     0     0
            ada2p1  ONLINE       0     0     0
            ada3p1  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        dp/store:<0x0>

Downgrading openzfs-kmod to version 2019101600 allows me to
mount the encrypted dataset again.
Comment 1 rk 2020-03-20 10:16:44 UTC
Using openzfs/openzfs-kmod port version 2019101600 for running a full scrub
on the pool cleared the error (without actually finding any issues):

  pool: dp
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub in progress since Thu Mar 19 17:27:55 2020
        19.0T scanned at 731M/s, 19.0T issued at 731M/s, 19.0T total
        0B repaired, 100.00% done, 0 days 00:00:00 to go
config:

        NAME        STATE     READ WRITE CKSUM
        dp          ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0
            ada1p1  ONLINE       0     0     0
            ada2p1  ONLINE       0     0     0
            ada3p1  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        dp/store:<0x0>

After completing the scrub the pool is clean again:

  pool: dp
 state: ONLINE
  scan: scrub repaired 0B in 0 days 07:34:17 with 0 errors on Fri Mar 20 01:02:1
2 2020
config:

        NAME        STATE     READ WRITE CKSUM
        dp          ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0
            ada1p1  ONLINE       0     0     0
            ada2p1  ONLINE       0     0     0
            ada3p1  ONLINE       0     0     0

errors: No known data errors


Everything looks fine and I can mount the encrypted dataset without
problems.

To investigate further, I upgraded to the current openzfs/openzfs-kmod
port version 2020031600 and tried to reproduce the issue:

# zfs load-key dp/store
Enter passphrase for 'dp/store':
# zfs mount dp/store
cannot mount 'dp/store': Input/output error
# zpool status -v dp
  pool: dp
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 0 days 07:34:17 with 0 errors on Fri Mar 20 01:02:1
2 2020
config:

        NAME        STATE     READ WRITE CKSUM
        dp          ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0
            ada1p1  ONLINE       0     0     0
            ada2p1  ONLINE       0     0     0
            ada3p1  ONLINE       0     0     0
 
errors: Permanent errors have been detected in the following files:

        dp/store:<0x0>


So version 2020031600 causes the same problem.
Comment 2 rk 2020-08-16 14:23:06 UTC
I just re-tested with the current version available in the
sysutils/openzfs and systuils/openzfs-kmod ports: 2020080800
The problem is still there. I cannot mount encrypted datasets
with it. It still fails with I/O error and marks the pool as corrupted.
Going back to openzfs 2019101600 and scrubbing the pool fixes the problems
and I can mount the encrypted datasets again.
Comment 3 rk 2020-08-16 14:35:04 UTC
When running with the newer code (2020080800), I tried to create
a new encrypted dataset. This works OK. I can even mount it successfully.
However, the older datasets that were created with older openzfs versions
cannot be mounted. Attempts to mount them always fail with I/O error
and render the pool corrupted (until zpool scrub is run on it).
Comment 4 David Schlachter 2021-02-06 18:30:10 UTC
I'm observing this same behaviour after updating my system from 12.2-RELEASE with openzfs (system built without native zfs support) to 13.0-BETA1 (build with native zfs support, with openzfs from ports uninstalled).

Attachment and verification the encryption key work, but mounting the encrypted dataset fails with "I/O error". "zpool status" shows the same "Permanent error" described by @rk. A scrub clears the error. An attempt to mount the dataset again will produce the same error, which can be cleared again by scrubbing. In no case am I able to mount the dataset.

However, I'm able to verify that this encrypted dataset is functional and apparently undamaged by mounting it in macOS (zfs-1.9.4-0).
Comment 5 Graham Perrin 2021-02-09 00:06:24 UTC
(In reply to rk from comment #0)

> … 12.1-STABLE r358220 … Downgrading openzfs-kmod to version 
> 2019101600 allows me to mount the encrypted dataset again.

rk, please: do you find the same bug and the same workaround with version 2021012500 of the kernel module on 12.2-STABLE-p3?

<https://www.freshports.org/sysutils/openzfs-kmod/#history>

----

(In reply to David Schlachter from comment #4)

> … 13.0-BETA1 … openzfs from ports uninstalled…

David, please: is the bug reproducible with sysutils/openzfs and sysutils/openzfs-kmod installed from ports? 

If you can, please build your system from the same /usr/src that you'll use to build the two ports. 

Thank you
Comment 6 David Schlachter 2021-02-12 23:27:04 UTC
FYI, I tried to reproduce my problem by creating an encrypted zfs dataset on FreeBSD 12 with OpenZFS, then importing it on FreeBSD 13.0-BETA1, and I was not able to reproduce the problem. It must have been a particular problem with my pool (which I had simply recreated to bypass the problem).
Comment 7 rk 2021-02-13 17:11:55 UTC
> rk, please: do you find the same bug and the same workaround with version 
> 2021012500 of the kernel module on 12.2-STABLE-p3?

Right now I don't have a system with 12.2-STABLE running the openzfs code
from ports, but I tested again using the existing old stable/12 r364260
system and a brand new stable/13 system from Feb 12:

- created a pool and an encrypted dataset with the old system running
  stable/12 r364260 and the openzfs-kmod-2019101600 port

  # zpool create test0 da0p1
  # zfs create -o encryption=on -o keyformat=passphrase test0/encr
  ...
  copy a test file into /test0/encr
  ...
  # zpool export test0
  # zpool import test0
  # zfs load-key test0/encr
  Enter passphrase for 'test0/encr':
  # zfs mount test0/encr
  #
  -> file is there, everything is working fine
  # zpool export test0

- put the device on a system running stable/13-n244514-18097ee2fb7
  built with the git repository state from 2021-02-12 running the bundled
  ZFS code from stable/13:

  # zpool import test0
  # zfs load-key test0/encr
  Enter passphrase for 'test0/encr':
  # zfs mount test0/encr
  cannot mount 'test0/encr': Input/output error
  # zpool status -v test0
  pool: test0
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
config:

        NAME        STATE     READ WRITE CKSUM
        test0       ONLINE       0     0     0
          da2p1     ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        test0/encr:<0x0>


So the problem is still reproducible when encrypted datasets were
created with openzfs-kmod-2019101600 or older. When attempts to
mount them are made using newer openzfs versions (right now tested 
with stable/13 from February 12 2021), the pool becomes corrupted.
So we can't transfer existing encrypted datasets to newer openzfs releases
until the reason is found and fixed. So something happened after 2019101600
so that the openzfs code can no longer deal with older encrypted datasets.