Bug 282622 - zfs snapshot corruption when using encryption
Summary: zfs snapshot corruption when using encryption
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-11-08 09:54 UTC by Palle Girgensohn
Modified: 2025-01-19 23:01 UTC (History)
11 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Palle Girgensohn freebsd_committer freebsd_triage 2024-11-08 09:54:21 UTC
Hi!

We se sporadical corruption in snapshots (not really files, it seems) since we started using encryption on previously well behaved system.

$ sudo zpool status -v tank
  pool: tank
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub in progress since Mon Nov  4 12:49:06 2024
	126T / 128T scanned at 392M/s, 126T / 128T issued at 392M/s
	0B repaired, 99.07% done, 00:53:05 to go
config:

	NAME                          STATE     READ WRITE CKSUM
	tank                          ONLINE       0     0     0
	  raidz2-0                    ONLINE       0     0     0
	    gpt/ZA17TQRZ0000R7412YQE  ONLINE       0     0     0
	    da7                       ONLINE       0     0     0
	    gpt/ZA17YK2F0000R7443JLQ  ONLINE       0     0     0
	    gpt/ZA17YLK20000R744Z3EQ  ONLINE       0     0     0
	    da2                       ONLINE       0     0     0
	    gpt/ZA17YMZM0000R741ZUG7  ONLINE       0     0     0
	    gpt/ZA17YN7N0000R7426S1K  ONLINE       0     0     0
	  raidz2-1                    ONLINE       0     0     0
	    gpt/Z4D4GSYQ0000R642L8LN  ONLINE       0     0     0
	    gpt/S4D198220000K706NYDF  ONLINE       0     0     0
	    gpt/S4D1E37R0000E715QM7B  ONLINE       0     0     0
	    gpt/Z4D4J36Y0000R63167HP  ONLINE       0     0     0
	    gpt/S4D198GF0000K706NYRP  ONLINE       0     0     0
	    gpt/Z4D4J3DX0000R633K23E  ONLINE       0     0     0
	  raidz3-3                    ONLINE       0     0     0
	    gpt/9RKBX1NL              ONLINE       0     0     0
	    gpt/9RKBZ0KL              ONLINE       0     0     0
	    gpt/9RKBBPYL              ONLINE       0     0     0
	    gpt/9RKBM1DC              ONLINE       0     0     0
	    gpt/9RKAW5MC              ONLINE       0     0     0
	    gpt/9RKD3LDC              ONLINE       0     0     0
	    gpt/9RKD2H3C              ONLINE       0     0     0
	    gpt/9RK7XB1C              ONLINE       0     0     0
	logs	
	  mirror-2                    ONLINE       0     0     0
	    gpt/ZIL1                  ONLINE       0     0     0
	    gpt/ZIL2                  ONLINE       0     0     0
	cache
	  gpt/L2ARC1                  ONLINE       0     0     0
	  gpt/L2ARC2                  ONLINE       0     0     0
	spares
	  gpt/ZA17YMCH0000R74264N0    AVAIL   

errors: Permanent errors have been detected in the following files:



No files though, and the last percent of scrub has been working for a few days now.

We know which snapshot is problematic since we ship them all off site using `zfs send`. Removing snapshot culprit fixes the problem, but it keeps popping up. We're planning upgrade to 14.1 this weekend, but I see no closed PRs about this problem so I'm posting this before the upgrade. Anyone knows if anything in the encryption code has been improved?

There a description in the OpenZFS github, https://github.com/openzfs/zfs/issues/12014 , that seems to be spot on. The issue is still open.

Any ideaa? What more can I supply in terms of data?
Comment 1 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-08 09:56:24 UTC
*edit* removing the snapshot *and* running scrub, sometimes twice, seems to fix the problem.
Comment 2 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-08 14:28:39 UTC
> No idea but i'd like to ask, in your context: do the snapshots get created, and if
> so, can you restore from them? In other words, does the 'corruption' fix itself
> with zfs self-healing?

I cannot run `zfs send -I fs@previous_snap fs@problematic_snap`, I get

 warning: cannot send 's@problematic_snap': Input/output error`

Removing the snapshot fixes the problem.
Comment 3 Lexi Winter freebsd_triage 2024-11-09 01:45:31 UTC
i've run into this about a year ago on 13.something, and it's still present in 15.0.

a workaround to avoid removing the snapshot is to reboot, which will cause the snapshot to work again: the issue seems to be purely with the data in memory.

there should probably be a warning somewhere in the manual that ZFS encryption is not actually usable, or perhaps it should even be locked behind a sysctl.
Comment 4 void 2024-11-09 02:08:50 UTC
(In reply to Lexi Winter from comment #3)

> i've run into this about a year ago on 13.something, and 
> it's still present in 15.0.

That's interesting. can you reproduce the error reliably?

> a workaround to avoid removing the snapshot is to reboot, 
> which will cause the snapshot to work again: the issue seems 
> to be purely with the data in memory.

How much memory? Any zfs sysctls? vfs.zfs.arc.max?

> there should probably be a warning somewhere in the manual that 
> ZFS encryption is not actually usable, or perhaps it should 
> even be locked behind a sysctl.

In my context (15-current) it can be used but haven't used it yet 
for snapshot/restore (am using tar for this).
Comment 5 Lexi Winter freebsd_triage 2024-11-09 10:20:52 UTC
> That's interesting. can you reproduce the error reliably?

when ZFS encryption was enabled, syncoid (part of sanoid) would reproduce the problem every few days.  more snapshots would gradually become inaccessible and 'permanent errors' listed in zpool status would grow over time until the server was rebooted.

i've since disabled encryption on all filesystems (other than the pool's root filesystem) and the problem no longer occurs.

> How much memory?

32GB.

> Any zfs sysctls?

vfs.zfs.min_auto_ashift=12

> vfs.zfs.arc.max?

vfs.zfs.arc.max: 0
Comment 6 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-09 12:00:55 UTC
destroy the snapshot and run
 zpool scrub -e 
is sufficent for me to get rid of the pool error.
Comment 7 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-09 12:03:18 UTC
Adding this parallel discussion

https://lists.freebsd.org/archives/freebsd-fs/2024-November/003806.html

Looking further at this, there seems to be many reports of
the same sort of problem you're reporting:

https://github.com/openzfs/openzfs-docs/issues/494
https://github.com/openzfs/zfs/issues/11679
https://github.com/openzfs/zfs/issues/16623
https://github.com/openzfs/zfs/issues/10019
Comment 8 Miroslav Lachman 2024-11-09 12:45:15 UTC
Just a data point.
I am running FreeBSD 13.3-RELEASE-p7 amd64 GENERIC with ZFS native encryption:

# zfs get encryption,encryptionroot tank0/vol0/remote_backup
NAME                      PROPERTY        VALUE                     SOURCE
tank0/vol0/remote_backup  encryption      aes-256-gcm               -
tank0/vol0/remote_backup  encryptionroot  tank0/vol0/remote_backup  -

Under tank0/vol0/remote_backup there are 42 descendant encrypted filesystems.
There are about 10TB of data - rsync backups of remote machines. Each day there are new snapshots for each encrypted FS. (1671 currently present)

This machines is in production from 2023-03 without any issues described in this PR.

Key is loaded manually by zfs load-key tank0/vol0/remote_backup
Comment 9 Lexi Winter freebsd_triage 2024-11-09 12:47:06 UTC
(In reply to Miroslav Lachman from comment #8)

> This machines is in production from 2023-03 without any issues described in this PR.

do you send snapshots *from* this machine to another machine?  this is what triggers the bug described in this PR.  normal local filesystem access or receiving snapshots will not trigger the bug.
Comment 10 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-09 12:50:25 UTC
(In reply to Lexi Winter from comment #9)
The error is triggered by

zfs send -I fs@previous_snap fs@problematic_snap > /dev/null

so yes, zfs send has problems. No files are corrupted, AFAICS.
Comment 11 Lexi Winter freebsd_triage 2024-11-09 12:52:33 UTC
(In reply to Palle Girgensohn from comment #10)

> No files are corrupted, AFAICS.

this is also my experience (i.e., the errors vanish on reboot), so things could be worse. 

however, it's not really ideal to have to reboot a production system every few days :-)
Comment 12 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-09 14:24:08 UTC
(In reply to Lexi Winter from comment #11)
zfs destroy snapshot
zpool scrub -e tank

is sufficient for our case. I tried rebooting (acutally failing over to the other node, since we use dual channel disks and disk boxes). The snapshot was still broken.

scrub -e fixed it though.
Comment 13 Miroslav Lachman 2024-11-10 07:05:23 UTC
(In reply to Lexi Winter from comment #9)
No, not sending snapshots to other machine.
Comment 14 void 2024-11-12 01:53:10 UTC
(In reply to Palle Girgensohn from comment #12)

You mention the sending system has 32GB RAM.

The output of "status -v tank" shows

  scan: scrub in progress since Mon Nov  4 12:49:06 2024
	126T / 128T scanned at 392M/s, 126T / 128T issued at 392M/s
	0B repaired, 99.07% done, 00:53:05 to go

I usually run zfs with 1GB RAM per TB disk, *minimum*

I'm wondering if, in your context, the addition of zfs encryption has made zfs run out of resources when doing certain operations.
Comment 15 Lexi Winter freebsd_triage 2024-11-12 02:59:10 UTC
(In reply to void from comment #14)

> I'm wondering if, in your context, the addition of zfs encryption has made zfs run out of resources when doing certain operations.

there is no need to wonder about this: ZFS native encryption is broken for everyone regardless of system configuration.  this is a well-known bug which has been present for years that OpenZFS simply cannot fix.
Comment 16 Alexander Ziaee freebsd_triage 2024-11-12 03:29:24 UTC
I see. Until this is fixed, I am declaring an emergency to put this in the manpage.
Comment 17 Alexander Ziaee freebsd_triage 2024-11-12 04:55:52 UTC
Here is a pull to just get this documented this upstream until it can be fixed. Since this bug results in people loosing their data in a reasonable, out of the box workflow, in a BETA, cc'ing both maintainers and elected officers.

Note to users, we have a filesystem independent encryption method called GELI which does not have this issue when used with ZFS, or UFS which recently gained support for snapshots, and this issue only affects snapshots used with ZFS native encryption.

https://github.com/openzfs/zfs/pull/16745
Comment 18 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-12 09:17:45 UTC
(In reply to void from comment #14)
$ sysctl -h hw.physmem
hw.physmem: 137 263 067 136
Comment 19 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-12 12:31:32 UTC
(In reply to Alexander Ziaee from comment #17)
I have not seen any "real" corruption, in the sense that files are corrupted. Seems more like a bug where snapshots are hindering the ability to send data using zfs send? No files are ever corrupted. 

Lexi writes that "this is a well-known bug which has been present for years that OpenZFS simply cannot fix". I'd say that "cannot" is problably not the correct verb, isn't more like "has not prioritised to"? Perhaps we could stir up som dust and the attention of some people with the right knowledge to fix it?

How b0rken is it really, has anyone seen real file corruption?
Comment 20 Alexander Ziaee freebsd_triage 2024-11-12 21:30:31 UTC
> how broken is it really

In the PR I linked, the issue on the openzfs issue tracker is linked. Its an issue enough that I think documenting it until we can fix it would benefit users.
Comment 21 Alexander Ziaee freebsd_triage 2024-11-12 21:37:57 UTC
Over 10 users testifying snapshot corruption on 3 operating systems is a known issue. Documenting it can stop the bleeding until we get a surgeon in.
Comment 22 Graham Perrin 2024-11-15 03:41:24 UTC
(In reply to Lexi Winter from comment #3)

> … 13.something, … still present in 15.0. …

The same pool on the same system?
Comment 23 Graham Perrin 2024-11-15 03:49:37 UTC
(In reply to Palle Girgensohn from comment #0)

> 14.0-RELEASE

Please run: 

zfs version
Comment 24 Graham Perrin 2024-11-15 03:56:15 UTC
… and: 

freebsd-version -kru ; uname -aKU

pkg -vv | grep -B 1 -e url -e priority
Comment 25 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-15 09:58:22 UTC
The machines have been upgraded to 14.1 now:

$ freebsd-version -kru ; uname -aKU

14.1-RELEASE-p5
14.1-RELEASE-p5
14.1-RELEASE-p6
FreeBSD storage2 14.1-RELEASE-p5 FreeBSD 14.1-RELEASE-p5 GENERIC amd64 1401000 1401000

$ zfs version
zfs-2.2.4-FreeBSD_g256659204
zfs-kmod-2.2.4-FreeBSD_g256659204

$ pkg -vv | grep -B 1 -e url -e priority
  pprepo: { 
    url             : "http://obscured-we-build-our-own-pkgs",
    enabled         : yes,
    priority        : 0,
Comment 26 Palle Girgensohn freebsd_committer freebsd_triage 2024-11-25 13:16:10 UTC
Just a quick note. We upgraded all our servers to 14.1 and upgraded all pool to latest feature set, and we have not yet seen this problem again since. Famous last words, it might still show up in a while, but you could safely say that the frequency has gone down considerably.
Comment 27 Mark Linimon freebsd_committer freebsd_triage 2025-01-19 23:01:48 UTC
So after 2 months, can we declare this problem fixed?