Created attachment 222535 [details] core.txt Loading ccp (either in rc.conf's kld_list or manually kldloading module after boot) breaks ZFS encryption - I can't load keys for existing dataset and creating new one results in kernel panic. Try to load ZFS dataset key % kldload ccp % zfs load-key data Enter passphrase for 'data': Key load error: Incorrect key provided for 'data'. Enter passphrase for 'data': Key load error: Incorrect key provided for 'data'. Enter passphrase for 'data': Key load error: Incorrect key provided for 'data'. zsh: exit 255 zfs load-key data One way to reproduce kernel panic: truncate -s 10G pool mdconfig -at vnode -f pool zpool create -m /mnt/test -O compress=lz4 -O atime=off -O devices=off -O setuid=off -O exec=off -O encryption=on -O keyformat=passphrase test /dev/md0 <kernel panic> Other way to reproduce kernel panic: Try to create encrypted partition on existing pool (doesn't matter if root of the pool is encrypted or not): zfs create -o encryption=on -o keyformat=passphrase zroot/encrypted <kernel panic> % cat /var/crash/info.last Dump header from device: /dev/gpt/hdd-swap Architecture: amd64 Architecture Version: 2 Dump Length: 1346650112 Blocksize: 512 Compression: none Dumptime: 2021-02-17 20:47:17 +0100 Hostname: zen-pobro Magic: FreeBSD Kernel Dump Version String: FreeBSD 13.0-BETA2 #2 r13.0-n244512-726e20f45041: Wed Feb 17 20:26:38 CET 2021 root@zen-pobro:/usr/obj/usr/src/amd64.amd64/sys/GENERIC Panic String: VERIFY3(0 == zio_crypt_key_wrap(&dck->dck_wkey->wk_key, key, iv, mac, keydata, hmac_keydata)) failed (0 == 5) Dump Parity: 2673242901 Bounds: 4 Dump Status: good % dmesg ... CPU: AMD Ryzen 7 PRO 4750G with Radeon Graphics (3593.33-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x860f01 Family=0x17 Model=0x60 Stepping=1 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x75c237ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX,ADMSKX> Structured Extended Features=0x219c91a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,PQE,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA> Structured Extended Features2=0x400004<UMIP,RDPID> XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> AMD Extended Feature Extensions ID EBX=0x90cf757<CLZERO,IRPerf,XSaveErPtr,RDPRU,MCOMMIT,WBNOINVD,IBPB,IBRS,STIBP,PREFER_IBRS,SSBD> SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768 TSC: P-state invariant, performance statistics ... ccp0: <AMD CCP-5a> mem 0xfcc00000-0xfccfffff,0xfcd8c000-0xfcd8dfff at device 0.2 on pci9 random: registering fast source AMD CCP TRNG % pciconf -lv none2@pci0:9:0:2: class=0x108000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15df subvendor=0x1022 subdevice=0x15df vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h (Models 10h-1fh) Platform Security Processor' class = encrypt/decrypt Reproduced on FreeBSD 13.0-ALPHA3, 13.0-BETA2 and 14.0-CURRENT (commit 4a7d84058d Wed Feb 17 11:45:54 2021 +0100) If ccp module is not loaded: % zfs load-key data Enter passphrase for 'data': <ZFS dataset decrypted> % zfs create -o encryption=on -o keyformat=passphrase zroot/encrypted <new encrypted ZFS dataset created without panic>
ccp(4) appears to have a constraint that the AAD length with AES-GCM must be a multiple of the cipher block size. ZFS doesn't handle the errors that result when it submits a request satisfying this constraint. See bug 252981 for a related example of the same problem. For 13.0 this will be worked around by simply disabling the use of hardware offloads by ZFS.
> when it submits a request satisfying this constraint ^ not
Also, there's no reason to use ccp(4). It's broken (bug 227982) and slower than aesni(4).
(In reply to Conrad Meyer from comment #3) What do you think should we do with it? I think most in-kernel consumers assume that you can have multiple requests in flight in a session, and without that it's hard if not impossible to get decent throughput from a hardware offload device. I'm not sure whether that's a limitation of the driver or the device though. Perhaps we could simply change ccp(4) to not register itself with OCF for now.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=940415f20a784156ec0e247989796385896f32a8 commit 940415f20a784156ec0e247989796385896f32a8 Author: Martin Matuska <mm@FreeBSD.org> AuthorDate: 2021-02-22 17:37:47 +0000 Commit: Martin Matuska <mm@FreeBSD.org> CommitDate: 2021-02-22 17:42:33 +0000 zfs: disable use of hardware crypto offload drivers From openzfs-master e7adccf7f commit message: First, the crypto request completion handler contains a bug in that it fails to reset fs_done correctly after the request is completed. This is only a problem for asynchronous drivers. Second, some hardware drivers have input constraints which ZFS does not satisfy. For instance, ccp(4) apparently requires the AAD length for AES-GCM to be a multiple of the cipher block size, and with qat(4) the AES-GCM AAD length may not be longer than 240 bytes. FreeBSD's generic crypto framework doesn't have a mechanism to automatically fall back to a software implementation if a hardware driver cannot process a request, and ZFS does not tolerate such errors. Patch Author: Mark Johnston <markj@freebsd.org> Obtained from: openzfs/zfs@e7adccf7f537a4d07281a2b74b360154bae367bc PR: 252981, 253595 MFS after: 3 days (direct commit) sys/contrib/openzfs/module/os/freebsd/zfs/crypto_os.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
Does this also disable the use of AES-NI and carryless multiply or does ZFS make use of accelerated software crypto directly? I'm asking because disabling AES-NI would really hurt the vast majority of potential ZFS encryption users and expose them to timing side channels present in pure software AES and GCM implementations optimized for speed?
(In reply to crest from comment #6) No, aesni will still be used by the ZFS encryption layer if available (same for armv8crypto). The change applies only to hardware offload drivers.
A commit in branch releng/13.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=442719c0c6de93051d4bf9820420e9863ed3de53 commit 442719c0c6de93051d4bf9820420e9863ed3de53 Author: Martin Matuska <mm@FreeBSD.org> AuthorDate: 2021-02-22 17:37:47 +0000 Commit: Martin Matuska <mm@FreeBSD.org> CommitDate: 2021-02-25 16:20:20 +0000 zfs: disable use of hardware crypto offload drivers From openzfs-master e7adccf7f commit message: First, the crypto request completion handler contains a bug in that it fails to reset fs_done correctly after the request is completed. This is only a problem for asynchronous drivers. Second, some hardware drivers have input constraints which ZFS does not satisfy. For instance, ccp(4) apparently requires the AAD length for AES-GCM to be a multiple of the cipher block size, and with qat(4) the AES-GCM AAD length may not be longer than 240 bytes. FreeBSD's generic crypto framework doesn't have a mechanism to automatically fall back to a software implementation if a hardware driver cannot process a request, and ZFS does not tolerate such errors. Patch Author: Mark Johnston <markj@freebsd.org> Obtained from: openzfs/zfs@e7adccf7f537a4d07281a2b74b360154bae367bc PR: 252981, 253595 Approved by: re (gjb) (cherry picked from commit 940415f20a784156ec0e247989796385896f32a8) sys/contrib/openzfs/module/os/freebsd/zfs/crypto_os.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
*** This bug has been marked as a duplicate of bug 252981 ***