Bug 252981

Summary: panic with ZFS encryption and QAT: VERIFY3(0 == spa_do_crypt_bad(...
Product: Base System Reporter: Alan Somers <asomers>
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: In Progress ---    
Severity: Affects Only Me CC: allanjude, asomers, grahamperrin, jhb, jsorocil, markj, mmpestorich
Priority: --- Keywords: crash
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
screenshot of the panic none

Description Alan Somers freebsd_committer freebsd_triage 2021-01-24 17:00:12 UTC
Created attachment 221884 [details]
screenshot of the panic

I tried out the new QAT driver today with ZFS.  I'm using a Atom C2758 CPU with 14.0-CURRENT.  When I tried to create an encrypted file system, the kernel immediately paniced.  Note that the same steps worked just fine without the qat driver loaded.

# kldload qat
# mdctl -a -t malloc -s 1g
# zpool create crypttest /dev/md0
# zpool set feature@encryption=enabled crypttest
# zfs create -o encryption=aes-256-gcm -o keyformat=passphrase crypttest/t
Enter passphrase:
Re-enter passphrase:
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2021-01-25 20:21:56 UTC
This looks like a bug in zfs_crypto_dispatch() which is only tickled when cryptop dispatch is asynchronous (i.e., a hardware driver is handling the request): it needs to set session->fs_done = false earlier.

More generally the code is assuming that only one thread is dispatching crypto requests for a given freebsd_session_t.  From my reading of the code this isn't obviously true: zio_do_crypt_data() uses key->zk_session but there doesn't appear to be any mutual exclusion.  Again this is only a problem with hw crypto.  I think it should be sufficient to make fs_done truly thread local.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2021-01-25 22:56:21 UTC
(In reply to Mark Johnston from comment #1)
With this fixed I hit the same panic as in the PR title.  QAT has some weird constraints around the AAD length for GCM/CCM; in particular, it must be no more than 240 bytes in length.  This is implemented in the Intel and DPDK QAT drivers as well as ours.

ZFS is submitting GCM encryption requests with a large AAD (I think when encrypting a block of dnodes?) and hitting the check in qat_process(), and it can't tolerate the error.  Weirdly, the ZFS QAT stub for Linux has no check for this case, but it also only uses QAT for data blocks, which I believe have no AAD.  See qat_crypt_use_accel().

So for now we should at least change ZFS to not use hardware crypto since it doesn't work and clearly hasn't been tested (see comment 1).  I'm not sure how hard it would be to permit the use of hardware crypto only for data blocks (and not for the ZIL and dnode blocks, which requires support for large AAD buffers.
Comment 3 Alan Somers freebsd_committer freebsd_triage 2021-01-25 23:04:05 UTC
Sounds like qat should refuse to service any consumers that use AAD.  Other consumers can use AAD too.  I think geli with data integrity verification does, for example.
Comment 4 Alan Somers freebsd_committer freebsd_triage 2021-01-25 23:21:42 UTC
Hm, apparently I don't understand the problem with AAD.  Because geli with QAT works fine when I create a device like this:
sudo geli onetime -a HMAC/SHA256 -e aes-cbc -l 256 -s 4096 md0
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2021-01-26 00:03:47 UTC
(In reply to Alan Somers from comment #3)
This constraint only applies to AES-GCM.  Other algorithms that provide integrity checking don't have such a limit on the AAD size.

IPSec's ESP protocol includes the ESP header itself as AAD, for instance, but because the AAD size is fixed and smaller than the limit, it can use QAT/AES-GCM with no problems.  That use-case was the original motivation for the port.

(In reply to Alan Somers from comment #4)
GELI doesn't appear to authenticate anything that isn't also encrypted, so there is no AAD.  Even if it were to use GCM there would be no problem.
Comment 6 Mark Johnston freebsd_committer freebsd_triage 2021-01-26 00:08:13 UTC
A better solution than disabling the use of hardware crypto in ZFS could be to extend opencrypto to optionally fall back to a software implementation if the request doesn't satisfy some constraints.
Comment 7 Mark Johnston freebsd_committer freebsd_triage 2021-02-12 15:35:42 UTC
I think for 13.0 we should just modify ZFS to not use hardware crypto drivers.  Per comment 1 the OCF wrapper in ZFS is buggy with respect to asynchronous completions.

Going forward there are at least three possible solutions:
1. Modify ZFS to use separate sessions for hardware and software crypto, and only use hardware crypto when ot != DMU_OT_INTENT_LOG && ot != DMU_OT_DNODE in zio_do_crypto_data().  This side-steps the AAD problem.
2. Modify qat(4) to authenticate AAD and perform encryption/decryption in separate requests, passing intermediate hash state from the first request to the second.  qat(4) can handle arbitrarily large GMAC requests.  I'm not sure how easy this is but I think it's possible.
3. Modify opencrypto to fall back to software crypto if the hardware can't handle the request for some reason.  jhb suggested that this would be useful for other purposes, e.g., if one is decrypting small packets, where the CPU cost of handling the request in software is the same as the request setup cost in a hardware driver.  This requires some thought around reordering of requests within a session.
Comment 8 John Baldwin freebsd_committer freebsd_triage 2021-02-12 20:16:57 UTC
I concur with changing 13 to only use software for now.  The other solutions aren't going to be ready before 13.0.
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2021-02-17 20:56:55 UTC
https://github.com/openzfs/zfs/pull/11612
Comment 10 commit-hook freebsd_committer freebsd_triage 2021-02-22 17:44:16 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=940415f20a784156ec0e247989796385896f32a8

commit 940415f20a784156ec0e247989796385896f32a8
Author:     Martin Matuska <mm@FreeBSD.org>
AuthorDate: 2021-02-22 17:37:47 +0000
Commit:     Martin Matuska <mm@FreeBSD.org>
CommitDate: 2021-02-22 17:42:33 +0000

    zfs: disable use of hardware crypto offload drivers

    From openzfs-master e7adccf7f commit message:
      First, the crypto request completion handler contains a bug in that it
      fails to reset fs_done correctly after the request is completed.  This
      is only a problem for asynchronous drivers.  Second, some hardware
      drivers have input constraints which ZFS does not satisfy.  For
      instance, ccp(4) apparently requires the AAD length for AES-GCM to be a
      multiple of the cipher block size, and with qat(4) the AES-GCM AAD
      length may not be longer than 240 bytes.  FreeBSD's generic crypto
      framework doesn't have a mechanism to automatically fall back to a
      software implementation if a hardware driver cannot process a request,
      and ZFS does not tolerate such errors.

    Patch Author:   Mark Johnston <markj@freebsd.org>

    Obtained from:  openzfs/zfs@e7adccf7f537a4d07281a2b74b360154bae367bc
    PR:             252981, 253595
    MFS after:      3 days

    (direct commit)

 sys/contrib/openzfs/module/os/freebsd/zfs/crypto_os.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)
Comment 11 commit-hook freebsd_committer freebsd_triage 2021-02-25 16:21:16 UTC
A commit in branch releng/13.0 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=442719c0c6de93051d4bf9820420e9863ed3de53

commit 442719c0c6de93051d4bf9820420e9863ed3de53
Author:     Martin Matuska <mm@FreeBSD.org>
AuthorDate: 2021-02-22 17:37:47 +0000
Commit:     Martin Matuska <mm@FreeBSD.org>
CommitDate: 2021-02-25 16:20:20 +0000

    zfs: disable use of hardware crypto offload drivers

    From openzfs-master e7adccf7f commit message:
      First, the crypto request completion handler contains a bug in that it
      fails to reset fs_done correctly after the request is completed.  This
      is only a problem for asynchronous drivers.  Second, some hardware
      drivers have input constraints which ZFS does not satisfy.  For
      instance, ccp(4) apparently requires the AAD length for AES-GCM to be a
      multiple of the cipher block size, and with qat(4) the AES-GCM AAD
      length may not be longer than 240 bytes.  FreeBSD's generic crypto
      framework doesn't have a mechanism to automatically fall back to a
      software implementation if a hardware driver cannot process a request,
      and ZFS does not tolerate such errors.

    Patch Author:   Mark Johnston <markj@freebsd.org>

    Obtained from:  openzfs/zfs@e7adccf7f537a4d07281a2b74b360154bae367bc
    PR:             252981, 253595
    Approved by:    re (gjb)

    (cherry picked from commit 940415f20a784156ec0e247989796385896f32a8)

 sys/contrib/openzfs/module/os/freebsd/zfs/crypto_os.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2021-02-25 16:24:18 UTC
*** Bug 253595 has been marked as a duplicate of this bug. ***
Comment 13 Mark Johnston freebsd_committer freebsd_triage 2021-02-25 16:25:25 UTC
So the immediate problem is fixed in 13.0.  We can leave this PR open for the longer-term goal of re-enabling hardware crypto support in ZFS, or perhaps submit a new one for this purpose.  Any preferences?
Comment 14 John Baldwin freebsd_committer freebsd_triage 2022-05-07 19:35:35 UTC
Perhaps a smaller fix is to do what I did in ccr(4) recently to explicitly allocate a software session and punt requests that don't "work" to software in the qat(4) driver itself.  The ZFS change can then be reverted.