I'm running 13-stable fdbbd118faab but the code is identical in HEAD. Looking at the backtrace: #16 <signal handler called> #17 dmu_dump_write (dscp=dscp@entry=0xfffffe02501abc30, type=<optimized out>, object=<optimized out>, offset=<optimized out>, offset@entry=0, lsize=<optimized out>, lsize@entry=131072, psize=psize@entry=131072, bp=0x0, data=0xfffffe02d94a6000) at /usr/src/sys/contrib/openzfs/module/zfs/dmu_send.c:493 #18 0xffffffff80410a3c in do_dump (dscp=dscp@entry=0xfffffe02501abc30, range=range@entry=0xfffff805fd82d900) at /usr/src/sys/contrib/openzfs/module/zfs/dmu_send.c:1016 #19 0xffffffff8040ead3 in dmu_send_impl (dspp=<optimized out>, dspp@entry=0xfffffe02501abdf0) at /usr/src/sys/contrib/openzfs/module/zfs/dmu_send.c:2537 #20 0xffffffff8040d8fd in dmu_send_obj (pool=<optimized out>, pool@entry=0xfffffe02d3b61000 "tank/compat@20210604bu", tosnap=10690, fromsnap=11065, embedok=<optimized out>, embedok@entry=1, large_block_ok=<optimized out>, large_block_ok@entry=2, compressok=<optimized out>, compressok@entry=4, rawok=8, savedok=0, outfd=1, off=0xfffffe02501ac070, dsop=0xfffffe02501ac058) at /usr/src/sys/contrib/openzfs/module/zfs/dmu_send.c:2695 dmu_send.c:493 is "ASSERT(!BP_IS_EMBEDDED(bp));" which dereferences bp with no checks for NULL, whereas dmu_send.c:1016 explicitly passes NULL to dmu_dump_write() as bp. This is obviously a bug somewhere. Looking at the comment at lines 1006-1008, it seems the code expects that raw sends will always have large block sends enabled, avoiding the problematic code block. And zfs-send(8) says that --raw implies --large-block if the source is not encrypted. But even if I explicitly specify --large-block then the code panics in the same way. (And --large-block as on option doesn't actually make sense with --raw because the send stream must match what's on local disk by definition).
I've modified my kernel to return an error, instead of panicing, and done some investigating: * The problem affects 3 (out of 81) filesystems I have. * Those 3 filesystems don't have any unusual configuration. * The encrypted pool reports no errors on a scrub. * I still have the original pool (from before I did the encryption) and I can do a "zfs send" of those filesystems from that pool without error. * I created a third pool, with encryption enabled, and copied/encrypted those 3 filesystems from the original pool to the third pool. Doing a send from that pool fails in the same way. * The error reports that it can't send an intermediate snapshot within the filesystem (the snapshots are different for each filesystem). If I delete the snapshot that reports the error then the error moves to the next most recent snapshot. * Creating the third pool with a different encryption key has no effect (unsurprising but I thought I'd check). I'm doing a scrub of the original pool but would be surprised if it reports any errors. At this point, it looks like there's something in the original filesystems that doesn't cause any issue with an unencrypted filesystem but breaks once that filesystem is encrypted. If anyone wants to investigate, I'm happy to share send streams from 2 of the filesystems - with xz, one shrinks to 39.5MB and the other to 35.8MB.
I've had a rummage around OpenZFS and this is https://github.com/openzfs/zfs/issues/12275, which is, unfortunately, still open without a fix.
Is this reproducible with recent 13.1-STABLE? (In reply to Peter Jeremy from comment #2) Closed 2022-06-27 (PR <https://github.com/openzfs/zfs/pull/12438> merged), subsequent commits.