On FreeBSD stable/12, you cannot receive a resumed ZFS stream into a dataset that has a mounted clone. The bug was likely introduced by r364412. The problem is that when receiving libzfs tries to unmount any dataset whose mountpoint might be changed. Such datasets include all children of the destination, as well as all clones of those children. Clones of the destination itself SHOULD NOT be included, but libzfs includes them anyway. Datasets whose mountpoint property is locally set also SHOULD NOT be included, but libzfs seems to include them anyway, too. The problem is not reproducible on head (which has switched to OpenZFS), because OpenZFS's libzfs does not try to unmount datasets when receiving a stream. I don't know why not. Steps to reproduce: > sudo zpool create tank vtbd1 > sudo zfs create tank/src > sudo dd if=/dev/zero bs=1m count=1024 of=/tank/src/zerofile 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 8.282515 secs (129639593 bytes/sec) > sudo zfs snapshot tank/src@1 > sudo zfs send -R tank/src@1 | sudo zfs recv -vs tank/dst receiving full stream of tank/src@1 into tank/dst@1 received 1.00GB stream in 4 seconds (257MB/sec) > sudo zfs clone tank/dst@1 tank/clone > # In another shell, cd to /tank/clone > sudo dd if=/dev/zero bs=1m count=1024 of=/tank/src/zerofile2 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 5.961812 secs (180103269 bytes/sec) > sudo zfs snapshot tank/src@2 > sudo zfs send -i tank/src@1 tank/src@2 | head -c 536870912 | sudo zfs receive -vs tank/dst receiving incremental stream of tank/src@2 into tank/dst@2 warning: cannot send 'tank/src@2': signal received cannot receive incremental stream: checksum mismatch or incomplete stream. Partially received snapshot is saved. A resuming stream can be generated on the sending system by running: zfs send -t 1-XXXXXX > sudo zfs send -t 1-XXXXXX | sudo zfs receive -vs tank/dst cannot unmount '/tank/clone': Device busy
Created attachment 218246 [details] In iter_dependents_cb, don't recurse into clones of the destination
I'm guessing that https://github.com/openzfs/zfs/commit/0c6d09361d is the reason why I can't reproduce this problem on head. But I still don't know why OpenZFS isn't vulnerable to bug 248606 .
That patch works, but unfortunately breaks the ability to do "zfs destroy -R" of a snapshot with clones.
Created attachment 218303 [details] Fix unmount/remount when resuming a receive stream zfs: Fix resuming receive stream to dataset with mounted clone My fix for bug 248606 (zfs receive: Input/output error accessing dataset after resuming interrupted receive), r364412, introduced a regression: attempting to resume a receive into a dataset with a mounted clone would fail if that clone were in-use. This change reverts r364412 and fixes it in a better way. Background: When ZFS receives a stream, it may decide to unmount and remount the destination and all of its children. However, ever since resumable send/receive was implemented, ZFS has skipped the unmount/remount step when resuming a stream. I don't know why. That let to bug 248606. When resuming the stream, ZFS didn't unmount and remount the destination, leaving a destroyed dataset mounted. My original fix was to always unmount and remount when resuming a receive, but that caused other problems, like bug 249579. A better solution is to unmount and remount when resuming a receive of a stream that would've unmounted and remounted when it was new. Direct commit to stable/12 because head has moved to OpenZFS. The bug exists there, too, but a change to the OpenZFS code can't be merged to the old ZFS code. PR: 249579 Test Plan: ZFS test suite
(In reply to Alan Somers from comment #4) Based on my understanding of the code and your description of the problem this change looks fine to me.
A commit references this bug: Author: asomers Date: Sat Sep 26 02:50:29 UTC 2020 New revision: 366180 URL: https://svnweb.freebsd.org/changeset/base/366180 Log: zfs: Fix resuming receive stream to dataset with mounted clone My fix for bug 248606 (zfs receive: Input/output error accessing dataset after resuming interrupted receive), r364412, introduced a regression: attempting to resume a receive into a dataset with a mounted clone would fail if that clone were in-use. This change reverts r364412 and fixes it in a better way. Background: When ZFS receives a stream, it may decide to unmount and remount the destination and all of its children. However, ever since resumable send/receive was implemented, ZFS has skipped the unmount/remount step when resuming a stream. I don't know why. That let to bug 248606. When resuming the stream, ZFS didn't unmount and remount the destination, leaving a destroyed dataset mounted. My original fix was to always unmount and remount when resuming a receive, but that caused other problems, like bug 249579. A better solution is to unmount and remount when resuming a receive of a stream that would've unmounted and remounted when it was new. Direct commit to stable/12 because head has moved to OpenZFS. The bug exists there, too, but a change to the OpenZFS code can't be merged to the old ZFS code. PR: 249579 Reviewed by: mmacy MFC after: 1 week Sponsored by: Axcient Changes: stable/12/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c