When using zfs receive to restore data saved with zfs send there is the possibility that the receive will fail due to replication of snapshots information being processed in a random order. In addition to this when receiving snapshots where the parent from snapshot is replaced by another needless failures occur which can be seen clearly in verbose mode. Fix: The first problem is down to the fact that the replication of snapshots are processed in a random order resulting in snapshot deletes and renames happening in the wrong order. The second problem is caused by deletes / renames effecting the parent snapshot of existing filesystems. The current process needlessly attempts to rename child filesystems of the deleted parents. The attached patch fixes both issues. The first by ensuring the list of snapshots used is sorted The second by maintaining a list of deleted guid's which are used to check against before performing a parent guid based rename. How-To-Repeat: 1. First create some volumes & backups on a test pool (testtank) zfs create testtank/test zfs create testtank/test/1 zfs create testtank/test/2 zfs snapshot -r testtank@backup zfs send -R testtank@backup > recvtest-1 zfs rename -r testtank@backup testtank@backup-last zfs snapshot -r testtank@backup zfs send -R -i testtank@backup-last testtank@backup > recvtest-2 zfs destroy -r testtank@backup-last zfs rename -r testtank@backup testtank@backup-last zfs snapshot -r testtank@backup zfs send -R -i testtank@backup-last testtank@backup > recvtest-3 2. Clean out the pool zfs destroy -r testtank 3. Restore the backups in order zfs recv -duFv testtank < recvtest-1 zfs recv -duFv testtank < recvtest-2 zfs recv -duFv testtank < recvtest-3 At some point usually the third receive it will error and leave the filesystems in a random state. You'll see temporary named filesystems e.g. zfs list -t all testtank/recv-16556-1 testtank/recv-16556-1/1 testtank/recv-16556-1/2
Patch for this issue ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
Responsible Changed From-To: freebsd-fs->smh I'll take it.
State Changed From-To: open->patched patched in head/ with r244194 - awaiting MFC
Author: smh Date: Wed Jun 5 11:23:10 2013 New Revision: 251411 URL: http://svnweb.freebsd.org/changeset/base/251411 Log: MFC r244194: Fixe zfs receive errors caused by snapshot replication being processed in random order. PR: kern/172259 Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/cddl/contrib/opensolaris/lib/libzfs/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jun 5 11:16:17 2013 (r251410) +++ stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jun 5 11:23:10 2013 (r251411) @@ -727,7 +727,7 @@ send_iterate_fs(zfs_handle_t *zhp, void sd->parent_fromsnap_guid = 0; VERIFY(0 == nvlist_alloc(&sd->parent_snaps, NV_UNIQUE_NAME, 0)); VERIFY(0 == nvlist_alloc(&sd->snapprops, NV_UNIQUE_NAME, 0)); - (void) zfs_iter_snapshots(zhp, B_FALSE, send_iterate_snap, sd); + (void) zfs_iter_snapshots_sorted(zhp, send_iterate_snap, sd); VERIFY(0 == nvlist_add_nvlist(nvfs, "snaps", sd->parent_snaps)); VERIFY(0 == nvlist_add_nvlist(nvfs, "snapprops", sd->snapprops)); nvlist_free(sd->parent_snaps); @@ -1945,11 +1945,12 @@ recv_incremental_replication(libzfs_hand recvflags_t *flags, nvlist_t *stream_nv, avl_tree_t *stream_avl, nvlist_t *renamed) { - nvlist_t *local_nv; + nvlist_t *local_nv, *deleted = NULL; avl_tree_t *local_avl; nvpair_t *fselem, *nextfselem; char *fromsnap; char newname[ZFS_MAXNAMELEN]; + char guidname[32]; int error; boolean_t needagain, progress, recursive; char *s1, *s2; @@ -1965,6 +1966,8 @@ recv_incremental_replication(libzfs_hand again: needagain = progress = B_FALSE; + VERIFY(0 == nvlist_alloc(&deleted, NV_UNIQUE_NAME, 0)); + if ((error = gather_nvlist(hdl, tofs, fromsnap, NULL, recursive, &local_nv, &local_avl)) != 0) return (error); @@ -2079,6 +2082,8 @@ again: needagain = B_TRUE; else progress = B_TRUE; + sprintf(guidname, "%lu", thisguid); + nvlist_add_boolean(deleted, guidname); continue; } @@ -2134,6 +2139,8 @@ again: needagain = B_TRUE; else progress = B_TRUE; + sprintf(guidname, "%lu", parent_fromsnap_guid); + nvlist_add_boolean(deleted, guidname); continue; } @@ -2156,6 +2163,24 @@ again: s2 = strrchr(stream_fsname, '/'); /* + * Check if we're going to rename based on parent guid change + * and the current parent guid was also deleted. If it was then + * rename will fail and is likely unneeded, so avoid this and + * force an early retry to determine the new + * parent_fromsnap_guid. + */ + if (stream_parent_fromsnap_guid != 0 && + parent_fromsnap_guid != 0 && + stream_parent_fromsnap_guid != parent_fromsnap_guid) { + sprintf(guidname, "%lu", parent_fromsnap_guid); + if (nvlist_exists(deleted, guidname)) { + progress = B_TRUE; + needagain = B_TRUE; + goto doagain; + } + } + + /* * Check for rename. If the exact receive path is specified, it * does not count as a rename, but we still need to check the * datasets beneath it. @@ -2209,8 +2234,10 @@ again: } } +doagain: fsavl_destroy(local_avl); nvlist_free(local_nv); + nvlist_free(deleted); if (needagain && progress) { /* do another pass to fix up temporary names */ _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Author: smh Date: Wed Jun 5 11:55:35 2013 New Revision: 251417 URL: http://svnweb.freebsd.org/changeset/base/251417 Log: MFC r244194: Fix zfs receive errors caused by snapshot replication being processed in random order. PR: kern/172259 Modified: stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Directory Properties: stable/9/cddl/contrib/opensolaris/ (props changed) stable/9/cddl/contrib/opensolaris/lib/libzfs/ (props changed) Modified: stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c ============================================================================== --- stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jun 5 11:46:43 2013 (r251416) +++ stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jun 5 11:55:35 2013 (r251417) @@ -731,7 +731,7 @@ send_iterate_fs(zfs_handle_t *zhp, void sd->parent_fromsnap_guid = 0; VERIFY(0 == nvlist_alloc(&sd->parent_snaps, NV_UNIQUE_NAME, 0)); VERIFY(0 == nvlist_alloc(&sd->snapprops, NV_UNIQUE_NAME, 0)); - (void) zfs_iter_snapshots(zhp, B_FALSE, send_iterate_snap, sd); + (void) zfs_iter_snapshots_sorted(zhp, send_iterate_snap, sd); VERIFY(0 == nvlist_add_nvlist(nvfs, "snaps", sd->parent_snaps)); VERIFY(0 == nvlist_add_nvlist(nvfs, "snapprops", sd->snapprops)); nvlist_free(sd->parent_snaps); @@ -1946,11 +1946,12 @@ recv_incremental_replication(libzfs_hand recvflags_t *flags, nvlist_t *stream_nv, avl_tree_t *stream_avl, nvlist_t *renamed) { - nvlist_t *local_nv; + nvlist_t *local_nv, *deleted = NULL; avl_tree_t *local_avl; nvpair_t *fselem, *nextfselem; char *fromsnap; char newname[ZFS_MAXNAMELEN]; + char guidname[32]; int error; boolean_t needagain, progress, recursive; char *s1, *s2; @@ -1966,6 +1967,8 @@ recv_incremental_replication(libzfs_hand again: needagain = progress = B_FALSE; + VERIFY(0 == nvlist_alloc(&deleted, NV_UNIQUE_NAME, 0)); + if ((error = gather_nvlist(hdl, tofs, fromsnap, NULL, recursive, &local_nv, &local_avl)) != 0) return (error); @@ -2080,6 +2083,8 @@ again: needagain = B_TRUE; else progress = B_TRUE; + sprintf(guidname, "%lu", thisguid); + nvlist_add_boolean(deleted, guidname); continue; } @@ -2135,6 +2140,8 @@ again: needagain = B_TRUE; else progress = B_TRUE; + sprintf(guidname, "%lu", parent_fromsnap_guid); + nvlist_add_boolean(deleted, guidname); continue; } @@ -2157,6 +2164,24 @@ again: s2 = strrchr(stream_fsname, '/'); /* + * Check if we're going to rename based on parent guid change + * and the current parent guid was also deleted. If it was then + * rename will fail and is likely unneeded, so avoid this and + * force an early retry to determine the new + * parent_fromsnap_guid. + */ + if (stream_parent_fromsnap_guid != 0 && + parent_fromsnap_guid != 0 && + stream_parent_fromsnap_guid != parent_fromsnap_guid) { + sprintf(guidname, "%lu", parent_fromsnap_guid); + if (nvlist_exists(deleted, guidname)) { + progress = B_TRUE; + needagain = B_TRUE; + goto doagain; + } + } + + /* * Check for rename. If the exact receive path is specified, it * does not count as a rename, but we still need to check the * datasets beneath it. @@ -2210,8 +2235,10 @@ again: } } +doagain: fsavl_destroy(local_avl); nvlist_free(local_nv); + nvlist_free(deleted); if (needagain && progress) { /* do another pass to fix up temporary names */ _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
State Changed From-To: patched->closed Committed. Thanks!
A commit in branch vendor/openzfs/master references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7509a3d29963882a0c7fa9e703085c3a6a2ced65 commit 7509a3d29963882a0c7fa9e703085c3a6a2ced65 Author: smh <smh@FreeBSD.org> AuthorDate: 2012-12-13 22:03:07 +0000 Commit: Brian Behlendorf <behlendorf1@llnl.gov> CommitDate: 2014-10-02 23:48:19 +0000 FreeBSD PR kern/172259: Fixes zfs receive errors FreeBSD PR kern/172259: Fixes zfs receive errors caused by snapshot replication being processed in a random order instead of creation order. Eliminates needless filesystem renames caused by removed parent snapshots which subsequently causes many more errors. PR: kern/172259 Submitted by: Steven Hartland Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks References: https://github.com/freebsd/freebsd/commit/4995789 Porting notes: Minor whitespace fixes were made to conform with style requirements: lib/libzfs/libzfs_sendrecv.c: 2269: indent by spaces instead of tabs lib/libzfs/libzfs_sendrecv.c: 2270: indent by spaces instead of tabs Ported-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2729 lib/libzfs/libzfs_sendrecv.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-)