Bug 172259 - [zfs] [patch] ZFS fails to receive valid snapshots (patch included)
Summary: [zfs] [patch] ZFS fails to receive valid snapshots (patch included)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 8.3-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Steven Hartland
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-02 00:40 UTC by Steven Hartland
Modified: 2021-06-08 14:16 UTC (History)
0 users

See Also:


Attachments
zfs-recv-sortfix.txt (3.01 KB, text/plain; format=flowed)
2012-10-02 01:12 UTC, Steven Hartland &
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Hartland 2012-10-02 00:40:02 UTC
When using zfs receive to restore data saved with zfs send there is the possibility that the receive will fail due to replication of snapshots information being processed in a random order.

In addition to this when receiving snapshots where the parent from snapshot is replaced by another needless failures occur which can be seen clearly in verbose mode.

Fix: 

The first problem is down to the fact that the replication of snapshots are processed in a random order resulting in snapshot deletes and renames happening in the wrong order.

The second problem is caused by deletes / renames effecting the parent snapshot of existing filesystems. The current process needlessly attempts to rename child filesystems of the deleted parents.

The attached patch fixes both issues.

The first by ensuring the list of snapshots used is sorted

The second by maintaining a list of deleted guid's which are used to check against before performing a parent guid based rename.
How-To-Repeat: 1. First create some volumes & backups on a test pool (testtank)
 
zfs create testtank/test
zfs create testtank/test/1
zfs create testtank/test/2
zfs snapshot -r testtank@backup
zfs send -R testtank@backup > recvtest-1
zfs rename -r testtank@backup testtank@backup-last
zfs snapshot -r testtank@backup
zfs send -R -i testtank@backup-last testtank@backup > recvtest-2
zfs destroy -r testtank@backup-last
zfs rename -r testtank@backup testtank@backup-last
zfs snapshot -r testtank@backup
zfs send -R -i testtank@backup-last testtank@backup > recvtest-3

2. Clean out the pool
zfs destroy -r testtank

3. Restore the backups in order
zfs recv -duFv testtank < recvtest-1
zfs recv -duFv testtank < recvtest-2
zfs recv -duFv testtank < recvtest-3

At some point usually the third receive it will error and leave the filesystems in a random state. You'll see temporary named filesystems e.g.
zfs list -t all
testtank/recv-16556-1
testtank/recv-16556-1/1
testtank/recv-16556-1/2
Comment 1 Steven Hartland & 2012-10-02 01:12:23 UTC
Patch for this issue

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2012-10-05 02:14:41 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 3 Steven Hartland freebsd_committer 2012-12-11 14:09:31 UTC
Responsible Changed
From-To: freebsd-fs->smh

I'll take it.
Comment 4 Steven Hartland freebsd_committer 2012-12-13 22:05:35 UTC
State Changed
From-To: open->patched

patched in head/ with r244194 - awaiting MFC
Comment 5 dfilter service freebsd_committer 2013-06-05 12:23:19 UTC
Author: smh
Date: Wed Jun  5 11:23:10 2013
New Revision: 251411
URL: http://svnweb.freebsd.org/changeset/base/251411

Log:
  MFC r244194:
  Fixe zfs receive errors caused by snapshot replication being processed in
  random order.
  
  PR:		kern/172259

Modified:
  stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
Directory Properties:
  stable/8/cddl/contrib/opensolaris/   (props changed)
  stable/8/cddl/contrib/opensolaris/lib/libzfs/   (props changed)

Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
==============================================================================
--- stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c	Wed Jun  5 11:16:17 2013	(r251410)
+++ stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c	Wed Jun  5 11:23:10 2013	(r251411)
@@ -727,7 +727,7 @@ send_iterate_fs(zfs_handle_t *zhp, void 
 	sd->parent_fromsnap_guid = 0;
 	VERIFY(0 == nvlist_alloc(&sd->parent_snaps, NV_UNIQUE_NAME, 0));
 	VERIFY(0 == nvlist_alloc(&sd->snapprops, NV_UNIQUE_NAME, 0));
-	(void) zfs_iter_snapshots(zhp, B_FALSE, send_iterate_snap, sd);
+	(void) zfs_iter_snapshots_sorted(zhp, send_iterate_snap, sd);
 	VERIFY(0 == nvlist_add_nvlist(nvfs, "snaps", sd->parent_snaps));
 	VERIFY(0 == nvlist_add_nvlist(nvfs, "snapprops", sd->snapprops));
 	nvlist_free(sd->parent_snaps);
@@ -1945,11 +1945,12 @@ recv_incremental_replication(libzfs_hand
     recvflags_t *flags, nvlist_t *stream_nv, avl_tree_t *stream_avl,
     nvlist_t *renamed)
 {
-	nvlist_t *local_nv;
+	nvlist_t *local_nv, *deleted = NULL;
 	avl_tree_t *local_avl;
 	nvpair_t *fselem, *nextfselem;
 	char *fromsnap;
 	char newname[ZFS_MAXNAMELEN];
+	char guidname[32];
 	int error;
 	boolean_t needagain, progress, recursive;
 	char *s1, *s2;
@@ -1965,6 +1966,8 @@ recv_incremental_replication(libzfs_hand
 again:
 	needagain = progress = B_FALSE;
 
+	VERIFY(0 == nvlist_alloc(&deleted, NV_UNIQUE_NAME, 0));
+
 	if ((error = gather_nvlist(hdl, tofs, fromsnap, NULL,
 	    recursive, &local_nv, &local_avl)) != 0)
 		return (error);
@@ -2079,6 +2082,8 @@ again:
 					needagain = B_TRUE;
 				else
 					progress = B_TRUE;
+				sprintf(guidname, "%lu", thisguid);
+				nvlist_add_boolean(deleted, guidname);
 				continue;
 			}
 
@@ -2134,6 +2139,8 @@ again:
 				needagain = B_TRUE;
 			else
 				progress = B_TRUE;
+			sprintf(guidname, "%lu", parent_fromsnap_guid);
+			nvlist_add_boolean(deleted, guidname);
 			continue;
 		}
 
@@ -2156,6 +2163,24 @@ again:
 		s2 = strrchr(stream_fsname, '/');
 
 		/*
+		 * Check if we're going to rename based on parent guid change
+		 * and the current parent guid was also deleted. If it was then
+		 * rename will fail and is likely unneeded, so avoid this and
+		 * force an early retry to determine the new
+		 * parent_fromsnap_guid.
+		 */
+		if (stream_parent_fromsnap_guid != 0 &&
+                    parent_fromsnap_guid != 0 &&
+                    stream_parent_fromsnap_guid != parent_fromsnap_guid) {
+			sprintf(guidname, "%lu", parent_fromsnap_guid);
+			if (nvlist_exists(deleted, guidname)) {
+				progress = B_TRUE;
+				needagain = B_TRUE;
+				goto doagain;
+			}
+		}
+
+		/*
 		 * Check for rename. If the exact receive path is specified, it
 		 * does not count as a rename, but we still need to check the
 		 * datasets beneath it.
@@ -2209,8 +2234,10 @@ again:
 		}
 	}
 
+doagain:
 	fsavl_destroy(local_avl);
 	nvlist_free(local_nv);
+	nvlist_free(deleted);
 
 	if (needagain && progress) {
 		/* do another pass to fix up temporary names */
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 6 dfilter service freebsd_committer 2013-06-05 12:55:47 UTC
Author: smh
Date: Wed Jun  5 11:55:35 2013
New Revision: 251417
URL: http://svnweb.freebsd.org/changeset/base/251417

Log:
  MFC r244194:
  Fix zfs receive errors caused by snapshot replication being processed in
  random order.
  
  PR:		kern/172259

Modified:
  stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
Directory Properties:
  stable/9/cddl/contrib/opensolaris/   (props changed)
  stable/9/cddl/contrib/opensolaris/lib/libzfs/   (props changed)

Modified: stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
==============================================================================
--- stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c	Wed Jun  5 11:46:43 2013	(r251416)
+++ stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c	Wed Jun  5 11:55:35 2013	(r251417)
@@ -731,7 +731,7 @@ send_iterate_fs(zfs_handle_t *zhp, void 
 	sd->parent_fromsnap_guid = 0;
 	VERIFY(0 == nvlist_alloc(&sd->parent_snaps, NV_UNIQUE_NAME, 0));
 	VERIFY(0 == nvlist_alloc(&sd->snapprops, NV_UNIQUE_NAME, 0));
-	(void) zfs_iter_snapshots(zhp, B_FALSE, send_iterate_snap, sd);
+	(void) zfs_iter_snapshots_sorted(zhp, send_iterate_snap, sd);
 	VERIFY(0 == nvlist_add_nvlist(nvfs, "snaps", sd->parent_snaps));
 	VERIFY(0 == nvlist_add_nvlist(nvfs, "snapprops", sd->snapprops));
 	nvlist_free(sd->parent_snaps);
@@ -1946,11 +1946,12 @@ recv_incremental_replication(libzfs_hand
     recvflags_t *flags, nvlist_t *stream_nv, avl_tree_t *stream_avl,
     nvlist_t *renamed)
 {
-	nvlist_t *local_nv;
+	nvlist_t *local_nv, *deleted = NULL;
 	avl_tree_t *local_avl;
 	nvpair_t *fselem, *nextfselem;
 	char *fromsnap;
 	char newname[ZFS_MAXNAMELEN];
+	char guidname[32];
 	int error;
 	boolean_t needagain, progress, recursive;
 	char *s1, *s2;
@@ -1966,6 +1967,8 @@ recv_incremental_replication(libzfs_hand
 again:
 	needagain = progress = B_FALSE;
 
+	VERIFY(0 == nvlist_alloc(&deleted, NV_UNIQUE_NAME, 0));
+
 	if ((error = gather_nvlist(hdl, tofs, fromsnap, NULL,
 	    recursive, &local_nv, &local_avl)) != 0)
 		return (error);
@@ -2080,6 +2083,8 @@ again:
 					needagain = B_TRUE;
 				else
 					progress = B_TRUE;
+				sprintf(guidname, "%lu", thisguid);
+				nvlist_add_boolean(deleted, guidname);
 				continue;
 			}
 
@@ -2135,6 +2140,8 @@ again:
 				needagain = B_TRUE;
 			else
 				progress = B_TRUE;
+			sprintf(guidname, "%lu", parent_fromsnap_guid);
+			nvlist_add_boolean(deleted, guidname);
 			continue;
 		}
 
@@ -2157,6 +2164,24 @@ again:
 		s2 = strrchr(stream_fsname, '/');
 
 		/*
+		 * Check if we're going to rename based on parent guid change
+		 * and the current parent guid was also deleted. If it was then
+		 * rename will fail and is likely unneeded, so avoid this and
+		 * force an early retry to determine the new
+		 * parent_fromsnap_guid.
+		 */
+		if (stream_parent_fromsnap_guid != 0 &&
+                    parent_fromsnap_guid != 0 &&
+                    stream_parent_fromsnap_guid != parent_fromsnap_guid) {
+			sprintf(guidname, "%lu", parent_fromsnap_guid);
+			if (nvlist_exists(deleted, guidname)) {
+				progress = B_TRUE;
+				needagain = B_TRUE;
+				goto doagain;
+			}
+		}
+
+		/*
 		 * Check for rename. If the exact receive path is specified, it
 		 * does not count as a rename, but we still need to check the
 		 * datasets beneath it.
@@ -2210,8 +2235,10 @@ again:
 		}
 	}
 
+doagain:
 	fsavl_destroy(local_avl);
 	nvlist_free(local_nv);
+	nvlist_free(deleted);
 
 	if (needagain && progress) {
 		/* do another pass to fix up temporary names */
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 7 Steven Hartland freebsd_committer 2013-06-07 16:01:28 UTC
State Changed
From-To: patched->closed

Committed. Thanks!
Comment 8 commit-hook freebsd_committer 2021-06-08 14:16:54 UTC
A commit in branch vendor/openzfs/master references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7509a3d29963882a0c7fa9e703085c3a6a2ced65

commit 7509a3d29963882a0c7fa9e703085c3a6a2ced65
Author:     smh <smh@FreeBSD.org>
AuthorDate: 2012-12-13 22:03:07 +0000
Commit:     Brian Behlendorf <behlendorf1@llnl.gov>
CommitDate: 2014-10-02 23:48:19 +0000

    FreeBSD PR kern/172259: Fixes zfs receive errors

    FreeBSD PR kern/172259: Fixes zfs receive errors caused by snapshot
    replication being processed in a random order instead of creation
    order.

    Eliminates needless filesystem renames caused by removed parent
    snapshots which subsequently causes many more errors.

    PR:             kern/172259
    Submitted by:   Steven Hartland
    Reviewed by:    pjd (mentor)
    Approved by:    pjd (mentor)
    MFC after:      2 weeks

    References:
      https://github.com/freebsd/freebsd/commit/4995789

    Porting notes:

    Minor whitespace fixes were made to conform with style requirements:

    lib/libzfs/libzfs_sendrecv.c: 2269: indent by spaces instead of tabs
    lib/libzfs/libzfs_sendrecv.c: 2270: indent by spaces instead of tabs

    Ported-by: Richard Yao <ryao@gentoo.org>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue #2729

 lib/libzfs/libzfs_sendrecv.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)