Bug 139076 - [zfs] ZFS file system has SysV group ownership semantics not BSD
Summary: [zfs] ZFS file system has SysV group ownership semantics not BSD
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: Pawel Jakub Dawidek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-23 08:30 UTC by Sean Winn
Modified: 2010-01-06 16:20 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sean Winn 2009-09-23 08:30:03 UTC
GID of files created in a directory take on the current GID of the user not the GID of the enclosing directory, and SGID on the directory forces BSD semantics; ie. it's using SysV ownerships.

Fix: 

Workaround is to use SGID on the directories, but its a pretty big change when migrating from UFS to ZFS
How-To-Repeat: 17:15 Wed 23-Sep sean@queen [~/tmp] ls -ld .
drwxrwxr-x  95 sean  sean  299 Sep 16 10:55 .
17:16 Wed 23-Sep sean@queen [~/tmp] sudo touch yyy
Password:
17:16 Wed 23-Sep sean@queen [~/tmp] ls -ld yyy
-rw-r--r--  1 root  wheel  0 Sep 23 17:16 yyy
17:16 Wed 23-Sep sean@queen [~/tmp] chmod 2775 .
17:16 Wed 23-Sep sean@queen [~/tmp] sudo touch yyy2
17:16 Wed 23-Sep sean@queen [~/tmp] ls -ld yyy2
-rw-r--r--  1 root  sean  0 Sep 23 17:16 yyy2
Comment 1 Gavin Atkinson freebsd_committer freebsd_triage 2009-09-23 09:10:56 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).  Not sure there's anything that can be done about 
this, other than documenting it.
Comment 2 dfilter service freebsd_committer freebsd_triage 2009-09-23 10:18:26 UTC
Author: pjd
Date: Wed Sep 23 09:18:16 2009
New Revision: 197426
URL: http://svn.freebsd.org/changeset/base/197426

Log:
  Restore BSD behaviour - when creating new directory entry use parent directory
  gid to set group ownership and not process gid.
  
  This was overlooked during v6 -> v13 switch.
  
  PR:		kern/139076
  Reported by:	Sean Winn <sean@gothic.net.au>
  MFC after:	3 days

Modified:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c

Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c
==============================================================================
--- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c	Wed Sep 23 05:32:33 2009	(r197425)
+++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c	Wed Sep 23 09:18:16 2009	(r197426)
@@ -1841,7 +1841,7 @@ zfs_perm_init(znode_t *zp, znode_t *pare
 				fgid = zfs_fuid_create_cred(zfsvfs,
 				    ZFS_GROUP, tx, cr, fuidp);
 #ifdef __FreeBSD__
-				gid = parent->z_phys->zp_gid;
+				gid = fgid = parent->z_phys->zp_gid;
 #else
 				gid = crgetgid(cr);
 #endif
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 3 dfilter service freebsd_committer freebsd_triage 2009-09-29 11:53:26 UTC
Author: pjd
Date: Tue Sep 29 10:53:06 2009
New Revision: 197613
URL: http://svn.freebsd.org/changeset/base/197613

Log:
  MFC r197287, r197289, r197351, r197426, r197458, r197459, r197497, r197498,
  r197512, r197513, r197514, r197515, r197525:
  
  r197287:
  
  Purge namecache for the file system being rolled back, so it doesn't point at
  invalid vnodes after the rollback resulting in EIO errors when trying to access
  files which are in the namecache.
  
  Reported by:	des
  
  r197289:
  
  Purge file system namecache when receiving incremental stream and rolling back
  to it.
  
  r197351:
  
  Purge namecache in the same place OpenSolaris does.
  
  r197426:
  
  Restore BSD behaviour - when creating new directory entry use parent directory
  gid to set group ownership and not process gid.
  
  This was overlooked during v6 -> v13 switch.
  
  PR:	kern/139076
  Reported by:	Sean Winn <sean@gothic.net.au>
  
  r197458:
  
  Close race in zfs_zget(). We have to increase usecount first and then
  check for VI_DOOMED flag. Before this change vnode could be reclaimed
  between checking for the flag and increasing usecount.
  
  r197459:
  
  Before calling vflush(FORCECLOSE) mark file system as unmounted so the
  following vnops will fail. This is very important, because without this change
  vnode could be reclaimed at any point, even if we increased usecount. The only
  way to ensure that vnode won't be reclaimed was to lock it, which would be very
  hard to do in ZFS without changing a lot of code. With this change simply
  increasing usecount is enough to be sure vnode won't be reclaimed from under
  us. To be precise it can still be reclaimed but we won't be able to see it,
  because every try to enter ZFS through VFS will result in EIO.
  
  The only function that cannot return EIO, because it is needed for vflush() is
  zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks
  z_teardown_lock and never returns EIO.
  
  r197497:
  
  Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to
  be a bit weak and OpenSolaris also switched to fletcher4.
  
  r197498:	head/cddl/contrib/opensolaris
  
  Fletcher4 is not the default checksum algorithm.
  
  r197512:
  
  - Don't depend on value returned by gfs_*_inactive(), it doesn't work
    well with forced unmounts when GFS vnodes are referenced.
  - Make other preparations to GFS for forced unmounts.
  
  PR:	kern/139062
  Reported by:	trasz
  
  r197513:
  
  Use traverse() function to find and return mount point's vnode instead of
  covered vnode when snapshot is already mounted.
  
  r197514:
  
  On lookup error VFS expects *vpp to be set to NULL, be sure to do that.
  
  r197515:
  
  Handle cases where virtual (GFS) vnodes are referenced when doing forced
  unmount. In that case we cannot depend on the proper order of invalidating
  vnodes, so we have to free resources when we have a chance.
  
  PR:	kern/139062
  Reported by:	trasz
  
  r197525:
  
  Ensure that tv_sec is between INT32_MIN and INT32_MAX, so ZFS won't object.
  This completes the fix from r185586.
  
  PR:	kern/139059
  Reported by:	Daniel Braniss <danny@cs.huji.ac.il>
  Submitted by:	Jaakko Heinonen <jh@saunalahti.fi>
  Tested by:	Daniel Braniss <danny@cs.huji.ac.il>
  
  Approved by:	re (kib)

Modified:
  stable/8/cddl/contrib/opensolaris/   (props changed)
  stable/8/cddl/contrib/opensolaris/cmd/zfs/zfs.8
  stable/8/sys/   (props changed)
  stable/8/sys/amd64/include/xen/   (props changed)
  stable/8/sys/cddl/contrib/opensolaris/   (props changed)
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/fletcher.c
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
  stable/8/sys/contrib/dev/acpica/   (props changed)
  stable/8/sys/contrib/pf/   (props changed)
  stable/8/sys/dev/xen/xenpci/   (props changed)
  stable/8/sys/nfsserver/nfs_serv.c

Modified: stable/8/cddl/contrib/opensolaris/cmd/zfs/zfs.8
==============================================================================
--- stable/8/cddl/contrib/opensolaris/cmd/zfs/zfs.8	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/cddl/contrib/opensolaris/cmd/zfs/zfs.8	Tue Sep 29 10:53:06 2009	(r197613)
@@ -535,7 +535,7 @@ This property is not inherited.
 .ad
 .sp .6
 .RS 4n
-Controls the checksum used to verify data integrity. The default value is "on", which automatically selects an appropriate algorithm (currently, \fIfletcher2\fR, but this may change in future releases). The value "off" disables integrity
+Controls the checksum used to verify data integrity. The default value is "on", which automatically selects an appropriate algorithm (currently, \fIfletcher4\fR, but this may change in future releases). The value "off" disables integrity
 checking on user data. Disabling checksums is NOT a recommended practice.
 .RE
 

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -595,7 +595,6 @@ found:
 	if (vp->v_flag & V_XATTRDIR)
 		VI_LOCK(fp->gfs_parent);
 	VI_LOCK(vp);
-	ASSERT(vp->v_count < 2);
 	/*
 	 * Really remove this vnode
 	 */
@@ -607,12 +606,7 @@ found:
 		 */
 		ge->gfse_vnode = NULL;
 	}
-	if (vp->v_count == 1) {
-		vp->v_usecount--;
-		vdropl(vp);
-	} else {
-		VI_UNLOCK(vp);
-	}
+	VI_UNLOCK(vp);
 
 	/*
 	 * Free vnode and release parent
@@ -1084,18 +1078,16 @@ gfs_vop_inactive(ap)
 {
 	vnode_t *vp = ap->a_vp;
 	gfs_file_t *fp = vp->v_data;
-	void *data;
 
 	if (fp->gfs_type == GFS_DIR)
-		data = gfs_dir_inactive(vp);
+		gfs_dir_inactive(vp);
 	else
-		data = gfs_file_inactive(vp);
-
-	if (data != NULL)
-		kmem_free(data, fp->gfs_size);
+		gfs_file_inactive(vp);
 
 	VI_LOCK(vp);
 	vp->v_data = NULL;
 	VI_UNLOCK(vp);
+	kmem_free(fp, fp->gfs_size);
+
 	return (0);
 }

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/fletcher.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/fletcher.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/fletcher.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -19,11 +19,111 @@
  * CDDL HEADER END
  */
 /*
- * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
+ * Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
  * Use is subject to license terms.
  */
 
-#pragma ident	"%Z%%M%	%I%	%E% SMI"
+/*
+ * Fletcher Checksums
+ * ------------------
+ *
+ * ZFS's 2nd and 4th order Fletcher checksums are defined by the following
+ * recurrence relations:
+ *
+ *	a  = a    + f
+ *	 i    i-1    i-1
+ *
+ *	b  = b    + a
+ *	 i    i-1    i
+ *
+ *	c  = c    + b		(fletcher-4 only)
+ *	 i    i-1    i
+ *
+ *	d  = d    + c		(fletcher-4 only)
+ *	 i    i-1    i
+ *
+ * Where
+ *	a_0 = b_0 = c_0 = d_0 = 0
+ * and
+ *	f_0 .. f_(n-1) are the input data.
+ *
+ * Using standard techniques, these translate into the following series:
+ *
+ *	     __n_			     __n_
+ *	     \   |			     \   |
+ *	a  =  >     f			b  =  >     i * f
+ *	 n   /___|   n - i		 n   /___|	 n - i
+ *	     i = 1			     i = 1
+ *
+ *
+ *	     __n_			     __n_
+ *	     \   |  i*(i+1)		     \   |  i*(i+1)*(i+2)
+ *	c  =  >     ------- f		d  =  >     ------------- f
+ *	 n   /___|     2     n - i	 n   /___|	  6	   n - i
+ *	     i = 1			     i = 1
+ *
+ * For fletcher-2, the f_is are 64-bit, and [ab]_i are 64-bit accumulators.
+ * Since the additions are done mod (2^64), errors in the high bits may not
+ * be noticed.  For this reason, fletcher-2 is deprecated.
+ *
+ * For fletcher-4, the f_is are 32-bit, and [abcd]_i are 64-bit accumulators.
+ * A conservative estimate of how big the buffer can get before we overflow
+ * can be estimated using f_i = 0xffffffff for all i:
+ *
+ * % bc
+ *  f=2^32-1;d=0; for (i = 1; d<2^64; i++) { d += f*i*(i+1)*(i+2)/6 }; (i-1)*4
+ * 2264
+ *  quit
+ * %
+ *
+ * So blocks of up to 2k will not overflow.  Our largest block size is
+ * 128k, which has 32k 4-byte words, so we can compute the largest possible
+ * accumulators, then divide by 2^64 to figure the max amount of overflow:
+ *
+ * % bc
+ *  a=b=c=d=0; f=2^32-1; for (i=1; i<=32*1024; i++) { a+=f; b+=a; c+=b; d+=c }
+ *  a/2^64;b/2^64;c/2^64;d/2^64
+ * 0
+ * 0
+ * 1365
+ * 11186858
+ *  quit
+ * %
+ *
+ * So a and b cannot overflow.  To make sure each bit of input has some
+ * effect on the contents of c and d, we can look at what the factors of
+ * the coefficients in the equations for c_n and d_n are.  The number of 2s
+ * in the factors determines the lowest set bit in the multiplier.  Running
+ * through the cases for n*(n+1)/2 reveals that the highest power of 2 is
+ * 2^14, and for n*(n+1)*(n+2)/6 it is 2^15.  So while some data may overflow
+ * the 64-bit accumulators, every bit of every f_i effects every accumulator,
+ * even for 128k blocks.
+ *
+ * If we wanted to make a stronger version of fletcher4 (fletcher4c?),
+ * we could do our calculations mod (2^32 - 1) by adding in the carries
+ * periodically, and store the number of carries in the top 32-bits.
+ *
+ * --------------------
+ * Checksum Performance
+ * --------------------
+ *
+ * There are two interesting components to checksum performance: cached and
+ * uncached performance.  With cached data, fletcher-2 is about four times
+ * faster than fletcher-4.  With uncached data, the performance difference is
+ * negligible, since the cost of a cache fill dominates the processing time.
+ * Even though fletcher-4 is slower than fletcher-2, it is still a pretty
+ * efficient pass over the data.
+ *
+ * In normal operation, the data which is being checksummed is in a buffer
+ * which has been filled either by:
+ *
+ *	1. a compression step, which will be mostly cached, or
+ *	2. a bcopy() or copyin(), which will be uncached (because the
+ *	   copy is cache-bypassing).
+ *
+ * For both cached and uncached data, both fletcher checksums are much faster
+ * than sha-256, and slower than 'off', which doesn't touch the data at all.
+ */
 
 #include <sys/types.h>
 #include <sys/sysmacros.h>

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h	Tue Sep 29 10:53:06 2009	(r197613)
@@ -255,6 +255,7 @@ VTOZ(vnode_t *vp)
 
 /*
  * ZFS_ENTER() is called on entry to each ZFS vnode and vfs operation.
+ * ZFS_ENTER_NOERROR() is called when we can't return EIO.
  * ZFS_EXIT() must be called before exitting the vop.
  * ZFS_VERIFY_ZP() verifies the znode is valid.
  */
@@ -267,6 +268,9 @@ VTOZ(vnode_t *vp)
 		} \
 	}
 
+#define	ZFS_ENTER_NOERROR(zfsvfs) \
+	rrw_enter(&(zfsvfs)->z_teardown_lock, RW_READER, FTAG)
+
 #define	ZFS_EXIT(zfsvfs) rrw_exit(&(zfsvfs)->z_teardown_lock, FTAG)
 
 #define	ZFS_VERIFY_ZP(zp) \

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h	Tue Sep 29 10:53:06 2009	(r197613)
@@ -76,7 +76,7 @@ enum zio_checksum {
 	ZIO_CHECKSUM_FUNCTIONS
 };
 
-#define	ZIO_CHECKSUM_ON_VALUE	ZIO_CHECKSUM_FLETCHER_2
+#define	ZIO_CHECKSUM_ON_VALUE	ZIO_CHECKSUM_FLETCHER_4
 #define	ZIO_CHECKSUM_DEFAULT	ZIO_CHECKSUM_ON
 
 enum zio_compress {

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -1841,7 +1841,7 @@ zfs_perm_init(znode_t *zp, znode_t *pare
 				fgid = zfs_fuid_create_cred(zfsvfs,
 				    ZFS_GROUP, tx, cr, fuidp);
 #ifdef __FreeBSD__
-				gid = parent->z_phys->zp_gid;
+				gid = fgid = parent->z_phys->zp_gid;
 #else
 				gid = crgetgid(cr);
 #endif

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -818,7 +818,11 @@ zfsctl_snapdir_lookup(ap)
 	if ((sep = avl_find(&sdp->sd_snaps, &search, &where)) != NULL) {
 		*vpp = sep->se_root;
 		VN_HOLD(*vpp);
-		if ((*vpp)->v_mountedhere == NULL) {
+		err = traverse(vpp, LK_EXCLUSIVE | LK_RETRY);
+		if (err) {
+			VN_RELE(*vpp);
+			*vpp = NULL;
+		} else if (*vpp == sep->se_root) {
 			/*
 			 * The snapshot was unmounted behind our backs,
 			 * try to remount it.
@@ -832,10 +836,9 @@ zfsctl_snapdir_lookup(ap)
 			 */
 			(*vpp)->v_flag &= ~VROOT;
 		}
-		vn_lock(*vpp, LK_EXCLUSIVE | LK_RETRY);
 		mutex_exit(&sdp->sd_lock);
 		ZFS_EXIT(zfsvfs);
-		return (0);
+		return (err);
 	}
 
 	/*
@@ -895,6 +898,8 @@ domount:
 	}
 	mutex_exit(&sdp->sd_lock);
 	ZFS_EXIT(zfsvfs);
+	if (err != 0)
+		*vpp = NULL;
 	return (err);
 }
 
@@ -1002,15 +1007,24 @@ zfsctl_snapdir_inactive(ap)
 {
 	vnode_t *vp = ap->a_vp;
 	zfsctl_snapdir_t *sdp = vp->v_data;
-	void *private;
+	zfs_snapentry_t *sep;
 
-	private = gfs_dir_inactive(vp);
-	if (private != NULL) {
-		ASSERT(avl_numnodes(&sdp->sd_snaps) == 0);
-		mutex_destroy(&sdp->sd_lock);
-		avl_destroy(&sdp->sd_snaps);
-		kmem_free(private, sizeof (zfsctl_snapdir_t));
+	/*
+	 * On forced unmount we have to free snapshots from here.
+	 */
+	mutex_enter(&sdp->sd_lock);
+	while ((sep = avl_first(&sdp->sd_snaps)) != NULL) {
+		avl_remove(&sdp->sd_snaps, sep);
+		kmem_free(sep->se_name, strlen(sep->se_name) + 1);
+		kmem_free(sep, sizeof (zfs_snapentry_t));
 	}
+	mutex_exit(&sdp->sd_lock);
+	gfs_dir_inactive(vp);
+	ASSERT(avl_numnodes(&sdp->sd_snaps) == 0);
+	mutex_destroy(&sdp->sd_lock);
+	avl_destroy(&sdp->sd_snaps);
+	kmem_free(sdp, sizeof (zfsctl_snapdir_t));
+
 	return (0);
 }
 
@@ -1068,6 +1082,9 @@ zfsctl_snapshot_inactive(ap)
 	int locked;
 	vnode_t *dvp;
 
+	if (vp->v_count > 0)
+		goto end;
+
 	VERIFY(gfs_dir_lookup(vp, "..", &dvp, cr, 0, NULL, NULL) == 0);
 	sdp = dvp->v_data;
 	VOP_UNLOCK(dvp, 0);
@@ -1075,11 +1092,6 @@ zfsctl_snapshot_inactive(ap)
 	if (!(locked = MUTEX_HELD(&sdp->sd_lock)))
 		mutex_enter(&sdp->sd_lock);
 
-	if (vp->v_count > 1) {
-		if (!locked)
-			mutex_exit(&sdp->sd_lock);
-		return (0);
-	}
 	ASSERT(!vn_ismntpt(vp));
 
 	sep = avl_first(&sdp->sd_snaps);
@@ -1099,6 +1111,7 @@ zfsctl_snapshot_inactive(ap)
 	if (!locked)
 		mutex_exit(&sdp->sd_lock);
 	VN_RELE(dvp);
+end:
 	VFS_RELE(vp->v_vfsp);
 
 	/*

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -864,7 +864,7 @@ zfs_root(vfs_t *vfsp, int flags, vnode_t
 	znode_t *rootzp;
 	int error;
 
-	ZFS_ENTER(zfsvfs);
+	ZFS_ENTER_NOERROR(zfsvfs);
 
 	error = zfs_zget(zfsvfs, zfsvfs->z_root, &rootzp);
 	if (error == 0) {
@@ -898,6 +898,9 @@ zfsvfs_teardown(zfsvfs_t *zfsvfs, boolea
 		 * 'z_parent' is self referential for non-snapshots.
 		 */
 		(void) dnlc_purge_vfsp(zfsvfs->z_parent->z_vfs, 0);
+#ifdef FREEBSD_NAMECACHE
+		cache_purgevfs(zfsvfs->z_parent->z_vfs);
+#endif
 	}
 
 	/*
@@ -1027,6 +1030,17 @@ zfs_umount(vfs_t *vfsp, int fflag)
 		ASSERT(zfsvfs->z_ctldir == NULL);
 	}
 
+	if (fflag & MS_FORCE) {
+		/*
+		 * Mark file system as unmounted before calling
+		 * vflush(FORCECLOSE). This way we ensure no future vnops
+		 * will be called and risk operating on DOOMED vnodes.
+		 */
+		rrw_enter(&zfsvfs->z_teardown_lock, RW_WRITER, FTAG);
+		zfsvfs->z_unmounted = B_TRUE;
+		rrw_exit(&zfsvfs->z_teardown_lock, FTAG);
+	}
+
 	/*
 	 * Flush all the files.
 	 */
@@ -1093,8 +1107,7 @@ zfs_umount(vfs_t *vfsp, int fflag)
 	if (zfsvfs->z_issnap) {
 		vnode_t *svp = vfsp->mnt_vnodecovered;
 
-		ASSERT(svp->v_count == 2 || svp->v_count == 1);
-		if (svp->v_count == 2)
+		if (svp->v_count >= 2)
 			VN_RELE(svp);
 	}
 	zfs_freevfs(vfsp);

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -890,17 +890,25 @@ again:
 		if (zp->z_unlinked) {
 			err = ENOENT;
 		} else {
-			if ((vp = ZTOV(zp)) != NULL) {
-				VI_LOCK(vp);
+			int dying = 0;
+
+			vp = ZTOV(zp);
+			if (vp == NULL)
+				dying = 1;
+			else {
+				VN_HOLD(vp);
 				if ((vp->v_iflag & VI_DOOMED) != 0) {
-					VI_UNLOCK(vp);
-					vp = NULL;
-				} else
-					VI_UNLOCK(vp);
+					dying = 1;
+					/*
+					 * Don't VN_RELE() vnode here, because
+					 * it can call vn_lock() which creates
+					 * LOR between vnode lock and znode
+					 * lock. We will VN_RELE() the vnode
+					 * after droping znode lock.
+					 */
+				}
 			}
-			if (vp != NULL)
-				VN_HOLD(vp);
-			else {
+			if (dying) {
 				if (first) {
 					ZFS_LOG(1, "dying znode detected (zp=%p)", zp);
 					first = 0;
@@ -912,6 +920,8 @@ again:
 				dmu_buf_rele(db, NULL);
 				mutex_exit(&zp->z_lock);
 				ZFS_OBJ_HOLD_EXIT(zfsvfs, obj_num);
+				if (vp != NULL)
+					VN_RELE(vp);
 				tsleep(zp, 0, "zcollide", 1);
 				goto again;
 			}

Modified: stable/8/sys/nfsserver/nfs_serv.c
==============================================================================
--- stable/8/sys/nfsserver/nfs_serv.c	Tue Sep 29 10:50:02 2009	(r197612)
+++ stable/8/sys/nfsserver/nfs_serv.c	Tue Sep 29 10:53:06 2009	(r197613)
@@ -1332,7 +1332,7 @@ nfsrv_create(struct nfsrv_descript *nfsd
 			tl = nfsm_dissect_nonblock(u_int32_t *,
 			    NFSX_V3CREATEVERF);
 			/* Unique bytes, endianness is not important. */
-			cverf.tv_sec  = tl[0];
+			cverf.tv_sec  = (int32_t)tl[0];
 			cverf.tv_nsec = tl[1];
 			exclusive_flag = 1;
 			break;
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 4 dfilter service freebsd_committer freebsd_triage 2010-01-06 16:10:17 UTC
Author: netchild
Date: Wed Jan  6 16:09:58 2010
New Revision: 201651
URL: http://svn.freebsd.org/changeset/base/201651

Log:
  MFC several ZFS related commits:
  
  r196980:
  ---snip---
      When we automatically mount snapshot we want to return vnode of the mount point
      from the lookup and not covered vnode. This is one of the fixes for using .zfs/
      over NFS.
  ---snip---
  
  r196982:
  ---snip---
      We don't export individual snapshots, so mnt_export field in snapshot's
      mount point is NULL. That's why when we try to access snapshots over NFS
      use mnt_export field from the parent file system.
  ---snip---
  
  r197131:
  ---snip---
      Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain
      NULL, but also can point to dead vnode, take that into account.
  
      PR:				kern/132068
      Reported by:		Edward Fisk" <7ogcg7g02@sneakemail.com>, kris
      Fix based on patch from:	Jaakko Heinonen <jh@saunalahti.fi>
  ---snip---
  
  r197133:
  ---snip---
      - Protect reclaim with z_teardown_inactive_lock.
      - Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if
        z_dbuf field is NULL - this might happen in case of rollback or forced
        unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete().
      - On forced unmount wait for all znodes to be destroyed - destruction can be
        done asynchronously via zfs_reclaim_complete().
  ---snip---
  
  r197153:
  ---snip---
      When zfs.ko is compiled with debug, make sure that znode and vnode point at
      each other.
  ---snip---
  
  r197167:
  ---snip---
      Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories
      by just returning EOPNOTSUPP. This will allow NFS server to fall back to
      regular READDIR.
  
      Note that converting inode number to snapshot's vnode is expensive operation.
      Snapshots are stored in AVL tree, but based on their names, not inode numbers,
      so to convert inode to snapshot vnode we have to interate over all snalshots.
  
      This is not a problem in OpenSolaris, because in their READDIRPLUS
      implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on
      d_fileno as we do.
  
      PR:			kern/125149
      Reported by:	Weldon Godfrey <wgodfrey@ena.com>
      Analysis by:	Jaakko Heinonen <jh@saunalahti.fi>
  ---snip---
  
  r197177:
  ---snip---
      Support both case: when snapshot is already mounted and when it is not yet
      mounted.
  ---snip---
  
  r197201:
  ---snip---
      - Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular
        df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows
        ZFS route of not listing snapshots by default with 'zfs list' command.
      - Add UPDATING entry to note that ZFS snapshots are no longer visible in
        mount(8) and df(1) output by default.
  
      Reviewed by:	kib
  ---snip---
  Note: the MNT_IGNORE part is commented out in this commit and the UPDATING
  entry is not merged, as this would be a POLA violation on a stable branch.
  This revision is included here, as it also makes locking changes and makes
  sure that a snapshot is mounted RO.
  
  r197426:
  ---snip---
      Restore BSD behaviour - when creating new directory entry use parent directory
      gid to set group ownership and not process gid.
  
      This was overlooked during v6 -> v13 switch.
  
      PR:			kern/139076
      Reported by:	Sean Winn <sean@gothic.net.au>
  ---snip---
  
  r197458:
  ---snip---
      Close race in zfs_zget(). We have to increase usecount first and then
      check for VI_DOOMED flag. Before this change vnode could be reclaimed
      between checking for the flag and increasing usecount.
  ---snip---

Modified:
  stable/7/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c
  stable/7/sys/cddl/compat/opensolaris/sys/vfs.h
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
Directory Properties:
  stable/7/sys/   (props changed)
  stable/7/sys/cddl/contrib/opensolaris/   (props changed)
  stable/7/sys/contrib/dev/acpica/   (props changed)
  stable/7/sys/contrib/pf/   (props changed)

Modified: stable/7/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c
==============================================================================
--- stable/7/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c	Wed Jan  6 16:09:58 2010	(r201651)
@@ -115,12 +115,13 @@ extern struct mount *vfs_mount_alloc(str
     const char *fspath, struct thread *td);
 
 int
-domount(kthread_t *td, vnode_t *vp, const char *fstype, char *fspath,
+mount_snapshot(kthread_t *td, vnode_t **vpp, const char *fstype, char *fspath,
     char *fspec, int fsflags)
 {
 	struct mount *mp;
 	struct vfsconf *vfsp;
 	struct ucred *cr;
+	vnode_t *vp;
 	int error;
 
 	/*
@@ -135,23 +136,28 @@ domount(kthread_t *td, vnode_t *vp, cons
 	if (vfsp == NULL)
 		return (ENODEV);
 
+	vp = *vpp;
 	if (vp->v_type != VDIR)
 		return (ENOTDIR);
+	/*
+	 * We need vnode lock to protect v_mountedhere and vnode interlock
+	 * to protect v_iflag.
+	 */
+	vn_lock(vp, LK_SHARED | LK_RETRY, td);
 	VI_LOCK(vp);
-	if ((vp->v_iflag & VI_MOUNT) != 0 ||
-	    vp->v_mountedhere != NULL) {
+	if ((vp->v_iflag & VI_MOUNT) != 0 || vp->v_mountedhere != NULL) {
 		VI_UNLOCK(vp);
+		VOP_UNLOCK(vp, 0, td);
 		return (EBUSY);
 	}
 	vp->v_iflag |= VI_MOUNT;
 	VI_UNLOCK(vp);
+	VOP_UNLOCK(vp, 0, td);
 
 	/*
 	 * Allocate and initialize the filesystem.
 	 */
-	vn_lock(vp, LK_SHARED | LK_RETRY, td);
 	mp = vfs_mount_alloc(vp, vfsp, fspath, td);
-	VOP_UNLOCK(vp, 0,td);
 
 	mp->mnt_optnew = NULL;
 	vfs_setmntopt(mp, "from", fspec, 0);
@@ -161,11 +167,20 @@ domount(kthread_t *td, vnode_t *vp, cons
 	/*
 	 * Set the mount level flags.
 	 */
-	if (fsflags & MNT_RDONLY)
-		mp->mnt_flag |= MNT_RDONLY;
-	mp->mnt_flag &=~ MNT_UPDATEMASK;
+	mp->mnt_flag &= ~MNT_UPDATEMASK;
 	mp->mnt_flag |= fsflags & (MNT_UPDATEMASK | MNT_FORCE | MNT_ROOTFS);
 	/*
+	 * Snapshots are always read-only.
+	 */
+	mp->mnt_flag |= MNT_RDONLY;
+#if 0
+	/*
+	 * We don't want snapshots to be visible in regular
+	 * mount(8) and df(1) output.
+	 */
+	mp->mnt_flag |= MNT_IGNORE;
+#endif
+	/*
 	 * Unprivileged user can trigger mounting a snapshot, but we don't want
 	 * him to unmount it, so we switch to privileged of original mount.
 	 */
@@ -173,11 +188,6 @@ domount(kthread_t *td, vnode_t *vp, cons
 	mp->mnt_cred = crdup(vp->v_mount->mnt_cred);
 	mp->mnt_stat.f_owner = mp->mnt_cred->cr_uid;
 	/*
-	 * Mount the filesystem.
-	 * XXX The final recipients of VFS_MOUNT just overwrite the ndp they
-	 * get.  No freeing of cn_pnbuf.
-	 */
-	/*
 	 * XXX: This is evil, but we can't mount a snapshot as a regular user.
 	 * XXX: Is is safe when snapshot is mounted from within a jail?
 	 */
@@ -186,7 +196,7 @@ domount(kthread_t *td, vnode_t *vp, cons
 	error = VFS_MOUNT(mp, td);
 	td->td_ucred = cr;
 
-	if (!error) {
+	if (error == 0) {
 		if (mp->mnt_opt != NULL)
 			vfs_freeopts(mp->mnt_opt);
 		mp->mnt_opt = mp->mnt_optnew;
@@ -198,42 +208,33 @@ domount(kthread_t *td, vnode_t *vp, cons
 	*/
 	mp->mnt_optnew = NULL;
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td);
-	/*
-	 * Put the new filesystem on the mount list after root.
-	 */
 #ifdef FREEBSD_NAMECACHE
 	cache_purge(vp);
 #endif
-	if (!error) {
+	VI_LOCK(vp);
+	vp->v_iflag &= ~VI_MOUNT;
+	VI_UNLOCK(vp);
+	if (error == 0) {
 		vnode_t *mvp;
 
-		VI_LOCK(vp);
-		vp->v_iflag &= ~VI_MOUNT;
-		VI_UNLOCK(vp);
 		vp->v_mountedhere = mp;
+		/*
+		 * Put the new filesystem on the mount list.
+		 */
 		mtx_lock(&mountlist_mtx);
 		TAILQ_INSERT_TAIL(&mountlist, mp, mnt_list);
 		mtx_unlock(&mountlist_mtx);
 		vfs_event_signal(NULL, VQ_MOUNT, 0);
 		if (VFS_ROOT(mp, LK_EXCLUSIVE, &mvp, td))
 			panic("mount: lost mount");
-		mountcheckdirs(vp, mvp);
-		vput(mvp);
-		VOP_UNLOCK(vp, 0, td);
-		if ((mp->mnt_flag & MNT_RDONLY) == 0)
-			error = vfs_allocate_syncvnode(mp);
+		vput(vp);
 		vfs_unbusy(mp, td);
-		if (error)
-			vrele(vp);
-		else
-			vfs_mountedfrom(mp, fspec);
+		*vpp = mvp;
 	} else {
-		VI_LOCK(vp);
-		vp->v_iflag &= ~VI_MOUNT;
-		VI_UNLOCK(vp);
-		VOP_UNLOCK(vp, 0, td);
+		vput(vp);
 		vfs_unbusy(mp, td);
 		vfs_mount_destroy(mp);
+		*vpp = NULL;
 	}
 	return (error);
 }

Modified: stable/7/sys/cddl/compat/opensolaris/sys/vfs.h
==============================================================================
--- stable/7/sys/cddl/compat/opensolaris/sys/vfs.h	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/compat/opensolaris/sys/vfs.h	Wed Jan  6 16:09:58 2010	(r201651)
@@ -110,8 +110,8 @@ void vfs_setmntopt(vfs_t *vfsp, const ch
     int flags __unused);
 void vfs_clearmntopt(vfs_t *vfsp, const char *name);
 int vfs_optionisset(const vfs_t *vfsp, const char *opt, char **argp);
-int domount(kthread_t *td, vnode_t *vp, const char *fstype, char *fspath,
-    char *fspec, int fsflags);
+int mount_snapshot(kthread_t *td, vnode_t **vpp, const char *fstype,
+    char *fspath, char *fspec, int fsflags);
 
 typedef	uint64_t	vfs_feature_t;
 

Modified: stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h
==============================================================================
--- stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_znode.h	Wed Jan  6 16:09:58 2010	(r201651)
@@ -231,8 +231,27 @@ typedef struct znode {
 /*
  * Convert between znode pointers and vnode pointers
  */
+#ifdef DEBUG
+static __inline vnode_t *
+ZTOV(znode_t *zp)
+{
+	vnode_t *vp = zp->z_vnode;
+
+	ASSERT(vp == NULL || vp->v_data == NULL || vp->v_data == zp);
+	return (vp);
+}
+static __inline znode_t *
+VTOZ(vnode_t *vp)
+{
+	znode_t *zp = (znode_t *)vp->v_data;
+
+	ASSERT(zp == NULL || zp->z_vnode == NULL || zp->z_vnode == vp);
+	return (zp);
+}
+#else
 #define	ZTOV(ZP)	((ZP)->z_vnode)
 #define	VTOZ(VP)	((znode_t *)(VP)->v_data)
+#endif
 
 /*
  * ZFS_ENTER() is called on entry to each ZFS vnode and vfs operation.

Modified: stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c
==============================================================================
--- stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c	Wed Jan  6 16:09:58 2010	(r201651)
@@ -1841,7 +1841,7 @@ zfs_perm_init(znode_t *zp, znode_t *pare
 				fgid = zfs_fuid_create_cred(zfsvfs,
 				    ZFS_GROUP, tx, cr, fuidp);
 #ifdef __FreeBSD__
-				gid = parent->z_phys->zp_gid;
+				gid = fgid = parent->z_phys->zp_gid;
 #else
 				gid = crgetgid(cr);
 #endif

Modified: stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
==============================================================================
--- stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c	Wed Jan  6 16:09:58 2010	(r201651)
@@ -879,20 +879,20 @@ domount:
 	mountpoint = kmem_alloc(mountpoint_len, KM_SLEEP);
 	(void) snprintf(mountpoint, mountpoint_len, "%s/.zfs/snapshot/%s",
 	    dvp->v_vfsp->mnt_stat.f_mntonname, nm);
-	err = domount(curthread, *vpp, "zfs", mountpoint, snapname, 0);
+	err = mount_snapshot(curthread, vpp, "zfs", mountpoint, snapname, 0);
 	kmem_free(mountpoint, mountpoint_len);
-	/* FreeBSD: This line was moved from below to avoid a lock recursion. */
-	if (err == 0)
-		vn_lock(*vpp, LK_EXCLUSIVE | LK_RETRY, curthread);
-	mutex_exit(&sdp->sd_lock);
-	/*
-	 * If we had an error, drop our hold on the vnode and
-	 * zfsctl_snapshot_inactive() will clean up.
-	 */
-	if (err) {
-		VN_RELE(*vpp);
-		*vpp = NULL;
+	if (err == 0) {
+		/*
+		 * Fix up the root vnode mounted on .zfs/snapshot/<snapname>.
+		 *
+		 * This is where we lie about our v_vfsp in order to
+		 * make .zfs/snapshot/<snapname> accessible over NFS
+		 * without requiring manual mounts of <snapname>.
+		 */
+		ASSERT(VTOZ(*vpp)->z_zfsvfs != zfsvfs);
+		VTOZ(*vpp)->z_zfsvfs->z_parent = zfsvfs;
 	}
+	mutex_exit(&sdp->sd_lock);
 	ZFS_EXIT(zfsvfs);
 	return (err);
 }

Modified: stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
==============================================================================
--- stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c	Wed Jan  6 16:09:58 2010	(r201651)
@@ -97,6 +97,8 @@ static int zfs_root(vfs_t *vfsp, int fla
 static int zfs_statfs(vfs_t *vfsp, struct statfs *statp, kthread_t *td);
 static int zfs_vget(vfs_t *vfsp, ino_t ino, int flags, vnode_t **vpp);
 static int zfs_sync(vfs_t *vfsp, int waitfor, kthread_t *td);
+static int zfs_checkexp(vfs_t *vfsp, struct sockaddr *nam, int *extflagsp,
+    struct ucred **credanonp);
 static int zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vnode_t **vpp);
 static void zfs_objset_close(zfsvfs_t *zfsvfs);
 static void zfs_freevfs(vfs_t *vfsp);
@@ -108,6 +110,7 @@ static struct vfsops zfs_vfsops = {
 	.vfs_statfs =		zfs_statfs,
 	.vfs_vget =		zfs_vget,
 	.vfs_sync =		zfs_sync,
+	.vfs_checkexp =		zfs_checkexp,
 	.vfs_fhtovp =		zfs_fhtovp,
 };
 
@@ -955,6 +958,18 @@ zfsvfs_teardown(zfsvfs_t *zfsvfs, boolea
 		zfsvfs->z_unmounted = B_TRUE;
 		rrw_exit(&zfsvfs->z_teardown_lock, FTAG);
 		rw_exit(&zfsvfs->z_teardown_inactive_lock);
+
+#ifdef __FreeBSD__
+		/*
+		 * Some znodes might not be fully reclaimed, wait for them.
+		 */
+		mutex_enter(&zfsvfs->z_znodes_lock);
+		while (list_head(&zfsvfs->z_all_znodes) != NULL) {
+			msleep(zfsvfs, &zfsvfs->z_znodes_lock, 0,
+			    "zteardown", 0);
+		}
+		mutex_exit(&zfsvfs->z_znodes_lock);
+#endif
 	}
 
 	/*
@@ -1114,6 +1129,20 @@ zfs_vget(vfs_t *vfsp, ino_t ino, int fla
 	znode_t		*zp;
 	int 		err;
 
+	/*
+	 * XXXPJD: zfs_zget() can't operate on virtual entires like .zfs/ or
+	 * .zfs/snapshot/ directories, so for now just return EOPNOTSUPP.
+	 * This will make NFS to fall back to using READDIR instead of
+	 * READDIRPLUS.
+	 * Also snapshots are stored in AVL tree, but based on their names,
+	 * not inode numbers, so it will be very inefficient to iterate
+	 * over all snapshots to find the right one.
+	 * Note that OpenSolaris READDIRPLUS implementation does LOOKUP on
+	 * d_name, and not VGET on d_fileno as we do.
+	 */
+	if (ino == ZFSCTL_INO_ROOT || ino == ZFSCTL_INO_SNAPDIR)
+		return (EOPNOTSUPP);
+
 	ZFS_ENTER(zfsvfs);
 	err = zfs_zget(zfsvfs, ino, &zp);
 	if (err == 0 && zp->z_unlinked) {
@@ -1134,6 +1163,26 @@ CTASSERT(SHORT_FID_LEN <= sizeof(struct 
 CTASSERT(LONG_FID_LEN <= sizeof(struct fid));
 
 static int
+zfs_checkexp(vfs_t *vfsp, struct sockaddr *nam, int *extflagsp,
+    struct ucred **credanonp)
+{
+	zfsvfs_t *zfsvfs = vfsp->vfs_data;
+
+	/*
+	 * If this is regular file system vfsp is the same as
+	 * zfsvfs->z_parent->z_vfs, but if it is snapshot,
+	 * zfsvfs->z_parent->z_vfs represents parent file system
+	 * which we have to use here, because only this file system
+	 * has mnt_export configured.
+	 */
+	vfsp = zfsvfs->z_parent->z_vfs;
+
+	return (vfs_stdcheckexp(zfsvfs->z_parent->z_vfs, nam, extflagsp,
+	    credanonp));
+}
+
+
+static int
 zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vnode_t **vpp)
 {
 	zfsvfs_t	*zfsvfs = vfsp->vfs_data;
@@ -1148,7 +1197,11 @@ zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, vno
 
 	ZFS_ENTER(zfsvfs);
 
-	if (fidp->fid_len == LONG_FID_LEN) {
+	/*
+	 * On FreeBSD we can get snapshot's mount point or its parent file
+	 * system mount point depending if snapshot is already mounted or not.
+	 */
+	if (zfsvfs->z_parent == zfsvfs && fidp->fid_len == LONG_FID_LEN) {
 		zfid_long_t	*zlfid = (zfid_long_t *)fidp;
 		uint64_t	objsetid = 0;
 		uint64_t	setgen = 0;

Modified: stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
==============================================================================
--- stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c	Wed Jan  6 16:09:58 2010	(r201651)
@@ -4340,11 +4340,20 @@ zfs_reclaim_complete(void *arg, int pend
 	znode_t	*zp = arg;
 	zfsvfs_t *zfsvfs = zp->z_zfsvfs;
 
-	ZFS_LOG(1, "zp=%p", zp);
-	ZFS_OBJ_HOLD_ENTER(zfsvfs, zp->z_id);
-	zfs_znode_dmu_fini(zp);
-	ZFS_OBJ_HOLD_EXIT(zfsvfs, zp->z_id);
+	rw_enter(&zfsvfs->z_teardown_inactive_lock, RW_READER);
+	if (zp->z_dbuf != NULL) {
+		ZFS_OBJ_HOLD_ENTER(zfsvfs, zp->z_id);
+		zfs_znode_dmu_fini(zp);
+		ZFS_OBJ_HOLD_EXIT(zfsvfs, zp->z_id);
+	}
 	zfs_znode_free(zp);
+	rw_exit(&zfsvfs->z_teardown_inactive_lock);
+	/*
+	 * If the file system is being unmounted, there is a process waiting
+	 * for us, wake it up.
+	 */
+	if (zfsvfs->z_unmounted)
+		wakeup_one(zfsvfs);
 }
 
 static int
@@ -4356,6 +4365,9 @@ zfs_freebsd_reclaim(ap)
 {
 	vnode_t	*vp = ap->a_vp;
 	znode_t	*zp = VTOZ(vp);
+	zfsvfs_t *zfsvfs = zp->z_zfsvfs;
+
+	rw_enter(&zfsvfs->z_teardown_inactive_lock, RW_READER);
 
 	ASSERT(zp != NULL);
 
@@ -4366,7 +4378,7 @@ zfs_freebsd_reclaim(ap)
 
 	mutex_enter(&zp->z_lock);
 	ASSERT(zp->z_phys != NULL);
-	ZTOV(zp) = NULL;
+	zp->z_vnode = NULL;
 	mutex_exit(&zp->z_lock);
 
 	if (zp->z_unlinked)
@@ -4374,7 +4386,6 @@ zfs_freebsd_reclaim(ap)
 	else if (zp->z_dbuf == NULL)
 		zfs_znode_free(zp);
 	else /* if (!zp->z_unlinked && zp->z_dbuf != NULL) */ {
-		zfsvfs_t *zfsvfs = zp->z_zfsvfs;
 		int locked;
 
 		locked = MUTEX_HELD(ZFS_OBJ_MUTEX(zfsvfs, zp->z_id)) ? 2 :
@@ -4397,6 +4408,7 @@ zfs_freebsd_reclaim(ap)
 	vp->v_data = NULL;
 	ASSERT(vp->v_holdcnt >= 1);
 	VI_UNLOCK(vp);
+	rw_exit(&zfsvfs->z_teardown_inactive_lock);
 	return (0);
 }
 

Modified: stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
==============================================================================
--- stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c	Wed Jan  6 16:05:33 2010	(r201650)
+++ stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c	Wed Jan  6 16:09:58 2010	(r201651)
@@ -110,7 +110,7 @@ znode_evict_error(dmu_buf_t *dbuf, void 
 		mutex_exit(&zp->z_lock);
 		zfs_znode_free(zp);
 	} else if (vp->v_count == 0) {
-		ZTOV(zp) = NULL;
+		zp->z_vnode = NULL;
 		vhold(vp);
 		mutex_exit(&zp->z_lock);
 		vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, curthread);
@@ -896,9 +896,25 @@ again:
 		if (zp->z_unlinked) {
 			err = ENOENT;
 		} else {
-			if (ZTOV(zp) != NULL)
-				VN_HOLD(ZTOV(zp));
+			int dying = 0;
+
+			vp = ZTOV(zp);
+			if (vp == NULL)
+				dying = 1;
 			else {
+				VN_HOLD(vp);
+				if ((vp->v_iflag & VI_DOOMED) != 0) {
+					dying = 1;
+					/*
+					 * Don't VN_RELE() vnode here, because
+					 * it can call vn_lock() which creates
+					 * LOR between vnode lock and znode
+					 * lock. We will VN_RELE() the vnode
+					 * after droping znode lock.
+					 */
+				}
+			}
+			if (dying) {
 				if (first) {
 					ZFS_LOG(1, "dying znode detected (zp=%p)", zp);
 					first = 0;
@@ -910,6 +926,8 @@ again:
 				dmu_buf_rele(db, NULL);
 				mutex_exit(&zp->z_lock);
 				ZFS_OBJ_HOLD_EXIT(zfsvfs, obj_num);
+				if (vp != NULL)
+					VN_RELE(vp);
 				tsleep(zp, 0, "zcollide", 1);
 				goto again;
 			}
@@ -1531,7 +1549,7 @@ zfs_create_fs(objset_t *os, cred_t *cr, 
 	ZTOV(rootzp)->v_data = NULL;
 	ZTOV(rootzp)->v_count = 0;
 	ZTOV(rootzp)->v_holdcnt = 0;
-	ZTOV(rootzp) = NULL;
+	rootzp->z_vnode = NULL;
 	VOP_UNLOCK(vp, 0, curthread);
 	vdestroy(vp);
 	dmu_buf_rele(rootzp->z_dbuf, NULL);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 5 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2014-06-01 07:09:35 UTC
State Changed
From-To: open->patched

Fix committed to HEAD. Thank you for the report! 


Comment 6 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2014-06-01 07:09:35 UTC
Responsible Changed
From-To: freebsd-fs->pjd

I'll take this one.
Comment 7 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2014-06-01 07:09:35 UTC
State Changed
From-To: patched->closed

Fix merged to stable/8.