Bug 184677 - [zfs] [panic] ZFS snapshot umount kernel panic
Summary: [zfs] [panic] ZFS snapshot umount kernel panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: Josh Paetzel
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-11 06:10 UTC by krichy
Modified: 2015-01-13 18:00 UTC (History)
1 user (show)

See Also:


Attachments
Proposed fix for .zfs/snapshot deadlock (10.68 KB, patch)
2014-08-15 14:12 UTC, Josh Paetzel
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description krichy 2013-12-11 06:10:00 UTC
Accessing ZFS snapshots and unmounting them parallell causes the system to panic. 

In a real server setup, where unix users exists, they are able to access
.zfs/snapshot/ directories, which causes snapshots to be mounted. The
system may be set up to clean those mounts, umount them at some time. Then
a panic may occur.

How-To-Repeat: Run the script at http://pastebin.com/Bf15sMhd on an empty ZFS dataset
with a snapshot.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2014-04-16 02:02:38 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Josh Paetzel freebsd_committer freebsd_triage 2014-08-15 14:12:52 UTC
Created attachment 145821 [details]
Proposed fix for .zfs/snapshot deadlock
Comment 3 commit-hook freebsd_committer freebsd_triage 2014-10-25 17:43:18 UTC
A commit references this bug:

Author: jpaetzel
Date: Sat Oct 25 17:42:46 UTC 2014
New revision: 273641
URL: https://svnweb.freebsd.org/changeset/base/273641

Log:
  This change addresses 4 bugs in ZFS exposed by Richard Kojedzinszky's
  crash.sh script attached to FreeNAS bug 4109:
  https://bugs.freenas.org/issues/4109

  Three are in the snapshot layer:
  a) AVG explains in his notes: https://wiki.freebsd.org/AvgVfsSolarisVsFreeBSD

  "VOP_INACTIVE must not do any destructive actions to a vnode
  and its filesystem node, nor invalidate them in any way."
  gfs_vop_inactive and zfsctl_snapshot_inactive did just that. In
  OpenSolaris VOP_INACTIVE is much closer to FreeBSD's VOP_RECLAIM.
  Rename & move them to gfs_vop_reclaim and zfsctl_snapshot_reclaim
  and merge in the requisite vnode_destroy from zfsctl_common_reclaim.

  b) gfs_lookup_dot and various zfsctl functions do not honor the
  FreeBSD VFS convention of only locking from the root downward. When
  looking up ".." the convention is to drop the current leaf vnode lock before
  acquiring the directory vnode and then subsequently re-acquiring the lock on the
  leaf vnode. This fixes that in all the places that our exercised by crash.sh.

  c) The snapshot may already be unmounted when the directory vnode is reclaimed.
  Check for this case and return.

  One in the common layer:
  d) Callers of traverse expect the reference to the vnode passed in to be
  maintained. Don't release it.

  This last one may be an unclear contract. There may in fact be some callers that
  do expect the reference to be dropped on success in addition to callers that
  expect it to be released. In this case a further audit of the callers is needed
  and a consensus on the correct behavior.

  PR:	184677
  Submitted by:	kmacy
  Reviewed by:	delphij, will, avg
  MFC after:	2 weeks
  Sponsored by:	iXsystems

Changes:
  head/sys/cddl/compat/opensolaris/kern/opensolaris_lookup.c
  head/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
  head/sys/cddl/contrib/opensolaris/uts/common/sys/gfs.h
Comment 4 commit-hook freebsd_committer freebsd_triage 2014-11-09 20:05:21 UTC
A commit references this bug:

Author: jpaetzel
Date: Sun Nov  9 20:04:30 UTC 2014
New revision: 274326
URL: https://svnweb.freebsd.org/changeset/base/274326

Log:
  MFC: 273641

  This change addresses 4 bugs in ZFS exposed by Richard Kojedzinszky's
  crash.sh script attached to FreeNAS bug 4109:
  https://bugs.freenas.org/issues/4109

  Three are in the snapshot layer:
  a) AVG explains in his notes: https://wiki.freebsd.org/AvgVfsSolarisVsFreeBSD

  "VOP_INACTIVE must not do any destructive actions to a vnode
  and its filesystem node, nor invalidate them in any way."
  gfs_vop_inactive and zfsctl_snapshot_inactive did just that. In
  OpenSolaris VOP_INACTIVE is much closer to FreeBSD's VOP_RECLAIM.
  Rename & move them to gfs_vop_reclaim and zfsctl_snapshot_reclaim
  and merge in the requisite vnode_destroy from zfsctl_common_reclaim.

  b) gfs_lookup_dot and various zfsctl functions do not honor the
  FreeBSD VFS convention of only locking from the root downward. When
  looking up ".." the convention is to drop the current leaf vnode lock before
  acquiring the directory vnode and then subsequently re-acquiring the lock on the
  leaf vnode. This fixes that in all the places that our exercised by crash.sh.

  c) The snapshot may already be unmounted when the directory vnode is reclaimed.
  Check for this case and return.

  One in the common layer:
  d) Callers of traverse expect the reference to the vnode passed in to be
  maintained. Don't release it.

  This last one may be an unclear contract. There may in fact be some callers that
  do expect the reference to be dropped on success in addition to callers that
  expect it to be released. In this case a further audit of the callers is needed
  and a consensus on the correct behavior.

  PR:     184677
  Submitted by:	kmacy
  Reviewed by:	delphij, will, avg
  Sponsored by:	iXsystems

Changes:
_U  stable/10/
  stable/10/sys/cddl/compat/opensolaris/kern/opensolaris_lookup.c
  stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c
  stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
  stable/10/sys/cddl/contrib/opensolaris/uts/common/sys/gfs.h