Bug 225795 - panic: found unreferenced mountpoint when accessing and unmounting snapshots in parallel
Summary: panic: found unreferenced mountpoint when accessing and unmounting snapshots ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Andriy Gapon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-09 21:01 UTC by Alan Somers
Modified: 2018-03-27 11:42 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Somers freebsd_committer freebsd_triage 2018-02-09 21:01:59 UTC
Found by snapshot_019_pos from the ZFS test suite:

panic: found unreferenced mountpoint
cpuid = 8
time = 1518201424
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00cef3f320
vpanic() at vpanic+0x18d/frame 0xfffffe00cef3f380
vpanic() at vpanic/frame 0xfffffe00cef3f400
zfsctl_snapdir_lookup() at zfsctl_snapdir_lookup+0x461/frame 0xfffffe00cef3f680
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xd3/frame 0xfffffe00cef3f6b0
lookup() at lookup+0x682/frame 0xfffffe00cef3f750
namei() at namei+0x52b/frame 0xfffffe00cef3f810
vn_open_cred() at vn_open_cred+0x213/frame 0xfffffe00cef3f950
kern_openat() at kern_openat+0x20c/frame 0xfffffe00cef3fac0
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe00cef3fbf0
fast_syscall_common() at fast_syscall_common+0xfc/frame 0x7fffffffb240
Uptime: 11h59m40s
Dumping 5340 out of 49055 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at ./machine/pcpu.h:229
229             __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:229
#1  doadump (textdump=1)
    at /usr/home/alans/freebsd/head/sys/kern/kern_shutdown.c:346
#2  0xffffffff80ac7042 in kern_reboot (howto=260)
    at /usr/home/alans/freebsd/head/sys/kern/kern_shutdown.c:415
#3  0xffffffff80ac760d in vpanic (fmt=<optimized out>, ap=0xfffffe00cef3f3c0)
    at /usr/home/alans/freebsd/head/sys/kern/kern_shutdown.c:811
#4  0xffffffff80ac7420 in kassert_panic (
    fmt=0xffffffff8224ea68 "found unreferenced mountpoint")
    at /usr/home/alans/freebsd/head/sys/kern/kern_shutdown.c:697
#5  0xffffffff821eae71 in zfsctl_snapdir_lookup (ap=0xfffffe00cef3f6f0)
    at /usr/home/alans/freebsd/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c:893
#6  0xffffffff810f50a3 in VOP_LOOKUP_APV (vop=<optimized out>,
    a=0xfffffe00cef3f6f0) at vnode_if.c:127
#7  0xffffffff80b89c92 in VOP_LOOKUP (vpp=0xfffffe00cef3f9c8,
    cnp=0xfffffe00cef3f9f0, dvp=<optimized out>) at ./vnode_if.h:54
#8  lookup (ndp=0xfffffe00cef3f968)
    at /usr/home/alans/freebsd/head/sys/kern/vfs_lookup.c:888
#9  0xffffffff80b8921b in namei (ndp=0xfffffe00cef3f968)
    at /usr/home/alans/freebsd/head/sys/kern/vfs_lookup.c:450
#10 0xffffffff80ba6d13 in vn_open_cred (ndp=0xfffffe00cef3f968,
    flagp=0xfffffe00cef3fa94, cmode=1, vn_open_flags=0,
    cred=0xfffff80133087c00, fp=0xfffff807b332e9b0)
    at /usr/home/alans/freebsd/head/sys/kern/vfs_vnops.c:279
#11 0xffffffff80b9f5ec in kern_openat (td=0xfffff804ca09d000, fd=-100,
    path=0x800cde000 <error: Cannot access memory at address 0x800cde000>,
    pathseg=UIO_USERSPACE, flags=1179653, mode=<optimized out>)
    at /usr/home/alans/freebsd/head/sys/kern/vfs_syscalls.c:1084
#12 0xffffffff80f7937b in syscallenter (td=0xfffff804ca09d000)
    at /usr/home/alans/freebsd/head/sys/amd64/amd64/../../kern/subr_syscall.c:134
#13 amd64_syscall (td=0xfffff804ca09d000, traced=0)
    at /usr/home/alans/freebsd/head/sys/amd64/amd64/trap.c:935
#14 0xffffffff80f55e38 in fast_syscall_common ()
    at /usr/home/alans/freebsd/head/sys/amd64/amd64/exception.S:468
Comment 1 commit-hook freebsd_committer freebsd_triage 2018-02-09 21:13:49 UTC
A commit references this bug:

Author: asomers
Date: Fri Feb  9 21:13:20 UTC 2018
New revision: 329083
URL: https://svnweb.freebsd.org/changeset/base/329083

Log:
  Skip snapshot_019_pos

  This test frequently panics: "panic: found unreferenced mountpoint"

  PR:		225795
  Sponsored by:	Spectra Logic Corp

Changes:
  projects/zfsd/head/tests/sys/cddl/zfs/tests/snapshot/snapshot_test.sh
Comment 2 Andriy Gapon freebsd_committer freebsd_triage 2018-02-12 17:26:53 UTC
(In reply to Alan Somers from comment #0)
Have you seen this with a recent CURRENT?

I would like to have a dig at the crash dump, if possible.
Comment 3 Alan Somers freebsd_committer freebsd_triage 2018-02-12 17:43:28 UTC
Yes, my CURRENT dates from Feb-1.  Would you like me to send you my crash dump and kernel symbol files?  They'll take around 400 MB.  Or, you could try to reproduce it yourself.
Comment 4 commit-hook freebsd_committer freebsd_triage 2018-02-19 08:55:52 UTC
A commit references this bug:

Author: avg
Date: Mon Feb 19 08:55:23 UTC 2018
New revision: 329556
URL: https://svnweb.freebsd.org/changeset/base/329556

Log:
  relax an assert in zfsctl_snapdir_lookup to match r323578

  Since r323578 we may remove the last reference to a covered vnode with
  vrele() instead of vput().  So, v_usecount may be decremented before
  the vnode is locked and zfsctl_snapdir_lookup may "catch" the vnode
  with v_usecount of zero and v_holdcnt of one.

  PR:		225795
  Reported by:	asomers
  MFC after:	1 week

Changes:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
Comment 5 commit-hook freebsd_committer freebsd_triage 2018-02-19 18:14:48 UTC
A commit references this bug:

Author: asomers
Date: Mon Feb 19 18:14:12 UTC 2018
New revision: 329597
URL: https://svnweb.freebsd.org/changeset/base/329597

Log:
  No longer skip snapshot_019_pos, now that PR 225795 is fixed

  PR:		225795
  Sponsored by:	Spectra Logic Corp

Changes:
  projects/zfsd/head/tests/sys/cddl/zfs/tests/snapshot/snapshot_test.sh
Comment 6 commit-hook freebsd_committer freebsd_triage 2018-02-22 11:41:39 UTC
A commit references this bug:

Author: avg
Date: Thu Feb 22 11:41:00 UTC 2018
New revision: 329820
URL: https://svnweb.freebsd.org/changeset/base/329820

Log:
  followup to r329556, completely remove the covered vnode assert

  vrele() acquires the vnode lock only if the hold count drops to zero.
  In other scenarios it needs only the interlock.  So,
  zfsctl_snapdir_lookup() can race with vfs_mount_destroy() -> vrele()
  such that the lookup adds a new reference and then vrele() drops the
  mountpoint's reference and only then we check the reference count.
  It would be just one in this case.

  In fact, the assert should have been removed in r323483 when the code
  learned how to deal with the uncovered vnode.

  PR:		225795
  MFC after:	4 days
  X-MFC with:	r329556

Changes:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c