Bug 205725 - Lock order reversal in gfs_file_create during zfs unmount
Summary: Lock order reversal in gfs_file_create during zfs unmount
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-30 10:46 UTC by florian.ermisch
Modified: 2016-02-16 13:56 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description florian.ermisch 2015-12-30 10:46:17 UTC
Hi *,

since I've upgraded my laptop from 10.2-RELEASE to 11-CURRENT (base r292536, 
now base r292755) I see this stack backtrace when a zpool is exported:

Dec 27 18:44:02 fuchi-cyber220 kernel: lock order reversal:
Dec 27 18:44:02 fuchi-cyber220 kernel: 1st 0xfffff800c7472418 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1224
Dec 27 18:44:02 fuchi-cyber220 kernel: 2nd 0xfffff800c73fad50 zfs_gfs (zfs_gfs) @ /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c:494
Dec 27 18:44:02 fuchi-cyber220 kernel: stack backtrace:
Dec 27 18:44:02 fuchi-cyber220 kernel: #0 0xffffffff80a7d6f0 at witness_debugger+0x70
Dec 27 18:44:02 fuchi-cyber220 kernel: #1 0xffffffff80a7d5f1 at witness_checkorder+0xe71
Dec 27 18:44:02 fuchi-cyber220 kernel: #2 0xffffffff809fedcb at __lockmgr_args+0xd3b
Dec 27 18:44:02 fuchi-cyber220 kernel: #3 0xffffffff80ac55ec at vop_stdlock+0x3c
Dec 27 18:44:02 fuchi-cyber220 kernel: #4 0xffffffff80fbb220 at VOP_LOCK1_APV+0x100
Dec 27 18:44:02 fuchi-cyber220 kernel: #5 0xffffffff80ae653a at _vn_lock+0x9a
Dec 27 18:44:02 fuchi-cyber220 kernel: #6 0xffffffff8209db13 at gfs_file_create+0x73
Dec 27 18:44:02 fuchi-cyber220 kernel: #7 0xffffffff8209dbbd at gfs_dir_create+0x1d
Dec 27 18:44:02 fuchi-cyber220 kernel: #8 0xffffffff821649e7 at zfsctl_mknode_snapdir+0x47
Dec 27 18:44:02 fuchi-cyber220 kernel: #9 0xffffffff8209e135 at gfs_dir_lookup+0x185
Dec 27 18:44:02 fuchi-cyber220 kernel: #10 0xffffffff8209e61d at gfs_vop_lookup+0x1d
Dec 27 18:44:02 fuchi-cyber220 kernel: #11 0xffffffff82163a05 at zfsctl_root_lookup+0xf5
Dec 27 18:44:02 fuchi-cyber220 kernel: #12 0xffffffff821648a3 at zfsctl_umount_snapshots+0x83
Dec 27 18:44:02 fuchi-cyber220 kernel: #13 0xffffffff8217d5ab at zfs_umount+0x7b
Dec 27 18:44:02 fuchi-cyber220 kernel: #14 0xffffffff80acf0b0 at dounmount+0x530
Dec 27 18:44:02 fuchi-cyber220 kernel: #15 0xffffffff80aceaed at sys_unmount+0x35d
Dec 27 18:44:02 fuchi-cyber220 kernel: #16 0xffffffff80e6e13b at amd64_syscall+0x2db
Dec 27 18:44:02 fuchi-cyber220 kernel: #17 0xffffffff80e4dd8b at Xfast_syscall+0xfb

(From /var/log/messages)

First I've only seen this on the console at shutdown or reboot but later
found I can reproduce it by exporting a zpool.
While the only pools I can trigger this without a shutdown/reboot are 
connected via USB(3) I still see it just before poweroff at shutdown 
after the "All synced." message.

When I try to export a zpool under heavy load (`make -C /use/src -j 4 buildworld`
on a 2 core CPU w/ HT) the system locks up completely. I don't think it's related
to memory pressure as I haven't seen swap being used during a buildworld (with 
8 gigs of RAM).

PS for cross-referencing: I've posted this issue on the current@ list yesterday,
see https://lists.freebsd.org/pipermail/freebsd-current/2015-December/059117.html
Comment 1 florian.ermisch 2015-12-30 10:50:01 UTC
Might be related to this issue on 10-STABLE:
https://lists.freebsd.org/pipermail/freebsd-stable/2015-December/083895.html
Comment 2 florian.ermisch 2015-12-30 14:31:37 UTC
bug #203352 might be related, trace starts at /usr/src/sys/kern/vfs_mount.c:1224, too.
Comment 3 Alan Somers freebsd_committer freebsd_triage 2015-12-31 20:07:29 UTC
No, bug #203352 is unrelated.  The guilty party here is the gfs code.
Comment 4 Will Green 2016-02-16 13:56:11 UTC
I have seen a similar LOR with GFS when using a file-backed ZFS pool on FreeBSD-11.0-CURRENT-amd64-20160206-r295345. 

It's easy to reproduce:

# dd bs=1m count=256 if=/dev/zero of=/tmp/disk1
256+0 records in
256+0 records out
268435456 bytes transferred in 0.266620 secs (1006810670 bytes/sec)
# zpool create testpool /tmp/disk1
# zpool destroy testpool

Feb 16 13:53:06 fbsd11 ZFS: vdev state changed, pool_guid=5988351068332835963 vdev_guid=14151074118185943762
Feb 16 13:53:14 fbsd11 kernel: lock order reversal:
Feb 16 13:53:14 fbsd11 kernel: 1st 0xfffff800370c05f0 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1222
Feb 16 13:53:14 fbsd11 kernel: 2nd 0xfffff8007238b240 zfs_gfs (zfs_gfs) @ /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c:494
Feb 16 13:53:14 fbsd11 kernel: stack backtrace:
Feb 16 13:53:14 fbsd11 kernel: #0 0xffffffff80a7fc60 at witness_debugger+0x70
Feb 16 13:53:14 fbsd11 kernel: #1 0xffffffff80a7fb61 at witness_checkorder+0xe71
Feb 16 13:53:14 fbsd11 kernel: #2 0xffffffff809ff22b at __lockmgr_args+0xd3b
Feb 16 13:53:14 fbsd11 kernel: #3 0xffffffff80ac64fc at vop_stdlock+0x3c
Feb 16 13:53:14 fbsd11 kernel: #4 0xffffffff80fbdb00 at VOP_LOCK1_APV+0x100
Feb 16 13:53:14 fbsd11 kernel: #5 0xffffffff80ae71ba at _vn_lock+0x9a
Feb 16 13:53:14 fbsd11 kernel: #6 0xffffffff820a2b13 at gfs_file_create+0x73
Feb 16 13:53:14 fbsd11 kernel: #7 0xffffffff820a2bbd at gfs_dir_create+0x1d
Feb 16 13:53:14 fbsd11 kernel: #8 0xffffffff8216bf57 at zfsctl_mknode_snapdir+0x47
Feb 16 13:53:14 fbsd11 kernel: #9 0xffffffff820a3135 at gfs_dir_lookup+0x185
Feb 16 13:53:14 fbsd11 kernel: #10 0xffffffff820a361d at gfs_vop_lookup+0x1d
Feb 16 13:53:14 fbsd11 kernel: #11 0xffffffff8216af75 at zfsctl_root_lookup+0xf5
Feb 16 13:53:14 fbsd11 kernel: #12 0xffffffff8216be13 at zfsctl_umount_snapshots+0x83
Feb 16 13:53:14 fbsd11 kernel: #13 0xffffffff82184cfb at zfs_umount+0x7b
Feb 16 13:53:14 fbsd11 kernel: #14 0xffffffff80acfeb0 at dounmount+0x530
Feb 16 13:53:14 fbsd11 kernel: #15 0xffffffff80acf8ed at sys_unmount+0x35d
Feb 16 13:53:14 fbsd11 kernel: #16 0xffffffff80e6f15b at amd64_syscall+0x2db
Feb 16 13:53:14 fbsd11 kernel: #17 0xffffffff80e4ed5b at Xfast_syscall+0xfb