Bug 229958 - ctld / zfs deadlock with 'zfs rename ...'
Summary: ctld / zfs deadlock with 'zfs rename ...'
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-22 10:15 UTC by Harald Schmalzbauer
Modified: 2018-12-12 18:40 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Harald Schmalzbauer 2018-07-22 10:15:49 UTC
Hello,

'zfs rename poolA/iscsivolumeX poolA/iscsiVOLx' locks up the ZFS pool poolA, if /dev/zvol/poolA/iscsivolumeX is used as target LUN path in ctld(8).

The machine is still operational and datasets from other pools are accessable, but no dataset of poolA.
Reboot doesn't succeed, machine hangs after "Syncing disks, vnodes remaining... 1 1 1 0 0 ..... done"
"All buffers synced."
  HANG

I'm quiet sure the pool lockup wasn't an issue in FreeBSD 10, since I often forgot to disable zvol ctld(8) backends before renaming the dataset.

Is this in any way related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220175 ???

And perhaps my other big ctld(8) problem might be related too?
https://lists.freebsd.org/pipermail/freebsd-stable/2018-July/089268.html

Sorry for combining problem descriptions, but I'm observing severe iSCSI regressions on all systems I updated to 11.2.  Unfortunately initiators were updated too, so the root cause is not clear.

Thanks,

-harry
Comment 1 emz 2018-12-04 12:32:14 UTC
I'm getting this periodically too. The problem first appeared on 11.x, and persists on 12.x.

I was able to take the crashdump by invoking the panic manually, compressed crashdump anlong with the kernel binaries is available at

http://82.194.234.25/crash/

Please let me know if I can help with debugging this in any possible way.

zfs-related processlist part at the moment of lockup:

root@san01:/tmp]# ps ax | grep zfs
   39  -  DL     5003:13,15 [zfskern]
74934  -  S         0:31,64 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74946  -  SN        0:02,43 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74947  -  I         1:15,30 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74948  -  I         1:16,98 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74949  -  I         0:48,78 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74950  -  I         0:51,96 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74951  -  I         0:52,85 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74952  -  I         0:51,56 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74953  -  I         0:25,39 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74954  -  I         0:49,74 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74955  -  I         1:01,79 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74956  -  I         0:49,77 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74957  -  I         0:55,23 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74958  -  I         0:56,99 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74959  -  I         0:48,99 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74960  -  I         0:51,11 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74961  -  I         1:42,39 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
74962  -  I         1:01,57 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-api.p
75196  -  S         0:13,66 /usr/local/bin/uwsgi-2.7 --master --uid zfsreplica --gid www --pidfile /var/run/uwsgi-repli
93143  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs clone data/kvm/desktop/desktop-master-m40-woefi-q35@desktop3 
93144  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs clone data/kvm/desktop/desktop-master-m40-woefi-q35-lp@deskto
93146  -  D         0:00,01 /sbin/zfs clone data/kvm/desktop/desktop-master-m40-woefi-q35@desktop3 data/kvm/desktop/des
93147  -  D         0:00,01 /sbin/zfs clone data/kvm/desktop/desktop-master-m40-woefi-q35-lp@desktop3 data/kvm/desktop/
93148  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs destroy data/kvm/desktop/games-vm45
93149  -  D         0:00,01 /sbin/zfs destroy data/kvm/desktop/games-vm45
93219  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs destroy data/kvm/desktop/desktop-vm149
93221  -  D         0:00,01 /sbin/zfs destroy data/kvm/desktop/desktop-vm149
93251  -  D         0:00,00 zfs list -t snapshot
93253  -  D         0:00,00 zfs list -t snapshot
93254  -  D         0:00,00 zfs list -t snapshot
93257  -  D         0:00,00 zfs list -t snapshot
93814  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
93815  -  D         0:00,00 /sbin/zfs list -t all
94172  -  D         0:00,00 zfs list -t snapshot
94173  -  D         0:00,00 zfs list -t snapshot
94174  -  D         0:00,00 zfs list -t snapshot
94175  -  D         0:00,00 zfs list -t snapshot
94657  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
94658  -  D         0:00,00 /sbin/zfs list -t all
94968  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs destroy data/reference@ver2_4457
94969  -  D         0:00,00 /sbin/zfs destroy data/reference@ver2_4457
95050  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
95051  -  D         0:00,00 /sbin/zfs list -t all
95103  -  D         0:00,00 zfs list -t snapshot
95112  -  D         0:00,00 zfs list -t snapshot
95113  -  D         0:00,00 zfs list -t snapshot
95114  -  D         0:00,00 zfs list -t snapshot
95236  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs destroy data/reference@ver2_4724
95237  -  D         0:00,00 /sbin/zfs destroy data/reference@ver2_4724
95542  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs destroy data/reference@ver2_4978
95543  -  D         0:00,00 /sbin/zfs destroy data/reference@ver2_4978
95840  -  I         0:00,01 /usr/local/bin/sudo /sbin/zfs destroy data/reference@ver2_5044
95841  -  D         0:00,00 /sbin/zfs destroy data/reference@ver2_5044
96069  -  D         0:00,00 zfs list -t snapshot
96078  -  D         0:00,00 zfs list -t snapshot
96079  -  D         0:00,00 zfs list -t snapshot
96081  -  D         0:00,00 zfs list -t snapshot
95137  9  D+        0:00,00 zfs list
93187  7  D+        0:00,01 zfs rename data/kvm/desktop/desktop-master-m40-lp@desktop3 data/kvm/desktop/desktop-master-
96210  2  S+        0:00,00 grep zfs
Comment 2 Harald Schmalzbauer 2018-12-05 14:18:41 UTC
Still 403 returned by nginx/1.10.3 trying to get http://82.194.234.25/crash/
Not that I'm able to fix with the help of the dump ;-) :-( just for info.

-harry
Comment 3 emz 2018-12-05 15:06:08 UTC
Oh, thanks a lot for noticing !

I think I fixed that, could you please recheck ?
Comment 4 Andriy Gapon freebsd_committer 2018-12-06 08:29:51 UTC
(In reply to emz from comment #1)
I think that the problem is with this thread:
(kgdb) tid 108285
(kgdb) bt
#0  sched_switch (td=0xfffff80130a82000, newtd=0xfffff8048008c000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2112
#1  0xffffffff80bac7c1 in mi_switch (flags=<optimized out>, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:439
#2  0xffffffff80bfa9cc in sleepq_wait (wchan=<unavailable>, pri=<unavailable>) at /usr/src/sys/kern/subr_sleepqueue.c:692
#3  0xffffffff80b3b6b2 in _cv_wait (cvp=0xfffff8001b148a98, lock=0xfffff8001b148a28) at /usr/src/sys/kern/kern_condvar.c:146
#4  0xffffffff8270f575 in txg_wait_synced (dp=0xfffff8001b148800, txg=38513569) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:659
#5  0xffffffff827448ab in zil_close (zilog=0xfffff8036f227400) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:2975
#6  0xffffffff8277f4b5 in zvol_last_close (zv=0xfffff80535721800) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c:893
#7  0xffffffff8278089e in zvol_d_close (dev=<optimized out>, flags=131075, fmt=<optimized out>, td=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c:3138
#8  0xffffffff80a57f0a in devfs_close (ap=<optimized out>) at /usr/src/sys/fs/devfs/devfs_vnops.c:650
#9  0xffffffff811fefc8 in VOP_CLOSE_APV (vop=<optimized out>, a=0xfffffe010d6f9518) at vnode_if.c:534
#10 0xffffffff80c8068f in VOP_CLOSE (vp=<optimized out>, fflag=<optimized out>, cred=<optimized out>, td=<optimized out>) at ./vnode_if.h:225
#11 vn_close1 (vp=0xfffff803ba0705a0, flags=3, file_cred=0x0, td=0xfffff80130a82000, keep_ref=false) at /usr/src/sys/kern/vfs_vnops.c:454
#12 0xffffffff82c3b6d3 in ctl_be_block_close (be_lun=<optimized out>) at /usr/src/sys/cam/ctl/ctl_backend_block.c:2102
#13 ctl_be_block_rm (softc=<optimized out>, req=<optimized out>) at /usr/src/sys/cam/ctl/ctl_backend_block.c:2505
#14 ctl_be_block_ioctl (dev=<optimized out>, cmd=<optimized out>, addr=0xfffff80057162400 "block", flag=<optimized out>, td=<optimized out>) at /usr/src/sys/cam/ctl/ctl_backend_block.c:1794
#15 0xffffffff82c34ba6 in ctl_ioctl (dev=0xfffff804a017f200, cmd=<optimized out>, addr=<optimized out>, flag=3, td=0xfffff80130a82000) at /usr/src/sys/cam/ctl/ctl.c:2967
#16 0xffffffff80a5810d in devfs_ioctl (ap=0xfffffe010d6f9718) at /usr/src/sys/fs/devfs/devfs_vnops.c:807
#17 0xffffffff811ffa6e in VOP_IOCTL_APV (vop=<optimized out>, a=0xfffffe010d6f9718) at vnode_if.c:1067
#18 0xffffffff80c7f0e4 in VOP_IOCTL (vp=<optimized out>, command=<optimized out>, data=<optimized out>, fflag=<unavailable>, cred=<optimized out>, td=<optimized out>) at ./vnode_if.h:448
#19 vn_ioctl (fp=0xfffff8001bd82370, com=<optimized out>, data=0xfffff80057162400, active_cred=0xfffff801b448e400, td=<unavailable>) at /usr/src/sys/kern/vfs_vnops.c:1490
#20 0xffffffff80a586ef in devfs_ioctl_f (fp=<unavailable>, com=<unavailable>, data=<unavailable>, cred=<unavailable>, td=0xfffff80130a82000) at /usr/src/sys/fs/devfs/devfs_vnops.c:765
#21 0xffffffff80c0a36d in fo_ioctl (fp=<optimized out>, com=<optimized out>, active_cred=<unavailable>, td=<optimized out>, data=<optimized out>) at /usr/src/sys/sys/file.h:330
#22 kern_ioctl (td=0xfffff80130a82000, fd=3, com=3244876065, data=<unavailable>) at /usr/src/sys/kern/sys_generic.c:800
#23 0xffffffff80c0a08e in sys_ioctl (td=0xfffff80130a82000, uap=0xfffff80130a823c0) at /usr/src/sys/kern/sys_generic.c:712
#24 0xffffffff81079f92 in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#25 amd64_syscall (td=0xfffff80130a82000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1154

The problem is that it holds the spa_namespace_lock while waiting for a TXG sync:
(kgdb) p spa_namespace_lock
$1 = {lock_object = {lo_name = 0xffffffff828037f2 <.L.str.48+1> "spa_namespace_lock", lo_flags = 577830912, lo_data = 0, lo_witness = 0x0}, sx_lock = 18446735282727821316}
(kgdb) p/x $1
$2 = {lock_object = {lo_name = 0xffffffff828037f2, lo_flags = 0x22710000, lo_data = 0x0, lo_witness = 0x0}, sx_lock = 0xfffff80130a82004}
(kgdb) p ((struct thread *)0xfffff80130a82000)->td_tid
$3 = 108285
(kgdb) tid 108285

Now let's take a look what the sync thread is doing:
(kgdb) fr 4
#4  0xffffffff8270f575 in txg_wait_synced (dp=0xfffff8001b148800, txg=38513569) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:659
(kgdb) p dp->dp_tx.tx_sync_thread->td_tid
$10 = 101227
(kgdb) tid $10
(kgdb) bt
#0  sched_switch (td=0xfffff8049493c000, newtd=0xfffff8000362e580, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2112
#1  0xffffffff80bac7c1 in mi_switch (flags=<optimized out>, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:439
#2  0xffffffff80bfa9cc in sleepq_wait (wchan=<unavailable>, pri=<unavailable>) at /usr/src/sys/kern/subr_sleepqueue.c:692
#3  0xffffffff80baac3c in _sx_xlock_hard (sx=0xffffffff8282e3b8 <spa_namespace_lock>, x=<optimized out>, opts=<optimized out>) at /usr/src/sys/kern/kern_sx.c:857
#4  0xffffffff827803ab in __sx_xlock (opts=0, line=0, sx=<optimized out>, td=<optimized out>, file=<optimized out>) at /usr/src/sys/sys/sx.h:168
#5  zvol_rename_minors (oldname=0xfffff8024f1da800 "data/kvm/desktop/desktop-master-m40-lp@desktop3", newname=0xfffff80249a2a400 "data/kvm/desktop/desktop-master-m40-lp@desktop1")
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c:3045
#6  0xffffffff826d32bf in dsl_dataset_rename_snapshot_sync_impl (dp=0xfffff8001b148800, hds=0xfffff80108c93000, arg=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:2533
#7  0xffffffff826cf4d4 in dsl_dataset_rename_snapshot_sync (arg=0xfffffe010d7c6578, tx=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:2557
#8  0xffffffff826d964e in dsl_sync_task_sync (dst=0xfffffe010d7c64a8, tx=0xfffff8068cdac300) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:234
#9  0xffffffff826d7ffb in dsl_pool_sync (dp=<optimized out>, txg=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:819
#10 0xffffffff82701675 in spa_sync (spa=<optimized out>, txg=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:8088
#11 0xffffffff8270efc8 in txg_sync_thread (arg=0xfffff8001b148800) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:543
#12 0xffffffff80b61e23 in fork_exit (callout=0xffffffff8270edc0 <txg_sync_thread>, arg=0xfffff8001b148800, frame=0xfffffe00a6d5cac0) at /usr/src/sys/kern/kern_fork.c:1057
#13 <signal handler called>

So, the sync thread is blocked on the spa_namespace_lock as suspected.
Comment 5 Andriy Gapon freebsd_committer 2018-12-06 08:37:09 UTC
I'd say that there is a worm can of problems related to ZVOL support in FreeBSD.
One issue is that we use spa_namespace_lock as zfsdev_state_lock.
Another is that we call functions like zvol_rename_minors() from a sync-ing context.
The locking logic in ZVOL code is really non-obvious because of all the interactions, including GEOM.

Once upon a time, I tried solve one part of the problems with this change https://reviews.freebsd.org/D7179
But I am not sure if that's the right solution.
Comment 6 Dave Baukus 2018-12-11 17:57:24 UTC
Since upgrading to 12.0, I've been hitting this deadlock from at least two zfs test suite tests (zfs_rename_test:zfs_rename_007_pos and history_test:history_002_pos) where zvols get renamed.

Curiously, these these test have not deadlocked when executed independently, but only when executed via our CI infrastructure. These tests have not been a problem heretofore. 

I can provide detailed back traces if its of interest.
Comment 7 Andriy Gapon freebsd_committer 2018-12-12 07:43:15 UTC
(In reply to Dave Baukus from comment #6)
Yes, please.  Especially, if you see anything different from comment #4.
Comment 8 Dave Baukus 2018-12-12 18:21:54 UTC
(In reply to Andriy Gapon from comment #7)
I have several deadlocks that all involve zvol and the spa_namespace_lock; the following does not involve a rename:

spa_namespace_lock lock holder blocked on the dp->dp_config_rwlock
------------------------------------------------------------------
Thread 1745 (Thread 100163):
(struct thread *)0xfffff80003cdc580, tid 100163
   geom :: (struct proc *)0xfffff80003d6ba50, pid 13

STACK:
#0  sched_switch (td=0xfffff80003cdc580, newtd=0xfffff80003acf580, flags=<optimized out>) at sys/kern/sched_ule.c:2112
#1  0xffffffff80513d6f in mi_switch (flags=260, newtd=0x0) at sys/kern/kern_synch.c:439
#2  0xffffffff8055e04d in sleepq_switch (wchan=0xfffff80e53a8ecc8, pri=0) at sys/kern/subr_sleepqueue.c:613
#3  0xffffffff8055def3 in sleepq_wait (wchan=0xfffff80e53a8ecc8, pri=0) at sys/kern/subr_sleepqueue.c:692
#4  0xffffffff804a4b73 in _cv_wait (cvp=0xfffff80e53a8ecc8, lock=<optimized out>) at sys/kern/kern_condvar.c:146
#5  0xffffffff8183a54b in rrw_enter_read_impl (rrl=0xfffff80e53a8eca8, prio=0, tag=0xffffffff8190d1da) at fs/zfs/rrwlock.c:181
#6  0xffffffff8181eb17 in dsl_pool_config_enter (dp=<optimized out>, tag=0xffffffff8190d1da) at fs/zfs/dsl_pool.c:1338
#7  dsl_pool_hold (name=<optimized out>, tag=0xffffffff8190d1da, dp=0xfffffe010e5a9848) at fs/zfs/dsl_pool.c:1310
#8  0xffffffff817f8991 in dmu_objset_hold (name=0xfffff80546b1a810 "testpool.1470/newvol", tag=0xffffffff8190d1da, osp=0xfffffe010e5a9888) at fs/zfs/dmu_objset.c:626
#9  0xffffffff81823138 in dsl_prop_get (dsname=<unavailable>, propname=0xffffffff8192fda6 "readonly", intsz=8, numints=1, buf=<optimized out>, setpoint=0x0) at fs/zfs/dsl_prop.c:320
#10 dsl_prop_get_integer (ddname=<unavailable>, propname=0xffffffff8192fda6 "readonly", valuep=0xfffffe010e5a9908, setpoint=0x0) at fs/zfs/dsl_prop.c:343
#11 0xffffffff818dc3fc in zvol_first_open (zv=0xfffff80546b1a800) at fs/zfs/zvol.c:880
#12 0xffffffff818dbdbb in zvol_open (pp=<optimized out>, count=<optimized out>, flag=<optimized out>) at fs/zfs/zvol.c:1198
#13 zvol_geom_access (pp=0xfffff8062add8c00, acr=<optimized out>, acw=0, ace=<optimized out>) at fs/zfs/zvol.c:2737
#14 0xffffffff804809f9 in g_access (cp=0xfffff804ef224000, dcr=<optimized out>, dcw=0, dce=0) at sys/geom/geom_subr.c:977
#15 0xffffffff80486974 in g_part_taste (mp=0xffffffff80a7ebf0 <g_part_class>, pp=0xfffff8062add8c00, flags=<optimized out>) at sys/geom/part/g_part.c:1982
#16 0xffffffff8048158a in g_new_provider_event (arg=0xfffff8062add8c00, flag=<optimized out>) at sys/geom/geom_subr.c:557
#17 0xffffffff8047c3b4 in one_event () at sys/geom/geom_event.c:268
#18 g_run_events () at sys/geom/geom_event.c:291
#19 0xffffffff804ca144 in fork_exit (callout=0xffffffff8047e720 <g_event_procbody>, arg=0x0, frame=0xfffffe010e5a9ac0) at sys/kern/kern_fork.c:1057
#20 <signal handler called>
========================================

The spa sync thread blocked on the spa_namespace_lock lock: 
------------------------------------------------------------
Thread 1774 (Thread 104418):
(struct thread *)0xfffff807bb1af580, tid 104418
   zfskern :: (struct proc *)0xfffff80005942528, pid 15

STACK:
#0  sched_switch (td=0xfffff807bb1af580, newtd=0xfffff80003ae7000, flags=<optimized out>) at sys/kern/sched_ule.c:2112
#1  0xffffffff80513d6f in mi_switch (flags=260, newtd=0x0) at sys/kern/kern_synch.c:439
#2  0xffffffff8055e04d in sleepq_switch (wchan=0xffffffff819aae98 <spa_namespace_lock>, pri=0) at sys/kern/subr_sleepqueue.c:613
#3  0xffffffff8055def3 in sleepq_wait (wchan=0xffffffff819aae98 <spa_namespace_lock>, pri=0) at sys/kern/subr_sleepqueue.c:692
#4  0xffffffff80511af6 in _sx_xlock_hard (sx=0xffffffff819aae98 <spa_namespace_lock>, x=<optimized out>, opts=<optimized out>, file=<optimized out>, line=<optimized out>) at sys/kern/kern_sx.c:857
#5  0xffffffff805115e1 in _sx_xlock (sx=0xffffffff819aae98 <spa_namespace_lock>, opts=0, file=0xffffffff818ffa89 "fs/zfs/dsl_dataset.c", line=3086) at sys/kern/kern_sx.c:328
#6  0xffffffff81815cc4 in dsl_dataset_promote_sync (arg=0xfffffe02597833d0, tx=0xfffff80c74143000) at fs/zfs/dsl_dataset.c:3086
#7  0xffffffff8181f2d9 in dsl_sync_task_sync (dst=0xfffffe0259783308, tx=0xfffff80c74143000) at fs/zfs/dsl_synctask.c:234
#8  0xffffffff8181d8db in dsl_pool_sync (dp=0xfffff80e53a8e800, txg=<optimized out>) at fs/zfs/dsl_pool.c:819
#9  0xffffffff8184bfe3 in spa_sync (spa=<optimized out>, txg=<optimized out>) at fs/zfs/spa.c:8124
#10 0xffffffff8185b206 in txg_sync_thread (arg=0xfffff80e53a8e800) at fs/zfs/txg.c:543
#11 0xffffffff804ca144 in fork_exit (callout=0xffffffff8185af20 <txg_sync_thread>, arg=0xfffff80e53a8e800, frame=0xfffffe039cc11ac0) at sys/kern/kern_fork.c:1057
#12 <signal handler called>

Pending on lock:
SX_LOCK: spa_namespace_lock, (struct sx *)0xffffffff819aae98
$5 = {
  lock_object = {
    lo_name = 0xffffffff8191dd6a,
    lo_flags = 0x22710000,
    lo_data = 0x0,
    lo_witness = 0x0
  },
  sx_lock = 0xfffff80003cdc584
}
LOCK HOLDER:
(struct thread *)0xfffff80003cdc580, tid 100163
   geom :: (struct proc *)0xfffff80003d6ba50, pid 13
========================================

Looking for the holder of the dp->dp_config_rwlock that the spa_namespace_lock lock holder is blocked on:

From the stack of Thread 1774:
set $dp = (dsl_pool_t *)0xfffff80e53a8e800

p $dp->dp_config_rwlock.rr_writer
$7 = (kthread_t *) 0xfffff807bb1af580

set $th = (struct thread *)0xfffff807bb1af580
p $th->td_tid
$8 = 104418

The spa sync thread is holding $dp->dp_config_rwlock.rr_writer for write
and it wants spa_namespace_lock while the cam "one_event" thread holds spa_namespace_lock and wants the $dp->dp_config_rwlock.rr_writer for read.
Comment 9 Dave Baukus 2018-12-12 18:40:36 UTC
(In reply to Andriy Gapon from comment #7)

I believe this one involves a rename, but I
no longer have the crash dump to verify:

spa_namespace_lock holder waiting for txg 185:
----------------------------------------------

Thread 1740 (Thread 100162):
(struct thread *)0xfffff80003d12000, tid 100162
   geom :: (struct proc *)0xfffff80003d98a50, pid 13

STACK:
#0  sched_switch (td=0xfffff80003d12000, newtd=0xfffff80003ae4580, flags=<optimized out>) at sys/kern/sched_ule.c:2112
#1  0xffffffff80513d6f in mi_switch (flags=260, newtd=0x0) at sys/kern/kern_synch.c:439
#2  0xffffffff8055e04d in sleepq_switch (wchan=0xfffff80c13b6c298, pri=0) at sys/kern/subr_sleepqueue.c:613
#3  0xffffffff8055def3 in sleepq_wait (wchan=0xfffff80c13b6c298, pri=0) at sys/kern/subr_sleepqueue.c:692
#4  0xffffffff804a4b73 in _cv_wait (cvp=0xfffff80c13b6c298, lock=<optimized out>) at sys/kern/kern_condvar.c:146
#5  0xffffffff8165f7ec in txg_wait_synced (dp=0xfffff80c13b6c000, txg=185) at fs/zfs/txg.c:659
#6  0xffffffff8169cd44 in zil_close (zilog=0xfffff80b2f00e400) at fs/zfs/zil.c:2975
#7  0xffffffff816e06c5 in zvol_last_close (zv=0xfffff8000b3fc000) at fs/zfs/zvol.c:893
#8  0xffffffff816dfcc4 in zvol_close (flag=0, pp=<optimized out>, count=<optimized out>) at fs/zfs/zvol.c:1313
#9  zvol_geom_access (pp=0xfffff80df6772b00, acr=<optimized out>, acw=<optimized out>, ace=<optimized out>) at fs/zfs/zvol.c:2739
#10 0xffffffff804809f9 in g_access (cp=0xfffff80657520b00, dcr=<optimized out>, dcw=0, dce=0) at sys/geom/geom_subr.c:977
#11 0xffffffff80486ac6 in g_part_taste (mp=<optimized out>, pp=0xfffff80df6772b00, flags=<optimized out>) at sys/geom/part/g_part.c:2046
#12 0xffffffff8048158a in g_new_provider_event (arg=0xfffff80df6772b00, flag=<optimized out>) at sys/geom/geom_subr.c:557
#13 0xffffffff8047c3b4 in one_event () at sys/geom/geom_event.c:268
#14 g_run_events () at sys/geom/geom_event.c:291
#15 0xffffffff804ca144 in fork_exit (callout=0xffffffff8047e720 <g_event_procbody>, arg=0x0, frame=0xfffffe010e5a0ac0) at sys/kern/kern_fork.c:1057
#16 <signal handler called>
========================================

The spa sync thread wants the spa_namespace_lock and is blocked by the "geom one_event" thread:
-----------------------------------------------------------------------
Thread 1769 (Thread 104558):
(struct thread *)0xfffff80779d9b000, tid 104558
   zfskern :: (struct proc *)0xfffff80005973528, pid 15

STACK:
#0  sched_switch (td=0xfffff80779d9b000, newtd=0xfffff80003ae5580, flags=<optimized out>) at sys/kern/sched_ule.c:2112
#1  0xffffffff80513d6f in mi_switch (flags=260, newtd=0x0) at sys/kern/kern_synch.c:439
#2  0xffffffff8055e04d in sleepq_switch (wchan=0xffffffff817aee98 <spa_namespace_lock>, pri=0) at sys/kern/subr_sleepqueue.c:613
#3  0xffffffff8055def3 in sleepq_wait (wchan=0xffffffff817aee98 <spa_namespace_lock>, pri=0) at sys/kern/subr_sleepqueue.c:692
#4  0xffffffff80511af6 in _sx_xlock_hard (sx=0xffffffff817aee98 <spa_namespace_lock>, x=<optimized out>, opts=<optimized out>, file=<optimized out>, line=<optimized out>) at sys/kern/kern_sx.c:857
#5  0xffffffff805115e1 in _sx_xlock (sx=0xffffffff817aee98 <spa_namespace_lock>, opts=0, file=0xffffffff81703a89 "fs/zfs/dsl_dataset.c", line=3086) at sys/kern/kern_sx.c:328
#6  0xffffffff81619cc4 in dsl_dataset_promote_sync (arg=0xfffffe01e0c3e3d0, tx=0xfffff80ccfc86500) at fs/zfs/dsl_dataset.c:3086
#7  0xffffffff816232d9 in dsl_sync_task_sync (dst=0xfffffe01e0c3e308, tx=0xfffff80ccfc86500) at fs/zfs/dsl_synctask.c:234
#8  0xffffffff816218db in dsl_pool_sync (dp=0xfffff80c13b6c000, txg=<optimized out>) at fs/zfs/dsl_pool.c:819
#9  0xffffffff8164ffe3 in spa_sync (spa=<optimized out>, txg=<optimized out>) at fs/zfs/spa.c:8124
#10 0xffffffff8165f206 in txg_sync_thread (arg=0xfffff80c13b6c000) at fs/zfs/txg.c:543
#11 0xffffffff804ca144 in fork_exit (callout=0xffffffff8165ef20 <txg_sync_thread>, arg=0xfffff80c13b6c000, frame=0xfffffe025a97aac0) at sys/kern/kern_fork.c:1057

Pending on lock:
SX_LOCK: spa_namespace_lock, (struct sx *)0xffffffff817aee98
$5 = {
  lock_object = {
    lo_name = 0xffffffff81721d6a,
    lo_flags = 0x22710000,
    lo_data = 0x0,
    lo_witness = 0x0
  },
  sx_lock = 0xfffff80003d12004
}
LOCK HOLDER:
(struct thread *)0xfffff80003d12000, tid 100162
   geom :: (struct proc *)0xfffff80003d98a50, pid 13