Bug 163770 - [zfs] [hang] LOR between zfs&syncer + vnlru leading to ZFS deadlock
Summary: [zfs] [hang] LOR between zfs&syncer + vnlru leading to ZFS deadlock
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: Bugmeister
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-02 08:20 UTC by Boris Lytochkin
Modified: 2025-01-19 06:20 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Boris Lytochkin 2012-01-02 08:20:08 UTC
Deadlocks are seen periodically, most of them can be triggered with low kern.maxvnodes and running /etc/periodic/security/100.chksetuid (a big find in it).
LOR itself as seen couple of minutes after server boots into multiuser:
--- syscall (5, FreeBSD ELF64, open), rip = 0x800f8666c, rsp = 0x7fffffffe8d8, rbp = 0x1b0 ---
lock order reversal:
 1st 0xffffff00508c1098 syncer (syncer) @ /usr/src/sys/kern/vfs_subr.c:1737
 2nd 0xffffff108f8abba8 zfs (zfs) @ /usr/src/sys/kern/vfs_subr.c:2137
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2c
witness_checkorder() at witness_checkorder+0x651
__lockmgr_args() at __lockmgr_args+0xb98
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x52
_vn_lock() at _vn_lock+0x47
vget() at vget+0x56
vfs_msync() at vfs_msync+0xa5
sync_fsync() at sync_fsync+0x12a
VOP_FSYNC_APV() at VOP_FSYNC_APV+0x4a
sync_vnode() at sync_vnode+0x157
sched_sync() at sched_sync+0x1b1
fork_exit() at fork_exit+0x11d
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff9b0b285d00, rbp = 0 ---

When deadlock appears, syncer gets stuck in zilog sync:
procstat:   20 100179 vnlru            -                mi_switch sleepq_timedwait _sleep zfs_zget zfs_get_data zil_commit zfs_freebsd_write VOP_WRITE_APV vnode_pager_generic_putpages VOP_PUTPAGES_APV vnode_pager_putpages vm_pageout_flush vm_object_page_collect_flush vm_object_page_clean vm_object_terminate vnode_destroy_vobject zfs_freebsd_reclaim VOP_RECLAIM_APV
procstat:   21 100180 syncer           -                mi_switch sleepq_wait _cv_wait zil_commit zfs_sync sync_fsync VOP_FSYNC_APV sync_vnode sched_sync fork_exit fork_trampoline
debug.procstat:63429 101808 rrdtool          -                mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall
procstat:63430 101562 rrdtool          -                mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall

ps:    0    20     0   0  46  0     0    16 zcolli DL    ??    1:36.89 [vnlru]
ps:    0    21     0   0  51  0     0    16 zilog- DL    ??  151:27.35 [syncer]
ps:   80 63429 63419   0  48  0 13280  2032 piperd I     ??    2:36.48 /usr/local/bin/rrdtool -
ps:   80 63430 63419   0  76  0 13280  1712 piperd I     ??    0:00.01 /usr/local/bin/rrdtool -

At this moment these locks are held:
db> show alllocks
Process 69468 (sshd) thread 0xffffff0f49769460 (102462)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff131329a0f8) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148
Process 63419 (php) thread 0xffffff115a574000 (102314)
shared lockmgr zfs (zfs) r = 0 (0xffffff0e7822e448) locked @ /usr/src/sys/kern/vfs_subr.c:2137
Process 21 (syncer) thread 0xffffff0027d96000 (100180)
exclusive lockmgr syncer (syncer) r = 0 (0xffffff0045d94098) locked @ /usr/src/sys/kern/vfs_subr.c:1737
Process 20 (vnlru) thread 0xffffff0027d96460 (100179)
exclusive lockmgr zfs (zfs) r = 0 (0xffffff140637f620) locked @ /usr/src/sys/kern/vfs_subr.c:2249

Excerpt from vfs_subr.c:
1732:
                vdrop(vp);
                mtx_lock(&sync_mtx);
                return (*bo == LIST_FIRST(slp));
        }
        vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
        (void) VOP_FSYNC(vp, MNT_LAZY, td);
        VOP_UNLOCK(vp, 0);
        vn_finished_write(mp);
        BO_LOCK(*bo);
        if (((*bo)->bo_flag & BO_ONWORKLST) != 0) {
2132:

        if ((flags & LK_INTERLOCK) == 0)
                VI_LOCK(vp);
        vholdl(vp);
        if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) {
                vdrop(vp);
                CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__,
                    vp);
                return (error);
        }
2244:
         */
        vp->v_iflag |= VI_OWEINACT;
        switch (func) {
        case VPUTX_VRELE:
                error = vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK);
                VI_LOCK(vp);
                break;
        case VPUTX_VPUT:
                if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) {
                        error = VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK |


Deadlock is observed per FS: other ZFS filesystems within ZFS pool that contains deadlocked FS, are working OK.

Fix: 

It seems that setting kern.maxvnodes to higher value may cope with this hang, at least 100.chksetuid can be executed without triggering deadlock.
How-To-Repeat: Create ZFS fs with tons (300k) of RRD files and update them periodically
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2012-01-02 13:39:00 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:44 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2025-01-19 06:20:06 UTC
^Triage: I'm sorry that this PR did not get addressed in a timely fashion.

By now, the version that it was created against is long out of support.
As well, many newer versions of ZFS have been imported.

Please re-open if it is still a problem on a supported version.