Bug 257522 - msdosfs rename(2) deadlock
Summary: msdosfs rename(2) deadlock
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-31 18:05 UTC by Edward Tomasz Napierala
Modified: 2021-09-05 15:38 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Edward Tomasz Napierala freebsd_committer 2021-07-31 18:05:23 UTC
The script below, when run twice in parallel, immediately deadlocks 14-CURRENT:


#!/bin/sh

while :; do
    mkdir a
    mv a b
    rmdir b
done

There are two locked vnodes, 0xfffff800be0d9000, locked by tid 101332, and 0xfffff8000818331c0, locked by tid 101333.  Backtraces:

(kgdb) thread 637
[Switching to thread 637 (Thread 101332)]
#0  sched_switch (td=td@entry=0xfffffe00d03da000, flags=<optimized out>, flags@entry=260) at /usr/home/trasz/git/freebsd/sys/kern/sched_ule.c:2097
2097                    cpuid = td->td_oncpu = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=td@entry=0xfffffe00d03da000, flags=<optimized out>, flags@entry=260) at /usr/home/trasz/git/freebsd/sys/kern/sched_ule.c:2097
#1  0xffffffff80c2e315 in mi_switch (flags=flags@entry=260) at /usr/home/trasz/git/freebsd/sys/kern/kern_synch.c:540
#2  0xffffffff80c7edd9 in sleepq_switch (wchan=0xfffff80081833230, wchan@entry=0xffffffff812567fc, pri=pri@entry=96) at /usr/home/trasz/git/freebsd/sys/kern/subr_sleepqueue.c:608
#3  0xffffffff80c7ec9e in sleepq_wait (wchan=<optimized out>, pri=<optimized out>) at /usr/home/trasz/git/freebsd/sys/kern/subr_sleepqueue.c:659
#4  0xffffffff80bf3110 in sleeplk (lk=lk@entry=0xfffff80081833230, flags=flags@entry=532480, ilk=<optimized out>, ilk@entry=0xfffff80081833258, wmesg=<optimized out>,
wmesg@entry=0xffffffff812567fc "msdosfs", pri=<optimized out>, pri@entry=96, timo=51, timo@entry=532480, queue=0) at /usr/home/trasz/git/freebsd/sys/kern/kern_lock.c:310
#5  0xffffffff80bf1602 in lockmgr_xlock_hard (lk=0xfffff80081833230, flags=<unavailable>, ilk=<unavailable>, file=<optimized out>, line=3009, lwa=<optimized out>)
at /usr/home/trasz/git/freebsd/sys/kern/kern_lock.c:940
#6  0xffffffff811ae37c in VOP_LOCK1_APV (vop=0xffffffff81aef040 <msdosfs_vnodeops>, a=a@entry=0xfffffe00d07fc5d8) at vnode_if.c:2247
#7  0xffffffff80d26a24 in VOP_LOCK1 (vp=0xfffff800818331c0, flags=532480, file=0xffffffff811dffc9 "/usr/home/trasz/git/freebsd/sys/kern/vfs_subr.c", line=3009) at ./vnode_if.h:1131
#8  _vn_lock (vp=vp@entry=0xfffff800818331c0, flags=flags@entry=532480, file=0xffffffff811dffc9 "/usr/home/trasz/git/freebsd/sys/kern/vfs_subr.c", line=line@entry=3009)
at /usr/home/trasz/git/freebsd/sys/kern/vfs_vnops.c:1805
#9  0xffffffff80d0e6fa in vget_finish (vp=0xfffff800818331c0, flags=532480, vs=vs@entry=VGET_USECOUNT) at /usr/home/trasz/git/freebsd/sys/kern/vfs_subr.c:3009
#10 0xffffffff80cea40e in cache_lookup (dvp=<optimized out>, dvp@entry=0xfffff800be0d9000, vpp=<optimized out>, vpp@entry=0xfffffe00d07fc8d8, cnp=<optimized out>,
cnp@entry=0xfffffe00d07fc900, tsp=tsp@entry=0x0, ticksp=<optimized out>, ticksp@entry=0x0) at /usr/home/trasz/git/freebsd/sys/kern/vfs_cache.c:2096
#11 0xffffffff80ceefab in vfs_cache_lookup (ap=<optimized out>) at /usr/home/trasz/git/freebsd/sys/kern/vfs_cache.c:3063
#12 0xffffffff811aab7a in VOP_LOOKUP_APV (vop=0xffffffff81aef040 <msdosfs_vnodeops>, a=a@entry=0xfffffe00d07fc770) at vnode_if.c:117
#13 0xffffffff80cffbc1 in VOP_LOOKUP (dvp=0xfffff800be0d9000, vpp=0xfffffe00d07fc8d8, cnp=0xfffffe00d07fc900) at ./vnode_if.h:69
#14 lookup (ndp=ndp@entry=0xfffffe00d07fc880) at /usr/home/trasz/git/freebsd/sys/kern/vfs_lookup.c:1131
#15 0xffffffff80cfe9f9 in namei (ndp=ndp@entry=0xfffffe00d07fc880) at /usr/home/trasz/git/freebsd/sys/kern/vfs_lookup.c:661
#16 0xffffffff80d1f0ed in kern_statat (td=0xfffffe00d03da000, td@entry=<unavailable>, flag=<optimized out>, fd=-100,
path=0x7fffffffe835 <error: Cannot access memory at address 0x7fffffffe835>, pathseg=pathseg@entry=UIO_USERSPACE, sbp=sbp@entry=0xfffffe00d07fc9d8, hook=0x0)
at /usr/home/trasz/git/freebsd/sys/kern/vfs_syscalls.c:2441
#17 0xffffffff80d1f7df in sys_fstatat (td=<unavailable>, td@entry=<error reading variable: value is not available>, uap=0xfffffe00d03da3e8,
uap@entry=<error reading variable: value is not available>) at /usr/home/trasz/git/freebsd/sys/kern/vfs_syscalls.c:2418
#18 0xffffffff810e4b0e in syscallenter (td=<optimized out>) at /usr/home/trasz/git/freebsd/sys/amd64/amd64/../../kern/subr_syscall.c:189
#19 amd64_syscall (td=0xfffffe00d03da000, traced=0) at /usr/home/trasz/git/freebsd/sys/amd64/amd64/trap.c:1186


(kgdb) thread 636
[Switching to thread 636 (Thread 101333)]
#0  sched_switch (td=td@entry=0xfffffe00d057f3a0, flags=<optimized out>, flags@entry=260) at /usr/home/trasz/git/freebsd/sys/kern/sched_ule.c:2097
2097                    cpuid = td->td_oncpu = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=td@entry=0xfffffe00d057f3a0, flags=<optimized out>, flags@entry=260) at /usr/home/trasz/git/freebsd/sys/kern/sched_ule.c:2097
#1  0xffffffff80c2e315 in mi_switch (flags=flags@entry=260) at /usr/home/trasz/git/freebsd/sys/kern/kern_synch.c:540
#2  0xffffffff80c7edd9 in sleepq_switch (wchan=0xfffff800be0d9070, wchan@entry=0xffffffff812567fc, pri=pri@entry=96) at /usr/home/trasz/git/freebsd/sys/kern/subr_sleepqueue.c:608
#3  0xffffffff80c7ec9e in sleepq_wait (wchan=<optimized out>, pri=<optimized out>) at /usr/home/trasz/git/freebsd/sys/kern/subr_sleepqueue.c:659
#4  0xffffffff80bf3110 in sleeplk (lk=lk@entry=0xfffff800be0d9070, flags=flags@entry=525312, ilk=<optimized out>, ilk@entry=0xfffff800be0d9098, wmesg=<optimized out>,
wmesg@entry=0xffffffff812567fc "msdosfs", pri=<optimized out>, pri@entry=96, timo=51, timo@entry=525312, queue=0) at /usr/home/trasz/git/freebsd/sys/kern/kern_lock.c:310
#5  0xffffffff80bf1602 in lockmgr_xlock_hard (lk=0xfffff800be0d9070, flags=<unavailable>, ilk=<unavailable>, file=<optimized out>, line=1411, lwa=<optimized out>)
at /usr/home/trasz/git/freebsd/sys/kern/kern_lock.c:940
#6  0xffffffff811ae37c in VOP_LOCK1_APV (vop=0xffffffff81aef040 <msdosfs_vnodeops>, a=a@entry=0xfffffe00d0ecd698) at vnode_if.c:2247
#7  0xffffffff80d26a24 in VOP_LOCK1 (vp=0xfffff800be0d9000, flags=525312, file=0xffffffff81293608 "/usr/home/trasz/git/freebsd/sys/kern/vfs_lookup.c", line=1411) at ./vnode_if.h:1131
#8  _vn_lock (vp=0xfffff800be0d9000, flags=flags@entry=525312, file=0xffffffff81293608 "/usr/home/trasz/git/freebsd/sys/kern/vfs_lookup.c", line=line@entry=1411)
at /usr/home/trasz/git/freebsd/sys/kern/vfs_vnops.c:1805
#9  0xffffffff80d0093f in relookup (dvp=0xfffff800be0d9000, vpp=vpp@entry=0xfffffe00d0ecd820, cnp=0xfffffe00d0ecd9e0) at /usr/home/trasz/git/freebsd/sys/kern/vfs_lookup.c:1411
#10 0xffffffff80ac3065 in msdosfs_rename (ap=0xfffffe00d0ecda38) at /usr/home/trasz/git/freebsd/sys/fs/msdosfs/msdosfs_vnops.c:1115
#11 0xffffffff811ad4ac in VOP_RENAME_APV (vop=0xffffffff81aef040 <msdosfs_vnodeops>, a=a@entry=0xfffffe00d0ecda38) at vnode_if.c:1678
#12 0xffffffff80d2276d in VOP_RENAME (fdvp=<unavailable>, fvp=<optimized out>, fcnp=<optimized out>, tdvp=<optimized out>, tvp=<unavailable>, tcnp=<optimized out>) at ./vnode_if.h:863
#13 kern_renameat (td=0xfffffe00d057f3a0, td@entry=<error reading variable: value is not available>, oldfd=-100, oldfd@entry=<error reading variable: value is not available>,
old=0x7fffffffe833 <error: Cannot access memory at address 0x7fffffffe833>, old@entry=<error reading variable: value is not available>, newfd=-100,
newfd@entry=<error reading variable: value is not available>, new=0x7fffffffde70 <error: Cannot access memory at address 0x7fffffffde70>,
new@entry=<error reading variable: value is not available>, pathseg=<optimized out>, pathseg@entry=<error reading variable: value is not available>)
at /usr/home/trasz/git/freebsd/sys/kern/vfs_syscalls.c:3743
#14 0xffffffff810e4b0e in syscallenter (td=<optimized out>) at /usr/home/trasz/git/freebsd/sys/amd64/amd64/../../kern/subr_syscall.c:189
#15 amd64_syscall (td=0xfffffe00d057f3a0, traced=0) at /usr/home/trasz/git/freebsd/sys/amd64/amd64/trap.c:1186

The kernel is amd64 FreeBSD 14-CURRENT, with "pragma clang optimize off" added to msdosfs_vnops.c, and DEBUG_VFS_LOCKS option added (also happens without it; there are no additional locking warnings).
Comment 1 Konstantin Belousov freebsd_committer 2021-07-31 23:39:45 UTC
msdosfs_rename() has the same issue as was present in ufs_rename() long time ago.

First, relookup() cannot be used while any vnode is locked.  Instead, special
non-sleeping variant of msdosfs_lookup() needs to be made available and applied
there.  It is already coded (dd_ino != NULL for msdosfs_lookup_() I believe),
just needs to be exposed and applied.

Another issue is that function prologue must not use sleepable vn_lock()
when owning any other vnode lock.

Mostly, it needs to mimic the structure of ufs_rename() for acquiring all
three or four locks.
Comment 2 Peter Holm freebsd_committer 2021-08-01 06:28:26 UTC
Reproduced: https://people.freebsd.org/~pho/stress/log/log0158.txt
Comment 3 commit-hook freebsd_committer 2021-08-27 15:41:04 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=95d42526e92cb2a9842d71d3c585aabf32da7534

commit 95d42526e92cb2a9842d71d3c585aabf32da7534
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2021-08-01 17:46:59 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2021-08-27 15:39:45 +0000

    msdosfs: fix rename

    Use the same locking algorithm for msdosfs_rename() as used by ufs_rename().
    Convert doscheckpath() to non-sleeping version.

    Reported by:    trasz
    PR:     257522
    In collaboration with:  pho
    Reviewed by:    mckusick
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D31464

 sys/fs/msdosfs/denode.h         |   5 +-
 sys/fs/msdosfs/msdosfs_denode.c |   1 +
 sys/fs/msdosfs/msdosfs_lookup.c |  79 ++++----
 sys/fs/msdosfs/msdosfs_vnops.c  | 430 +++++++++++++++++++++-------------------
 4 files changed, 274 insertions(+), 241 deletions(-)