We hit a panic where nfs_advlock() trapped trying to dereference the v_mount of a doomed vnode (during a race with forced unmount as part of a reboot). The dereference is part of the NFS_ISV4() macro, which nfs_advlock() appears to access outside of any locks and without any check for a doomed vnode. I think, but am not certain, that it is expected that we may enter VOP_ADVLOCK() with a doomed vnode. Given code context, it also seems like in that situation it is okay simply to return EBADF (i.e. that no additional work needs to be done e.g. to attempt to release an advisory lock). If both of these are true, then I think the fix may be straightforward: do an interlocked check for a doomed vnode, and either record the result of NFS_ISV4() under the interlock or continue to hold it up to the vnode lock calls. However, if it is true that it is valid for VOP_ADVLOCK() to be called with doomed vnodes, then I think unionfs_advlock() may have a similar bug. I did not audit all the others, but lf_advlock() does appear to have doomed vnode checks. panic @ time 1540421723.857, thread 0xfffff8022feae000: Fatal trap 12: page fault while in kernel mode cpuid = 4, TSC = 0x4cc4cc33bb8b6 Panic occurred in module kernel loaded at 0xffffffff80200000: Stack: -------------------------------------------------- kernel:trap_fatal+0xa4 kernel:trap_pfault+0x23f kernel:trap+0x308 kernel:nfs_advlock+0x30 kernel:VOP_ADVLOCK_APV+0xda kernel:closef+0x94 kernel:fdescfree_fds+0x94 kernel:fdescfree+0x366 kernel:exit1+0x564 kernel:sigexit+0xde8 kernel:postsig+0x3da kernel:ast+0x336 -------------------------------------------------- cpuid = 4; apic id = 04 fault virtual address = 0x280 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff804d7020 stack pointer = 0x28:0xfffffe1b9d128430 frame pointer = 0x28:0xfffffe1b9d128530 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 Registers: rax: fffff80232da9580 rbx: fffffe1b9d128578 rcx: 0000000000000000 rdx: 0000000000080000 rsi: fffffe1b9d128510 rdi: fffffe1b9d128578 rbp: fffffe1b9d128530 rsp: fffffe1b9d128430 r8: 0000000000000246 r9: 0000000000000016 r10: 0000000000010000 r11: 0000000000000001 r12: 0000000000000000 r13: fffff8027cb0e268 r14: fffff802b7150470 r15: fffff8022feae000 rflags: 0000000000010286 rip: ffffffff804d7020 trapno: 000000000000000c err: 0000000000000000 curthread: 0xfffff8022feae000 current process = 3316 (perl5.26.2) Stack Data from 0xfffffe1b9d128430: 0: fffff8083ffbcd18 fffff8026b5dd790 0000000000000000 fffff8083ffbcd00 fffffe1b9d128490 ffffffff805934d4 0000000000000001 fffff8026b5dd000 8: fffff8026b5ddf90 fffff802bc458000 fffff8083ffab3c0 fffff8026b5ddf90 fffffe1b9d1284c0 ffffffff80888d6f fffff8083ffab430 0000000000000000 16: fffff8026b5dd790 fffff8083ffab3c0 fffffe1b9d1284f0 ffffffff80885197 fffffe1b9d128500 ffffffff809a5805 fffff802b7150470 fffffe1b9d128580 24: ffffffff80b05870 fffff8022feae000 fffffe1b9d128530 ffffffff80d4fb60 fffffe1b9d128580 fffff8027cb0e268 fffffe1b9d128578 00000000ffffffff 32: fffffe1b9d128560 ffffffff809ac91b 0000000000000001 fffff802bc458000 fffff8027cb0e268 fffff8022feae000 fffffe1b9d1285f0 ffffffff8055d665 40: 0000000000000001 ffffffff80dd75d8 fffff802b7150470 fffff80232da9580 fffff80000000002 fffffe1b9d1285a8 fffffe1b00000040 0000000000000000 48: 0000000000000000 000000024882d200 fffff8004882d000 0000000000000001 fffff802bc458000 fffff8027cb0e268 0000000000000030 fffff802bc458080 56: fffffe1b9d128640 ffffffff8055d2c5 0000000000000100 fffff8022feae000 00000001bc458040 fffff802bc458040 fffff8004882d000 fffff8005afb9b18 Dumping stacks (40960 bytes) Tracing command init pid 1 tid 100002 td 0xfffff8001b776000 (CPU 0) cpustop_handler() at cpustop_handler+0x2e/frame 0xffffffff810df8d0 ipi_nmi_handler() at ipi_nmi_handler+0x4a/frame 0xffffffff810df8f0 trap() at trap+0x46/frame 0xffffffff810dfad0 nmi_calltrap() at nmi_calltrap+0x8/frame 0xffffffff810dfad0 --- trap 0x13, rip = 0xffffffff80614b97, rsp = 0xfffffe1977b92400, rbp = 0xfffffe1977b92410 --- _isitmyx() at _isitmyx+0x77/frame 0xfffffe1977b92410 witness_checkorder() at witness_checkorder+0x2b9/frame 0xfffffe1977b92490 __mtx_lock_flags() at __mtx_lock_flags+0x9b/frame 0xfffffe1977b924e0 _vdrop() at _vdrop+0x169/frame 0xfffffe1977b92520 vputx() at vputx+0x23f/frame 0xfffffe1977b92580 vflush() at vflush+0x50e/frame 0xfffffe1977b926c0 nfs_unmount() at nfs_unmount+0x71/frame 0xfffffe1977b92710 dounmount() at dounmount+0x64a/frame 0xfffffe1977b92790 unmount_or_warn() at unmount_or_warn+0x40/frame 0xfffffe1977b927b0 vfs_unmountall() at vfs_unmountall+0x55/frame 0xfffffe1977b927d0 bufshutdown() at bufshutdown+0x3a0/frame 0xfffffe1977b92820 kern_reboot() at kern_reboot+0x197/frame 0xfffffe1977b92860 sys_reboot() at sys_reboot+0x3b5/frame 0xfffffe1977b928b0 amd64_syscall() at amd64_syscall+0x388/frame 0xfffffe1977b92ab0 fast_syscall_common() at fast_syscall_common+0x106/frame 0x7fffffffe860 (gdb) target remote localhost:8011 Remote debugging using localhost:8011 0xffffffff804d7020 in nfs_advlock (ap=0xfffffe1b9d128578) at /b/mnt/src/sys/fs/nfsclient/nfs_clvnops.c:3095 3095 /b/mnt/src/sys/fs/nfsclient/nfs_clvnops.c: No such file or directory. (gdb) p vp $1 = (struct vnode *) 0xfffff802b7150470 (gdb) p vp->v_mount $2 = (struct mount *) 0x0 (gdb) p vp->v_type $3 = VBAD
I'll take a look and see if there is a simple patch for this particular crash. However I do know that, in general, that the NLM is not safe for forced dismounts and I could never figure out a way to make it safe. (Essentially, whenever the vnode is unlocked, a forced dismount can cause a crash. However, there are areas in the code where the vnode must be unlocked. I could reduce the likelyhood of a crash doing something similar to what you suggest, where the code sets local variables before unlocking the vnode and using those instead of the vnode, but I couldn't do all cases.) At the very least, it should check for a doomed vnode and return EBADF when it does lock the vnode. I've said this many times on the mailing lists, but I'll mention it here too. The NLM is a fundamentally flawed protocol that was never published as an RFC. I think avoiding use of it is the best way to go. If locks for a file don't need to be visible to other clients, the "nolockd" mount option should work. For the cases (not very common) where the locks do need to be visible to other client(s), using NFSv4 is a better bet. (And NFSv4.1 fixed a lot of things, so I suggest NFSv4.1 over NFSv4.0 if possible.)
Created attachment 198644 [details] Fix crash in nfs_advlock() during forced dismount Duh, oops... I didn't look at the crash before making the last comment. This case appears to happen right in nfs_advlock() before any call to the NLM. I think this patch should fix this crash and make it work for the non-NLM cases. ("nolockd" option or NFSv4) Maybe the reported can test this patch and report back? Thanks for reporting this, rick
(In reply to Rick Macklem from comment #2) Thanks. I think we should be able to reproduce it, I'll get back to you with test results.
I was able to reproduce the panic as follows. The dtrace script inserts a pause when lockf tries to drop the lock, which opens up the race with unmount. umount -f $mp mount $mp daemon dtrace -wn 'fbt::nfs_advlock:entry /execname == "lockf" && args[0]->a_op == 2/ {chill(500000000);}' lockf $mp/foo true & umount -f $mp; wait; mount $mp [...try last line again if it didn't trigger...] The panic reproduces before, but not after, the patch in comment 2. I did not do any kind of thorough functionality testing. I used NFSv3. Thanks!
Good work w.r.t. the testing. Thanks. I'll be committing the patch to head tomorrow.
A commit references this bug: Author: rmacklem Date: Thu Nov 1 15:27:22 UTC 2018 New revision: 339999 URL: https://svnweb.freebsd.org/changeset/base/339999 Log: Fix NFS client vnode locking to avoid a crash during forced dismount. A crash was reported where the crash occurred in nfs_advlock() when the NFS_ISV4(vp) macro was being executed. This was caused by the vnode being VI_DOOMED due to a forced dismount in progress. This patch fixes the problem by locking the vnode before executing the NFS_ISV4() macro. Tested by: rlibby PR: 232673 Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D17757 Changes: head/sys/fs/nfsclient/nfs_clvnops.c
A commit references this bug: Author: rmacklem Date: Sun Nov 18 22:59:55 UTC 2018 New revision: 340588 URL: https://svnweb.freebsd.org/changeset/base/340588 Log: MFC: r339999 Fix NFS client vnode locking to avoid a crash during forced dismount. A crash was reported where the crash occurred in nfs_advlock() when the NFS_ISV4(vp) macro was being executed. This was caused by the vnode being VI_DOOMED due to a forced dismount in progress. This patch fixes the problem by locking the vnode before executing the NFS_ISV4() macro. PR: 232673 Changes: _U stable/11/ stable/11/sys/fs/nfsclient/nfs_clvnops.c
A commit references this bug: Author: rmacklem Date: Sun Nov 18 23:48:16 UTC 2018 New revision: 340589 URL: https://svnweb.freebsd.org/changeset/base/340589 Log: MFC: r339999 Fix NFS client vnode locking to avoid a crash during forced dismount. A crash was reported where the crash occurred in nfs_advlock() when the NFS_ISV4(vp) macro was being executed. This was caused by the vnode being VI_DOOMED due to a forced dismount in progress. This patch fixes the problem by locking the vnode before executing the NFS_ISV4() macro. PR: 232673 Changes: _U stable/10/ stable/10/sys/fs/nfsclient/nfs_clvnops.c
A commit references this bug: Author: rmacklem Date: Mon Nov 19 00:49:08 UTC 2018 New revision: 340590 URL: https://svnweb.freebsd.org/changeset/base/340590 Log: MFC: r339999 Fix NFS client vnode locking to avoid a crash during forced dismount. A crash was reported where the crash occurred in nfs_advlock() when the NFS_ISV4(vp) macro was being executed. This was caused by the vnode being VI_DOOMED due to a forced dismount in progress. This patch fixes the problem by locking the vnode before executing the NFS_ISV4() macro. PR: 232673 Changes: _U stable/12/ stable/12/sys/fs/nfsclient/nfs_clvnops.c
Patch has been MFC'd.