Bug 251320 - [PATCH] vfs_domount_first: do not crash on mount failure
Summary: [PATCH] vfs_domount_first: do not crash on mount failure
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: Konstantin Belousov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-23 00:21 UTC by Tong Zhang
Modified: 2020-12-01 16:09 UTC (History)
2 users (show)

See Also:


Attachments
patch (1.64 KB, patch)
2020-11-23 00:21 UTC, Tong Zhang
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tong Zhang 2020-11-23 00:21:57 UTC
Created attachment 219896 [details]
patch

vfs_write_suspend_umnt() expect a non-zero vfs_write_suspend_umnt,
otherwise it will panic, however in vfs_domount_first(), the counter
is zero when calling VFS_UNMOUNT()

 panic: vn_finished_write: neg cnt
 cpuid = 7
 time = 1606004193
 KDB: stack backtrace:
 #0 0xffffffff80c0a8f5 at kdb_backtrace+0x65
 freebsd#1 0xffffffff80bbeb1b at vpanic+0x17b
 freebsd#2 0xffffffff80bbe993 at panic+0x43
 freebsd#3 0xffffffff80c9da05 at vn_finished_write+0xc5
 freebsd#4 0xffffffff80c9f726 at vfs_write_suspend_umnt+0x16
 freebsd#5 0xffffffff80ecdd31 at ffs_unmount+0x71
 freebsd#6 0xffffffff80c844dd at vfs_domount+0xc0d
 freebsd#7 0xffffffff80c83228 at vfs_donmount+0x988
 freebsd#8 0xffffffff80c82871 at sys_nmount+0x71
 freebsd#9 0xffffffff810904c7 at amd64_syscall+0x387
 freebsd#10 0xffffffff8106785e at fast_syscall_common+0xf8
 Uptime: 2h10m11s
Comment 1 Konstantin Belousov freebsd_committer 2020-11-23 08:14:29 UTC
(In reply to Tong Zhang from comment #0)
vn_finished_write() is required after VFS_UNMOUNT().

But there are more issues with VFS_UNMOUNT() call, I wrote more extensive patch
based on your submission, please see https://reviews.freebsd.org/D27327

Since it seems that you can reproduce the situation, please test.
Comment 2 Tong Zhang 2020-11-24 02:28:40 UTC
(In reply to Konstantin Belousov from comment #1)


lock order reversal:
 1st 0xfffff80004da0258 ufs (ufs, lockmgr) @ /usr/src/sys/kern/vfs_mount.c:1739
 2nd 0xfffff80004da0070 devfs (devfs, lockmgr) @ /usr/src/sys/kern/vfs_subr.c:3135
lock order devfs -> ufs established at:
#0 0xffffffff80c47a2d at witness_checkorder+0x46d
#1 0xffffffff80ba8c98 at lockmgr_lock_flags+0x188
#2 0xffffffff80e998e5 at ffs_lock+0x75
#3 0xffffffff80cd22a4 at _vn_lock+0x54
#4 0xffffffff80cb19d5 at vfs_domount+0xee5
#5 0xffffffff80cb0122 at vfs_donmount+0x872
#6 0xffffffff80cb44e7 at kernel_mount+0x57
#7 0xffffffff80cb6e11 at parse_mount+0x4a1
#8 0xffffffff80cb5319 at vfs_mountroot+0x589
#9 0xffffffff80b680af at start_init+0x1f
#10 0xffffffff80b93e50 at fork_exit+0x80
#11 0xffffffff80ffaace at fork_trampoline+0xe
lock order ufs -> devfs attempted at:
#0 0xffffffff80c4838c at witness_checkorder+0xdcc
#1 0xffffffff80ba8c98 at lockmgr_lock_flags+0x188
#2 0xffffffff80cd22a4 at _vn_lock+0x54
#3 0xffffffff80cbb5d1 at vput_final+0x111
#4 0xffffffff80e9762f at ffs_unmount+0x33f
#5 0xffffffff80cb2abc at dounmount+0x42c
#6 0xffffffff80cb261c at kern_unmount+0x2fc
#7 0xffffffff8102715e at amd64_syscall+0x12e
#8 0xffffffff80ffa39e at fast_syscall_common+0xf8
Comment 3 Tong Zhang 2020-11-24 02:47:14 UTC
(In reply to Tong Zhang from comment #2)

root@freebsd:~ # uname -a
FreeBSD freebsd 13.0-CURRENT FreeBSD 13.0-CURRENT #9 r367963M: Tue Nov 24 02:20:06 UTC 2020     root@freebsd:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
root@freebsd:~ # 

the lock order reversal warning pops up whenever I try to mount then umount the disk.
Comment 4 Tong Zhang 2020-11-24 02:59:45 UTC
(In reply to Tong Zhang from comment #3)

hmmm... after testing an unmodified r367963, my conclusion is that this lock order bug already exists in r367963 which is irrelevant to your patch.
Comment 5 Konstantin Belousov freebsd_committer 2020-11-24 12:37:13 UTC
The patch in the review only modifies path for recovery for failed
VFS_ROOT/VFS_STATFS.  From the backtraces you printed I think that you tested
successful mount which is not touched by the change.
Comment 6 Tong Zhang 2020-11-24 15:42:22 UTC
(In reply to Konstantin Belousov from comment #5)
Right. Your observation is correct.

-- I should have mentioned my reproducer is now triggering more warnings and panic before hitting the VFS_UNMOUNT() on -CURRENT, which makes it unable to trigger the same bug in the original report.
It seems that more things appears to be broken since 12.2.
I will see if I can get around those irrelevant panics and warnings and test the patch.
Comment 7 commit-hook freebsd_committer 2020-11-26 18:08:52 UTC
A commit references this bug:

Author: kib
Date: Thu Nov 26 18:08:42 UTC 2020
New revision: 368075
URL: https://svnweb.freebsd.org/changeset/base/368075

Log:
  More careful handling of the mount failure.

  - VFS_UNMOUNT() requires vn_start_write() around it [*].
  - call VFS_PURGE() before unmount.
  - do not destroy mp if cleanup unmount did not succeed.
  - set MNTK_UNMOUNT, and indicate forced unmount with MNTK_UNMOUNTF
    for VFS_UNMOUNT() in cleanup.

  PR:	251320 [*]
  Reported by:	Tong Zhang <ztong0001@gmail.com>
  Reviewed by:	markj, mjg
  Discussed with:	rmacklem
  Sponsored by:	The FreeBSD Foundation
  Differential revision:	https://reviews.freebsd.org/D27327

Changes:
  head/sys/kern/vfs_mount.c