It looks on Hyper-V the issue only happens to UP VM (SMP VM is good), but the VM hangs in the generic FreeBSD kernel code, so I suspect the issue should reproduce on bare metal as well. More info: FreeBSD-12.0-CURRENT-amd64-20180322-r331345-disc1.iso is good, but FreeBSD-12.0-CURRENT-amd64-20180329-r331740-disc1.iso is broken. These are the serial messages: Tue Apr 10 01:52:02 CST 2018 FreeBSD/amd64 (decui-b11) (ttyu0) login: Stopping cron. Stopping sshd. appending output to nohup.out Stopping devd. Writing entropy file:. Writing early boot entropy file:. Terminated . Apr 10 01:52:10 decui-b11 syslogd: exiting on signal 15 Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `bufdaemon' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 4 NMI ISA 0, EISA ff NMI/cpu0 ... going to debugger [ thread pid 18 tid 100055 ] Stopped at witness_lock+0x1ae: popq %rbp db> bt Tracing pid 18 tid 100055 td 0xfffff8000b052000 witness_lock() at witness_lock+0x1ae/frame 0xfffffe0026553930 sched_add() at sched_add+0xe3/frame 0xfffffe0026553970 setrunnable() at setrunnable+0x8f/frame 0xfffffe0026553990 sleepq_broadcast() at sleepq_broadcast+0xe8/frame 0xfffffe00265539d0 wakeup() at wakeup+0x1d/frame 0xfffffe00265539f0 kproc_suspend_check() at kproc_suspend_check+0x58/frame 0xfffffe0026553a20 buf_daemon() at buf_daemon+0x11a/frame 0xfffffe0026553a70 fork_exit() at fork_exit+0x84/frame 0xfffffe0026553ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0026553ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- db>
When the issue happens, the cpu utilization of the UP VM is 100%. While we're trying to find the first bad revision, it would be great if somebody can report if the issue also happens to bare metal or other hypervisors.
When the hang issue happens, the last serial message is the line "Syncing disks, vnodes remaining... 4 ". Later I sent an NMI interrupt to the VM by Hyper-V PowerShell cmdlet, and the kernel printed out the later lines showing the call-stack.
(In reply to Dexuan Cui from comment #0) I have observed a similar (perhaps the same issue) on real multi-core hardware but with the SMP kernel (GENERIC) when all APs were disabled via hints (hint.lapic.%d.disabled=1). The reboot got stuck and a thread doing it was in this state: db> bt Tracing pid 1 tid 100002 td 0xfffff80003230560 sched_switch() at sched_switch+0x95c/frame 0xfffffe00213b9880 mi_switch() at mi_switch+0x188/frame 0xfffffe00213b98b0 sleepq_switch() at sleepq_switch+0x10d/frame 0xfffffe00213b98f0 sleepq_timedwait() at sleepq_timedwait+0x50/frame 0xfffffe00213b9930 _sleep() at _sleep+0x2fa/frame 0xfffffe00213b99d0 kproc_suspend() at kproc_suspend+0x9a/frame 0xfffffe00213b9a00 kproc_shutdown() at kproc_shutdown+0x4e/frame 0xfffffe00213b9a20 kern_reboot() at kern_reboot+0x19e/frame 0xfffffe00213b9a60 sys_reboot() at sys_reboot+0x3ab/frame 0xfffffe00213b9ac0 amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe00213b9bf0 fast_syscall_common() at fast_syscall_common+0x105/frame 0xfffffe00213b9bf0 I didn't examine other threads much.
I think the first bad patch is this one: https://github.com/freebsd/freebsd/commit/63a483ed5f4eaadb8979992c7a5de24c7a471c61 (Fix a bug introduced in r329612 that slowly invalidates all clean bufs.): Today's https://github.com/freebsd/freebsd/commit/66e8725e8d24141506bc4f458ec7d1a51e86304c is broken, but if I revert 63a483ed5f4eaadb8979992c7a5de24c7a471c61, the bug can not reproduce. Cc bde & jeff.
(In reply to Dexuan Cui from comment #4) When the bug reproduces, the log is: Stopping cron. Stopping sshd. appending output to nohup.out Stopping devd. Writing entropy file:. Writing early boot entropy file:. Terminated . Apr 10 14:46:40 decui-b11 syslogd: exiting on signal 15 Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `bufdaemon' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 4 (It hangs here) After I revert 63a483ed5f4eaadb8979992c7a5de24c7a471c61, the bug can't reproduce, despite the messages: Apr 10 14:28:44 decui-b11 syslogd: exiting on signal 15 Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `bufdaemon' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 4 1 0 done All buffers synced. lock order reversal: 1st 0xfffff80008533ba8 ufs (ufs) @ /root/bsd.git/sys/kern/vfs_mount.c:1335 2nd 0xfffff800085d0428 syncer (syncer) @ /root/bsd.git/sys/kern/vfs_subr.c:2732 stack backtrace: #0 0xffffffff80bccfa3 at witness_debugger+0x73 #1 0xffffffff80bcce24 at witness_checkorder+0xe34 #2 0xffffffff80b3bb9b at lockmgr_lock_fast_path+0x17b #3 0xffffffff8119d069 at VOP_LOCK1_APV+0xd9 #4 0xffffffff80c488a6 at _vn_lock+0x66 #5 0xffffffff80c379a7 at vputx+0x157 #6 0xffffffff80c2f7d9 at dounmount+0x4d9 #7 0xffffffff80c3919b at vfs_unmountall+0x6b #8 0xffffffff80c14a25 at bufshutdown+0x2c5 #9 0xffffffff80b66d7a at kern_reboot+0x21a #10 0xffffffff80b66b09 at sys_reboot+0x3a9 #11 0xffffffff8102706b at amd64_syscall+0x79b #12 0xffffffff8100191d at fast_syscall_common+0x101 lock order reversal: 1st 0xfffff80008533ba8 ufs (ufs) @ /root/bsd.git/sys/kern/vfs_mount.c:1335 2nd 0xfffff800085d07e8 devfs (devfs) @ /root/bsd.git/sys/ufs/ffs/ffs_vfsops.c:1371 stack backtrace: #0 0xffffffff80bccfa3 at witness_debugger+0x73 #1 0xffffffff80bcce24 at witness_checkorder+0xe34 #2 0xffffffff80b3bb9b at lockmgr_lock_fast_path+0x17b #3 0xffffffff8119d069 at VOP_LOCK1_APV+0xd9 #4 0xffffffff80c488a6 at _vn_lock+0x66 #5 0xffffffff80e67a93 at ffs_flushfiles+0x93 #6 0xffffffff80e4adf2 at softdep_flushfiles+0x82 #7 0xffffffff80e6a147 at ffs_unmount+0x77 #8 0xffffffff80c2f819 at dounmount+0x519 #9 0xffffffff80c3919b at vfs_unmountall+0x6b #10 0xffffffff80c14a25 at bufshutdown+0x2c5 #11 0xffffffff80b66d7a at kern_reboot+0x21a #12 0xffffffff80b66b09 at sys_reboot+0x3a9 #13 0xffffffff8102706b at amd64_syscall+0x79b #14 0xffffffff8100191d at fast_syscall_common+0x101 Uptime: 2m50s acpi0: Powering system off BTW, when the patch is reverted, I occasionally get this when the VM boots, but I guess that's a different issue: (da1:storvsc2:0:0:0): storvsc inquiry (6) [0 b2 0 4 1 ... ] da2 at storvsc2 bus 0 scbus4 target 0 lun 2 da2: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device da2: 300.000MB/s transfers da2: Command Queueing enabled da2: 51200MB (104857600 512 byte sectors) s_debugger+0x73 #1 0xffffffff80(da2:storvsc2:0:0:2): storvsc inquiry (6) [0 b2 0 4 1 ... ] (da1:storvsc2:0:0:0): storvsc inquiry (5) [0 b0 0 3c 0 ... ] bce381 at witness_warn+0x461 #2 0xffffffff81026273 at trap_pfa(da2:storvsc2:0:0:2): storvsc inquiry (5) [0 b0 0 3c 0 ... ] (da1:storvsc2:0:0:0): storvsc inquiry (5) [0 b1 0 3c 0 ... ] da1: Delete methods: <UNMAP(*),ZERO> ult+0x53 #3 0xffffffff81025a72 (da2:storvsc2:0:0:2): storvsc inquiry (5) [0 b1 0 3c 0 ... ] da2: Delete methods: <UNMAP(*),ZERO> at trap+0x2f2 #4 0xffffffff810010cc at calltrap+0x8 #5 0xffffffff80c1af78 at vfs_vmio_unwire+0x78 #6 0xffffffff80c16350 at bGEOM: new disk da2 relse+0x3c0 #7 0xffffffff80e6af3a at ffs_use_bread+0x9a #8 0xffffffff80e6659c at ffs_sbget+0x8c #9 0xffffffff80e69213 at ffs_mount+0xe03 #10 0xffffffff80c2e449 at vfs_domount+0x719 #11 0xffffffff80c2d727 at vfs_donmount+0x7f7 #12 0xffffffff80c30a32 at kernel_mount+0x62 #13 0xffffffff80c32ddd at parse_mount+0x43d #14 0xffffffff80c3150c at vfs_mountroot+0x68c #15 0xffffffff80afe567 at start_init+0x27 #16 0xffffffff80b277b4 at fork_exit+0x84 #17 0xffffffff81001dee at fork_trampoline+0xe Fatal trap 12: page fault while in kernel mode cpuid = 12; apic id = 0c fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80ea6081 stack pointer = 0x28:0xfffffe002d0b1140 frame pointer = 0x28:0xfffffe002d0b1150 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1 (kernel) [ thread pid 1 tid 100002 ] Stopped at _vm_page_deactivate+0xb1: cmpq %rcx,(%rax) db> bt Tracing pid 1 tid 100002 td 0xfffff800032e0560 _vm_page_deactivate() at _vm_page_deactivate+0xb1/frame 0xfffffe002d0b1150 vfs_vmio_unwire() at vfs_vmio_unwire+0x78/frame 0xfffffe002d0b1170 brelse() at brelse+0x3c0/frame 0xfffffe002d0b11c0 ffs_use_bread() at ffs_use_bread+0x9a/frame 0xfffffe002d0b1200 ffs_sbget() at ffs_sbget+0x8c/frame 0xfffffe002d0b1270 ffs_mount() at ffs_mount+0xe03/frame 0xfffffe002d0b1410 vfs_domount() at vfs_domount+0x719/frame 0xfffffe002d0b1630 vfs_donmount() at vfs_donmount+0x7f7/frame 0xfffffe002d0b16e0 kernel_mount() at kernel_mount+0x62/frame 0xfffffe002d0b1740 parse_mount() at parse_mount+0x43d/frame 0xfffffe002d0b1880 vfs_mountroot() at vfs_mountroot+0x68c/frame 0xfffffe002d0b1a00 start_init() at start_init+0x27/frame 0xfffffe002d0b1a70 fork_exit() at fork_exit+0x84/frame 0xfffffe002d0b1ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe002d0b1ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- db>
I ran into this a couple of times now on a UP machine. The problem is that base r329612 (jeff) turned the bufspace daemon process into a thread of the buf daemon process. Now both threads call kproc_suspend_check with the same proc argument and it contains the following while loop: while (SIGISMEMBER(p->p_siglist, SIGSTOP)) { wakeup(&p->p_siglist); msleep(&p->p_siglist, &p->p_mtx, PPAUSE, "kpsusp", 0); } So one thread wakes up the other and the other wakes up the first again, starving other processes like the syncer. On an SMP system other processes can still run on another CPU. I suspect one or both daemons need to be converted to use kthread_suspend and kthread_suspend_check instead, but I cannot reboot this machine right now to test this.
+1 on a Penium-M laptop. I'll try the patch sometime this weekend or early next week.
A commit references this bug: Author: tijl Date: Sun Apr 22 16:05:30 UTC 2018 New revision: 332871 URL: https://svnweb.freebsd.org/changeset/base/332871 Log: Make bufdaemon and bufspacedaemon use kthread_suspend_check instead of kproc_suspend_check. In r329612 bufspacedaemon was turned into a thread of the bufdaemon process causing both to call kproc_suspend_check with the same proc argument and that function contains the following while loop: while (SIGISMEMBER(p->p_siglist, SIGSTOP)) { wakeup(&p->p_siglist); msleep(&p->p_siglist, &p->p_mtx, PPAUSE, "kpsusp", 0); } So one thread wakes up the other and the other wakes up the first again, locking up UP machines on shutdown. Also register the shutdown handlers with SHUTDOWN_PRI_LAST + 100 so they run after the syncer has shutdown, because the syncer can cause a situation where bufdaemon help is needed to proceed. PR: 227404 Reviewed by: kib Tested by: cy, rmacklem Changes: head/sys/kern/vfs_bio.c
*** Bug 227352 has been marked as a duplicate of this bug. ***
We can confirm this fix works for us. Thank!!!
re-tested with latest -CURRENT on armv6 (where I had originally reported it) FreeBSD pi1c 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r332894: Tue Apr 24 07:43:22 PDT 2018 bobf@hack.SFT.local:/e-drive/obj.current/e-drive/src.current/arm.armv6/sys/RPI-B arm did not observe the hang. however, I _did_ see this in the console: Apr 24 16:55:56 pi1c syslogd: exiting on signal 15 Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 1 1 1 0 done Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... done All buffers synced. lock order reversal: 1st 0xd2c34974 ufs (ufs) @ /e-drive/src.current/sys/kern/vfs_mount.c:1335 2nd 0xd2c2fbc4 syncer (syncer) @ /e-drive/src.current/sys/kern/vfs_subr.c:2732 stack backtrace: lock order reversal: 1st 0xd2c34724 ufs (ufs) @ /e-drive/src.current/sys/kern/vfs_mount.c:1335 2nd 0xd2c34034 devfs (devfs) @ /e-drive/src.current/sys/fs/msdosfs/msdosfs_vfsops.c:934 stack backtrace: Uptime: 1m49s The operating system has halted. Please press any key to reboot.
Yes, there are several LORs in the filesystem code (search for "LOR" in the search box at the top of this page). The ones you see are probably harmless because at that point during shutdown no other thread will access the filesystem so deadlock cannot occur.