Backstory of this is in #244048. Briefly, a test VM with all debugs turned on (INVARIANT, WITNESS, etc...) deadlocked while taking a snapshot of a non-root filesystem and had to be reset. Now I've got a snapshot where: _ at reboot fsck is run on /, then system starts; after 1 minute a background fsck is run and this causes a panic; _ I can boot in single user mode, run fsck -y on the filesystem and boot works again; however I get the same panic if I try taking a new snapshot. This makes me think the filesystem is somehow ruined and fsck -y won't fix it. Panic trace is: panic: ffs_copyonwrite: bad copy block cpuid = 0 time = 1581243816 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001beef0e0 vpanic() at vpanic+0x19d/frame 0xfffffe001beef130 panic() at panic+0x43/frame 0xfffffe001beef190 ffs_copyonwrite() at ffs_copyonwrite+0x74c/frame 0xfffffe001beef230 ffs_geom_strategy() at ffs_geom_strategy+0x8c/frame 0xfffffe001beef260 ufs_strategy() at ufs_strategy+0x83/frame 0xfffffe001beef290 VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0xc9/frame 0xfffffe001beef2c0 bufstrategy() at bufstrategy+0x44/frame 0xfffffe001beef2f0 bufwrite() at bufwrite+0x230/frame 0xfffffe001beef330 ffs_snapshot() at ffs_snapshot+0x8e0/frame 0xfffffe001beef630 ffs_mount() at ffs_mount+0xb3a/frame 0xfffffe001beef7d0 vfs_domount() at vfs_domount+0x8b6/frame 0xfffffe001beef9f0 vfs_donmount() at vfs_donmount+0x7e7/frame 0xfffffe001beefa90 sys_nmount() at sys_nmount+0xf2/frame 0xfffffe001beefac0 amd64_syscall() at amd64_syscall+0x281/frame 0xfffffe001beefbf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe001beefbf0 --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x8002d88ba, rsp = 0x7fffffffd288, rbp = 0x7fffffffeae0 --- KDB: enter: panic I can connect with remote GDB to this VM, but my knowledge of kernel internal is not enough to debug this all by myself.
This bug was reported and fixed in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253158 It has not yet been, but will be MFC'ed to stable/12.
(In reply to Kirk McKusick from comment #1) Thanks! Would this patch affect #244048 too?
(In reply to ml from comment #2) Almost certainly this fix will solve the panic reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244048 too.
(In reply to Kirk McKusick from comment #3) Sorry, I wasn't clear. Of course this will fix the panic in #244048, as this bug is just a spin off to better describe it. I was asking if the mentioned patch could help with the main subject of that bug, i.e. the deadlock (which was the root cause of the later panic).
(In reply to ml from comment #4) I do not think that the main problem reported in #244048 (apparently running out of buffers in the buffer cache) will be fixed by this patch. It is not clear to me what changed between the 11-stable release and the 12-stable release to cause the buffer problem. At this point I would consider trying out the 13 release to see if the problem is still present in that distribution.
See also suggestion to try patch in https://reviews.freebsd.org/D28901.
The panic described in this report has been fixed in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253158. The buffer cache hanging was addressed in https://reviews.freebsd.org/D28901.