Bug 244048

Summary: mksnap_ffs hangs machine (12.1 regression over 11.3)
Product: Base System Reporter: ml
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: Open ---    
Severity: Affects Only Me CC: crest, kib, markj, mckusick, mi
Priority: --- Keywords: regression
Version: 12.2-RELEASE   
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252821

Description ml 2020-02-11 14:13:06 UTC
On several servers I manage, I take backups to an external HD and use mksnap_ffs to create snapshots.
I've never had troubles with this up to 11.3.

Lately I started doing this on a 12.1 server and noticed mksnap_ffs will hang the box for several minutes (services stuck, no login allowed, already established ssh sessions partially work; shutdown not feasible unless reset button is pressed).
N.B. This is an external drive, only mounted when needed and only accessed to make backups, so if *it* got stuck, it should not affect the whole system.

I decided to check this and took the external HD to my desktop (11.3): it worked perfectly.
I then upgraded my desktop to 12.1p2 and it started doing as above:

_ mksnap_ffs will work for several minutes under high I/O (the HD is 6TB); meanwhile, the system is responsive;

_ then mksnap_ffs will drastically reduce its I/O (at least as measured with top), but will keep working for some other minutes: during this phase, I cannot open any new program; ThunderBird gets stuck, already open FireFox windows still works, but I cannot open any new window; audacity keeps playing the current track, but will get stuck when moving on to the next; already open terminal windows might partially work;

_ after several minutes mksnap_ffs will exit and everything will get back to normal.

This is of course unacceptable on a production server.

I built a test machine with 12.1/amd64 with the following kernel options: KDB, KDB_TRACE, DDB, GDB, INVARIANTS, INVARIANT_SUPPORT, WITNESS, WITNESS_SKIPSPIN, DEBUG_VFS_LOCKS, LOCK_PROFILING, KTR, ALQ, KTR_ENTRIES=4096.
Such a kernel paniced immediately after launching mksnap_ffs with LOR #269.

I removed WITNESS, WITNESS_SKIPSPIN, issued a "fsck -y" on the disk and tried again.
This time I got a different panic:
panic: ffs_copyonwrite: bad copy block
cpuid = 0
time = 1581243816
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001beef0e0
vpanic() at vpanic+0x19d/frame 0xfffffe001beef130
panic() at panic+0x43/frame 0xfffffe001beef190
ffs_copyonwrite() at ffs_copyonwrite+0x74c/frame 0xfffffe001beef230
ffs_geom_strategy() at ffs_geom_strategy+0x8c/frame 0xfffffe001beef260
ufs_strategy() at ufs_strategy+0x83/frame 0xfffffe001beef290
VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0xc9/frame 0xfffffe001beef2c0
bufstrategy() at bufstrategy+0x44/frame 0xfffffe001beef2f0
bufwrite() at bufwrite+0x230/frame 0xfffffe001beef330
ffs_snapshot() at ffs_snapshot+0x8e0/frame 0xfffffe001beef630
ffs_mount() at ffs_mount+0xb3a/frame 0xfffffe001beef7d0
vfs_domount() at vfs_domount+0x8b6/frame 0xfffffe001beef9f0
vfs_donmount() at vfs_donmount+0x7e7/frame 0xfffffe001beefa90
sys_nmount() at sys_nmount+0xf2/frame 0xfffffe001beefac0
amd64_syscall() at amd64_syscall+0x281/frame 0xfffffe001beefbf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe001beefbf0
--- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x8002d88ba, rsp = 0x7fffffffd288, rbp = 0x7fffffffeae0 ---
KDB: enter: panic

So, I also removed INVARIANTS and INVARIANT_SUPPORT (and run "fsck -y" twice) in order to be able to get snapshots.

I haven't been able to collect other data yet.
Comment 1 ml 2020-03-01 12:34:27 UTC
This week another box was upgraded from 11.3 to 12.1: like the first one, this hangs as soon as mksnap_ffs is run on an external USB-attached HD and only the reset button can get it out of this situaation.
The external HD is not the same, so we can exclude it's something wroing in that particular filesystem.

As a side note, I tried setting up debugging on a test machine via serial cable, but was unable to get it to work.
Comment 2 ml 2020-03-04 15:48:33 UTC
Further testing:
_ one one box, running GENERIC instead of custom kernel, still shows the problem;
_ on another box, I installed a 11.3 VM (through bhyve) and connected it to the host via ggate: doing the shapshot on the VM does not halt the system (so this has nothing to do with the data on the disk).
Comment 3 Mikhail Teterin freebsd_committer 2020-06-01 20:21:16 UTC
Confirming :(

I also backup a filesystem from cron. Sometimes this hangs.

When this happens, the mksnap_ffs is running at 100% of one of the processors. It is also unkillable... The hapless dump(8) sits there waiting for mksnap_ffs to exit...

The filesystem becomes unusable -- the already-started processes keep running, but their attempts to access the filesystem hang.

In our case the filesystem's parameters are:

tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       disabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: space to hold for metadata blocks: (-k)            6408
tunefs: optimization preference: (-o)                      time

I cannot say, if this is a regression as this is a new machine.
Comment 4 ml 2021-01-12 09:29:00 UTC
Today I tried again after upgrading to 12.2.
Still the same behavious (i.e. hardware reset required).

This is becoming more serious, as 11.x is approaching its EOL, as it prevents me from upgrading several machines.
Comment 5 ml 2021-01-19 09:24:15 UTC
I finally was able to reproduce this systematically on a virtual machine (with INVARIANTS & co enabled).
I've got a snapshot where the VM is deadlocked and I can connect to it via remote GDB: I'd appreciate any help on diagnosing this.

I've also got a different snapshot where the described panic happens, but I've opened a different bug for that (#252821).
Comment 6 ml 2021-02-21 18:51:38 UTC
After some investigation this is what I found.
(Notice I'm no kernel expert, so I just hope I'm not saying stupid things).

The thread originating from mksnap_ffs is stuck in softdep_check_suspend, sleeping on mp->mnt_secondary_writes ("secwr" for userland utilities).
Full backtrace:
#0  sched_switch (td=0xfffff8000237e760, newtd=0xfffff8000212c760, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2143
#1  0xffffffff805a5294 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:452
#2  0xffffffff805f272b in sleepq_switch (wchan=0xfffff80004120a00, pri=119) at /usr/src/sys/kern/subr_sleepqueue.c:626
#3  0xffffffff805f25c3 in sleepq_wait (wchan=0xfffff80004120a00, pri=119) at /usr/src/sys/kern/subr_sleepqueue.c:705
#4  0xffffffff805a4a6b in _sleep (ident=0xfffff80004120a00, lock=<optimized out>, priority=631, wmesg=0xffffffff80921d1b "secwr", sbt=0, pr=0, flags=256) at /usr/src/sys/kern/kern_synch.c:217
#5  0xffffffff807b6324 in softdep_check_suspend (mp=0xfffff80004120000, devvp=0xfffff800041ff780, softdep_depcnt=15, softdep_accdepcnt=50188, secondary_writes=1, secondary_accwrites=106) at /usr/src/sys/ufs/ffs/ffs_softdep.c:14299
#6  0xffffffff807c19a6 in ffs_sync (mp=0xfffff80004120000, waitfor=4) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1620
#7  0xffffffff8067c8bf in vfs_write_suspend (mp=0xfffff80004120000, flags=0) at /usr/src/sys/kern/vfs_vnops.c:1864
#8  0xffffffff8079c0b9 in ffs_snapshot (mp=0xfffff80004120000, snapfile=0xfffff80004214780 "A\003\215\200\377\377\377\377\070\232\254\200\377\377\377\377 \b\361\035") at /usr/src/sys/ufs/ffs/ffs_snapshot.c:430
#9  0xffffffff807bfe5a in ffs_mount (mp=<unavailable>) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:479
#10 0xffffffff80661a54 in vfs_domount_update (td=0xfffff80080ac9401, vp=<optimized out>, fsflags=<optimized out>, optlist=<optimized out>) at /usr/src/sys/kern/vfs_mount.c:1037
#11 vfs_domount (td=0xfffff80080ac9401, fstype=<optimized out>, fspath=<optimized out>, fsflags=<optimized out>, optlist=0xfffffe000053aa38) at /usr/src/sys/kern/vfs_mount.c:1191
#12 0xffffffff80660b27 in vfs_donmount (td=0xfffff8000237e760, fsflags=2166784, fsoptions=0xfffff80004108600) at /usr/src/sys/kern/vfs_mount.c:726
#13 0xffffffff80660312 in sys_nmount (td=0xfffff8000237e760, uap=0xfffff8000237eb20) at /usr/src/sys/kern/vfs_mount.c:431
#14 0xffffffff808418b7 in syscallenter (td=0xfffff8000237e760) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:144
#15 amd64_syscall (td=0xfffff8000237e760, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1163
#16 <signal handler called>
#17 0x00000008002dcb9a in ?? ()

I *think* it should be awakened by softdep_flush thread, in function process_worklist_item.
Alas, this is stuck waiting for a buffer in bufspace_wait.
Full backtrace:
#0  sched_switch (td=0xfffff8000425f760, newtd=0xfffff800040df000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2143
#1  0xffffffff805a5294 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:452
#2  0xffffffff805f272b in sleepq_switch (wchan=0xffffffff80a0a8b8 <bdomain+33208>, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:626
#3  0xffffffff805f25c3 in sleepq_wait (wchan=0xffffffff80a0a8b8 <bdomain+33208>, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:705
#4  0xffffffff805a4a6b in _sleep (ident=0xffffffff80a0a8b8 <bdomain+33208>, lock=<optimized out>, priority=96, wmesg=0xffffffff808e5676 "newbuf", sbt=0, pr=0, flags=256) at /usr/src/sys/kern/kern_synch.c:217
#5  0xffffffff8064f6ff in bufspace_wait (bd=0xffffffff80a02700 <bdomain>, vp=0xfffff800042145a0, gbflags=<optimized out>, slpflag=<optimized out>, slptimeo=<optimized out>) at /usr/src/sys/kern/vfs_bio.c:773
#6  0xffffffff8064bbfc in getnewbuf (vp=<optimized out>, slpflag=0, slptimeo=0, maxsize=32768, gbflags=0) at /usr/src/sys/kern/vfs_bio.c:3284
#7  0xffffffff80649155 in getblkx (vp=0xfffff800042145a0, blkno=<optimized out>, size=32768, slpflag=0, slptimeo=0, flags=0, bpp=0xfffffe00005853d8) at /usr/src/sys/kern/vfs_bio.c:4022
#8  0xffffffff8064b905 in getblk (vp=<unavailable>, blkno=<unavailable>, size=<unavailable>, slpflag=<unavailable>, slptimeo=<unavailable>, flags=<unavailable>) at /usr/src/sys/kern/vfs_bio.c:3802
#9  0xffffffff807c69b5 in readindir (vp=<unavailable>, lbn=<unavailable>, daddr=56528, bpp=0xfffffe0000585478) at /usr/src/sys/ufs/ufs/ufs_bmap.c:111
#10 0xffffffff807c6468 in ufs_bmaparray (vp=0xfffff800042145a0, bn=-6049804, bnp=0xfffffe0000585518, nbp=<optimized out>, runp=<optimized out>, runb=0x0) at /usr/src/sys/ufs/ufs/ufs_bmap.c:266
#11 0xffffffff807d3975 in ufs_strategy (ap=<optimized out>) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2309
#12 0xffffffff808b9911 in VOP_STRATEGY_APV (vop=0xffffffff80aca540 <ufs_vnodeops>, a=0xfffffe0000585570) at vnode_if.c:2279
#13 0xffffffff80647114 in VOP_STRATEGY (vp=<unavailable>, bp=0xfffffe000090b5c0) at ./vnode_if.h:940
#14 bufstrategy (bo=<optimized out>, bp=0xfffffe000090b5c0) at /usr/src/sys/kern/vfs_bio.c:4999
#15 0xffffffff80648b1e in bstrategy (bp=<optimized out>) at /usr/src/sys/sys/buf.h:419
#16 breadn_flags (vp=<optimized out>, blkno=<optimized out>, size=<optimized out>, rablkno=0x0, rabsize=0x0, cnt=0, cred=0x0, flags=0, ckhashfunc=0x0, bpp=0xfffffe0000585780) at /usr/src/sys/kern/vfs_bio.c:2181
#17 0xffffffff807982b8 in ffs_balloc_ufs2 (vp=<optimized out>, startoffset=<optimized out>, size=<optimized out>, cred=0xfffff8000211b000, flags=<optimized out>, bpp=0xfffffe0000585820) at /usr/src/sys/ufs/ffs/ffs_balloc.c:894
#18 0xffffffff8079fb22 in ffs_snapblkfree (fs=0xfffffe0011c76000, devvp=<optimized out>, bno=48416024, size=32768, inum=5, vtype=VREG, wkhd=0xfffffe0000585950) at /usr/src/sys/ufs/ffs/ffs_snapshot.c:1790
#19 0xffffffff80790d16 in ffs_blkfree (ump=0xfffff80004116800, fs=0xfffffe0011c76000, devvp=0xfffff800041ff780, bno=48416024, size=32768, inum=5, vtype=VREG, dephd=0xfffffe0000585950, key=2) at /usr/src/sys/ufs/ffs/ffs_alloc.c:2602
#20 0xffffffff807baa4f in indir_trunc (freework=0xfffff800048ed480, dbn=<optimized out>, lbn=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:8259
#21 0xffffffff807ba93b in indir_trunc (freework=0xfffff800048ed480, dbn=<optimized out>, lbn=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:8240
#22 0xffffffff807ad4f9 in handle_workitem_indirblk (freework=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:7875
#23 handle_workitem_freeblocks (freeblks=0xfffff800048eda00, flags=512) at /usr/src/sys/ufs/ffs/ffs_softdep.c:7970
#24 0xffffffff807b5ba1 in process_worklist_item (mp=0xfffff80004120000, target=10, flags=512) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1806
#25 0xffffffff807a1e92 in softdep_process_worklist (mp=0xfffff80004120000, full=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1600
#26 0xffffffff807a580f in softdep_flush (addr=0xfffff80004120000) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1402
#27 0xffffffff80556f8c in fork_exit (callout=0xffffffff807a5720 <softdep_flush>, arg=0xfffff80004120000, frame=0xfffffe0000585c00) at /usr/src/sys/kern/kern_fork.c:1080
#28 <signal handler called>

The above explains why mksnap_ffs is halted; however the whole machine is hanged.
It seems almost any user thread (e.g. an "ls" *on a different filesystem*) is stuck in bufspace_wait.

Neither buf_daemon thread, nor its child bufspace_daemon are stuck (they are running their normal loop).

So far I wasn't able to pinpoint what changed between 11.4 and 12.1 to cause this.
Any hint appreciated.
Comment 7 Konstantin Belousov freebsd_committer 2021-02-22 07:55:23 UTC
What is the physical hardware where the hang occurs?
The sleep of buffer runspace most often means that io hung.  I saw it many
times with USB/umass and low-quality ata/sata-usb adapters.
Comment 8 ml 2021-02-22 11:26:13 UTC
(In reply to Konstantin Belousov from comment #7)

I doubt it's a hardware trouble:
_ this happens on more than one physical machine;
_ I too have seen umass troubles (on other machines), but this is usually logged; 
_ it only happens with 12.1/12.2, not with 11.x;
_ umass works perfectly for everything but mksnap_ffs;
_ if umass hanged, I don't think this should hang the whole system as only one process is using it; the others are depending on different storage and should not be affected (at least that's my experience);
_ the machine I'm debugging is a VirtualBox one (where I was able to recreate the problem): it's not using umass at all. It has system on /dev/ada0 and data on /dev/ada1 (of which I'm taking the snapshot). Underlying hardware is working fine.
Comment 9 Mark Johnston freebsd_committer 2021-02-23 21:42:29 UTC
I have no strong evidence that this is related, but can you try testing https://reviews.freebsd.org/D28901 ?  The problem was introduced in 12.
Comment 10 ml 2021-02-26 16:44:42 UTC
(In reply to Mark Johnston from comment #9)

I'll try.
Any side effects?

Anyway this is gonna need some time: as the problem is not always triggered, I have to rollback the VM, upgrade the system, save state and try, try and try again until I enter the deadlock once more.

I did a test though: while the machine is locked, I put a breakpoint in bufbdflush (the only function changed in that diff) and see it never gets reached.
Is this an indication that this patch won't solve or possibly it would have made a difference earlier?
Comment 11 Kirk McKusick freebsd_committer 2021-02-26 21:15:06 UTC
(In reply to ml from comment #10)
It would have made a difference earlier. By the time you get to your deadlock, it is too late to recover from it.
Comment 12 ml 2021-02-28 14:56:53 UTC
Just for the record...
While preparing to install the mentioned patch, I reached a status where:
_ issuing mksnap_ffs always locks the machine;
_ unmounting the filessytem (the one I would run mksnap_ffs on) also causes a deadlock (I guess it's the same deadlock, but I didn't analyze it);
_ shutting down, the machine cannot write all buffers and after reboot all the filesystem will require fsck;
_ the fsck on said filesystem will be problematic (hangs or panics).

So it seems that, after a lot of snapshots creation/deletion, the situation is already compromised *before* I try and take the snapshot that hangs.
Comment 13 Kirk McKusick freebsd_committer 2021-02-28 21:58:20 UTC
(In reply to ml from comment #12)
I am unclear on the order of events

Is the snapshot hanging on a filesystem that passes fsck, or are you trying it on an unclean filesystem? Snapshots should only be done on clean filesystems.

When you say that fsck panics while trying to clean the filesystem, do you mean that the kernel panics or that fsck fails. If it is fsck that is failing, what are the flags that you are using and how does it fail? If it hangs, please type ^T to get it to report what it is doing.
Comment 14 ml 2021-03-02 09:57:46 UTC
I apologize for not being clear.
The problem is not deterministic: behaviour changes slightly from machine to machine and even subsequent runs starting from the same VM snapshot can differ.

In comment 12 I was talking about the last situation I was able to achieve that I think is interesting.

VirtualBox VM with two disk:
_ ada0 is the "system" disk (UFS SU+J);
_ ada1 is the "data" disk, i.e. the one I'm trying snapshots on (GELI+UFS SU, no journal as it would prevent snapshotting).

The filesystem is clean: fsck finds no errors and everything works normally.
There are already 12 snapshots.

I can take one or two further snapshots (probably more, I didn't try): still everything is fine.

However, if I *delete* an existing snapshot, trouble starts:
_ subsequent mksnap_ffs produces the deadlock I described;
_ trying to unmount the filesystem also hangs (but won't hang the whole machine): ctrl-t may say it's in "secwr", like the above or in "bo_wwait" or in "wdrain"; a LOR (ufs@vfs_mount.c:1349/devfs@ffs_vfsops.c:1610) may appear or not;
_ even shutdown doesn't work properly: vnodes remaining count seems to rise and decrease at the same time; it may reach 0, but will grow again.

What happens after an hard reset can vary: fsck -p in single user mode seems to work and recover the situation; a background fsck is usually asking for troubles, often resulting in a panic (dropping into DDB).

In the end I was perhaps looking at the wrong side of things: it might not be mksnap_ffs that hangs the machine, but *deleting* a snapshot.
I'm not sure, though: I think in the past I've seen situations where it hanged even without deleting, but I cannot reconstruct now.

Unfortunately https://reviews.freebsd.org/D28901 doesn't seem to help.
Next thing I'll do is try 13.0 (unless someone has a better suggestion).