Bug 267628

Summary: growfs deadlocks if output is redirected to the filesystem being grown
Product: Base System Reporter: cgull
Component: binAssignee: freebsd-fs (Nobody) <fs>
Status: Closed FIXED    
Severity: Affects Some People CC: grahamperrin, kib, mckusick
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267630
https://reviews.freebsd.org/D37896

Description cgull 2022-11-07 22:01:27 UTC
growfs deadlocks in disk wait on 'suspfs' if its stdout/stderr is redirected to a file on the filesystem being grown.  This appears to be because it suspends the filesystem with the UFSSUSPEND ioctl, and then writes disk block numbers as a progress indicator as it writes new cylinder groups.

The reproduction is easy:

mdconfig -s 20m
gpart create -s gpt md0
gpart add -t freebsd-ufs -s 10m -i 1 md0
newfs md0p1
mount /dev/md0p1 /mnt
gpart resize -i 1 md0
growfs -y md0p1 > /mnt/growfs.log


Reproduced in 13.1p2 and a -CURRENT snapshot dated Nov 3 07:57 with identifier 'main-n259005-5cc5c9254da'.

Also, if the system is shutdown with a growfs in this state, disk syncing hangs and the system never halts/reboots.  I'll open a separate bug for that.

Ideas for fixes (some are not complete fixes):

* fstat() stdout/stderr and compare their fsids against the filesystem we're growing.  If either is found to match, exit with an error message.  I think that may be a complete solution for growfs (it can no longer uninterruptibly deadlock).  'growfs | tee /growing-fs/growfs.log' will still hang if growfs produces enough output, but growfs will be interruptible, and tee will make progress after the ufssuspend state is exited.

* Write new cylinder groups first, before doing UFSSUSPEND and writing the formerly-last cg and superblock.  This is not a complete fix for this particular issue (if growfs reports errors it will still deadlock), but is 90%, and has the added benefits of doing most of the I/O (and encountering most possible errors) before making irreversible changes to the existing filesystem, and also reducing the amount of time the filesystem is suspended.

* Don't report progress or errors while in ufssuspend state.

* On non-ttys, set stdio buffers so that progress indicators are not written until completion.

I'll try and come up with patches for this.
Comment 1 cgull 2022-11-08 00:04:52 UTC
The kernel issue is bug 267630.
Comment 2 Kirk McKusick freebsd_committer freebsd_triage 2022-12-28 23:04:30 UTC
I am inclined to your third suggestion in the case that you are writing to a file in the filesystem being grown (i.e., detected as in your first suggestion).
Comment 3 Konstantin Belousov freebsd_committer freebsd_triage 2022-12-28 23:38:52 UTC
I suggest just disable a process to have files opened for write on the
target filesystem.  There are more problems due to this.

https://reviews.freebsd.org/D37896
Comment 4 commit-hook freebsd_committer freebsd_triage 2022-12-29 20:56:42 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=701b36961cbd2084c36eb828402d0ef89211d929

commit 701b36961cbd2084c36eb828402d0ef89211d929
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-12-28 18:14:52 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-12-29 20:55:39 +0000

    ufs/suspend: deny suspension if the calling process has a file from mp opened for write

    Also deny suspension if we cannot check the above condition race-free
    because there is more than one thread in the calling process.

    PR:     267628, 267630
    Reviewed by:    mckusick
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D37896

 sys/ufs/ffs/ffs_suspend.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
Comment 5 cgull 2022-12-29 21:22:55 UTC
I have some WIP addressing this and adding enhancements, which I'd not finished testing before the holidays.

It may be a bit late now, but my fix to growfs is to check stdout/stderr for mountpoints in growfs.c instead.  That's less general, but growfs() and /sbin/growfs aren't very general-purpose to begin with.  I think kib's approach is fine, though.

I have some other minor changes to growfs to avoid writing to stdout/stderr unnecessarily, and I also implemented my proposed enhancements to growfs.c, growfs() syscall, and /dev/ufssuspend to write new cylinder groups before rewriting the last existing cylinder group.

I'll finish testing these and make them available for review soon.
Comment 6 Kirk McKusick freebsd_committer freebsd_triage 2022-12-29 23:13:37 UTC
(In reply to cgull from comment #5)
I am happy to review your proposed changes. Kostik's changes are needed in any event as the kernel must protect against the deadlock and not just depend on a utility to do so. However having the utility avoid the problem and give a more specific error message is still good to have. Also the reordering to narrow the window of vulnerability to filesystem corruption is much appreciated.
Comment 7 commit-hook freebsd_committer freebsd_triage 2023-01-20 03:25:08 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=00c0e20c5180284298910dcf0a677243b2c28f7d

commit 00c0e20c5180284298910dcf0a677243b2c28f7d
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-12-28 18:14:52 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2023-01-20 03:21:35 +0000

    ufs/suspend: deny suspension if the calling process has a file from mp opened for write

    PR:     267628, 267630
    Tested by:      pho

    (cherry picked from commit 701b36961cbd2084c36eb828402d0ef89211d929)

 sys/ufs/ffs/ffs_suspend.c | 11 +++++++++++
 1 file changed, 11 insertions(+)