255473 – Infinite writes on UFS with SU+J filesystem

Bug 255473 - Infinite writes on UFS with SU+J filesystem

Summary: Infinite writes on UFS with SU+J filesystem

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	13.0-STABLE
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	freebsd-fs (Nobody)

URL:
Keywords:

Duplicates (1):	253968 (view as bug list)
Depends on:
Blocks:

Reported:	2021-04-28 22:55 UTC by Jack
Modified:	2021-07-10 02:23 UTC (History)
CC List:	8 users (show)

See Also:

Attachments
top with hung machine doing infinite i/o (286.73 KB, image/jpeg) 2021-05-05 10:24 UTC, Jack	no flags	Details
The lockf process (94.49 KB, image/jpeg) 2021-05-05 10:32 UTC, Jack	no flags	Details
Unable to turn off soft update journaling (255.90 KB, image/jpeg) 2021-05-26 20:33 UTC, Jack	no flags	Details
Unable to turn off soft update journaling (381.43 KB, image/jpeg) 2021-05-27 03:15 UTC, Jack	no flags	Details
patch to disable journaled soft updates (800 bytes, patch) 2021-05-31 05:29 UTC, Kirk McKusick	no flags	Details \| Diff
Unable to disable journal on softupdates after applying patch (646.65 KB, image/jpeg) 2021-05-31 07:09 UTC, Jack	no flags	Details
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jack 2021-04-28 22:55:27 UTC

Using FreeBSD 13.0-STABLE #0 stable/13-n245116-edda63ba979a

When I rm -rf a directory with a lot of files like "rm -rf /usr/ports/*" and /usr/ports is symlinked to another directory on another drive like /drive2/ports

The rm -rf finishes but and returns to the prompt but gstat shows 100% writes to the disk forever and attempting to access the directory hangs.

When trying to reboot the system, it says there are outstanding writes and eventually needing to power cycle the system to get it back to a usable state.

This has happened when I had it on 13.0-RELEASE

Comment 1 Mark Millard 2021-04-29 01:41:09 UTC

UFS file systems? ZFS file systems? TRIM enabled
and in use vs. not? (gstat -spod would show
delete activity for TRIM activity.)

There may be other configuration information that
would be relevant that I've not thought of.

Hangs: Do both forms of reference hang:
/drive2/ports/NAME and /usr/ports/NAME ?

Comment 2 Jack 2021-04-29 01:50:55 UTC

UFS2, spinning disks in a gmirror (but also happened on single disk). No trim enabled.

I just tried to ls /usr/ports after but ls just hung, didn't try anything else. kill -9 the ls process didn't respond, just stuck at biorw or something similar to biorsomething.

Comment 3 Jack 2021-04-29 01:58:53 UTC

Looking through bug reports, it looks similar to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224292 in the pkg upgrade case on 13.0-CURRENT

Thinking about it, it does usually happen when I'm attempting to do some pkg operation at the same time I'm deleting a ton of files (usually rm -rf old world builds in /usr/obj and /usr/ports/*/*/work directories set to WRKDIRPREFIX=/usr/obj that weren't cleaned up)

I'll try to reproduce the problem again on a vm.

Comment 4 Jack 2021-04-30 03:06:26 UTC

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=255499 looks to be similar also. I'm also using 
soft update journaling: (-j)                       enabled

Comment 5 Jack 2021-05-05 10:23:20 UTC

I just encountered another infinite write with gmirror setup on UFS2+SUJ
FreeBSD 13.0-STABLE (PXE) #0 stable/13-n245487-d0fbb03a4dc5: Tue May  4 00:36:26                               

I was shutting down a virtualbox vm and then this process shows up and attempting to lsof -p 47491 just hangs without being able to ctrl+c

47491 root          1  20    0    12M  1792K biowr    2   0:11   1.11% lockf

gstat -spod
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s     kB   kBps   ms/r    w/s     kB   kBps   ms/w    d/s     kB   kBps   ms/d    o/s   ms/o
  %busy Name
    0      0      0      0      0    0.0      0      0      0    0.0      0      0      0    0.0      0    0.0
   0.0| cd0
    3   4034      0      0      0    0.0   4034     32 129087    0.6      0      0      0    0.0      0    0.0
  98.5| ada0
    2   4033      0      0      0    0.0   4033     32 129055    0.5      0      0      0    0.0      0    0.0
  96.4| ada1
    0      0      0      0      0    0.0      0      0      0    0.0      0      0      0    0.0      0    0.0
   0.0| ada2

but after typing the gstat command, the system is completely frozen from i/o and can't ssh anymore. I had to get on the console and top is running but even top does not exit. Attaching screenshot in next comment.

Comment 6 Jack 2021-05-05 10:24:18 UTC

Created attachment 224684 [details]
top with hung machine doing infinite i/o

top output from console since machine is locked up from infinite i/o

Comment 7 Jack 2021-05-05 10:32:23 UTC

Created attachment 224685 [details]
The lockf process

Comment 8 Jack 2021-05-05 10:39:04 UTC

Let me know if I can run anything else while console is still active and machine frozen

Comment 9 Kirk McKusick freebsd_committer

2021-05-11 23:45:15 UTC

Does the problem go away if you run with just soft updates (rather than journalled soft updates)? We have been investigating a similar issue where the journel gets full and we are not managing to effectively flush it.

Comment 10 Kirk McKusick freebsd_committer

2021-05-11 23:56:37 UTC

(In reply to Kirk McKusick from comment #9)

You can disable journalled soft updates on disk /dev/ada0p2 on /mnt using:

# unmount /mnt
# tunefs -j disable /dev/ada0p2
Clearing journal flags from inode 4
tunefs: soft updates journaling cleared but soft updates still set.
tunefs: remove .sujournal to reclaim space
# mount /dev/ada0p2 /mnt
# rm -f /mnt/.sujournal

Comment 11 Jack 2021-05-12 04:32:37 UTC

(In reply to Kirk McKusick from comment #10)
It doesn't seem to always happen so hard to tell but it's also the / gmirror disk so would need to reboot the system to make the change.

Comment 12 Kirk McKusick freebsd_committer

2021-05-12 20:24:04 UTC

(In reply to Jack from comment #11)
It would be helpful to turn off soft updates journaling when you next schedule a reboot so that we can find out if the problem persists. Journaling slows your disk I/O by 1-2% as a tradeoff for making fsck run much faster after a crash. There is no change in risk of filesystem corruption from turning off journaling.

Comment 13 Jack 2021-05-14 00:06:38 UTC

It's doing the infinite writes again
13.0-STABLE FreeBSD 13.0-STABLE #0 stable/13-n245485-dec9f377531d

    1  13805      0      0    0.0  13805 441756    0.1  100.0| ada0
    1  13805      0      0    0.0  13805 441756    0.1  100.0| ada1
    1  13805      0      0    0.0  13805 441756    0.1  100.0| ada0p2
    1  13805      0      0    0.0  13805 441756    0.1  100.0| ada1p2
    1  13805      0      0    0.0  13805 441756    0.1  100.0| mirror/root1

The command I issued immediately before was
cd /usr/ports;git pull;portupgrade -ua

It's been stuck at

[Reading data from pkg(8) ... - 1088 packages found - done]

for the past 2 hours and everything i type results in a hang.

top shows

 1113 dhcpd         1  24    0    21M  2956K biowr    1  12:51   8.50% dhcpd
50445 root          1  23    0   117M    38M getblk   5  17:50   6.66% ruby27
50776 root          1  22    0    13M  2552K getblk   3   5:09   5.24% sh
50693 root          1  22    0   141M  8316K ufs      2   5:02   4.21% smbd
50783 root          1  22    0   141M  8316K ufs      7   3:27   4.17% smbd
49660 root          1  22    0   141M  8280K RUN      7  46:09   4.16% smbd
50724 root          1  22    0   141M  8316K ufs      5   4:32   3.96% smbd
50884 root          1  22    0   141M  8320K ufs      3   1:22   3.95% smbd
50817 root          1  22    0   141M  8320K ufs      2   2:40   3.94% smbd
50789 root          1  22    0   141M  8316K biowr    6   3:24   3.91% smbd
50532 root          1  21    0   141M  8316K ufs      2   9:06   3.83% smbd
50725 root          1  22    0   141M  8316K ufs      1   4:32   3.83% smbd
50888 root          1  22    0   141M  8324K ufs      5   1:19   3.82% smbd
50711 root          1  21    0   141M  8316K ufs      6   4:41   3.80% smbd
50786 root          1  21    0   141M  8316K ufs      2   3:24   3.79% smbd
50885 root          1  21    0   141M  8320K ufs      5   1:20   3.78% smbd
49659 root          1  21    0    46M  5648K RUN      6  46:17   3.76% nmbd
50887 root          1  22    0   141M  8324K biowr    4   1:19   3.73% smbd
49658 root          1  21    0    54M  7480K CPU2     2  46:17   3.73% winbindd
50889 root          1  21    0   141M  8324K ufs      3   1:17   3.71% smbd
50784 root          1  21    0   141M  8316K ufs      7   3:27   3.64% smbd
50886 root          1  21    0   141M  8320K ufs      4   1:20   3.63% smbd
50818 root          1  21    0   141M  8320K ufs      4   2:39   3.58% smbd
50533 root          1  21    0   141M  8316K ufs      7   9:07   3.57% smbd

Only way to recover is to power cycle the machine as reboot also hangs.

Comment 14 Kirk McKusick freebsd_committer

2021-05-15 01:14:48 UTC

(In reply to Jack from comment #13)
When you do the reboot, please try disabling soft update journaling as described in my comment #10.

Comment 15 Jack 2021-05-15 01:19:16 UTC

(In reply to Kirk McKusick from comment #14)
But if I disable journaling and I power cycle the machine, will fsck take a very long time on a 4T gmirror with maybe over a million files?

Comment 16 Kirk McKusick freebsd_committer

2021-05-15 01:22:41 UTC

(In reply to Jack from comment #15)
Fsck will likely take up to 15 minutes. But if the net result is that the machine stops hanging, then it will be well worth it. And if it does stop hanging the machine, we will have a good idea of where to look to find and hopefully fix the problem.

Comment 17 Jack 2021-05-15 01:30:09 UTC

(In reply to Kirk McKusick from comment #16)
Ok, I'll try. This machine is a router and file server so it's hard to find a good time to not unintentionally take down the machine.

Comment 18 Kirk McKusick freebsd_committer

2021-05-19 22:09:13 UTC

(In reply to Jack from comment #17)
Is it possible for you to try applying the patch described in

https://reviews.freebsd.org/D30041

It is attempting to mitigate the problem that you are experiencing when running with soft updates journaling.

It would still be useful to know if disabling journaling is helpful because if it is then it validates that the fix in D30041 is a useful idea to pursue.

Comment 19 Jack 2021-05-19 22:25:01 UTC

(In reply to Kirk McKusick from comment #18)
I can apply the patch. So far the system hasn't hung yet so I haven't needed to reboot it. The random hangs makes it hard to tell when it will show itself unfortunately.

Comment 20 Jack 2021-05-25 19:09:33 UTC

Had a power outage and still haven't been able to reproduce the issue on the system that was hanging but I now have a new computer with an NVME and SSD in a gmirror together running the latest 5/25/21 b14db362bbd2 git checkout of the source code that hung. I had done a fresh make buildworld,installworld, rebooted, then did a rm -rf /usr/ports. Edited /etc/fstab and system did infinite writes to the gmirror nvme/ssd. It was so hung that my /etc/fstab couldn't save and the only way to recover was to power cycle the machine. The gmirror had su+j+trim. Appears bug still exists in current 13-stable.

Comment 21 sigsys 2021-05-26 04:31:01 UTC

(In reply to Jack from comment #20)
I think I'm seeing the same problem on 14-CURRENT occasionally and when it starts doing crazy writes just running "sync" makes it stop.  When running commands is still possible anyway.

It has gotten rarer for me with recent updates but running chrome and clicking links really fast would sometimes still trigger the problem (with files from chrome's cache directory being involved).

In my case it's a VM with the disks on zvols and the pool has sync=disabled.  Maybe it causes write I/Os to be much faster than what is usual for UFS and that's what can trigger the problem.  Maybe your NVME disks are way faster than what it's used to too.  Maybe regular data being written so fast causes more complex meta-data softupdates structures to be handled together.  I'm really just guessing here... in case it could give someone an idea.  Maybe running UFS on virtual disks with ridiculously fast writes could be a good way to try to reproduce it.

Comment 22 Kirk McKusick freebsd_committer

2021-05-26 05:10:24 UTC

(In reply to Jack from comment #20)
Let me reiterate. Please try disabling journaled soft updates as described in comment #10 to see if the problem persists. This basic test will enable us to understand how to proceed on fixing the problem.

Comment 23 Kirk McKusick freebsd_committer

2021-05-26 05:12:01 UTC

(In reply to sigsys from comment #21)
Please try running with journaled soft updates disabled (but leave soft updates enabled) as described in my comment #10. Let us know if that makes the problem go away.

Comment 24 sigsys 2021-05-26 08:40:00 UTC

(In reply to Kirk McKusick from comment #23)
Welp, now I can't for the life of me reproduce it (with journaling enabled).  Usually I could by running some du/cp/rm loops and chrome but now it seems intent on just working perfectly.  It used to happen a lot (sometimes a few times a day, and not necessarily with a ton of FS activity like that) but it has become much more rare now.  So I might not be very useful to test this anymore but it definitively looked like the same problem.  I'm going to try to reproduce it with journaling enabled some more and then run this system with journaling disabled and see if the problem ever happens again.

Comment 25 Jack 2021-05-26 09:10:32 UTC

(In reply to sigsys from comment #24)
I'm in the same situation, if I try to reproduce it, it doesn't happen but when I'm doing normal things, it crops up. I'm disabling journaling on my nvme/ssd gmirror to see if it comes up again. Brand new install.

Comment 26 Jack 2021-05-26 20:33:30 UTC

Created attachment 225293 [details]
Unable to turn off soft update journaling

I tried to turn off soft update journaling but when I mount the filesystem, tunefs says it's on.

Comment 27 Kirk McKusick freebsd_committer

2021-05-26 23:21:24 UTC

(In reply to Jack from comment #26)
Did you follow my instructions in comment #10? Specifically you need to have
done it so that it gets disabled on both sides of the mirror. Also, did you remove .sujournal? Because if it is gone, it is not possible to mount with journaling enabled.

Comment 28 Jack 2021-05-27 02:24:55 UTC

(In reply to Kirk McKusick from comment #27)
I thought removing it from the gmirror was sufficient, did not know it had to be done on the disk itself. Will remove it from the disks and try again.

Comment 29 Jack 2021-05-27 03:15:18 UTC

Created attachment 225298 [details]
Unable to turn off soft update journaling

I booted into single user mode and I tried to turn off journaling but it's throwing this error (I set kern.geom.debugflags=18)

Comment 30 Kirk McKusick freebsd_committer

2021-05-29 00:20:39 UTC

(In reply to Jack from comment #29)
What does the filesystem report when it is mounted? Does it show as journaling?

Comment 31 Jack 2021-05-29 05:57:34 UTC

Says it is on

gmirror status root
       Name    Status  Components
mirror/root  COMPLETE  nvd0p2 (ACTIVE)
                       ada0p2 (ACTIVE)
                       ada1p2 (ACTIVE)

/dev/mirror/root on / (ufs, local, noatime, journaled soft-updates)


tunefs -p /dev/mirror/root
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       enabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             0%
tunefs: space to hold for metadata blocks: (-k)            6400
tunefs: optimization preference: (-o)                      space
tunefs: volume label: (-L)

 tunefs -p /dev/nvd0p2
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       enabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             0%
tunefs: space to hold for metadata blocks: (-k)            6400
tunefs: optimization preference: (-o)                      space
tunefs: volume label: (-L)

tunefs -p /dev/ada0p2
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       enabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             0%
tunefs: space to hold for metadata blocks: (-k)            6400
tunefs: optimization preference: (-o)                      space
tunefs: volume label: (-L)

tunefs -p /dev/ada1p2
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       enabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             0%
tunefs: space to hold for metadata blocks: (-k)            6400
tunefs: optimization preference: (-o)                      space
tunefs: volume label: (-L)

Comment 32 Kirk McKusick freebsd_committer

2021-05-29 21:58:06 UTC

(In reply to Jack from comment #31)
When you booted single user on /dev/mirror/root and tried to use tunefs to disable soft updates journaling, what error did you get?

Comment 33 Jack 2021-05-29 22:31:00 UTC

(In reply to Kirk McKusick from comment #32)
I did not get any error. See https://bz-attachments.freebsd.org/attachment.cgi?id=225293

It claims it was disabled but when mounting it, tunefs showed it was enabled.

Comment 34 Kirk McKusick freebsd_committer

2021-05-31 00:23:15 UTC

(In reply to Kirk McKusick from comment #32)
If I provide you with a kernel patch that disables journaled soft updates can you install and run a patched kernel on your system? If so let me know what vintage kernel you are running (output of uname -a).

Comment 35 Jack 2021-05-31 01:43:51 UTC

(In reply to Kirk McKusick from comment #34)
Yes, I can do that. I'll be checking out the current 13-stable so the patch against whatever is in git will be fine.

Comment 36 Kirk McKusick freebsd_committer

2021-05-31 05:29:33 UTC

Created attachment 225397 [details]
patch to disable journaled soft updates

please try this patch.

Comment 37 Jack 2021-05-31 07:09:10 UTC

Created attachment 225399 [details]
Unable to disable journal on softupdates after applying patch

I'm still unable to turn off journaling with the patch.

root@upstairs:/usr/src # git diff
diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
index 2cc16ab2a2c1..cc3f64b4e70e 100644
--- a/sys/ufs/ffs/ffs_softdep.c
+++ b/sys/ufs/ffs/ffs_softdep.c
@@ -2653,6 +2653,10 @@ softdep_uninitialize()
        callout_drain(&softdep_callout);
 }
 
+static int enable_suj = 0;
+SYSCTL_INT(_debug_softdep, OID_AUTO, enable_suj, CTLFLAG_RW,
+    &enable_suj, 0, "Enable the activation of journaled soft updates");
+
 /*
  * Called at mount time to notify the dependency code that a
  * filesystem wishes to use it.
@@ -2716,7 +2720,7 @@ softdep_mount(devvp, mp, fs, cred)
        ACQUIRE_GBLLOCK(&lk);
        TAILQ_INSERT_TAIL(&softdepmounts, sdp, sd_next);
        FREE_GBLLOCK(&lk);
-       if ((fs->fs_flags & FS_SUJ) &&
+       if (enable_suj && (fs->fs_flags & FS_SUJ) &&
            (error = journal_mount(mp, fs, cred)) != 0) {
                printf("Failed to start journal: %d\n", error);
                softdep_unmount(mp);

root@upstairs:/usr/src # uname -a
FreeBSD upstairs.jack.com 13.0-STABLE FreeBSD 13.0-STABLE #0 stable/13-n245805-4a4a174959ce-dirty: Sun May 30 23:31:47 PDT 2021     jack@upstairs.jack.com:/usr/obj/usr/src/amd64.amd64/sys/UPSTAIRS  amd64

Comment 38 Kirk McKusick freebsd_committer

2021-05-31 22:46:17 UTC

(In reply to Jack from comment #37)
The filesystem is still enabled for journaled soft updates, but with my patch the kernel does not mount it with journaled soft updates (unless you re-enable them using `sysctl debug.softdep.enable_suj=1'). You can see that journaled soft updates are not enabled on the filesystem by typing the `mount' command (with no arguments) after it is mounted and you will see that it is running with just soft updates.

Comment 39 Jack 2021-06-02 02:05:03 UTC

Looks like that's the case.
/dev/mirror/root on / (ufs, local, noatime, soft-updates)

I'll try to see if I can make the system hang with the usual ways it happens.

Comment 40 Kirk McKusick freebsd_committer

2021-06-07 19:16:09 UTC

(In reply to Jack from comment #39)
We have had others report that they can reproduce the problem and that
1) disabling journaling makes the problem go away;
2) increasing the journal size moderates the problem;
3) the patch proposed in https://reviews.freebsd.org/D30041 helps to moderate the problem.

Comment 41 Jack 2021-06-07 22:20:01 UTC

That's great that the issue is reproducible, I can't seem to reliably reproduce it. So far I've been running with the disable journal patch and have not come across any hangs yet.

Comment 42 Kirk McKusick freebsd_committer

2021-06-22 06:07:48 UTC

*** Bug 253968 has been marked as a duplicate of this bug. ***

Comment 43 Jack 2021-07-05 05:05:59 UTC

(In reply to sigsys from comment #21)
I have a few more systems that are default installs with the latest 13-STABLE (as of today) with soft updates+journal+ufs+gmirror randomly have this infinite write problem when doing a cd /usr/ports;git pull, some are ssds and some are spinning disks. Simply typing "sync" appears to stop the infinite writes for me too if that helps any for solving the soft updates+journaling.

Comment 44 Kirk McKusick freebsd_committer

2021-07-06 21:57:19 UTC

(In reply to Jack from comment #43)
Thanks for the update. The patch that was committed to head (described in https://reviews.freebsd.org/D30041) includes doing an internal sync when this condition arises and that helps a lot. Assuming that no new issues arise in head, the MFC to 13-stable should happen later this week.

Comment 45 Kirk McKusick freebsd_committer

2021-07-10 02:15:43 UTC

(In reply to Jack from comment #43)
The fix was MFC'ed to 13-STABLE on July 7th, so a kernel built from sources on or after that date should have the fix.

Comment 46 Jack 2021-07-10 02:23:26 UTC

(In reply to Kirk McKusick from comment #45)
Thanks, I'll unpatch my src and rebuild (and upgrade the rest of my systems) to see if the issue resurfaces.