Bug 255261 - Slow unmount of (ZFS) filesystem at reboot time
Summary: Slow unmount of (ZFS) filesystem at reboot time
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-20 13:49 UTC by Peter Eriksson
Modified: 2021-04-22 21:50 UTC (History)
3 users (show)

See Also:


Attachments
Patch to add a progress indicator for filesystem unmounting at reboot/shutdown (1.73 KB, patch)
2021-04-20 13:49 UTC, Peter Eriksson
no flags Details | Diff
Progress indicator during vfs_unmount at shutdown (1.92 KB, patch)
2021-04-21 09:34 UTC, Peter Eriksson
no flags Details | Diff
More verbose during shutdown (2.90 KB, patch)
2021-04-21 09:37 UTC, Peter Eriksson
no flags Details | Diff
Patch to make zfs_fini more verbose at shutdown (2.40 KB, patch)
2021-04-21 09:39 UTC, Peter Eriksson
no flags Details | Diff
Improved FreeBSD 12.2 version of "verbose shutdown" patch (12.14 KB, patch)
2021-04-22 21:49 UTC, Peter Eriksson
no flags Details | Diff
Improved FreeBSD 13 version of "verbose shutdown" patch (13.01 KB, patch)
2021-04-22 21:50 UTC, Peter Eriksson
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2021-04-20 13:49:52 UTC
Created attachment 224295 [details]
Patch to add a progress indicator for filesystem unmounting at reboot/shutdown

On a server with about 140000 zfs filesystems a "reboot" sometimes takes a very long (hours) time. Time that is spend by the kernel unmounting the filesystems. Unfortunately the kernel doesn't print any kind of progress indication when this occurs so you just see the "Uptime: xxxx" printed and then nothing...

Please find enclosed a patch that adds a progress indication for the unmounting part.

(There used to be a related issue with the ZFS kmem freeing also taking a very long time but that fix is included so that part goes quickly when the unmounting part has been passed).
Comment 1 Peter Eriksson 2021-04-20 16:14:27 UTC
I probably should have mentioned some more details:

FreeBSD 12.2-RELEASE-p6
149354 ZFS filesystems

# zpool list
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
DATA      582T   272T   310T        -         -     7%    46%  1.00x  ONLINE  -
FILUR00  65.5T   107G  65.4T        -         -     0%     0%  1.00x  ONLINE  -
FILUR02   196T  72.6T   124T        -         -    11%    36%  1.00x  ONLINE  -
FILUR03   196T  78.2T   118T        -         -    11%    39%  1.00x  ONLINE  -
RUNUR01  65.5T  11.8M  65.5T        -         -     0%     0%  1.00x  ONLINE  -
SUSPECT  9.06T  2.79M  9.06T        -         -     0%     0%  1.00x  ONLINE  -
UNUSED    196T  5.38M   196T        -         -     0%     0%  1.00x  ONLINE  -
zroot     444G  9.03G   435G        -         -    21%     2%  1.00x  ONLINE  -

At the reboot time when it was really slow it unmounted around 6-10 filesystems per second -> estimated reboot time around 4-5 hours... (Needless to say I hard-rebooted it with an "ipmitool power reset" after an hour :-)

(It hasn't been this slow before, but we don't reboot these servers very often).

Anyway, a more verbose vfs_unmount() would be a good thing even with a more sane amount of filesystems - in my opinion.

(And preferably a faster zfs unmount operation :-)
Comment 2 dgilbert 2021-04-20 18:39:43 UTC
"That part is included" ... where?  I have only 400-odd filesystems, but I still find that long running ZFS can take minutes if not hours to let the computer reboot.  I'd like to test your patch, but I'd like this other patch, too.
Comment 3 Peter Eriksson 2021-04-20 19:46:03 UTC
(In reply to dgilbert from comment #2)

Ah, sorry. Wasn't really clear there. The fix for that other problem is already in the normal FreeBSD 12.2 kernel. 

Information about that fix (from FreeBSD 11.3) is in bug 242427 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=242427) - however that is something that happens after all the filesystems are unmounted.

For a machine with more modest amounts of zfs filesystems (3000) then the unmounting is much quicker (in my tests) - like 3000 filesystems unmounted in 10 seconds. That is for a fairly recently rebooted (same hardware as the slow one - a HP DL380g9 with 512-768GB of RAM and 140 10TB drives on the big one and 10 10TB drives on the small one.
Comment 4 dgilbert 2021-04-20 19:56:11 UTC
Hrm.  Not matching my experience, then.  I recently updated from 12.2 to 13 --- basically at the beginning of the release schedule.  My server has 128G RAM, a 8C/16T threadripper and 60T of disk.

The disk has many uses, but I also do poudriere builds on this machine --- which tend to thrash it pretty hard.

If a reboot is done after low uptime (under a week or so), things are fine.  But the reboot time rises as uptime does ... roughly.  Poudriere runs seem to frustrate it.

So far, with 13, I have seen a reboot take ~ 5 minutes.  Small amounts of disk activity on the array (just noticed the blinken lights).  This is after the buffer messages but before the uptime is printed.
Comment 5 Peter Eriksson 2021-04-20 22:33:09 UTC
FreeBSD 13 uses the new OpenZFS code whereas 12.2 uses the older FreeBSD ZFS code, so it probably differs quite a bit.

If you're seeing the "Uptime: xx" message then it's not the unmounting of the filesystems since "Uptime: xx" is printed after they all are unmounted. After that it frees kernel memory and other stuff... I have another patch that adds more verbose printing in various parts that was very slow back in the FreeBSD 11.3 days, but I'm not sure how much of it applies directly to FB13 due to the new ZFS code.

Hmm.. I wonder if the changes in the memory handling code in FreeBSD might have caused the kmem_cache stuff to become slow again for some reason.
Comment 6 Alan Somers freebsd_committer 2021-04-20 22:38:48 UTC
I too suffer from long reboots, on both 12.2 and 13.0, on servers with large numbers of disks.  I plan to test your patch.  On my systems, the long hang (1m - 30m) happens after "All buffers synced".
Comment 7 Peter Eriksson 2021-04-21 09:34:57 UTC
Created attachment 224325 [details]
Progress indicator during vfs_unmount at shutdown

An updated vfs_unmount progress indicator patch with more details. Applies to 12.2, 13.0 (och compiles on current).
Comment 8 Peter Eriksson 2021-04-21 09:37:44 UTC
Created attachment 224326 [details]
More verbose during shutdown

Adds a sysctl to control how much more verbose to be during shutdown and prints some progress indicators (for FreeBSD 13), but probably applies without much fuzz for 12.2 too.
Comment 9 Peter Eriksson 2021-04-21 09:39:16 UTC
Created attachment 224327 [details]
Patch to make zfs_fini more verbose at shutdown

A patch to print more details while closing down zfs at shutdown/reboot. Requires the "More verbose during shutdown" patch. For FreeBSD 13 and newer.
Comment 10 Peter Eriksson 2021-04-22 21:49:57 UTC
Created attachment 224366 [details]
Improved FreeBSD 12.2 version of "verbose shutdown" patch
Comment 11 Peter Eriksson 2021-04-22 21:50:23 UTC
Created attachment 224367 [details]
Improved FreeBSD 13 version of "verbose shutdown" patch