Bug 229829 - [zfs] scrubbing prevents shutdown and slows down startup
Summary: [zfs] scrubbing prevents shutdown and slows down startup
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-17 13:17 UTC by Martin Birgmeier
Modified: 2023-11-06 20:19 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2018-07-17 13:17:54 UTC
Scenario 1:
- FreeBSD 11.2
- UFS root fs
- 6 x 1.5 TB SATA disks in RAIDZ2 pool
- The pool is being scrubbed
- Issue "shutdown -p now"

Actual result 1:
- The system shuts down, shows that UFS fs has been synced
- The system continues scrubbing (in what appears to be right after all "usual" kernel log messages) and does not power off

Expected result 1:
- The system stops scrubbing and powers off

Scenario 2:
- Continued from scenario 1
- Press hard reset button
- The system starts booting into multi-user

Actual result 2:
- As soon as the pool is imported, scrubbing continues
- As a result, system startup is extremely slow, the UFS fs check does not finish in a reasonable time

Expected result 2:
- Scrubbing should not continue right with pool import
- System startup should be normal speed

Scenario 3:
- Continued from scenario 2
- Press hard reset button
- Boot system single user
- Run "fsck -p" in an effort to fix the UFS fs first; is successful
- Run "zpool list status"

Actual result 3:
- The system starts scrubbing the pool immediately
- As a result, "zpool list status" does not finish for a long time (actual duration can be given later)

Expected result 3:
- Scrubbing should not continue right with the pool import
- zpool import and zpool list status should continue with normal speed

In summary, I believe that the solution is that before shutdown, all scrubbing activities should be paused. Similarly, on boot any pool marked for scrubbing should be treated as if scrubbing were paused on it.
Comment 1 Martin Birgmeier 2018-07-17 16:03:10 UTC
I now analyzed the startup time for scenario 3. This scenario continued as follows:

- type "^c"
- result: console shows "^c", otherwise no reaction, no command prompt
- type "^zbg<RET>"
- result: console shows "^zbg", otherwise no reaction, no command prompt
- type "exit<RET>"
- leave console unattended

Using last(1) the time from typing "exit" to finding the corresponding "boot time" entry was 20 minutes. This most likely means that the pool import took 20 minutes to complete (I was not at the console anymore at that time).

-- Martin
Comment 2 Martin Birgmeier 2018-07-19 11:10:05 UTC
Scenarios, continued:
- server up, scrub of pool is continuing
- gstat is running
- 2 VirtualBox VMs are running
- another VirtualBox VM is accessing a zvol via iSCSI (ctl)

Result:
- The machine freezes: not immediately, but process after process gets stuck

Scenario continued:
- After 30 minutes watching this, perform hard reset
- Start single user
- "fsck -p" -> cleans UFS root fs
- issue "zpool export pool" in order to start multiuser without the pool being imported

Result:
- The scrubbing starts again immediately, no command prompt

Scenario continued:
- Issue "ifconfig <interface> inet <hostname>" into console buffer
- Issue "ifconfig <interface> inet6 <hostname>" into console buffer
- On another machine, wait until ping succeeds

Result:
- First ping after ca. 20 minutes (see previous scenarios)
- Console shows that 20G gmirror for swap has been rebuilt
- pool is exported

I am running with https://reviews.freebsd.org/D7538 applied because otherwise the system was slowly filling all swap space and swapping continuously, even with no activity.

Something is very wrong in FreeBSD 11.2 with ZFS, swapping, and memory management. I did not have these problems with 11.1.

-- Martin
Comment 3 Bryan Drewery freebsd_committer freebsd_triage 2022-06-17 18:46:14 UTC
I think I am running into this on stable/13 3a0fcdb37dffcd28c21c846d6165f6c382d9aac3
Comment 4 Bryan Drewery freebsd_committer freebsd_triage 2022-06-21 14:33:25 UTC
This seems simpler. I might find time to commit this.

diff --git usr.sbin/periodic/etc/daily/800.scrub-zfs usr.sbin/periodic/etc/daily/800.scrub-zfs
index 8cca1ea4d949..474e070153e8 100755
--- usr.sbin/periodic/etc/daily/800.scrub-zfs
+++ usr.sbin/periodic/etc/daily/800.scrub-zfs
@@ -15,6 +15,13 @@ then
     source_periodic_confs
 fi

+doscrub() {
+       local pool="$1"
+
+       zfs set org.freebsd:last-scrub=$(date +%F.%T) "${pool}"
+       zpool scrub "${pool}"
+}
+
 : ${daily_scrub_zfs_default_threshold=35}

 case "$daily_scrub_zfs_enable" in
@@ -55,9 +62,7 @@ case "$daily_scrub_zfs_enable" in
                        _pool_threshold=${daily_scrub_zfs_default_threshold}
                fi

-               _last_scrub=$(zpool history ${pool} | \
-                   egrep "^[0-9\.\:\-]{19} zpool scrub ${pool}\$" | tail -1 |\
-                   cut -d ' ' -f 1)
+               _last_scrub=$(zfs get -s local -H -o value org.freebsd:last-scrub ${pool})
                if [ -z "${_last_scrub}" ]; then
                        # creation time of the pool if no scrub was done
                        _last_scrub=$(zpool history ${pool} | \
@@ -88,12 +93,12 @@ case "$daily_scrub_zfs_enable" in
                                ;;
                        *"none requested"*)
                                echo "   starting first scrub (since reboot) of pool '${pool}':"
-                               zpool scrub ${pool}
+                               doscrub ${pool}
                                [ $rc -eq 0 ] && rc=1
                                ;;
                        *)
                                echo "   starting scrub of pool '${pool}':"
-                               zpool scrub ${pool}
+                               doscrub ${pool}
                                [ $rc -eq 0 ] && rc=1
                                ;;
                esac
Comment 5 Martin Birgmeier 2022-06-21 17:55:24 UTC
Just my 2 cents: For the issue described in this PR it would be necessary to suspend an ongoing scrub before shutdown and resume it after restart - assuming the system does both (especially the shutdown) cleanly.

Most likely something like a "reverse" rc.d would be neeeded, where the shutdown procedure checks which zpools are currently being scrubbed, saves this info, and then suspends the scrubs; conversely, the startup would need to check which scrubs were suspended and resume them.

To not need a separate file for saving the info about which zpools were in the process of being scrubbed across the shutdown/reboot it would be nice to be able to do some query directly on the pool to obtain this information.

-- Martin
Comment 6 Ronald Klop 2022-06-22 07:41:29 UTC
(In reply to Martin Birgmeier from comment #5)
It should already work that way.
see "man zpool-scrub"

OPTIONS
     -s  Stop scrubbing.

     -p  Pause scrubbing.  Scrub pause state and progress are periodically
         synced to disk.  If the system is restarted or pool is exported
         during a paused scrub, even after import, scrub will remain paused
         until it is resumed.  Once resumed the scrub will pick up from the
         place where it was last checkpointed to disk.  To resume a paused
         scrub issue zpool scrub again.

     -w  Wait until scrub has completed before returning.
Comment 7 J.R. Oldroyd 2023-11-06 20:19:52 UTC
This issue appears to be happening to me on 13.2-RELEASE-p1.

For me, I do suspend/resumes regularly. Mostly with no problems, but occasionally the suspend blocks at the point where you'd expect the power off and it requires a hard power off shutdown. (Or possibly it requires you to wait for the scrub to complete - obviously not practical when you're suspending a laptop.)

I am experimenting with these additions in rc.suspend and rc.resume:

rc.suspend:
# pause any zpool scrub in progress
zpool status | while read KEY VALUE; do
        case "$KEY" in
        pool:)  POOL=$VALUE ;;
        scan:)  case "$VALUE" in
                "scrub in progress since "*)
                        echo "$POOL" >>/var/db/zpool.scrub.resume
                        zpool scrub -p $POOL
                        ;;
                esac
        esac
done

rc.resume:
# resume any scrub that was in progress on suspend
if [ -f /var/db/zpool.scrub.resume ]; then
	cat /var/db/zpool.scrub.resume | while read POOL; do
		zpool scrub $POOL
	done
	rm /var/db/zpool.scrub.resume
fi

Something similar may also be needed in rc.shutdown with a suitable rc.d/zpool_scrub_resume script for the shutdown/reboot sequence.