Bug 121366 - [zfs] [patch] Automatic disk scrubbing from periodic(8)
Summary: [zfs] [patch] Automatic disk scrubbing from periodic(8)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 7.0-STABLE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-04 20:10 UTC by Stefan Moeding
Modified: 2011-10-11 08:26 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Moeding 2008-03-04 20:10:01 UTC
The ZFS Best Practices Guide recommends disk scrubbing on a regular
basis.  The attached file fits into the periodic(8) framework to start
a weekly scrub.  It will look at all pools and choose one of them on a
round-robin basis depending on the date of the last scrub/resilver
unless there is already a scrub or resilver running.

The script should probably be installed to run at the end the weekly
schedule to avoid I/O contention from scrubbing and other weekly scripts.

The new periodic(8) switch 'weekly_zfs_scrubbing_enable' is introduced.

One drawback: ZFS seems to forget the time of a scrub/resilver when
shutting down.  On machines with regular reboots and multiple pools
the script will probably pick the same pool every time.

Fix: 

#!/bin/sh
#
# $FreeBSD$
#

# If there is a global system configuration file, suck it in.
#
if [ -r /etc/defaults/periodic.conf ]
then
    . /etc/defaults/periodic.conf
    source_periodic_confs
fi

scrubdate() {
    # Echo lines with timestamp (0=never, -1=running)
    # of last scrub/resilver and pool name to stdout.
    for pool in $(/sbin/zpool list -H -o name); do
        /sbin/zpool status ${pool} |
        while read line; do
            case "$line" in
                scrub:\ scrub\ completed*|scrub:\ resilver\ completed*)
                    # Extract date from zpool output and convert to epoch
                    date=$(echo $line | sed 's/^scrub: .* on //')
                    date=$(date -jn -f "%+" "$date" +"%s")
                    echo $date $pool
                    ;;
                scrub:\ scrub\ in\ progress*|scrub:\ resilver\ in\ progress*)
                    # Scrub or resilver is running
                    echo -1 $pool
                    ;;
                scrub:\ none\ requested)
                    # Pool has never been scrubed or resilvered
                    echo 0 $pool
                    ;;
            esac
        done
    done
}

case "$weekly_zfs_scrubbing_enable" in
    [Yy][Ee][Ss])
        line=$(scrubdate | sort -n | head -1)

        if [ -n "$line" ]; then
            date=$(echo $line | cut -d" " -f1)
            pool=$(echo $line | cut -d" " -f2)

            echo ""
            case "$date" in
                -1)
                    echo "Scrub or resilver is running for pool $pool:"
                    /sbin/zpool status $pool
                    rc=1
                    ;;
                *)
                    echo "Starting scrub for pool $pool:"
                    /sbin/zpool scrub $pool
                    rc=$?
                    ;;
            esac
        else
            echo '$weekly_zfs_scrubbing_enable is set' \
                 'but no zfs pools have been created'
            rc=2
        fi;;
    *)  rc=0;;
esac

exit $rc
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2009-05-18 03:56:46 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Giorgos Keramidas freebsd_committer freebsd_triage 2009-05-19 04:29:07 UTC
On Mon, 18 May 2009 02:57:00 GMT, linimon@FreeBSD.org wrote:
> Synopsis: [zfs] [patch] Automatic disk scrubbing from periodic(8)
>
> Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
> Responsible-Changed-By: linimon
> Responsible-Changed-When: Mon May 18 02:56:46 UTC 2009
> Responsible-Changed-Why: 
> Over to maintainer(s).
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=121366

A variation of the same idea that should auto-kick scrubbing if the
'age' of the last scrub exceeds a configurable threshold would be nice
too.  The _untested_ patch below is a start at implementing that:

    http://people.freebsd.org/~keramida/diff/zfs-scrub.periodic.diff

I'll give this a few test runs and if it looks useful resubmit with the
matching manpage updates too.

%%%
diff -r cc894c49974d etc/defaults/periodic.conf
--- a/etc/defaults/periodic.conf	Mon May 18 10:31:18 2009 +0300
+++ b/etc/defaults/periodic.conf	Tue May 19 06:27:09 2009 +0300
@@ -226,6 +226,11 @@
 pkg_version=pkg_version					# Use this program
 pkg_version_index=/usr/ports/INDEX-8			# Use this index file
 
+# 500.zfs-scrub
+weekly_status_zfs_scrub_enable="NO"			# Scrub zfs pools
+weekly_status_zfs_scrub_auto="NO"			# Auto-scrub old pools
+weekly_status_zfs_scrub_max_age="1728000"		# Scrub age in seconds
+
 # 999.local
 weekly_local="/etc/weekly.local"			# Local scripts
 
diff -r cc894c49974d etc/periodic/weekly/500.zfs-scrub
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/etc/periodic/weekly/500.zfs-scrub	Tue May 19 06:27:09 2009 +0300
@@ -0,0 +1,131 @@
+#!/bin/sh
+#
+# $FreeBSD$
+#
+# Report the last time a zfs pool was scrubbed and if auto-scrubbing is
+# enabled, kick off a 'zfs scrub' run for pools that were checked too
+# far in the past.
+#
+
+# If there is a global system configuration file, suck it in.
+if [ -r /etc/defaults/periodic.conf ]
+then
+    . /etc/defaults/periodic.conf
+    source_periodic_confs
+fi
+
+# scrubdate
+#	Find the time since the Epoch of the last pool scrub for all
+#	active zfs pools (0=never, -1=running now).
+scrubdate()
+{
+    for pool in $(/sbin/zpool list -H -o name) ; do
+	/sbin/zpool status "${pool}" |
+	while read line ; do
+	    case "${line}" in
+		scrub:\ scrub\ completed*|scrub:\ resilver\ completed*)
+		    # Extract date from zpool output and convert to epoch.
+		    date=$(echo $line | sed 's/^scrub: .* on //')
+		    date=$(date -jn -f '%+' "$date" '+%s')
+		    ;;
+
+		scrub:\ scrub\ in\ progress*|scrub:\ resilver\ in\ progress*)
+		    # Scrub or resilver is running now.
+		    date='-1'
+		;;
+
+		scrub:\ none\ requested)
+		    # Pool has never been scrubed or resilvered.
+		    date='0'
+		    ;;
+	    esac
+	    echo "${date} ${pool}"
+	done
+    done
+}
+
+# scrubpool POOL TIME
+#	Kick off a scrub of zfs pool `POOL' if the `TIME' of its last
+#	scrub is too old and auto-scrubbing is enabled.
+scrubpool()
+{
+    local _pool="$1"
+    local _time="$2"
+
+    case "${weekly_status_zfs_scrub_auto}" in
+	[Yy][Ee][Ss])
+	    case "${weekly_status_zfs_scrub_max_age}" in
+		[0-9]*)
+		    ;;
+		*)
+		    echo '$weekly_status_zfs_scrub_max_age is not set' \
+			'properly - see periodic.conf(5)'
+		    ;;
+	    esac
+	    now=$(date '+%s')
+	    age=$(( "${now}" - "${_time}" ))
+	    if [ -z "${age}" ]; then
+		echo "Cannot find scrub age for pool ${_pool} -" \
+		    "forcing a scrub run"
+		zfs scrub "${_pool}"
+		return $?
+	    fi
+	    if [ "${age}" -gt "${weekly_status_zfs_scrub_max_age}" ]; then
+		echo "Last scrub for pool ${_pool} was ${age} seconds ago."
+		echo "Starting automatic scrub run."
+		zfs scrub "${_pool}"
+		return $?
+	    fi
+	    ;;
+
+	[Nn][Oo])
+	    return 0
+	    ;;
+
+	*)
+	    echo '$weekly_zfs_scrub_auto is not set properly - ' \
+	    'see periodic.conf(5)'
+	    return 1
+	    ;;
+    esac
+}
+
+rc=0
+case "$weekly_zfs_scrub_enable" in
+    [Yy][Ee][Ss])
+	havepools=1
+	scrubdate | sort -n | \
+	while read date pool ; do
+	    havepools=1
+	    echo ""
+	    case ${date} in
+		-1)
+		    echo "Scrub or resilver is running for pool $pool:"
+		    /sbin/zpool status ${pool}
+		    status=$?
+		    [ ${status} -ne 0 ] && rc=${status}
+		    ;;
+
+		0)
+		    scrubpool ${pool} ${time}
+		    status=$?
+		    [ ${status} -ne 0 ] && rc=${status}
+		    ;;
+	    esac
+	done
+	if [ ${havepools} -eq 0 ]; then
+	    echo '$weekly_zfs_scrub_enable is set' \
+		'but no zfs pools have been created'
+	    rc=2
+	fi
+	;;
+
+    [Nn][Oo])
+	;;
+
+    *)
+	echo '$weekly_zfs_scrub_enable is not set properly - ' \
+	    'see periodic.conf(5)'
+	;;
+esac
+exit $rc
%%%
Comment 3 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2009-09-07 15:58:05 UTC
On Tue, Mar 04, 2008 at 08:46:51PM +0100, Stefan Moeding wrote:
> 
> >Number:         121366
> >Category:       bin
> >Synopsis:       [zfs] [patch] Automatic disk scrubbing from periodic(8)
> >Confidential:   no
> >Severity:       non-critical
> >Priority:       low
> >Responsible:    freebsd-bugs
> >State:          open
> >Quarter:        
> >Keywords:       
> >Date-Required:
> >Class:          change-request
> >Submitter-Id:   current-users
> >Arrival-Date:   Tue Mar 04 20:10:01 UTC 2008
> >Closed-Date:
> >Last-Modified:
> >Originator:     Stefan Moeding
> >Release:        FreeBSD 7.0-STABLE i386
> >Organization:
> >Environment:
> System: FreeBSD elan.setuid.de 7.0-STABLE FreeBSD 7.0-STABLE #23: Sat Mar  1 14:17:18 CET 2008 root@elan.setuid.de:/usr/obj/usr/src/sys/ELAN i386
> >Description:
> 
> The ZFS Best Practices Guide recommends disk scrubbing on a regular
> basis.  The attached file fits into the periodic(8) framework to start
> a weekly scrub.  It will look at all pools and choose one of them on a
> round-robin basis depending on the date of the last scrub/resilver
> unless there is already a scrub or resilver running.
> 
> The script should probably be installed to run at the end the weekly
> schedule to avoid I/O contention from scrubbing and other weekly scripts.
> 
> The new periodic(8) switch 'weekly_zfs_scrubbing_enable' is introduced.
> 
> One drawback: ZFS seems to forget the time of a scrub/resilver when
> shutting down.  On machines with regular reboots and multiple pools
> the script will probably pick the same pool every time.


Maybe you can use user properties to remember time of last automatic
scrub? A user property has to contain ':' in its name, eg.:

	# zfs set org.freebsd:lastscrub=`date "+%s"` pool
	# zpool scrub pool

You can obtain it later with:

	# zfs get -H -o value org.freebsd:lastscrub system

It can be removed (if needed) with:

	# zfs inherit org.freebsd:lastscrub system

If you like the idea, would you also like to update your script to use it?

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
Comment 4 Martin Matuska freebsd_committer freebsd_triage 2011-10-11 08:26:43 UTC
State Changed
From-To: open->closed

Implemented in r209195 by netchild@ (/etc/periodic/daily/800.scrub-zfs)