Bug 134491 - [zfs] Hot spares are rather cold...
Summary: [zfs] Hot spares are rather cold...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-12 17:10 UTC by Michel Bouissou
Modified: 2017-05-10 16:17 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michel Bouissou 2009-05-12 17:10:01 UTC
Although ZFS offers the possibility to define devices as "spares" for MIRROR / RAIDZ / RAIDZ2 storage pools, and FreeBSD will happily accept this, such "spare" devices will *NOT* automagically take over if a RAID pool device fails.

According to http://docs.sun.com/app/docs/doc/819-5461/gcvcw?a=view , I understand that the device replacement with a spare might not be performed by the kernel ZFS module but by an external agent/daemon ?
« Automatic replacement  When a fault is received, an FMA agent examines the pool to see if it has any available hot spares. If so, it replaces the faulted device with an available spare. »

I'm unable to find such a tool in FreeBSD, at least if it exists (?) it isn't active by default. So in the current status ZFS "spares" have to be activated / deactivated manually when a disk fails or is replaced.

Not only this is suboptimal but this presents a data loss risk for people who would assume that "spares" would just do what they are intended for in all usual RAID implementations... Where they won't and will just sit there idle if a disk dies, until the admin manually activates them.

This deserves preferably a fix, but at least a prominent WARNING note...

Also, although SUN doc states « Multiple pools can share devices that are designated as hot spares », in the current FreeBSD implementation ZFS will refuse to assign to a pool a "spare" which is already assigned to another, stating the device is "busy", i.e.:

# zpool status
  pool: syspool
 state: ONLINE
 (Blah-blah)

        NAME        STATE     READ WRITE CKSUM
        syspool     ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            aacd1   ONLINE       0     0     0
            aacd2   ONLINE       0     0     0
        spares
          da15      AVAIL

(Blah-blah)

# zpool add vol01 spare da15
invalid vdev specification
use '-f' to override the following errors:
da15 is in use (r1w1e1)

# zpool add -f vol01 spare da15
invalid vdev specification
the following errors must be manually repaired:
da15 is in use (r1w1e1)

How-To-Repeat: Create any redundant ZFS storage pool with a spare device. Hot-remove (or manually "offline") an active device from the pool. The spare won't take over unless a manual "zpool replace <pool_name> <failed_device> <spare_device>" is issued.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2009-05-12 20:52:40 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 admin 2009-11-30 21:11:55 UTC
may be some zfs developers comment this?
Comment 3 Daniel Black 2010-04-16 00:52:47 UTC
A partial/potential solution is described here:=20
http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/055686.html=20
=20
Comment 4 Garrett Cooper 2011-10-09 20:20:09 UTC
    delphij and I have verified that this issue is resolved on the
zfsd svn branch, but this hasn't been backported to CURRENT and
contains a number of changes to geom, and a handful of changes to zfs.
I'll leave it to the reader to determine where between geom and zfs
things are getting hung up.
Thanks,
-Garrett
Comment 5 Miroslav Lachman 2012-11-11 15:04:43 UTC
Is this still that hard issue, which cannot be solved for 3 years, even 
if there is possible zfsd solution in another branch?

FreeBSD is used as ZFS storage more often, and many users are in false 
feel they have hot spares configured and fully working.
Comment 6 vsjcfm 2014-01-19 10:52:03 UTC
What is the purpose of ZFS spares on FreeBSD if they doesn't work at all?
I cannot see any sense to use spare - I can just run "replace"
subcommand, right?
Comment 7 Alan Somers freebsd_committer freebsd_triage 2017-05-10 16:17:02 UTC
zfsd(8) is available in FreeBSD 11.0.  Just add "zfsd_enable=YES" to /etc/rc.conf and spares will automatically take over when a disk fails.