Bug 241069 - zfs: scrub does not detect all errors on active spares
Summary: zfs: scrub does not detect all errors on active spares
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-fs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-05 00:18 UTC by Alan Somers
Modified: 2019-10-30 02:03 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Somers freebsd_committer 2019-10-05 00:18:14 UTC
This is a partial regression of https://www.illumos.org/issues/8473, which I fixed in illumos rev 554675e (FreeBSD 323813).  Previously, ZFS scrub would never detect errors on active spares.  Now, it detects some of them, but not all.  The problem can be reproduced with the hotspare_test:hotspare_scrub_002_pos test from the ZFS test suite, or by these commands:

truncate -s 64m /tmp/a /tmp/b /tmp/c /tmp/d
sudo zpool create -f testpool raidz1 /tmp/a /tmp/b /tmp/c spare /tmp/d
sudo zpool replace testpool /tmp/a /tmp/d
/bin/dd if=/dev/zero bs=1024k count=63 oseek=1 conv=notrunc of=/tmp/d
sync
sudo zpool scrub testpool
zpool status testpool # Will show only a few errors
sudo zpool offline testpool /tmp/a
sudo zpool scrub testpool
zpool status testpool # Will show new errors!!!
Comment 1 commit-hook freebsd_committer 2019-10-07 20:19:27 UTC
A commit references this bug:

Author: asomers
Date: Mon Oct  7 20:19:06 UTC 2019
New revision: 353288
URL: https://svnweb.freebsd.org/changeset/base/353288

Log:
  ZFS: mark hotspare_scrub_002_pos as an expected failure

  "zpool scrub" doesn't detect all errors on active spares in raidz arrays

  PR:		241069
  MFC after:	2 weeks
  Sponsored by:	Axcient

Changes:
  head/tests/sys/cddl/zfs/tests/hotspare/hotspare_test.sh
Comment 2 commit-hook freebsd_committer 2019-10-30 02:03:48 UTC
A commit references this bug:

Author: asomers
Date: Wed Oct 30 02:03:42 UTC 2019
New revision: 354165
URL: https://svnweb.freebsd.org/changeset/base/354165

Log:
  MFC r353117-r353118, r353281-r353282, r353284-r353289, r353309-r353310, r353360-r353361, r353366, r353379

  r353117:
  ZFS: the hotspare_add_004_neg test needs at least two disks

  Sponsored by:	Axcient

  r353118:
  ZFS: fix several of the "zpool create" tests

  * Remove zpool_create_013_neg.  FreeBSD doesn't have an equivalent of
    Solaris's metadevices.  GEOM would be the equivalent, but since all geoms
    are the same from ZFS's perspective, this test would be redundant with
    zpool_create_012_neg

  * Remove zpool_create_014_neg.  FreeBSD does not support swapping to regular
    files.

  * Remove zpool_create_016_pos.  This test is redundant with literally every
    other test that creates a disk-backed pool.

  * s:/etc/vfstab:/etc/fstab in zpool_create_011_neg

  * Delete the VTOC-related portion of zpool_create_008_pos.  FreeBSD doesn't
    use VTOC.

  * Replace dumpadm with dumpon and swap with swapon in multiple tests.

  * In zpool_create_015_neg, don't require "zpool create -n" to fail.  It's
    reasonable for that variant to succeed, because it doesn't actually open
    the zvol.

  * Greatly simplify zpool_create_012_neg.  Make it safer, too, but not
    interfering with the system's regular swap devices.

  * Expect zpool_create_011_neg to fail (PR 241070)

  * Delete some redundant cleanup steps in various tests

  * Remove some unneeeded ATF timeout specifications.  The default is fine.

  PR:		241070
  Sponsored by:	Axcient

  r353281:
  ZFS: fix several zvol_misc tests

  * Adapt zvol_misc_001_neg to use dumpon instead of Solaris's dumpadm
  * Disable zvol_misc_003_neg, zvol_misc_005_neg, and zvol_misc_006_pos,
    because they involve using a zvol as a dump device, which FreeBSD does not
    yet support.

  Sponsored by:	Axcient

  r353282:
  zfs: fix the slog_012_neg test

  This test attempts to corrupt a file-backed vdev by deleting it and then
  recreating it with truncate.  But that doesn't work, because the pool
  already has the vdev open, and it happily hangs on to the open-but-deleted
  file.  Fix by truncating the file without deleting it.

  Sponsored by:	Axcient

  r353284:
  ZFS: fix the zpool_get_002_pos test

  ZFS has grown some additional properties that hadn't been added to the
  config file yet.  While I'm here, improve the error message, and remove a
  superfluous command.

  Sponsored by:	Axcient

  r353285:
  zfs: fix the zdb_001_neg test

  The test needed to be updated for r331701 (MFV illumos 8671400), which added
  a "-k" option.

  Sponsored by:	Axcient

  r353286:
  zfs: skip the zfsd tests if zfsd is not running

  Sponsored by:	Axcient
  Differential Revision:	https://reviews.freebsd.org/D21878

  r353287:
  ZFS: fix the delegate tests

  These tests have never worked correctly

  * Replace runwattr with sudo
  * Fix a scoping bug with the "dtst" variable
  * Cleanup user properties created during tests
  * Eliminate the checks for refreservation and send support. They will always
    be supported.
  * Fix verify_fs_snapshot. It seemed to assume that permissions would not yet
    be delegated, but that's not how it's actually used.
  * Combine verify_fs_promote with verify_vol_promote
  * Remove some useless sleeps
  * Fix backwards condition in verify_vol_volsize
  * Remove some redundant cleanup steps in the tests. cleanup.ksh will handle
    everything.
  * Disable some parts of the tests that FreeBSD doesn't support:
      * Creating snapshots with mkdir
      * devices
      * shareisci
      * sharenfs
      * xattr
      * zoned

  The sharenfs parts could probably be reenabled with more work to remove the
  Solarisms.

  Sponsored by:	Axcient
  Differential Revision:	https://reviews.freebsd.org/D21898

  r353288:
  ZFS: mark hotspare_scrub_002_pos as an expected failure

  "zpool scrub" doesn't detect all errors on active spares in raidz arrays

  PR:		241069
  Sponsored by:	Axcient

  r353289:
  ZFS: fix the redundancy tests

  * Fix force_sync_path, which ensures that a file is fully flushed to disk.
    Apparently "zpool history"'s performance has improved, but exporting and
    importing the pool still works.
  * Fix file_dva by using undocumented zdb syntax to clarify that we're
    interested in the pool's root file system, not the pool itself. This
    should also fix the zpool_clear_001_pos test.
  * Remove a redundant cleanup step

  Sponsored by:	Axcient
  Differential Revision:	https://reviews.freebsd.org/D21901

  r353309:
  zfs: fix the zfsd_autoreplace_003_pos test

  The test declared that it only needed 5 disks, but actually tried to use 6.
  Fix it to use just 5, which is all it really needs.

  Sponsored by:	Axcient

  r353310:
  zfs: fix the zfsd_hotspare_007_pos test

  It was trying to destroy the pool while zfsd was detaching the spare, and
  "zpool destroy" failed.  Fix by waiting until the spare has fully detached.

  Sponsored by:	Axcient

  r353360:
  ZFS: multiple fixes to the zpool_import tests

  * Don't create a UFS mountpoint just to store some temporary files.  The
    tests should always be executed with a sufficiently large TMPDIR.
    Creating the UFS mountpoint is not only unneccessary, but it slowed
    zpool_import_missing_002_pos greatly, because that test moves large files
    between TMPDIR and the UFS mountpoint.  This change also allows many of
    the tests to be executed with just a single test disk, instead of two.

  * Move zpool_import_missing_002_pos's backup device dir from / to $PWD to
    prevent cross-device moves.  On my system, these two changes improved that
    test's speed by 39x.  It should also prevent ENOSPC errors seen in CI.

  * If insufficient disks are available, don't try to partition one of them.
    Just rely on Kyua to skip the test.  Users who care will configure Kyua
    with sufficient disks.

  Sponsored by:	Axcient

  r353361:
  ZFS: in the tests, don't override PWD

  The ZFS test suite was overriding the common $PWD variable with the path to
  the pwd command, even though no test wanted to use it that way.  Most tests
  didn't notice, because ksh93 eventually restored it to its proper meaning.

  Sponsored by:	Axcient

  r353366:
  ZFS: fix the zpool_add_010_pos test

  The test is necessarily racy, because it depends on being able to complete a
  "zpool add" before a previous resilver finishes.  But it was racier than it
  needed to be.  Move the first "zpool add" to before the resilver starts.

  Sponsored by:	Axcient

  r353379:
  zfs: multiple improvements to the zpool_add tests

  * Don't partition a disk if too few are available.  Just rely on Kyua to
    ensure that the tests aren't run with insufficient disks.

  * Remove redundant cleanup steps

  * In zpool_add_003_pos, store the temporary file in $PWD so Kyua will
    automatically clean it up.

  * Update zpool_add_005_pos to use dumpon instead of dumpadm.  This test had
    never been ported to FreeBSD.

  * In zpool_add_005_pos, don't format the dump disk with UFS.  That was
    pointless.

  Sponsored by:	Axcient
  > Description of fields to fill in above:                     76 columns --|
  > PR:                       If and which Problem Report is related.
  > Submitted by:             If someone else sent in the change.
  > Reported by:              If someone else reported the issue.
  > Reviewed by:              If someone else reviewed your modification.
  > Approved by:              If you needed approval for this commit.
  > Obtained from:            If the change is from a third party.
  > MFC after:                N [day[s]|week[s]|month[s]].  Request a reminder email.
  > MFH:                      Ports tree branch name.  Request approval for merge.
  > Relnotes:                 Set to 'yes' for mention in release notes.
  > Security:                 Vulnerability reference (one per line) or description.
  > Sponsored by:             If the change was sponsored by an organization (each collaborator).
  > Differential Revision:    https://reviews.freebsd.org/D### (*full* phabric URL needed).
  > Empty fields above will be automatically removed.

  _M   12
  M    12/ObsoleteFiles.inc
  M    12/tests/sys/cddl/zfs/include/commands.txt
  M    12/tests/sys/cddl/zfs/include/libtest.kshlib
  M    12/tests/sys/cddl/zfs/tests/cli_root/zdb/zdb_001_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/cleanup.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/setup.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_001_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_002_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_003_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_004_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_005_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_006_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_007_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_008_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_009_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_010_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_test.sh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/Makefile
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create.kshlib
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_008_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_011_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_012_neg.ksh
  D    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_013_neg.ksh
  D    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_014_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_015_neg.ksh
  D    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_016_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_test.sh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_get/zpool_get.cfg
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_get/zpool_get_002_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/cleanup.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/setup.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/zpool_import.cfg
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/zpool_import_all_001_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/zpool_import_test.sh
  M    12/tests/sys/cddl/zfs/tests/delegate/delegate_common.kshlib
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_001_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_002_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_003_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_007_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_010_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_012_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_test.sh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_unallow_007_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/delegate/zfs_unallow_test.sh
  M    12/tests/sys/cddl/zfs/tests/hotspare/hotspare_test.sh
  M    12/tests/sys/cddl/zfs/tests/redundancy/redundancy.kshlib
  M    12/tests/sys/cddl/zfs/tests/redundancy/redundancy_001_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/slog/slog_012_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/zfsd/zfsd_autoreplace_003_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/zfsd/zfsd_hotspare_007_pos.ksh
  M    12/tests/sys/cddl/zfs/tests/zfsd/zfsd_test.sh
  M    12/tests/sys/cddl/zfs/tests/zvol/zvol_misc/zvol_misc_001_neg.ksh
  M    12/tests/sys/cddl/zfs/tests/zvol/zvol_misc/zvol_misc_test.sh

Changes:
_U  stable/12/
  stable/12/ObsoleteFiles.inc
  stable/12/tests/sys/cddl/zfs/include/commands.txt
  stable/12/tests/sys/cddl/zfs/include/libtest.kshlib
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zdb/zdb_001_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/cleanup.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/setup.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_001_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_002_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_003_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_004_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_005_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_006_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_007_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_008_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_009_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_010_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_add/zpool_add_test.sh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/Makefile
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create.kshlib
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_008_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_011_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_012_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_013_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_014_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_015_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_016_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_create/zpool_create_test.sh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_get/zpool_get.cfg
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_get/zpool_get_002_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/cleanup.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/setup.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/zpool_import.cfg
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/zpool_import_all_001_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/cli_root/zpool_import/zpool_import_test.sh
  stable/12/tests/sys/cddl/zfs/tests/delegate/delegate_common.kshlib
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_001_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_002_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_003_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_007_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_010_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_012_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_allow_test.sh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_unallow_007_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/delegate/zfs_unallow_test.sh
  stable/12/tests/sys/cddl/zfs/tests/hotspare/hotspare_test.sh
  stable/12/tests/sys/cddl/zfs/tests/redundancy/redundancy.kshlib
  stable/12/tests/sys/cddl/zfs/tests/redundancy/redundancy_001_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/slog/slog_012_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/zfsd/zfsd_autoreplace_003_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/zfsd/zfsd_hotspare_007_pos.ksh
  stable/12/tests/sys/cddl/zfs/tests/zfsd/zfsd_test.sh
  stable/12/tests/sys/cddl/zfs/tests/zvol/zvol_misc/zvol_misc_001_neg.ksh
  stable/12/tests/sys/cddl/zfs/tests/zvol/zvol_misc/zvol_misc_test.sh